Total Zero-Copy Serialization with rkyv
Why traditional serialization kills latency and how to implement true zero-copy data loading using rkyv in Rust.
In the world of web development, JSON is king. It is human-readable, flexible, and ubiquitous. But in the world of high-frequency trading (HFT) and systems engineering, serialization formats like JSON, Avro, or even Protobuf are the enemy.
The Serialization Tax
Consider a standard market data update:
{ "s": "BTC-USD", "p": 45000.50, "q": 1.5, "t": 1638291000 }
When your program receives this:
- Allocation: It allocates memory for the string.
- Parsing: It scans the bytes, typically state-machine based.
- Conversion: ASCII “45000.50” must be parsed into a float (expensive!).
- Layout: Fields are copied into a struct in memory.
This process burns thousands of CPU cycles. In a system processing 10 million messages per second, this “Parsing Tax” consumes 80-90% of your CPU time.
The Zero-Copy Promise
True Zero-Copy Serialization means the binary format on disk (or wire) is identical to the memory layout of the struct.
- Deserialization becomes a pointer cast (0 ns).
- Access is instant.
- Validation is optional (trusted sources).
Enter rkyv
rkyv (pronounced “archive”) is the gold standard for zero-copy in Rust. Unlike libraries like bincode (which just packs bytes but requires copying to a struct), rkyv guarantees memory representation.
Relative Pointers: The Magic Trick
You might ask: “How can you store a Vec<u8> or String in a zero-copy format? Vectors are pointers to heap memory. If I send you my pointer 0x7ffee..., it points to garbage on your machine.”
rkyv solves this with Relative Pointers.
Instead of storing an absolute address (0x8000), it stores an offset (+32 bytes from here).
When you load the archive into memory:
- The root object is at offset 0.
- The
Stringfield says “my data is at +64 bytes”. - You follow the offset.
This makes the data relocatable. It doesn’t matter where in RAM it lands; the internal relationships are preserved.
Hands-On: Zero-Copy Market Data
Let’s build a zero-copy order book event.
1. Dependencies
[dependencies]
rkyv = { version = "0.7", features = ["validation"] }
2. Defining the Struct
We derive Archive, Serialize, and Deserialize. The check_bytes macro generates validation logic (critical for untrusted input).
use rkyv::{Archive, Deserialize, Serialize, Archived};
use bytecheck::CheckBytes;
#[derive(Archive, Deserialize, Serialize, Debug, PartialEq)]
#[archive(check_bytes)] // Enables security validation
#[repr(C)] // Ensure C-compatible layout stability
pub struct MarketEvent {
pub symbol: [u8; 8], // Fixed size array avoids indirections
pub timestamp: u64,
pub price: u64, // Fixed-point (e.g., satoshis)
pub quantity: u64,
pub side: u8, // 0 = Bid, 1 = Ask
// Note: avoided String and Vec for the "hot" path
}
3. Serialization (The “Slow” Path)
This happens at the ingress (Feed Handler).
let event = MarketEvent {
symbol: *b"ETH-USDC",
timestamp: 1620000000,
price: 3500_000000,
quantity: 10_000000,
side: 1,
};
// Serialize to a fixed-size buffer on the stack (no heap alloc!)
let mut writer = rkyv::ser::serializers::AllocSerializer::<256>::default();
writer.serialize_value(&event).unwrap();
let bytes = writer.into_serializer().into_inner();
4. Deserialization (The “Fast” Path)
This is what the Matching Engine does.
// UNSAFE: Trusted Zero-Copy (Fastest)
// If we trust the source (e.g., our own shared memory ring buffer)
let archived = unsafe { rkyv::archived_root::<MarketEvent>(&bytes) };
println!("Symbol: {:?}", std::str::from_utf8(&archived.symbol));
println!("Price: {}", archived.price);
Cost: Effectively 0 CPU cycles. It is just a pointer calculation.
Advanced: Shared Memory Ring Buffers
In the ZeroCopy Sentinel, we combine rkyv with a shared memory file (/dev/shm).
- Writer serializes events directly into the memory-mapped file.
- Reader maps the same file.
- Reader receives a signal (or polls a cursor).
- Reader accesses
archived_rootat the specific offset.
This eliminates memcpy between processes. The data written by the Feed Handler is instantly visible to the Strategy Engine.
Benchmarks
| Format | Deser Time | Allocation | Copying |
|---|---|---|---|
| JSON | 4,200 ns | Yes | Yes |
| Bincode | 120 ns | Yes | Yes |
| Alpaca | 60 ns | Yes | Yes |
| Cap’n Proto | 5 ns | No | No |
| rkyv | < 1 ns | No | No |
Summary
- Avoid Parsing: Parsing is overhead.
- Relocatable Data: Use relative pointers.
- Trusted ingress: Validate once at the edge, use
unsafezero-copy internally.
Next, we need a way to pass these zero-copy events between threads without locking. Enter the Disruptor.
Questions about this lesson? Working on related infrastructure?
Let's discuss