2025-06-22
In this post, I'll show that several binary formats outperform JSON at serialization in NodeJS. However, Node's runtime and library ecosystem introduce complexity that must be dealt with to achieve this performance.
This is intended as a companion to Binary Formats are Better than JSON in Browsers. In that post, I explored how several binary formats outperfom JSON in browsers. In this post, I'll be focused on serialization performance in backend services written in NodeJS.
For all of these tests, the code is available here. I used the NYC citibike dataset as my data for benchmarking. In my benchmark, I serialize a list of 100,000 trips in different serialization formats.
Here are the final results, comparing the formats that I tested:
Naively, one would assume that binary formats are always better than JSON, since they produce a smaller output, and fewer bytes ought to be faster. However, JSON serialization is implemented in C++ as part of v8, and can make use of optimizations that are not available to other libraries. A library like avsc is implemented in pure JavaScript, and so has fewer optimization opportunities.
In a past job, I wrote a doc about how avro didn't perform as well as JSON for serialization, so we shouldn't use it. This seems to no longer be the case, so I wanted to write this follow up.
I also found that it was important to optimize a couple of things in the serialization benchmark to get a good result. Without optimization, some of these serializers were 10x slower. I wanted to document this somewhere, in case it is helpful.
At the beginning of these tests, the results were radically different from the eventual optimized versions:
There were two big changes that I made to several of these serializers which were responsible for this speedup:
When I started CPU profiling this benchmark, it became very clear that we were spending almost all of our time inside the garbage collector. I quickly switched to looking at allocation sampling. Node makes this very easy using the inspector module -- third party libraries are no longer required for doing scoped profiling!
Allocation sampling clearly showed that most of this garbage was created in the two aforementioned places: when setting up the object to serialize, and when growing buffers.
The first thing I noticed when optimizing this code was that we were creating most of our garbage when setting up the object for serialization. This is code that looked like this:
const response = {
trips: trips.map((trip) => ({
rideId: trip.rideId,
rideableType: mapToAvroRideableType(trip.rideableType),
startLat: trip.startLat || null,
...
})),
};
This code was responsible for mapping between our internal types and the types
expected by our Avro schema. In this case, I needed to replace some undefined
with null
, and remap enums. In other cases, I needed to slightly rename
fields, like startTime
became startTimeMs
.
At any rate, this created a huge amount of garbage, which bogged down our benchmark.
I was able to fix these by creating "remapper types", which looked like this:
class AvroTripTransformer {
constructor(private underlying: Trip) {}
get rideId(): string {
return this.underlying.rideId;
}
get rideableType(): string {
return mapToAvroRideableType(this.underlying.rideableType);
}
get startLat(): number | null {
return this.underlying.startLat || null;
}
...
}
This mostly eliminated the garbage created by these objects, which more than doubled the speed of the avro benchmark. Other serializers had a noticeable, but less dramatic speedup.
The second big thing I noticed was that many serializers allocated a lot of buffers. Most of the libraries that I looked at closely implemented dynamically growing buffers by allocating a new buffer double the size of the original buffer when it was full, and copying the data into the new buffer. This is a lot of wasted effort in JavaScript, where Uint8Arrays are weirdly expensive relative to other languages.
Some of these libraries permitted us to provide a buffer for the serializer to use instead, so I could simply allocate a large enough buffer up front, and avoid this cost.
It seems to me that all of these libraries ought to just compute the length of the serialized message first, then allocate an appropriately sized buffer, rather than trying to dynamically grow buffers.
Avsc essentially does this, but in
an inefficient way. Its underlying Tap
implementation
will silently overflow when presented with an object that is too big to
serialize. It will then explicitly allocate an appropriately sized buffer, and
repeat the serialization operation into this buffer. It would probably be faster
to have an explicit way to determine the size of a serialized message.
Other libraries, like Msgpackr do not have a way to provide an appropriately sized buffer, but they do permit us to reuse a buffer previously returned by a prior call to msgpacker. By using this method and repeating the test a number of times, we can amortize out the initial attempt which grows the buffer dynamically.
While I was able to speed up Bebop considerably, it was still much slower than JSON. It didn't have a reasonable way to provide a pre-allocated buffer to the serializer. I suspect this would help considerably.
Protobuf.js does not have a good API
for avoiding garbage during setup. It requires creating explicit, library
specific, Message
objects, which we can't easily fake to avoid creating
garbage.
Protobuf.js avoids the issues with dynamically growing buffers out of the box. However, this appears to be at the cost of a much more complicated implementation. Garbage collection still accounts for a 85% of the time spent serializing protobuf messages with Protobuf.js, so I'd guess that this optimized implementation costs more than it saves.
This situation with Protobuf.js is unfortunate. There's nothing in the Protobuf format that requires the serializer to be this slow.
I tried another library called Pbf, and it was an order of magnitude faster. It was also around 88% smaller than Protobuf.js, by lines of code. The code it generated was very easy to read and understand.
To make a broad generalization, I've seen this kind of pattern with many JavaScript libraries. They are not designed with performance in mind, and end up having poor performance characteristics.
Compared with minimally optimized Rust code, JavaScript is just very slow. This is not surprising at all, but I think it bears remembering. If you have a choice, don't use JavaScript on the server.
Note that all of this high performance JavaScript is rather subtle, and requires using a profiler and carefully weighing tradeoffs. I think it's probably harder to write this kind of JavaScript than it is to just use a compiled language with better performance characteristics, like Rust. Obviously, there are situations where NodeJS is a requirement, but it'd be best to just not use this for anything that needs to perform well.