Serialization from NodeJS

2025-06-22

Introduction

In this post, I'll show that several binary formats outperform JSON at serialization in NodeJS. However, Node's runtime and library ecosystem introduce complexity that must be dealt with to achieve this performance.

This is intended as a companion to Binary Formats are Better than JSON in Browsers. In that post, I explored how several binary formats outperfom JSON in browsers. In this post, I'll be focused on serialization performance in backend services written in NodeJS.

For all of these tests, the code is available here. I used the NYC citibike dataset as my data for benchmarking. In my benchmark, I serialize a list of 100,000 trips in different serialization formats.

Here are the final results, comparing the formats that I tested:

Why is this interesting?

Naively, one would assume that binary formats are always better than JSON, since they produce a smaller output, and fewer bytes ought to be faster. However, JSON serialization is implemented in C++ as part of v8, and can make use of optimizations that are not available to other libraries. A library like avsc is implemented in pure JavaScript, and so has fewer optimization opportunities.

In a past job, I wrote a doc about how avro didn't perform as well as JSON for serialization, so we shouldn't use it. This seems to no longer be the case, so I wanted to write this follow up.

I also found that it was important to optimize a couple of things in the serialization benchmark to get a good result. Without optimization, some of these serializers were 10x slower. I wanted to document this somewhere, in case it is helpful.

Optimizations

At the beginning of these tests, the results were radically different from the eventual optimized versions:

There were two big changes that I made to several of these serializers which were responsible for this speedup:

Profiling and Allocation Sampling:

When I started CPU profiling this benchmark, it became very clear that we were spending almost all of our time inside the garbage collector. I quickly switched to looking at allocation sampling. Node makes this very easy using the inspector module -- third party libraries are no longer required for doing scoped profiling!

Allocation sampling clearly showed that most of this garbage was created in the two aforementioned places: when setting up the object to serialize, and when growing buffers.

Creating the input to the serializer without creating lots of garbage.

The first thing I noticed when optimizing this code was that we were creating most of our garbage when setting up the object for serialization. This is code that looked like this:


	const response = {
		trips: trips.map((trip) => ({
			rideId: trip.rideId,
			rideableType: mapToAvroRideableType(trip.rideableType),
			startLat: trip.startLat || null,
			...
		})),
	};

This code was responsible for mapping between our internal types and the types expected by our Avro schema. In this case, I needed to replace some undefined with null, and remap enums. In other cases, I needed to slightly rename fields, like startTime became startTimeMs.

At any rate, this created a huge amount of garbage, which bogged down our benchmark.

I was able to fix these by creating "remapper types", which looked like this:

class AvroTripTransformer {
	constructor(private underlying: Trip) {}

	get rideId(): string {
		return this.underlying.rideId;
	}

	get rideableType(): string {
		return mapToAvroRideableType(this.underlying.rideableType);
	}

	get startLat(): number | null {
		return this.underlying.startLat || null;
	}
	...
}

This mostly eliminated the garbage created by these objects, which more than doubled the speed of the avro benchmark. Other serializers had a noticeable, but less dramatic speedup.

Using an appropriately sized buffer.

The second big thing I noticed was that many serializers allocated a lot of buffers. Most of the libraries that I looked at closely implemented dynamically growing buffers by allocating a new buffer double the size of the original buffer when it was full, and copying the data into the new buffer. This is a lot of wasted effort in JavaScript, where Uint8Arrays are weirdly expensive relative to other languages.

Some of these libraries permitted us to provide a buffer for the serializer to use instead, so I could simply allocate a large enough buffer up front, and avoid this cost.

It seems to me that all of these libraries ought to just compute the length of the serialized message first, then allocate an appropriately sized buffer, rather than trying to dynamically grow buffers.

Avsc essentially does this, but in an inefficient way. Its underlying Tap implementation will silently overflow when presented with an object that is too big to serialize. It will then explicitly allocate an appropriately sized buffer, and repeat the serialization operation into this buffer. It would probably be faster to have an explicit way to determine the size of a serialized message.

Other libraries, like Msgpackr do not have a way to provide an appropriately sized buffer, but they do permit us to reuse a buffer previously returned by a prior call to msgpacker. By using this method and repeating the test a number of times, we can amortize out the initial attempt which grows the buffer dynamically.

Serializer Specific Observations

Bebop

While I was able to speed up Bebop considerably, it was still much slower than JSON. It didn't have a reasonable way to provide a pre-allocated buffer to the serializer. I suspect this would help considerably.

Protobuf.js

Protobuf.js does not have a good API for avoiding garbage during setup. It requires creating explicit, library specific, Message objects, which we can't easily fake to avoid creating garbage.

Protobuf.js avoids the issues with dynamically growing buffers out of the box. However, this appears to be at the cost of a much more complicated implementation. Garbage collection still accounts for a 85% of the time spent serializing protobuf messages with Protobuf.js, so I'd guess that this optimized implementation costs more than it saves.

This situation with Protobuf.js is unfortunate. There's nothing in the Protobuf format that requires the serializer to be this slow.

I tried another library called Pbf, and it was an order of magnitude faster. It was also around 88% smaller than Protobuf.js, by lines of code. The code it generated was very easy to read and understand.

To make a broad generalization, I've seen this kind of pattern with many JavaScript libraries. They are not designed with performance in mind, and end up having poor performance characteristics.

Use a Different Programming Language

Compared with minimally optimized Rust code, JavaScript is just very slow. This is not surprising at all, but I think it bears remembering. If you have a choice, don't use JavaScript on the server.

Note that all of this high performance JavaScript is rather subtle, and requires using a profiler and carefully weighing tradeoffs. I think it's probably harder to write this kind of JavaScript than it is to just use a compiled language with better performance characteristics, like Rust. Obviously, there are situations where NodeJS is a requirement, but it'd be best to just not use this for anything that needs to perform well.

Conclusion