# What's the best way to concatenate Uint8Arrays?

by
, posted

This post is for people familiar with JavaScript’s `Uint8Array`.

Sometimes, I want to combine multiple `Uint8Array`s into one. Something like this:

``````const a = new Uint8Array([1, 2]);
const b = new Uint8Array([3, 4]);
const c = new Uint8Array([5]);

concatenate([a, b, c]);
// => Uint8Array(5) [1, 2, 3, 4, 5]
``````

What’s the best way to do this?

If you’re only using Node, use `Buffer.concat`.

If you’re not only using Node, I prefer this solution:

``````/**
* Combine multiple Uint8Arrays into one.
*
* @param {Uint8Array[]} uint8arrays
* @returns {Uint8Array}
*/
function concatenate(uint8arrays) {
const totalLength = uint8arrays.reduce(
(total, uint8array) => total + uint8array.byteLength,
0
);

const result = new Uint8Array(totalLength);

let offset = 0;
uint8arrays.forEach((uint8array) => {
result.set(uint8array, offset);
offset += uint8array.byteLength;
});

return result;
}
``````

For this problem, I wanted to write a function, `concatenate`, which took an array of `Uint8Array`s as input and returned a single, combined `Uint8Array` as output. I didn’t want to install any libraries.

I found three reasonable ways to do this:

1. Allocate the result upfront, then update the result in pieces
2. Put them in a `Blob`
3. `Buffer.concat()` (requires Node)

Overall, I think option 1 is best. If you’re using Node, option 3 is probably best, just because it’s built in.

I also tried a few ideas that didn’t work well. We’ll see some of those failures below.

I tested these with Deno and Firefox. I generated random `Uint8Array`s of sizes between 0 bytes and 1 mebibyte, then tried to combine them. I did this with 0, 1, 10, 100, and 1000 inputs. (If you’re curious, check out the simple benchmark script I wrote.)

Let’s look at these solutions.

## Reasonable Option 1: allocate the result, set each piece

Here’s the first solution I came up with:

``````function concatenate(uint8arrays) {
// Determine the length of the result.
const totalLength = uint8arrays.reduce(
(total, uint8array) => total + uint8array.byteLength,
0
);

// Allocate the result.
const result = new Uint8Array(totalLength);

// Copy each Uint8Array into the result.
let offset = 0;
uint8arrays.forEach((uint8array) => {
result.set(uint8array, offset);
offset += uint8array.byteLength;
});

return result;
}
``````

At a high level, it:

1. Allocates the result buffer. To do this, you need the length of the result, which you can get by adding up all the inputs’ lengths.
2. Piece by piece, copy the inputs into the result.

This option is reasonably fast and works for large inputs. I would reach for a solution like this in most cases.

## Reasonable Option 2: put them in a `Blob`

Option 1 is good but is a bit long. Here’s a much shorter solution:

``````async function concatenate(uint8arrays) {
// Put the inputs into a Blob.
const blob = new Blob(uint8arrays);

// Pull an ArrayBuffer out. (Has to be async.)
const buffer = await blob.arrayBuffer();

// Convert that ArrayBuffer to a Uint8Array.
return new Uint8Array(buffer);
}
``````

It’s a little harder to read, but you could even shorten this further:

``````const concatenate = async (uint8arrays) =>
new Uint8Array(await new Blob(uint8arrays).arrayBuffer());
``````

At a high level, this solution takes advantage of the fact that the `Blob` constructor accepts an array of `Uint8Array`s as input (among other things), and can then be converted to an `ArrayBuffer`. It’s easy to convert an `ArrayBuffer` to a `Uint8Array` once you have one.

I like the brevity of this solution, and also the fact that it takes more advantage of the standard library. However, it is asynchronous, which means you need to `await` it (or handle the promise). Also, in my informal testing, this version was about 5% slower than Option 1.

Overall, I think this solution is worse than Option 1 unless brevity is your primary goal, and then it’s better.

## Reasonable Option 3: `Buffer.concat` (Node-only)

If you’re using Node, there’s an even briefer solution: `Buffer.concat`. It’s built in!

``````const a = new Uint8Array([1, 2]);
const b = new Uint8Array([3, 4]);
const c = new Uint8Array([5]);

Buffer.concat([a, b, c]);
// => <Buffer 01 02 03 04 05>
``````

This returns a `Buffer`. `Buffers` are `Uint8Array`s with subtle differences that probably don’t affect you (and you can convert them to `Uint8Array`s easily if you wish).

If you look at Node’s source code, `Buffer.concat` looks a lot like Option 1; it allocates a result buffer and copies each `Uint8Array` inside.

Unlike Option 1, `Buffer.concat` uses `Buffer.allocUnsafe` internally. These buffers are pulled from a memory pool that might contain stale data. It doesn’t set every value to 01, unlike what `new Uint8Array` does:

``````new Uint8Array(3);
// => Uint8Array(3) [ 0, 0, 0 ]

Buffer.allocUnsafe(3);
// => <Buffer 62 4c 36> (your results may vary)
``````

`Buffer.allocUnsafe` is sometimes dangerous, as its name suggests, but it can be faster. In this case, it doesn’t matter whether the result buffer is initialized to, because Node overwrites whatever was there before, so we can safely enjoy the performance benefit.

I recommend using this option if you’re only in the Node world. It was the fastest option I tested, and doesn’t require writing/importing any code.

## The graveyard of bad solutions

I tried a bunch of other ideas and all of them were bad for various reasons. If you want to see several ideas that don’t work…read on.

### Bad idea 1: a big array

My simplest (stupidest?) idea was to put everything into a big array, then convert that to a `Uint8Array` at the end.

``````// Warning: this solution is bad!
const array = [];
for (const uint8array of uint8arrays) {
array.push(...uint8array);
}
return new Uint8Array(array);
}
``````

This worked okay for smaller inputs but crashed with larger inputs. These arrays can get quite big.

I evaluated a similar idea where you would pass an array-like object (e.g., `{ length: 3, "0": 9, "1": 8, "2": 7 }`). This “solution” had a similar problem—allocating a huge object—and failed for similar reasons.2

### Bad idea 2: a generator

To avoid allocating a giant array myself, I tried using a generator function:

``````// Warning: this solution is bad!
new Uint8Array(
(function* () {
for (const uint8array of uint8arrays) {
yield* uint8array;
}
})()
);
``````

This was about 10× slower for small inputs and ran out of memory for larger ones. I assume this is because the result size is unknown upfront, so JavaScript has to store everything in an internal buffer somewhere, effectively allocating that giant array I was trying to avoid.

### Bad idea 3: an iterator (no generators)

In my anecdotal experience, generators can be slower than hand-rolling iterator code yourself.

This code was long, but here’s an abbreviated version3:

``````// Warning: this solution is bad!

class ByteIterator {
/* ...code skipped... */
}

class ByteIterable {
// ...code skipped...
[Symbol.iterator]() {
return new ByteIterator(this.uint8arrays);
}
}

const iterator = (uint8arrays) =>
new Uint8Array(new ByteIterable(uint8arrays));
``````

This was faster than the generator solution but still much slower than the other good solutions, and still ran out of memory for large inputs. I assume it failed for the same reason as the generator one.

### Bad idea 4: `Uint8Array.from`

If you call `Uint8Array.from` with a second argument, you get a “map function” that we can abuse for this purpose.

This code was also very long. Here’s a very abbreviated version:

``````// Warning: this solution is bad!
Uint8Array.from({ length: totalLength }, () => {
// This is pseudocode:
return nextByteFromLatestUint8Array;
});
``````

Unlike all the other bogus ideas I tried, this one does actually work. It’s not a complete failure!

However, when compared to the other good options, I don’t think it has any benefits at all. It’s a lot slower (the worst I saw was 16× slower!), harder to understand, more code, and less compatible with browsers and runtimes.

## Summary

I recommend using `Buffer.concat` if you’re using Node and Option 1 otherwise.

Contact me if I missed something!

1. Unless you override its default behavior↩︎

2. There might be a way to make this idea workable using proxies, but I didn’t try that. ↩︎

3. If you want to read some long code that doesn’t work, here’s the full iterator-based “solution”↩︎