Combining JavaScript Arrays
This is a quickie simple post on JavaScript techniques. We're going to cover different methods for combining/merging two JS arrays, and the pros/cons of each approach.
Let's start with the scenario:
var a = [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ]; var b = [ "foo", "bar", "baz", "bam", "bun", "fun" ];
The simple concatenation of a
and b
would, obviously, be:
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, "foo", "bar", "baz", "bam" "bun", "fun" ]
concat(..)
The most common approach is:
var c = a.concat( b ); a; // [1,2,3,4,5,6,7,8,9] b; // ["foo","bar","baz","bam","bun","fun"] c; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]
As you can see, c
is a whole new array
that represents the combination of the two a
and b
arrays, leaving a
and b
untouched. Simple, right?
What if a
is 10,000 items, and b
is 10,000 items? c
is now 20,000 items, which constitutes basically doubling the memory usage of a
and b
.
"No problem!", you say. We just unset a
and b
so they are garbage collected, right? Problem solved!
a = b = null; // `a` and `b` can go away now
Meh. For only a couple of small array
s, this is fine. But for large array
s, or repeating this process regularly a lot of times, or working in memory-limited environments, it leaves a lot to be desired.
Looped Insertion
OK, let's just append one array
's contents onto the other, using Array#push(..)
:
// `b` onto `a` for (var i=0; i < b.length; i++) { a.push( b[i] ); } a; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"] b = null;
Now, a
has the result of both the original a
plus the contents of b
.
Better for memory, it would seem.
But what if a
was small and b
was comparitively really big? For both memory and speed reasons, you'd probably want to push the smaller a
onto the front of b
rather than the longer b
onto the end of a
. No problem, just replace push(..)
with unshift(..)
and loop in the opposite direction:
// `a` into `b`: for (var i=a.length-1; i >= 0; i--) { b.unshift( a[i] ); } b; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"] a = null;
Functional Tricks
Unfortunately, for
loops are ugly and harder to maintain. Can we do any better?
Here's our first attempt, using Array#reduce
:
// `b` onto `a`: a = b.reduce( function(coll,item){ coll.push( item ); return coll; }, a ); a; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"] // or `a` into `b`: b = a.reduceRight( function(coll,item){ coll.unshift( item ); return coll; }, b ); b; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]
Array#reduce(..)
and Array#reduceRight(..)
are nice, but they are a tad clunky. ES6 =>
arrow-functions will slim them down slightly, but it's still requiring a function-per-item call, which is unfortunate.
What about:
// `b` onto `a`: a.push.apply( a, b ); a; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"] // or `a` into `b`: b.unshift.apply( b, a ); b; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]
That's a lot nicer, right!? Especially since the unshift(..)
approach here doesn't need to worry about the reverse ordering as in the previous attempts. ES6's spread operator will be even nicer: a.push( ...b )
or b.unshift( ...a )
.
But, things aren't as rosy as they might seem. In both cases, passing either a
or b
to apply(..)
's second argument (or via the ...
spread operator) means that the array is being spread out as arguments to the function.
The first major problem is that we're effectively doubling the size (temporarily, of course!) of the thing being appended by essentially copying its contents to the stack for the function call. Moreover, different JS engines have different implementation-dependent limitations to the number of arguments that can be passed.
So, if the array
being added on has a million items in it, you'd almost certainly way exceed the size of the size of the stack allowed for that push(..)
or unshift(..)
call. Ugh. It'll work just fine for a few thousand elements, but you have to be careful not to exceed a reasonably safe limit.
Note: You can try the same thing with splice(..)
, but you'll have the same conclusions as with push(..)
/ unshift(..)
.
One option would be to use this approach, but batch up segments at the max safe size:
function combineInto(a,b) { var len = a.length; for (var i=0; i < len; i=i+5000) { b.unshift.apply( b, a.slice( i, i+5000 ) ); } }
Wait, we're going backwards in terms of readability (and perhaps even performance!). Let's quit before we give up all our gains so far.
Summary
Array#concat(..)
is the tried and true approach for combining two (or more!) arrays. But the hidden danger is that it's creating a new array instead of modifying one of the existing ones.
There are options which modify-in-place, but they have various trade-offs.
Giving the various pros/cons, perhaps the best of all of the options (including others not shown) is the reduce(..)
and reduceRight(..)
.
Whatever you choose, it's probably a good idea to critically think about your array merging strategy rather than taking it for granted.
About Kyle Simpson
Kyle Simpson is a web-oriented software engineer, widely acclaimed for his "You Don't Know JS" book series and nearly 1M hours viewed of his online courses. Kyle's superpower is asking better questions, who deeply believes in maximally using the minimally-necessary tools for any task. As a "human-centric technologist", he's passionate about bringing humans and technology together, evolving engineering organizations towards solving the right problems, in simpler ways. Kyle will always fight for the people behind the pixels.
1) that would only work for array elements which can be serialized to a string value and back. wouldn’t work for arrays of complex objects, for instance (my normal use-case).
2) it would totally fall down if any value that was stringified had a “,” appearing in it, like a list of names (“Simpson, Kyle”)
3) it triples the memory usage. :(
@Kyle:
It would be interesting to see some benchmarks for the different methods; both in performance and in memory usage. One should not trust optimizations without prof. Especially since the hypothesis above should depend on the underlying implementation of the array. However, I can only speculate as my knowledge of the underlying implementation is not good enough in this case.
@max: As a note to your suggestion. Strings are immutable objects in javascript. So in your example a.join and b.join would create two strings. The “+” operations would create a third string. And finally the “split” would create an array. However, the GC should be really fast though, as the I think the strings will end up on the stack and not the heap.
@nils:
I wasn’t really asserting optimizations. I don’t think it’s possible to absolutely reason about such things. But it’s guaranteed that
a.concat(b)
has to produce anotherarray
, and I have definitely run into such problems in node where I ran my VM out of memory when working with large arrays.I suspect that performance benchmarking (CPU or memory) of such low-level JS operations would be very tough to do accurately, as the engine has so many more things it’s doing on top of your code.
The performance benchmark is easy. The memory benchmark is a bit trickier but in Chrome it would be possible observe the memory consumption.
For the interested reader there is a benchmark created on jsperf: http://jsperf.com/combining-js-arrays
From the benchmark numbers it’s not worth using these methods if not Array.prototype.concat is causing problems. Also, please note that even if the approach looks like it’s using less memory it might not be the case. Because the underlying implementation might create an new array and copy data from the old array to the new array (please note that in this case I’m talking about the underlying array implementation).
I tried to find an explanation of the underlying implementation but I could not find one. It would be really interesting if someone could find that.
Ps. It’s worth noting the difference between Chrome and Firefox. Where “looped insertion” is actually in par with “concat”.
@kyle:
Please consider adding the benchmark to the article. Also, I agree with your comment above regarding optimizations. What’s true today might not be tomorrow (or in another browser).
I’ve been enjoying these snippets of information. Once I see most of them, it just seems like such a “duh!” moment.
I use this approach in JS a lot. Should be quite efficient.
@jovi-
Of course, that works similarly to the
for
loop approach I show in the post, except that yours also mutatesb
by emptying it (in addition to mutatinga
). That may be tolerable in certain cases, but I can think of several cases whereb
would need to stay untouched in its merging intoa
.Its sounds better….
@kyle that is true, but in case b may well be mutated, it is an efficient way with minimum amount of function calls and without temporarily doubling required memory. So I consider it quite efficient, but then again I have no benchmarks to prove that.
If b may not be mutated, however, it seems kinda obvious that one cannot get around some usage of extra memory. Personally, even then I’d prefer my method operating on
[].concat(b)
simply for readability reasons.Anyways, thanks for the great article! totally forgot to mention that :)
I am shocked by how slow my
while(length)
approach actually runs… I don’t think i will use it again :)I played around a bit on jsfiddle:
http://jsfiddle.net/loopmode/j90aqhse/5/
100000 iterations work fine on Win7/Chrome37 with most approaches, but a million makes the browser freeze, despite of try/catch, unless using your chunked combineInto approach.
As expected, that one does its job:
http://jsfiddle.net/loopmode/mh0q345e/3/
Nice post!
I created a new post about 15 Javascript Hacks and I wrote one tip about combining arrays, of course I added this post as reference :)
The post is in portuguese but I hope you enjoy it
http://udgwebdev.com/15-javascript-hacks/
And do a bit more work if you need to loop through c.
BTW, always use “for/in” (never the arithmetic loop) to loop through an array. (I know that’s not what they say on SO, but it works when the start index is not zero, when the end index is not ‘length – 1’, when the array is sparse, when … http://martinrinehart.com/frontend-engineering/engineers/javascript/arrays/array-loops.html ).
Using for in is never a good idea not even for Objects, why? Because it’s a but load of resource, knowing the length is always better. So an object which has a length param would be better suited to go by the length property (if the values are as referenced).
Fastest loop is a reverse while loop.
Just a small use case inside of a function constructor.
Depending on what you are going to use that new array for you may get away with cheating. If you just want something you can
over you don’t need to copy at all, just return a new object.
http://jsperf.com/combining-js-arrays/8
It is cheating of course since you may want a real Array sometimes, but the option is there. Lastly, this is just code i wrote in the console to show an example. You could make a new “class”, lets call it List, that has the same api as an Array, that is add
, things you can’t fake would be the third argument to iterator functions and bracket notation. But you could do
instead of
.
Sometimes it’s ok to cheat. If all you want is to be able to forEach over it later this is all you need. This is basically the “bit more work” that Martin mentioned above.
http://jsperf.com/combining-js-arrays/8
You could make a List “class” and get most of the methods that Arrays have without doing any copying at all.
Just wanted to note that the
reduce
approach doesn’t require assignment because it modifies a in-place:slight error in your examples:
should become
else if a is actually part of a larger object
(var big = {a:[]}, a=big.a)
you would not changebig.a
but justa
….besides the fact that the “a=” is simply redundant.
Is there a big difference between
forEach
andreduce
in this scenario?And a little over a year later, we have this:
I love the direction JavaScript is heading in.
You are the champ! Thanks for this!!!
Although Naive question but something like :- http://jsperf.com/combining-js-arrays
Seems to differ from your views…
how to append two row arrays into a single array with first row array as column 1 and second row array as column 2?
in Javascript
I like
. While I agree it’s not the best for size, i think it’s also worth mentioning that it also doesn’t have side effects.
This page ripped-off your article, almost word-for-word:
https://medium.com/@Andela/combining-arrays-in-javascript-3b1b9cf874a1
You’ve missed the best and most obvious solution:
Just because the concat function leaves the original arrays untouched doesn’t mean you have to!
What is you wanted the new array to match the elements by their corresponding indexes?
Not ending up with this
But with this(separated by space)
What about this?
This doesn’t mutate any array you input, but allows you to combine multiple arrays into a fresh array very easily: