Combining JavaScript Arrays

By  on  

This is a quickie simple post on JavaScript techniques. We're going to cover different methods for combining/merging two JS arrays, and the pros/cons of each approach.

Let's start with the scenario:

var a = [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ];
var b = [ "foo", "bar", "baz", "bam", "bun", "fun" ];

The simple concatenation of a and b would, obviously, be:

[
   1, 2, 3, 4, 5, 6, 7, 8, 9,
   "foo", "bar", "baz", "bam" "bun", "fun"
]

concat(..)

The most common approach is:

var c = a.concat( b );

a; // [1,2,3,4,5,6,7,8,9]
b; // ["foo","bar","baz","bam","bun","fun"]

c; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

As you can see, c is a whole new array that represents the combination of the two a and b arrays, leaving a and b untouched. Simple, right?

What if a is 10,000 items, and b is 10,000 items? c is now 20,000 items, which constitutes basically doubling the memory usage of a and b.

"No problem!", you say. We just unset a and b so they are garbage collected, right? Problem solved!

a = b = null; // `a` and `b` can go away now

Meh. For only a couple of small arrays, this is fine. But for large arrays, or repeating this process regularly a lot of times, or working in memory-limited environments, it leaves a lot to be desired.

Looped Insertion

OK, let's just append one array's contents onto the other, using Array#push(..):

// `b` onto `a`
for (var i=0; i < b.length; i++) {
    a.push( b[i] );
}

a; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

b = null;

Now, a has the result of both the original a plus the contents of b.

Better for memory, it would seem.

But what if a was small and b was comparitively really big? For both memory and speed reasons, you'd probably want to push the smaller a onto the front of b rather than the longer b onto the end of a. No problem, just replace push(..) with unshift(..) and loop in the opposite direction:

// `a` into `b`:
for (var i=a.length-1; i >= 0; i--) {
    b.unshift( a[i] );
}

b; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

a = null;

Functional Tricks

Unfortunately, for loops are ugly and harder to maintain. Can we do any better?

Here's our first attempt, using Array#reduce:

// `b` onto `a`:
a = b.reduce( function(coll,item){
    coll.push( item );
    return coll;
}, a );

a; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

// or `a` into `b`:
b = a.reduceRight( function(coll,item){
    coll.unshift( item );
    return coll;
}, b );

b; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

Array#reduce(..) and Array#reduceRight(..) are nice, but they are a tad clunky. ES6 => arrow-functions will slim them down slightly, but it's still requiring a function-per-item call, which is unfortunate.

What about:

// `b` onto `a`:
a.push.apply( a, b );

a; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

// or `a` into `b`:
b.unshift.apply( b, a );

b; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

That's a lot nicer, right!? Especially since the unshift(..) approach here doesn't need to worry about the reverse ordering as in the previous attempts. ES6's spread operator will be even nicer: a.push( ...b ) or b.unshift( ...a ).

But, things aren't as rosy as they might seem. In both cases, passing either a or b to apply(..)'s second argument (or via the ... spread operator) means that the array is being spread out as arguments to the function.

The first major problem is that we're effectively doubling the size (temporarily, of course!) of the thing being appended by essentially copying its contents to the stack for the function call. Moreover, different JS engines have different implementation-dependent limitations to the number of arguments that can be passed.

So, if the array being added on has a million items in it, you'd almost certainly way exceed the size of the size of the stack allowed for that push(..) or unshift(..) call. Ugh. It'll work just fine for a few thousand elements, but you have to be careful not to exceed a reasonably safe limit.

Note: You can try the same thing with splice(..), but you'll have the same conclusions as with push(..) / unshift(..).

One option would be to use this approach, but batch up segments at the max safe size:

function combineInto(a,b) {
    var len = a.length;
    for (var i=0; i < len; i=i+5000) {
        b.unshift.apply( b, a.slice( i, i+5000 ) );
    }
}

Wait, we're going backwards in terms of readability (and perhaps even performance!). Let's quit before we give up all our gains so far.

Summary

Array#concat(..) is the tried and true approach for combining two (or more!) arrays. But the hidden danger is that it's creating a new array instead of modifying one of the existing ones.

There are options which modify-in-place, but they have various trade-offs.

Giving the various pros/cons, perhaps the best of all of the options (including others not shown) is the reduce(..) and reduceRight(..).

Whatever you choose, it's probably a good idea to critically think about your array merging strategy rather than taking it for granted.

Kyle Simpson

About Kyle Simpson

Kyle Simpson is an Open Web Evangelist from Austin, TX, who's passionate about all things JavaScript. He's an author, workshop trainer, tech speaker, and OSS contributor/leader.

Recent Features

  • By
    Write Better JavaScript with Promises

    You've probably heard the talk around the water cooler about how promises are the future. All of the cool kids are using them, but you don't see what makes them so special. Can't you just use a callback? What's the big deal? In this article, we'll...

  • By
    Being a Dev Dad

    I get asked loads of questions every day but I'm always surprised that they're rarely questions about code or even tech -- many of the questions I get are more about non-dev stuff like what my office is like, what software I use, and oftentimes...

Incredible Demos

Discussion

  1. max
    c=(a.join(",")+b.join(",")).split(",")
    • 1) that would only work for array elements which can be serialized to a string value and back. wouldn’t work for arrays of complex objects, for instance (my normal use-case).

      2) it would totally fall down if any value that was stringified had a “,” appearing in it, like a list of names (“Simpson, Kyle”)

      3) it triples the memory usage. :(

  2. nils

    @Kyle:
    It would be interesting to see some benchmarks for the different methods; both in performance and in memory usage. One should not trust optimizations without prof. Especially since the hypothesis above should depend on the underlying implementation of the array. However, I can only speculate as my knowledge of the underlying implementation is not good enough in this case.

    @max: As a note to your suggestion. Strings are immutable objects in javascript. So in your example a.join and b.join would create two strings. The “+” operations would create a third string. And finally the “split” would create an array. However, the GC should be really fast though, as the I think the strings will end up on the stack and not the heap.

    • @nils:

      I wasn’t really asserting optimizations. I don’t think it’s possible to absolutely reason about such things. But it’s guaranteed that a.concat(b) has to produce another array, and I have definitely run into such problems in node where I ran my VM out of memory when working with large arrays.

      I suspect that performance benchmarking (CPU or memory) of such low-level JS operations would be very tough to do accurately, as the engine has so many more things it’s doing on top of your code.

    • nils

      The performance benchmark is easy. The memory benchmark is a bit trickier but in Chrome it would be possible observe the memory consumption.

      For the interested reader there is a benchmark created on jsperf: http://jsperf.com/combining-js-arrays

      From the benchmark numbers it’s not worth using these methods if not Array.prototype.concat is causing problems. Also, please note that even if the approach looks like it’s using less memory it might not be the case. Because the underlying implementation might create an new array and copy data from the old array to the new array (please note that in this case I’m talking about the underlying array implementation).

      I tried to find an explanation of the underlying implementation but I could not find one. It would be really interesting if someone could find that.

      Ps. It’s worth noting the difference between Chrome and Firefox. Where “looped insertion” is actually in par with “concat”.

    • nils

      @kyle:

      Please consider adding the benchmark to the article. Also, I agree with your comment above regarding optimizations. What’s true today might not be tomorrow (or in another browser).

  3. I’ve been enjoying these snippets of information. Once I see most of them, it just seems like such a “duh!” moment.

  4. jovi
    var a = [1,2,3],
        b = ["a", "b", "c"];
    while (b.length) {
        a.push(b.shift())
    }
    

    I use this approach in JS a lot. Should be quite efficient.

    • @jovi-

      Of course, that works similarly to the for loop approach I show in the post, except that yours also mutates b by emptying it (in addition to mutating a). That may be tolerable in certain cases, but I can think of several cases where b would need to stay untouched in its merging into a.

    • Its sounds better….

  5. jovi

    @kyle that is true, but in case b may well be mutated, it is an efficient way with minimum amount of function calls and without temporarily doubling required memory. So I consider it quite efficient, but then again I have no benchmarks to prove that.

    If b may not be mutated, however, it seems kinda obvious that one cannot get around some usage of extra memory. Personally, even then I’d prefer my method operating on [].concat(b) simply for readability reasons.

    Anyways, thanks for the great article! totally forgot to mention that :)

  6. Jovi

    I am shocked by how slow my while(length) approach actually runs… I don’t think i will use it again :)

    I played around a bit on jsfiddle:
    http://jsfiddle.net/loopmode/j90aqhse/5/

    100000 iterations work fine on Win7/Chrome37 with most approaches, but a million makes the browser freeze, despite of try/catch, unless using your chunked combineInto approach.
    As expected, that one does its job:
    http://jsfiddle.net/loopmode/mh0q345e/3/

  7. Nice post!

    I created a new post about 15 Javascript Hacks and I wrote one tip about combining arrays, of course I added this post as reference :)

    The post is in portuguese but I hope you enjoy it
    http://udgwebdev.com/15-javascript-hacks/

  8. var c = [a, b];
    

    And do a bit more work if you need to loop through c.

    BTW, always use “for/in” (never the arithmetic loop) to loop through an array. (I know that’s not what they say on SO, but it works when the start index is not zero, when the end index is not ‘length – 1’, when the array is sparse, when … http://martinrinehart.com/frontend-engineering/engineers/javascript/arrays/array-loops.html ).

    • Chris

      Using for in is never a good idea not even for Objects, why? Because it’s a but load of resource, knowing the length is always better. So an object which has a length param would be better suited to go by the length property (if the values are as referenced).

      Fastest loop is a reverse while loop.

      var len = this.length = element.length;
      while(len--){
        this[len] = element[len];
      }

      Just a small use case inside of a function constructor.

  9. Depending on what you are going to use that new array for you may get away with cheating. If you just want something you can

    .forEach()

    over you don’t need to copy at all, just return a new object.

    http://jsperf.com/combining-js-arrays/8

    It is cheating of course since you may want a real Array sometimes, but the option is there. Lastly, this is just code i wrote in the console to show an example. You could make a new “class”, lets call it List, that has the same api as an Array, that is add

    list.{map, filter, reduce, reduceRight, length}

    , things you can’t fake would be the third argument to iterator functions and bracket notation. But you could do

    list.get(i)

    instead of

    array[i]

    .

  10. Sometimes it’s ok to cheat. If all you want is to be able to forEach over it later this is all you need. This is basically the “bit more work” that Martin mentioned above.

    function cheatCombine(a,b) {
      return {
        forEach: function(iter) {
          var alen = a.length, len = alen + b.length
          for (var i = 0; i < len; i++) {
            if (i < alen) {
              iter(a[i], i)
            }
            else {
              iter(b[i - alen], i)
            }
          }
        }
      }
    }
    

    http://jsperf.com/combining-js-arrays/8

    You could make a List “class” and get most of the methods that Arrays have without doing any copying at all.

  11. Just wanted to note that the reduce approach doesn’t require assignment because it modifies a in-place:

    b.reduce( function(coll,item){
        coll.push( item );
        return coll;
    }, a);
    
  12. Jörn Berkefeld

    slight error in your examples:

    a = b.reduce( function(coll,item){
        coll.push( item );
        return coll;
    }, a );
    

    should become

    b.reduce( function(coll,item){
        coll.push( item );
        return coll;
    }, a );
    

    else if a is actually part of a larger object (var big = {a:[]}, a=big.a) you would not change big.a but just a….

    besides the fact that the “a=” is simply redundant.

    • Cesar Vidril

      Is there a big difference between forEach and reduce in this scenario?

  13. And a little over a year later, we have this:

      a.push(...b);
    

    I love the direction JavaScript is heading in.

  14. Tarun

    Although Naive question but something like :- http://jsperf.com/combining-js-arrays
    Seems to differ from your views…

  15. sunny

    how to append two row arrays into a single array with first row array as column 1 and second row array as column 2?
    in Javascript

  16. Scott

    I like

    [].concat.apply([], arrayOfArrays)

    . While I agree it’s not the best for size, i think it’s also worth mentioning that it also doesn’t have side effects.

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!