Collaborative Editing in JavaScript: An Intro to Operational Transformation

By  on  

I've set out to build a robust collaborative code editor for the web. It's called Codr, and it lets developers work together in real time - like Google Docs for code. For web developers, Codr doubles as a shared reactive work surface where every change is instantly rendered for all viewers. Check out Codr's newly launched Kickstarter campaign to learn more.

A collaborative editor allows multiple people to edit the same document simultaneously and to see each other's edits and selection changes as they occur. Concurrent text editing allows for engaging and efficient collaboration that would otherwise be impossible. Building Codr has enabled me to better understand and (I hope) convey how to build a fast and reliable collaborative application.

The Challenge

If you've built a collaborative editor or have talked with someone who has, then you know that gracefully handling concurrent edits in a multi-user environment is challenging. It turns out, however, that a few relatively simple concepts greatly simplify this problem. Below I'll share what I've learned in this regard through building Codr.

The primary challenge associated with collaborative editing is concurrency control. Codr uses a concurrency control mechanism based on Operational Transformation (OT). If you'd like to read up about the history and theory of OT, then check out the wikipedia page. I'll introduce some of the theory below, but this post is intended as an implementor's guide and is hands-on rather than abstract.

Codr is built in JavaScript and code examples are in JavaScript. Significant logic needs to be shared between the server and client to support collaborative editing, so a node/iojs backend is an excellent choice. In the interest of readability, code examples are in ES6.

A Naive Approach to Collaborative Editing

In a zero-latency environment, you might write a collaborative editor like this:

Client

editor.on('edit', (operation) => 
    socket.send('edit', operation));
socket.on('edit', (operation) => 
    editor.applyEdit(operation));

Server

socket.on('edit', (operation) => {
    document.applyEdit(operation);
    getOtherSockets(socket).forEach((otherSocket) => 
        otherSocket.emit('edit', operation)
    );
});

Every action is conceptualized as either an insert or delete operation. Each operation is:

  1. Locally applied in the editing component
  2. Sent to the server
  3. Applied to a server-side copy of the document
  4. Broadcast to other remote editors
  5. Locally applied to each remote editor's copy of the document

Latency Breaks Things

When you introduce latency between the client and server, however, you run into problems. As you've likely foreseen, latency in a collaborative editor introduces the possibility of version conflicts. For example:

Starting document state:

bcd

User 1 inserts a at document start. The operation looks like this:

{
    type: 'insert',
    lines: ['a'],
    range: {
        start: { row: 0, column: 0}
        end: {row: 0, column: 1}
    }
}

At the same time, User 2 types e at document end:

{
    type: 'insert',
    lines: ['e'],
    range: {
        start: { row: 0, column: 3}
        end: {row: 0, column: 4}
    }
}

What should happen is that User 1 and User 2 end up with:

abcde

In reality, User 1 sees:

bcd    <-- Starting Document State
abcd   <-- Apply local "insert 'a'" operation at offset 0
abced  <-- Apply remote "insert 'e'" operation at offset 3

And User 2 sees:

bcd    <-- Starting Document State
bcde   <-- Apply local "insert 'e'" operation at offset 3
abcde  <-- Apply remote "insert 'a'" operation at offset 0

Oops! 'abced' != 'abcde' - the shared document is now in an inconsistent state.

The Easy Fix is Too Slow

The above conflict occurs because each user is "optimistically" applying edits locally without first ensuring that no one else is making edits. Since User 1 changed the document out from under User 2, a conflict occurred. User 2's edit operation presupposes a document state that no longer exists by the time it's applied to User 1's document.

A simple fix is to switch to a pessimistic concurrency control model where each client requests an exclusive write lock from the server before applying updates locally. This avoids conflicts altogether. Unfortunately, the lag resulting from such an approach over an average internet connection would render the editor unusable.

Operational Transformation to the Rescue

Operational Transformation (OT) is a technique to support concurrent editing without compromising performance. Using OT, each client optimistically updates its own document locally and the OT implementation figures out how to automatically resolve conflicts.

OT dictates that when we apply a remote operation we first "transform" the operation to compensate for conflicting edits from other users. The goals are two-fold:

  1. Ensure that all clients end up with consistent document states
  2. Ensure that the intent of each edit operation is preserved

In my original example, we'd want to transform User 2's insert operation to insert at character offset 4 instead of offset 3 when we apply it to User 1's document. This way, we respect User 2's intention to insert e after d and ensure that both users end up with the same document state.

Using OT, User 1 will see:

bcd    <-- Starting Document State
abcd   <-- Apply local "insert 'a'" operation at offset 0
abcde  <-- Apply TRANSFORMED "insert 'e'" operation at offset 4

And User 2 will see:

bcd    <-- Starting Document State
bcde   <-- Apply local "insert 'e'" operation at offset 3
abcde  <-- Apply remote "insert 'a'" operation at offset 0

The Life-Cycle of An Operation

A helpful way to visualize how edits are synchronized using OT is to think of a collaborative document as a git repository:

  1. Edit operations are commits
  2. The server is the master branch
  3. Each client is a topic branch off of master

Merging Edits Into Master (Server-Side) When you make an edit in Codr, the following occurs:

  1. The Codr client branches from master and locally applies your edit
  2. The Codr client makes a merge request to the server

Here's git's lovely (slightly adapted) diagram. Letters reference commits (operations):

Before Merge:

      A topic (client)
     /
    D---E---F master (server)

After Merge:

      A ------ topic
     /         \
    D---E---F---G master

To do the merge, the server updates (transforms) operation A so that it still make sense in light of preceding operations E and F, then applies the transformed operation (G) to master. The transformed operation is directly analogous to a git merge commit.

Rebasing Onto Master (Client-Side) After an operation is transformed and applied server-side, it is broadcasted to the other clients. When a client receives the change, it does the equivalent of a git rebase:

  1. Reverts all "pending" (non-merged) local operations
  2. Applies remote operation
  3. Re-applies pending operations, transforming each operation against the new operation from the server

By rebasing the client rather than merging the remote operation as is done server-side, Codr ensure that edits are applied in the same order across all clients.

Establishing a Canonical Order of Edit Operations

The order in which edit operations are applied is important. Imagine that two users type the characters a and b simultaneously at the same document offset. The order in which the operations occur will determine if ab or ba is shown. Since latency is variable, we can't know with certainty what order the events actually occurred in, but it's nonetheless important that all clients agree on the same ordering of events. Codr treats the order in which events arrive at the server as the canonical order.

The server stores a version number for the document which is incremented whenever an operation is applied. When the server receives an operation, it tags the operation with the current version number before broadcasting it to the other clients. The server also sends a message to the client initiating the operation indicating the new version. This way every client knows what its "server version" is.

Whenever a client sends an operation to the server, it also sends the client's current server version. This tells the server where the client "branched", so the server knows what previous operations the new change needs to be transformed against.

Transforming an Operation

The core of Codr's OT logic is this function:

function transformOperation(operation1, operation2) {
    // Modify operation2 such that its intent is preserved
    // subsequent to intervening change operation1
}

I won't go into the full logic here, as it gets involved, but here are some examples:

  1. If op1 inserted line(s) before op2's line, increase op2's line offset accordingly.

  2. If op1 inserted text before op2 on the same line, increase op2's character offset accordingly.

  3. If op1 occurred entirely after op2, then don't do anything.

  4. If op1 inserts text into a range that op2 deletes, then grow op2's deletion range to include the inserted text and add the inserted text. Note: Another approach would be to split op2 into two deletion actions, one on either side of op1's insertion, thus preserving the inserted text.

  5. If op1 and op2 are both range deletion operations and the ranges overlap, then shrink op2's deletion range to only include text NOT deleted by op1.

Syncing Cursor Position and Selection

A user selection is simply a text range. If the start and end points of the range are equal, then the range is a collapsed cursor. When the user selection changes, the client sends the new selection to the server and the server broadcasts the selection to the other clients. As with editing operations, Codr transforms the selection against conflicting operations from other users. The transform logic for a selection is simply a subset of the logic needed to transform an insert or delete operation.

Undo/Redo

Codr gives each user their own undo stack. This is important for a good editing experience: otherwise hitting CMD+Z could undo someone else's edit in a different part of the document.

Giving each user their own undo stack also requires OT. In fact, this is one case where OT would be necessary even in a zero-latency environment. Imagine the following scenario:

abc     <-- User 1 types "abc"
abcde   <-- User 2 types "de"
ce      <-- User 1 deletes "bcd"
??      <-- User 2 hits CMD+Z

User2's last action was:

{
    type: 'insert',
    lines: ['de'],
    range: {
        start: { row: 0, column: 3}
        end: {row: 0, column: 5}
    }
}

The inverse (undo) action would be:

{
    type: 'delete',
    lines: ['de'],
    range: {
        start: { row: 0, column: 3}
        end: {row: 0, column: 5}
    }
}

But we obviously can't just apply the inverse action. Thanks to User 1's intervening change, there is no longer a character offset 3 in the document!

Once again, we can use OT:

var undoOperation = getInverseOperation(myLastOperation);
getOperationsAfterMyLastOperation().forEach((operation) => 
    transformOperation(operation, undoOperation);
);
editor.applyEdit(undoOperation);
socket.emit('edit', undoOperation);

By transforming the undo operation against subsequent operations from other clients, Codr will instead apply the following operation on undo, achieving the desired behavior.

{
    type: 'delete',
    lines: ['e'],
    range: {
        start: { row: 0, column: 1}
        end: {row: 0, column: 2}
    }
}

Implementing undo/redo correctly is one of the more challenging aspects of building a collaborative editor. The full solution is somewhat more involved than what I've described above because you need to undo contiguous inserts and deletes as a unit. Since operations that were contiguous may become non-contiguous due to edits made by other collaborators, this is non-trivial. What's cool though, is that we can reuse the same OT used for syncing edits to achieve per-user undo histories.

Conclusion

OT is a powerful tool that allows us to build high-performance collaborative apps with support for non-blocking concurrent editing. I hope that this summary of Codr's collaborative implementation provides a helpful starting point for understanding OT. A huge thanks to David for his invitation to let me share this piece on his blog.

Want to learn more about Codr? Check out the KickStarter campaign or tweet to @CodrEditor to request an invite.

Alden Daniels

About Alden Daniels

Alden is the lead front-end engineer at OneSpot. He likes using web technologies to build high-performance apps that humans will enjoy using. He lives and works in the beautiful city of Austin, TX. When he's not hacking on code, he enjoys all things outdoors, travel, and good times with family and friends. He's also an insufferable punster.

Recent Features

  • By
    Write Better JavaScript with Promises

    You've probably heard the talk around the water cooler about how promises are the future. All of the cool kids are using them, but you don't see what makes them so special. Can't you just use a callback? What's the big deal? In this article, we'll...

  • By
    Introducing MooTools Templated

    One major problem with creating UI components with the MooTools JavaScript framework is that there isn't a great way of allowing customization of template and ease of node creation. As of today, there are two ways of creating: new Element Madness The first way to create UI-driven...

Incredible Demos

Discussion

  1. Your effort is much appreciated. I’m looking for good collaborative editing tools for years.

    What i really want is integration in local IDE’s and code editors. Especially IntelliJ IDEA and Sublime Text. Maybe Atom, too.

    Is that planned? Otherwise i don’t see any value for more complicated workflows. For example for projects that use a build tool like gulp.

    • Forgive me, i now see that my comment was completely out of scope.

      Great post and keep up the good work!

  2. Risto Novik

    Great idea! Collaborative editing topic is complicated and the solution are also hard to test and implement.

    Couple questions about your implementation:

    – Have you looked at the ShareJS project which solves also similar problem ?
    – Do you plan to separate the core implementation of OT, so it would be possible to use in other projects ?
    – How would you scale your solution beyond single instance, what are the limits ?
    – Is there going to be offline support and how do you solve it using the OT ?

    Great post.

    • Risto Novik

      Forgot one more question, any plans to support other data structures (Sets, Vectors, key-value stores) besides the text sequence editing ?

  3. Hi Risto – Apologies for my drastically delayed response. I apparently was not automatically subscribed to comments.

    > Have you looked at the ShareJS project which solves also similar problem?

    I’m familiar with ShareJS, but haven’t used it. One of my goals when creating Codr was to become more conversant with OT. This motivated me to build my own solution. Also, I don’t think ShareJS provides built-in support for separate undo stacks.

    > Do you plan to separate the core implementation of OT, so it would be possible to use in other projects ?

    I’d like to. This will take some time, however, so whether or not I get to this depends on whether the KickStarter campaign is funded.

    > How would you scale your solution beyond single instance, what are the limits?

    Scaling out to multiple machines doesn’t introduce too much complexity so long as actions affecting a specific document are all routed to the same machine. Some intelligent load balancing / routing would accomplish this.

    > Is there going to be offline support and how do you solve it using the OT

    OT makes this entirely possible. However, the result may be unhelpful if you and someone else spend a lot of time editing the same document and don’t see the merged results till later. Since Codr’s primary purpose is to facilitate realtime collaboration, this is something I’m unlikely to add in the near future.

    > Forgot one more question, any plans to support other data structures (Sets, Vectors, key-value stores) besides the text sequence editing?

    No, although a lot of the logic could be re-used for these. It looks like ShareJS supports pluggable transform types, which is a good approach.

  4. Shanmu

    Hi Alden,

    Great post, I understand the core concept of collaborative editing.

    I am stuck with the OT algorithm, how it actually transforms the position of operation?

    I am wondering the actual algorithm will be? How does it work behind the scenes?

    Can I get any sudo view of the algorithm? Or any reference link for OT algorithm.

    Thanks in advance.

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!