Writing

How Real-Time Collaborative Editors work? [Operational Transformation]

Recently, I’ve been diving deep on how real-time collaborative editing works. I have been reading various papers on both the methods of implementing collaborative editing : Operational Transformation, which goes back to 1989, when it was first implemented by the GROVE (GRoup Outtie Viewing Editor) system (this algorithm is quite old, and Google uses this algorithm for collaborative editing for Google Docs, Google Slides, Wave, etc) and Conflict-Free Replicated Data Types, which is a much newer approach to real-time editing.

Real challenge of collaborative editing

If you know how real-time collaborative editing works, then you may know that handling concurrent editing in multi user environment gracefully is very challenging. However, a few simple concepts can simplify this problem. The main challenge, as mentioned, with collaborative editing is the concurrency control [concurrent edits] to the document are not commutative. This needs to be causally ordered before applying either by undoing history, or by transforming the operations [operational transformation] before applying them to make them seem commutative.

Bringing Latency into action

Introducing latency between the client and server is where the problems arise. Latency in a collaborative editor introduces the possibility of version conflicts. Here, where, I said Operational Transformation will come into action.

Let’s take an example :

Starting Client’s state :

ABCD

Starting Server’s state :

ABCD

Now, let’s say, Client enters X in between C and D , the operation would look something like this :

insert(X,3) //where 3 is the position where x is going to be added (0=A, 1=B, 2=C ..)

And at the same time, Server deletes b , the operation would be :

delete(B,1)

What actually should happen is that the client and server should both end with ACXD but in reality, client ends with ACXD

Starting Document State -> ABCD
"Insert 'X'" operation at offset 3 [local] -> ABCXD
"Delete 'B'" operation at offset 1 [remote] -> ACXD

but the server ends with ACDX.

Starting Document State -> ABCD
"Delete 'B'" operation at offset 1 [local] -> ACD
"Insert 'X'" operation at offset 3 [remote] -> ACDX

Ofcourse, ACXD != ACDX and the document which is shared now is in wrong state.

Here is where the Operational Transformation algorithm comes to the rescue.

Operational Transformation Algorithm

Operational Transformation (OT) is an algorithm/technique for the transformation of operations such that they can be applied to documents whose states have diverged, bringing them both back to the same state.

How does Operational Transformation work?

A short overview of how OT works :

Let’s apply Operational Transformation in the original example.

If we apply OT, Client will see :

Starting Document State -> ABCD
"Insert 'X'" operation at offset 3 [local] -> ABCXD
"Delete 'B'" operation at offset 1 [transformed] -> ACXD

and Server will see :

Starting Document State -> ABCD
"Delete 'B'" operation at offset 1 [local] -> ACD
"Insert 'X'" operation at offset 2 [transformed] -> ACXD //Transform function would add add it in the new (3 - 1 = 2) position 

Client - Server [OT] Approach to Collaborative Editing

The transform function is used to build a client-server protocol that can handle collaboration between any number of clients.

Choosing a Client-Server architecture will allow scouting a large number of clients without actually complicating the environment. Also, there will be a single system which holds the source of truth i.e. the server, so even if the clients crash/go offline for a long time, we can go back to the server and fetch the document easily.

This source of truth also forces the client to wait for the server to acknowledge the operation that the client has just sent which would mean that the client always stays on the server’s OT path. This would help in keeping a single history of operations without actually having to keep a mirror of the state for each client that is connected. That would eventually mean the number of clients that are connected to the server would have only one single copy of the document on the server.

Conclusion

Operational Transformation is a very powerful tool that allows to build great collaborative apps with support for non-blocking concurrent editing. I am looking forward to dive deep into the algorithm (and ofcourse extending this blogpost) and possibly working on an editor which supports it as well.

Resources

I’ve read the following papers and articles to learn about Operational Transformation.