Sunny Li / 10.12.2022Home / guides

How does encrypted collaboration work in real-time?

Collaborating in real-time requires complex algorithms. How can it be done end-to-end encrypted and completely privately?
Multiple cursors collaborating on an encrypted document.
Collaborating in real-time is a technical challenge across networking, infrastructure, product, and text editing. For many internet users, Google Docs introduced real-time collaboration to writing, with text emerging from multiple colorful cursors when working together. Today, remote collaboration and remote work go hand-in-hand, with real-time editing a daily necessity.Yet, organizations today require greater privacy and security standards, particularly as they collaborate across time zones and share data containing critical personal or professional information. Given these requirements, new security standards, such as end-to-end encryption, have become requirements for many business contexts and consumer messaging.How has collaboration kept up?This article explores the nuances of how real-time collaboration operates in an end-to-end encrypted environment, where no centralized server has access to users’ sensitive data in plain text. After giving an overview of how collaboration works today on products such as Google Docs, we’ll cover new, distributed models that change the privacy paradigm and enable seamless real-time work while preserving end-to-end encryption of all user and document data.

A brief history of collaborating in real-time

Operational transforms are a type of mathematical transform used to achieve a particular goal. The most common use for operational transforms is in computer science, where they are used to create and maintain consistency in a distributed system.Google Docs is a popular example of a system that uses operational transforms. Google Docs is a cloud-based word processing and spreadsheet application that allows users to collaboratively edit documents in real-time. When multiple users are editing a document at the same time, each user's edits are tracked by an operational transform. These transforms are then applied to the document on each user's screen, so that everyone sees the same document, with the most up-to-date changes.Operational transforms are a powerful tool for maintaining consistency in a distributed system. However, they can also be used for other purposes. For example, operational transforms can be used to create a shared whiteboard that multiple users can draw on simultaneously.Operational transforms are a fascinating topic with a wide range of applications.

What breaks with end-to-end encryption?

In a collaborative system defined by operational transforms, a single centralized entity is required to maintain track of the current system state and apply operations to that state. When Google Docs was released in 2006, this was the only way to collaborate in real-time on the web.However, the prospect of having every key and edit pass through centralized Google servers may now give users pause. With privacy scandals rippling through big tech in recent years, hundreds of millions of users are choosing to limit the amount of personal and sensitive data shared with these companies.Collaborative documents frequently involve incredibly sensitive personal or corporate information, from new designs to client lists. As a result, end-to-end encryption has become a necessity. In this model, no centralized authority has access to user data, making operational transforms impossible. This has led to a new, state-of-the-art data structure for collaborating in real-time: The CRDT.

What are CRDTs, and how do they enable secure, real-time editing?

Conflict-free replicated data types (CRDTs) are a category of data structures used to achieve strong eventual consistency (SEC) and Commutative Replicated Data Type (CmRDT). CRDTs are an important category of data structures for distributed systems, as they can be used to achieve various consistency models including eventual consistency and strong eventual consistency. Eventual consistency is a weaker form of consistency that allows for some stale reads, but guarantees that all writes will eventually be visible to all readers. Strong eventual consistency is a stronger form of eventual consistency that guarantees that all writes will be visible to all readers within a bounded time period. CRDTs can be used to implement both of these consistency models.CRDTs are designed to be replicated across multiple nodes in a distributed system. Each node in the system has its own copy of the CRDT data structure. The individual copies of the CRDT are kept in sync by exchanging updates between the nodes. When a node makes a change to its copy of the CRDT, it propagates the change to the other nodes in the system. The other nodes then apply the change to their own copies of the CRDT. This process continues until all the nodes in the system have the same copy of the CRDT.The key property that allows CRDTs to achieve strong eventual consistency is commutativity. Commutativity is a property of mathematical operations that states that the order in which the operations are performed does not affect the result. For example, the addition operation is commutative because the order of the operands does not affect the result:1 + 2 = 2 + 1The multiplication operation is also commutative:2 * 3 = 3 * 2However, the subtraction operation is not commutative:2 - 3 ≠ 3 - 2The order of the operands does affect the result of the subtraction operation.CRDTs are designed to be commutative. This means that the order in which the updates are applied does not affect the final state of the CRDT - an absolutely critical property when considering how real-time collaboration works. This property is important because it allows the updates to be applied in any order, without affecting the result. This is crucial for achieving strong eventual consistency, because it means that the updates can be applied asynchronously. As long as all the updates are eventually applied, the CRDT will reach the same final state, regardless of the order in which the updates are applied.There are many different types of CRDTs, each of which is designed for a specific type of data. The most common types of CRDTs are counters, sets, and maps.Counters are the simplest type of CRDT. They are designed to track a single number, such as the number of likes on a post. Counters have two operations: Increment and decrement. These operations are commutative, so the order in which they are applied does not affect the result.Sets are another common type of CRDT. They are designed to track a set of values, such as the set of users who have liked a post. Sets have two operations: Add and remove. These operations are also commutative, so the order in which they are applied does not affect the result.Maps are the most complex type of CRDT. They are designed to track a mapping of keys to values, such as the mapping of user IDs to user names. Maps have four operations: Put, remove, get, and clear. These operations are not commutative, but they are idempotent. This means that the order in which they are applied does affect the result, but applying the same operation multiple times has the same effect as applying it once.CRDTs are a powerful tool for achieving strong eventual consistency. They are designed to be replicated across multiple nodes in a distributed system and to be updated asynchronously. The key property that allows CRDTs to achieve strong eventual consistency is commutativity. This property allows the updates to be applied in any order, without affecting the result.

CRDTs for collaboration

CRDTs’ consistency properties make them ideal candidates for facilitaing decentralized, end-to-end encrypted collaboration. Because the order in which operations are applied does not change the eventual consistency of a system, users may exchange distinct operations on documents or data structures and arrive at a shared final state.In the last decade, CRDTs have advanced significantly to the point where they are being deployed in production systems for real-time collaborating on documents, spreadsheets, whiteboards, and more. Furthermore, given that they allow for a distributed architecture in collaborating, it is possible for participants to exchange fully end-to-end encrypted updates, where no centralized party exists to track and maintain user data.CRDTs may optimistically become the future of all collaborative products due to their wide extensibility and attractive properties.

Conclusion and more reading

To try Skiff, which uses the YJS CRDT for production real-time collaboration, you can visit skiff.com to sign up and create an account for end-to-end encrypted mail, notes, file storage, and more. To learn more about how Skiff uses CRDTs, visit Skiff’s blog or whitepaper, which goes into greater technical depth on many of the design challenges and decisions made to build a privacy-first, end-to-end encrypted workspace.

Join the community

Become a part of our 1,000,000+ community and join the future of a private and decentralized internet.

Free plan • No card required