From Isaac, Mumu, Shoshin, to CRDTs

Apr 23, 2024

Special thanks to Rob Morris and James Addison for insightful pointers.

This essay captures the first presentation given at Autonomous Anonymous 2024 in Lisbon. The video recording is available here.

Topology’s founding vision is to create sovereign, persistent, and interoperable realities on the Internet. We considered programmable blockchains as the best foundations for building these realities. At 2021, many design patterns had emerged for onchain financial applications, but very few had been concerned with onchain simulations in 2D or 3D that are multiplayer, interactive, and malleable like our reality. The challenge, therefore, involved a technical part and a creative part. What affordances do blockchains provide, or what can they possibly provide with architectural innovations? That was the technical part. Given the affordances, among the possible realities that can be constructed with them, what are the ones that are deeply human and engaging to be worth our time? That was the creative part.

We believed that beautiful answers exist and we were motivated to discover them – through experimentation.

Isaac

Numerical simulation is the foundation of game physics. For sovereign, therefore inter-objective, realities, their physics must be tamper-proof. Isaac was our first experimental game. Its design centers around a constrained four-body simulation that runs in a Cairo smart contract [1]. Its game mechanics was explained in detail in a past post.

As an execution environment, blockchain is unwieldy. The raw game experience suffered from unstable tick interval which was the block time. Computation-wise, the complexity of the simulation function was limited by block gas limit; in theory it was possible to spread out a simulation run across multiple transactions, one per block, with intermediate states saved between consecutive transactions. Yet at that time gas metering on Starknet was not available. The idea of spending multiple full blocks worth of gas for one single game tick was worrisome.

At the level of player coordination, we observed that most coordination effort happened on a Discord channel. We noted the importance of copresence as the foundation for collaboration and coordination.

Mumu

We decided to take the block time out of the equation of game experience by implementing “local-first” game mechanics. The concept coincided with the local-first software principle [2], but at that time we have not learned of its existence. Mumu was our second experimental game, designed to be a simulation puzzle to be solved in trial-and-error locally. Only solutions that the players are happy with are voluntarily submitted onchain. We implemented the simulation logic twice, once in Cairo that runs onchain for verification and ranking, and once in Typescript that runs locally in the player’s browser.

Game design-wise, we explored giving players a visual programming interface to specify behaviors. We wanted to explore design patterns that allow participants of a malleable reality to “program” the reality in accessible ways. We saw first-hand how players were able to engineer beautiful combinatorics from relatively simple rulesets.

Shoshin

Shoshin was a “local-first” fighting game. While actual fighting games have players interact in real time, a local-first fighting game asks players to design and submit fighting behaviors, which enter fighting simulations with other submitted behaviors.

In some sense, Shoshin was an “upgrade” to Mumu. Instead of solving static puzzles as in Mumu, Shoshin players submitted “solutions” that were measured against other solutions in interactive simulations, which effectively put psychology into gameplay. Shoshin’s visual programming interface was much more expressive than Mumu’s. Details about Shoshin’s design were covered in another past post.

In-person playtests were held, once in Paris for PvE mode, and once in California as a PvP tournament. We noticed how much spark and engagement were generated spontaneously in the physical spaces where playtesters were present.

During the PvP tournament, we explored concepts around projectile attacks (~hadouken) and king-of-the-hill mechanics (proposed by Non; to be explained in a future post). Yet it was at that juncture that we decided to explore an entirely different avenue for running programs trustlessly over peer-to-peer networks.

Problems

Among the three vertices of the blockchain trilemma, decentralization and security are non-negotiable for self-sovereign realities. Blockchains that maximize these two properties suffer in scalability. Yet, scalability is key to enabling copresence and malleability by many concurrent users in a shared reality. How to fix this?

The modular thesis for blockchain scaling stacks multiple “application threads” on top of a shared “parent thread”. However, each application thread is still a single thread. A decentralized sequencer seeks to reach consensus among a decentralized network of nodes on a total ordering (sequence) of events, which pays a heavy communication cost to coordination over the Internet. On the other hand, advocates for verifiable computing propose the idea of “zk servers”, or centralized sequencers that submit serialized program states in the form of proof-carrying data to the underlying parent chain. This approach introduces liveness bottleneck and vulnerability to censorship.

To construct a general-purpose medium that achieves copresence and highly concurrent malleability for reality-making or autonomous worlding [3], we must look beyond the blockchain paradigm and the trilemma that limits it.

CRDTs

We heard about CRDTs through a call with mconcat in the Cosmos ecosystem, where the context was multi-party programmable state channels. Following the call we fell into the CRDT rabbit hole.

In 2006, the web-based text editor Writerly was acquired by Google. It soon turned into Google Doc. Its collaborative editing function used Operational Transformation (OT), an approach that was proposed in 1989 [4], formalized in 1995 [5], and improved by researchers over many years. Its general idea is as follows. Replicate a document into many user copies or replicas. A user can view and mutate its own replica in a non-blocking manner. Changes at one replica are propagated to all other replicas. Incoming changes are transformed then applied to the local replica. The transformation step is meant to ensure document consistency. Conceptually I like to analogize OT to Lorentz Transformation, which is the transformation of physical quantities between observer frames whose relative velocities are non-negligible with respect to the speed of light.

The Wave Protocol (developed alongside Google Wave, which was shut down in 2012; this article offers a wonderful recap of its original motivations) used OT to preserve consistency of wavelet data across the users that are sharing the same wavelet from different wave servers connected in a federation model [6,7]. As with all protocols that did not get adopted as broadly as they intended, there were many reasons behind the downfall of the Wave Protocol. Among those, some technical ones were that OT was complex to get right and it still needed centralized servers. Together, these reasons limit the protocol’s extensibility and scalability.

Around the same time, researchers were proposing peer-to-peer collaborative editor algorithms without OT [8]. This line of work eventually became formalized into Convergence/Commutative Replicated Data Types, which then coalesced into Conflict-free Replicated Data Types (CRDTs) [9]. In short, CRDTs are “multiplayer data types”. Each replica of a CRDT records extra “spacetime” metadata with every operation performed so that any two valid replica states are always mergeable into another valid replica state. This property allows for distributed execution without coordination [10].

Throughout 2010s and early 2020s, improvements were made around the issues of composing and garbage-collecting CRDTs, allowing CRDTs to be deployed in production software. Geo-replicated databases use CRDT for state synchronization [11]. Code editors use CRDT to support collaborative teletype [12,13]. Apple Notes uses CRDT for synchronizing across devices. The work on Merkle-CRDT [14] (transporting CRDT updates in a Merkle DAG) spawned peer-to-peer database solutions [15].

On the day we fell into the CRDT rabbit hole, despite CRDT’s origin in collaborative document editing, one thing was instantly clear: the difference between collaborative editing and multiplayer interactive simulations, particularly those without stringent timing constraints, is only an increase in dimension. Multiple users typing into the same document positions is not unlike rigid bodies attempting to move into the same coordinates in space.

Few had considered using CRDTs for making games. From our own survey and from communication with Martin Kleppmann, author of BFT-CRDT [16], we only know of one academic work that considered the intersection of peer-to-peer CRDT and videogames [17]. We also came across the work by James Addison and Rob Morris on a tiny Minecraft implementation with CRDTs [18].

To prove our intuition, we used Yjs to construct a peer-to-peer multiplayer platformer prototype where each log-in user controls a penguin character and operates at 60 frames per second. CRDTs were used to represent and synchronize the state of the game world. WebRTC was used as the network transport.

Later, as a proof of concept for collision resolution, we endowed each user with a CRDT mailbox so that users can request one another to back off when intersection occurs.

The source code is available here. Note that the prototype was not tolerant to Byzantine faults and did not preserve causal ordering among operations.

Topology Protocol in a nutshell

The goal of Topology protocol is to create a general-purpose medium for anyone to construct self-sovereign malleable realities on the Internet, and to experience copresence in them. In a nutshell, the protocol is based on the following architectural concept: user-defined BFT-CRDT actors running over a peer-to-peer network that functions like a “random access memory” for the world computer. Explanations are left to the next post.

Reference

[1] Goldberg et. al., Cairo – a Turing-complete STARK-friendly CPU architecture, 2021. link

[2] Kleppmann et. al., Local-first software – You own your data, in spite of the cloud, 2019. link

[3] ludens, Autonomous Worlds (Part 1), 2022. link

[4] Ellis et. al., Concurrency Control in Groupware Systems, 1989. link

[5] Nichols et. al., High-Latency, Low-Bandwidth Windowing in the Jupiter Collaboration System, 1995. link

[6] Wang et. al., Google Wave Operational Transformation, 2010. link

[7] Gregorio, Google Wave, 2009. link

[8] Oster et. al., Real time group editors without Operational transformation, 2005. link

[9] Shapiro et. al., Conflict-free Replicated Data Types, 2011. link

[10] Bailis et. al., Coordination Avoidance in Database Systems, 2014. link

[11] Active-Active geo-distribution. link

[12] Atom’s teletype-crdt. link

[13] Jahns, How we made Jupyter Notebooks collaborative with Yjs, 2021. link

[14] Sanjuán et. al., Merkle-CRDTs – Merkle-DAGs meet CRDTs, 2020. link

[15] OrbitDB. link

[16] Kleppmann et. al., Making CRDTs Byzantine Fault Tolerant, 2022. link

[17] van der Linde et. al., Practical Client-side Replication: Weak Consistency Semantics for Insecure Settings, 2020. link

[18] Addison & Morris, “Metaverse in a box” demo, 2022. link