Why CRDT didn't work out as well for collaborative editing xi-editor

457 点作者 UkiahSmith大约 6 年前

16 条评论

tomsmeding大约 6 年前

Relevant discussion from a couple of days ago: <a href="https://news.ycombinator.com/item?id=19845776" rel="nofollow">https://news.ycombinator.com/item?id=19845776</a>

chubot大约 6 年前

I don't have much experience in this area, but I'd be interested in an overview of how different pieces of sofware handle the concurrent / multiplayer editing problem, like:- Etherpad- Google docs- Apache / Google Wave (open sourced: <a href="http://incubator.apache.org/projects/wave.html" rel="nofollow">http://incubator.apache.org/projects/wave.html</a>)- repl.it <a href="https://repl.it/site/blog/multi" rel="nofollow">https://repl.it/site/blog/multi</a>- figma <a href="https://www.figma.com/blog/multiplayer-editing-in-figma/" rel="nofollow">https://www.figma.com/blog/multiplayer-editing-in-figma/</a> (image editing rather than text editing)So is the gist of it that OT relies on central servers and they all use OT rather than CRDT? That was not entirely clear to me.Looking at the xi analysis, it sounds like "IME" is specific to a desktop application using X (?), so it doesn't apply to any of this web software.The rest of them seem like they do apply?Is the problem that you have to pay a "CRDT tax" for every piece of state in the application? I thought the same was true of OT. Don't you have to express every piece of state within those constraints too?repl.it doesn't seem to have a problem with syntax highlighting or brace matching (try it, it's pretty slick). So is it just that they paid that tax with a lot of code or is there something else fundamentally different about xi vs. repl.it ? Or maybe xi is going for a lot lower latency than web-based editors?recent thread about OT vs. CRDT that might be interesting: <a href="https://news.ycombinator.com/item?id=18191867" rel="nofollow">https://news.ycombinator.com/item?id=18191867</a>

评论 #19887534 未加载

评论 #19887426 未加载

评论 #19887431 未加载

评论 #19887323 未加载

评论 #19887358 未加载

评论 #19887283 未加载

评论 #19887517 未加载

评论 #19890305 未加载

lewisl9029大约 6 年前

> Indeed, the literature of CRDT does specify a mathematically correct answer. But this does not always line up with what humans would find the most faithful rendering of intent.This is a very salient point that anyone thinking of using CRDTs to "solve" synchronization in an user-facing application needs to take into consideration. Yes, CRDTs will guarantee that clients converge to an identical, mathematically "consistent" state eventually, but there's no guarantee whether or not that mathematically "consistent" state would make any sense to the application business logic that needs to consume that state, or to the human that needs to reason about the rendered result. That is a completely different can of worms that we'll still have to tackle to build a usable application.Here's great example to illustrate this from Martin Kleppmann's talk on this topic: <a href="https://youtu.be/yCcWpzY8dIA?t=2634" rel="nofollow">https://youtu.be/yCcWpzY8dIA?t=2634</a>Rest of the talk is also highly recommended for anyone interested in an approachable primer on CRDTs.The trade-offs to CRDTs mentioned by the author in the context of text-editors make sense, but I would be curious to hear from the Xray team on what their current thinking on the topic is, given that they have collaborative editing as an explicit core objective (which might shift the value prop in favor of using CRDTs relatively speaking since in Xi it seems to be only an aspirational goal), and that their approach to implementation was similar but not quite identical to Xi's:> Our use of a CRDT is similar to the Xi editor, but the approach we're exploring is somewhat different. Our current understanding is that in Xi, the buffer is stored in a rope data structure, then a secondary layer is used to incorporate edits. In Xray, the fundamental storage structure of all text is itself a CRDT. It's similar to Xi's rope in that it uses a copy-on-write B-tree to index all inserted fragments, but it does not require any secondary system for incorporating edits.<a href="https://github.com/atom/xray#text-is-stored-in-a-copy-on-write-crdt" rel="nofollow">https://github.com/atom/xray#text-is-stored-in-a-copy-on-wri...</a>

评论 #19889631 未加载

评论 #19890256 未加载

laughinghan大约 6 年前

@dang, can we edit the title? It's inaccurate. This post is about how the CRDT didn't work for asynchronous editing by automated tools like syntax highlighting, automatic bracket balancing, etc. The post explicitly contrasts those use cases with collaborative editing as a use case that the author didn't implement, but where they think "the CRDT is not unreasonable".

coldtea大约 6 年前

Maybe first build a capable editor, with plugins, etc (xi-editor is not that yet) and worry about "collaborative editing" later?And even for that, I think simply "taking turns" (where users share an editor session, can chat with each other, and can switch on sequentially who gets to actively edit) is enough for 99% of cases, and is not more difficult than mere single-person editing (since there are no conflicts).

评论 #19888633 未加载

评论 #19888697 未加载

评论 #19889020 未加载

评论 #19889176 未加载

lucb1e大约 6 年前

At the risk of asking a stupid question: is there a reason other than offline support why we bother with conflict resolution algorithms?Every time concurrent editors come up, one of the main points of discussion is the pros and cons of different possible conflict resolution algorithms. People seem to be spending a lot of time on debating and implementing that. The way I see it, whichever packet reaches the server first gets applied first. Send something like "line 9 column 19: insert <Enter>", and when another client whose cursor is on line 15 receives that, it moves the cursor down to line 16 and scrolls the view down one line. Because you can see each other's cursors and selections, it shouldn't be hard to avoid typing in the same place. Unless you have round trip times of multiple seconds (satellite uplinks maybe?), and unless you edit continuously with more than, say, one people per ten sentences, you should hardly ever need it, and if it happens, the person editing will notice within two seconds and just wait a second for the other to finish. It's not as if you can reliably apply edits anyway: as the article already describes, changing a line from ABC to EFG concurrently with someone modifying B to D, does not really have a good outcome. In a more realistic example, it would be changing "its" to "it's" concurrently with changing the word to "that". There is no good solution (the server wouldn't know which person to ignore: the apostrophe inserter or the replacer), so someone will have to resolve it manually anyway, so why bother with complex resolution algorithms? Heck, I'd be fine if my editor would do exclusive locks for the line I'm on before I can start typing.For slow things like the customer report, internal documents, code, etc., I use something like git. Collaborative editing is (to me) for realtime things like jotting down notes about what I'm working on and looking at what others are working on right now, where even a proper revision control system is too cumbersome (git pull, vim notes.txt, small edit, :wq, git commit, git push, repeat) because someone might be working on the same thing. In such a case, where I need to work together on a file in real time, I'm not working offline, so this conflict resolution is by definition unnecessary. Is that different from the majority of people that use collaborative editing?

评论 #19888497 未加载

评论 #19888435 未加载

评论 #19888244 未加载

评论 #19888389 未加载

nicodemus26大约 6 年前

I think CRDTs would make much more sense in a projectional editor than a text one. When the changes are mutations to the abstract syntax tree its more well defined how a merge would end. Also, the merge results don't have the opportunity to be invalid syntax.

评论 #19893724 未加载

catpolice大约 6 年前

"For syntax highlighting, any form of OT or CRDT is overkill; the highlighting is a stateless function of the document, so if there's a conflict, you can just toss the highlighting state and start again."I first became interested in CRDTs in a case where this wasn't really true. I was writing an IDE for a custom in-house DSL - think of the application as a special language for interacting with a gigantic and very strange database. Basically, the problem was that the use case really stretched the bounds of what is normally done with syntax highlighting. Some requirements:- It had syntax and semantic highlighting, where the visual feedback associated with a term would depend on the results of queries to the remote database- It had to be able to handle documents of several megabytes (and many thousands of terms) fairly smoothly, with as little noticeable lag or flicker as possible- It couldn't swamp the database with unnecessary requests- The document itself had implicit procedural state (e.g. if you wrote a command that, if evaluated, would alter the state of a term on the database, appearances of that term later in the document needed to be highlighted as if those changes had already been applied)So I definitely couldn't throw out metadata and start over with every change. I ended up with a kind of algebraic editing model that allowed me to put bounds on what needed to be updated with every edit and calculate a minimal set of state changes to flow forward. It was extraordinarily complicated. I never got around to learning enough about CRDTs to determine if they'd be simpler than the solution I came up with, but they do seem to target some similar issues.

评论 #19888894 未加载

hansjorg大约 6 年前

I'm assuming CRDT refers to conflict free replicated data type: <a href="https://en.m.wikipedia.org/wiki/Conflict-free_replicated_data_type" rel="nofollow">https://en.m.wikipedia.org/wiki/Conflict-free_replicated_dat...</a>OT, operational transformation: <a href="https://en.m.wikipedia.org/wiki/Operational_transformation" rel="nofollow">https://en.m.wikipedia.org/wiki/Operational_transformation</a>

评论 #19887265 未加载

colemickens大约 6 年前

Thank you for writing this up Raph. I've been following CRDT usage in Xray/Xi and am curious to see where collaborative editing goes. I appreciate you thinking about it upfront.

lacampbell大约 6 年前

Having a bit of difficulty following this, so I'll break down my understanding of CRDTs and see if someone can help me out.A CRDT can be thought of as an algebraic structure, consisting of data type D, and a join function. So for all a, b, c in D, it's:Associative:<pre><code> join(a, join(b, c)) == join(join(a, b), c) </code></pre> Commutative:<pre><code> join(a, b) == join(b, a) </code></pre> Idempotent:<pre><code> join(a, a) == a </code></pre> Partially ordered:<pre><code> if join(a, b) == b then a <= b a <= a == true if (a < b) and (b > a) then a == b if (a <= b) and (b <= c) then (a <= c) </code></pre> So given all of that, I am not sure why the example in the article holds. I assume it's a consequence of the partial ordering, but I don't know what the partial ordering is. What's the join operation and what's the data type?

评论 #19889163 未加载

josephg大约 6 年前

I replied with my thoughts to the github issue, but they might be of interest to people reading along here too. I've got some experience on these systems (wave, sharejs, sharedb, etc).> As a side note, I've heard an interesting theory about why CRDT-type solutions are relatively popular in the cloud. To do OT well, you need to elect a centralized server, which is responsible for all edits to a document. I believe the word for this is "server affinity," and Google implements it very well. They need to, for Jupiter-style OT (Google Docs) to work.You don't need to do this. (Although I'm not sure if we knew that on the wave team). You can implement an OT system on top of any database that has a transactional write model. The approach is to enter a retry loop where you first try to apply the operation (but in a way that will reject the operation if the expected version numbers don't match). If an error happens, fetch the concurrent edits, transform and retry. Firepad implemented this retry loop from the client, and it worked much better than I expected. Here is a POC of a collaborative editor on top of statecraft - <a href="https://home.seph.codes/edit/test" rel="nofollow">https://home.seph.codes/edit/test</a> . The only OT code on the server is this middleware function:<a href="https://github.com/josephg/statecraft/blob/b6a82f34268238c90a8b3e600ea39ad1558cd12b/core/lib/stores/ot.ts#L36-L110" rel="nofollow">https://github.com/josephg/statecraft/blob/b6a82f34268238c90...</a> .In my experience the reason why semi- or fully- centralized systems are popular in products like google docs is that they're easier to implement. Access control in a decentralized system like git is harder. Gossip networks don't perform as well as straight offset-based event logs (kafka and friends). And if you have a canonical incoming stream of edits, its easier to reason about.---> I have a stronger conclusion: any attempt to automate resolving simultaneous editing conflicts that, e.g., git merge could not resolve, will fail in a way that fatally confuses users.I think you have to act with intent about what you want to happen when two users edit the same text at the same time. There are basically 2 approaches:1. Resolve to some sort of best-effort outcome. (Eg "DE F G" or "E F GD")2. Generate an error of some sort (eg via conflict markers) and let the user explicitly resolve the conflictAs much as it pains me to say, for code I think the most correct answer is to use approach (1) when the code is being edited live and (2) when the code is being edited offline / asyncronously. When we can see each other's changes in realtime, humans handle this sort of thing pretty well. We'll back off if someone is actively editing a sentence and we'll see them typing and let them finish their thought. If anything goes wrong we'll just correct it (together) before moving on. The problem happens when we're not online, and we edit the same piece of code independently, "blind" as it were. And in those cases, I think version control systems have the right approach - because the automated merge is often wrong.(More: <a href="https://github.com/xi-editor/xi-editor/issues/1187#issuecomment-491551004" rel="nofollow">https://github.com/xi-editor/xi-editor/issues/1187#issuecomm...</a> )

EGreg大约 6 年前

From these comnents it seems that OT requires a central server while CRDT can have a far more flexible topology. Is this true? And don’t we have robust implementations of CRDT for simple trees?

评论 #19889832 未加载

microcolonel大约 6 年前

I think it would be interesting to let the language mode control the rope, or delegate subtrees of the rope to a mode. This way, you could represent things like lexical scope in the tree of the rope, and a language-specific tokenizer could further reduce the complexity of syntax formatting.Emacs has the concept of "faces", and many Emacs major modes have proper parsers, lexers, and even some static analyzers that they use to apply the faces. If the rope resembled the AST, then many of the issues Raph talks about could be greatly reduced by localizing edits to their area of influence. If you edit inside a token, and somebody else deletes that whole token, then it is pretty clear how to resolve that. You could conceive of natural language modes which produce humanistic hierarchies, or modes with internal formats other than text (which may have a cached text view on them) like spreadsheets or debuggers.

marknadal大约 6 年前

TLDR;CRDTs cannot be "bolted ontop".----I really don't like this answer, but it is sadly true - even as an expert in the space (my database <a href="https://github.com/amark/gun" rel="nofollow">https://github.com/amark/gun</a> is one of the few CRDT-based systems out there). And there is a simple reason for this:Distributed systems are composable, they can be used to build higher-level strongly consistent systems on top. (Note: Sacrificing AP along the way, but then you can have a "tunable" system where each record you save you decide what consistency requirement you need, fast or slow.)However centralized systems are not composable, you can't go "down" the latter of abstraction by adding more stuff.

评论 #19888332 未加载

atheowaway4z大约 6 年前

Parsing & syntax highlighting before CRDT might give better results