TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

CRDT: Fractional Indexing

343 pointsby mblodeover 2 years ago

19 comments

jitlover 2 years ago
To me the other algorithms described in the list are more novel and interesting:<p><a href="https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;crdt-tree-based-indexing&#x2F;" rel="nofollow">https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;crdt-tree-based-indexing&#x2F;</a> - for when precise order is critical, like paragraphs in a document. This algorithm is <i>almost</i> like storing adjacency information like a linked list, but is more convergent. Very interesting for [my use-case](<a href="https:&#x2F;&#x2F;www.notion.so&#x2F;blog&#x2F;data-model-behind-notion" rel="nofollow">https:&#x2F;&#x2F;www.notion.so&#x2F;blog&#x2F;data-model-behind-notion</a>).<p><a href="https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;crdt-mutable-tree-hierarchy&#x2F;" rel="nofollow">https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;crdt-mutable-tree-hierarchy&#x2F;</a> - for tree-shaped data, like blocks in a Notion page that should have exactly one parent, but allow concurrent re-parenting operations<p><a href="https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;log-spaced-snapshots&#x2F;" rel="nofollow">https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;log-spaced-snapshots&#x2F;</a> - log space snapshots, for choosing what fidelity of historical information to store. For context, many CRDTs for rich text or sequences store unbounded history so that any edit made at any time can be merged into the sequence. For long-lived documents, this could be impractical to sync to all clients or keep in &quot;hot&quot; memory. Instead, we can decide to compact historical data and move it to cold storage, imposing a time boundary on what writes the system can accept on the hot path. The log-spaced snapshots algorithm here could be used to decide what should be kept &quot;hot&quot;, and how to tune the cold storage.
评论 #33768462 未加载
评论 #33770369 未加载
samwillisover 2 years ago
Anyone unsure of what a CRDT is (I think everyone on HN must know by now), this is the perfect intro: <a href="https:&#x2F;&#x2F;www.inkandswitch.com&#x2F;peritext&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.inkandswitch.com&#x2F;peritext&#x2F;</a><p>The two most widely used CRDT implementations (combining JSON like general purpose types and rich text editing types) are:<p>- Automerge <a href="https:&#x2F;&#x2F;github.com&#x2F;automerge&#x2F;automerge" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;automerge&#x2F;automerge</a><p>- Yjs <a href="https:&#x2F;&#x2F;github.com&#x2F;yjs&#x2F;yjs" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;yjs&#x2F;yjs</a><p>Both have JS and Rust implementations, and have bindings to most online rich text editors.<p>CRDTs are addictive one you get into them.
评论 #33768162 未加载
评论 #33766778 未加载
LAC-Techover 2 years ago
CRDTs are often talked about in the same breath as collaborative editing software, but they&#x27;re useful for much more than that.<p>They really are a theoretical model of how distributed, convergent, multi-master systems have to work. IE the DT in CRDT could be a whole datastore, not as just an individual document.<p>(Wish I could remember who on HN alerted me to this. I had read the paper but didn&#x27;t grok the full implications).
评论 #33766098 未加载
评论 #33765667 未加载
评论 #33765544 未加载
paulgbover 2 years ago
This stuff is fun to play with. I implemented a Rust version of fractional indexing based on another of Evan’s blog posts.<p><a href="https:&#x2F;&#x2F;docs.rs&#x2F;fractional_index&#x2F;latest&#x2F;fractional_index&#x2F;" rel="nofollow">https:&#x2F;&#x2F;docs.rs&#x2F;fractional_index&#x2F;latest&#x2F;fractional_index&#x2F;</a>
评论 #33764964 未加载
评论 #33765590 未加载
superb-owlover 2 years ago
CRDTs are incredible. I recommend checking out the original link [1] as well as this “awesome” list [2]<p>[1] <a href="https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;" rel="nofollow">https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;</a><p>[2] <a href="https:&#x2F;&#x2F;github.com&#x2F;alangibson&#x2F;awesome-crdt" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;alangibson&#x2F;awesome-crdt</a>
conaclosover 2 years ago
This is basically the idea behind Logoot [Weis_2009] that was improved by LSeq [Nédelec_2013] and later extended to the first block-wise sequence CRDT: LogootSplit [André_2013]. LogootSplit was recently improved as Dotted LogootSplit [1] [Elvinger_2021].<p>Disclamer: I&#x27;m the author of Dotted LogootSplit.<p>[Weis_2009] <a href="https:&#x2F;&#x2F;hal.inria.fr&#x2F;inria-00432368" rel="nofollow">https:&#x2F;&#x2F;hal.inria.fr&#x2F;inria-00432368</a><p>[Nédelec_2013] <a href="https:&#x2F;&#x2F;hal.archives-ouvertes.fr&#x2F;hal-00921633&#x2F;en" rel="nofollow">https:&#x2F;&#x2F;hal.archives-ouvertes.fr&#x2F;hal-00921633&#x2F;en</a><p>[André_2013] <a href="https:&#x2F;&#x2F;hal.archives-ouvertes.fr&#x2F;hal-01246212" rel="nofollow">https:&#x2F;&#x2F;hal.archives-ouvertes.fr&#x2F;hal-01246212</a><p>[Elvinger_2021] <a href="https:&#x2F;&#x2F;hal.univ-lorraine.fr&#x2F;tel-03284806" rel="nofollow">https:&#x2F;&#x2F;hal.univ-lorraine.fr&#x2F;tel-03284806</a><p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;coast-team&#x2F;dotted-logootsplit" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;coast-team&#x2F;dotted-logootsplit</a>
throwaway81523over 2 years ago
This is kind of interesting but &quot;fractional indexing&quot; doesn&#x27;t seem to be a computer science topic, and I think it might be clearer to treat these indexes as lists of numbers (or ordinals in ω^ω, if you prefer) rather than fractions. Those are simpler to generate and compare than arbitrary-precision fractions. Or as jitl&#x27;s post suggests, using trees as indexes (I haven&#x27;t yet looked at jitl&#x27;s linked articles). Those would presumably have order type ε₀. It&#x27;s not clear to me why you&#x27;d want that, but it seems doable. In all these schemes you might occasionally want a &quot;stop the world garbage collection&quot; where you reset all the indices to be ordinary integers or maybe pairs of integers. I guess that is also doable without having to pause all the updates, at least if you use pairs.
评论 #33767185 未加载
lukeramsdenover 2 years ago
Reminds me a little of &quot;Lexorank&quot; used in Jira[0], except by using decimals instead of base-36 strings, which I guess could be more efficient? The &quot;rebalancing&quot; aspect is interesting because on a long enough timescale for a document you will definitely want to do it. Would love to read up more on any algorithms for doing this in a p2p manner - with a central server it&#x27;s probably quite easy using some tombstoning or something.<p>[0] <a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;40718900&#x2F;jiras-lexorank-algorithm-for-new-stories" rel="nofollow">https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;40718900&#x2F;jiras-lexorank-...</a>
fatneckbeardzover 2 years ago
reminds me a lot of Ford Circle &#x2F; Farey Diagram &#x2F; Stern Brocot tree<p>Basically a tree of fractions where you take two rational points on a number line, a&#x2F;b and c&#x2F;d, then the next point in the tree is (a+b) &#x2F; (c+d). Turns out that every single point you create this way has a unique position and never duplicate each other, and it forms a tree like structure.<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Ford_circle" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Ford_circle</a><p>not sure if this would be useful, but basically it could be a fractional index that has a built in tree structure, since it basically means any fraction is a leaf on a Stern-Brocot tree.
评论 #33772001 未加载
newhousebover 2 years ago
Is this the same as LSeq [1] except rather than using bytes one is basically using digits in a floating point representation (given this is JS where most things are floats)?<p>[1] <a href="https:&#x2F;&#x2F;bartoszsypytkowski.com&#x2F;operation-based-crdts-arrays-1&#x2F;" rel="nofollow">https:&#x2F;&#x2F;bartoszsypytkowski.com&#x2F;operation-based-crdts-arrays-...</a>
brileeover 2 years ago
Doesn&#x27;t this end up being effectively a binary heap, with a maximum tree depth of 23 (floating point mantissa precision)? I imagine there must be a rebalancing operation required every so often, possibly more frequently for pathological insertion orders.
评论 #33766055 未加载
lelandfeover 2 years ago
For those unaware: Evan is the cofounder, former CTO, and technical wizard behind Figma; creator of esbuild, etc.
bhlover 2 years ago
Is there a good resource on designing collaborative apps with CRDTS, that is which type of CRDT and conflict resolution to pick for a given application? Something like <a href="https:&#x2F;&#x2F;mattweidner.com&#x2F;2022&#x2F;02&#x2F;10&#x2F;collaborative-data-design.html" rel="nofollow">https:&#x2F;&#x2F;mattweidner.com&#x2F;2022&#x2F;02&#x2F;10&#x2F;collaborative-data-design...</a> but generalizes.
johnxieover 2 years ago
Thanks for sharing this, we implemented OT in our real-time editor, but CRDT&#x27;s potential is evident in creating a collaborative experience, in particular fractional indexing and reparenting of mutable tree for our use cases.<p>We should end the CRDT vs OT debate once and for all.
estover 2 years ago
Reminds me of TokuDB engine for MySQL. Good times.
parenthesesover 2 years ago
Maybe I’m missing the point. Random offsets feels like an inelegant solution in the space of elegant solutions (read: CRDTs).
netikover 2 years ago
this reminds me of libketama sharding
dangover 2 years ago
This looks like great stuff if you follow the pointers, but lists don&#x27;t make good submissions to HN (which itself is already a list). They tend not to lead to deep discussion because comments are about lowest common denominator of the items on the list, and this is usually pretty generic. What&#x27;s better is to pick the most interesting item on the list and submit that instead. <a href="https:&#x2F;&#x2F;hn.algolia.com&#x2F;?dateRange=all&amp;page=0&amp;prefix=true&amp;sort=byDate&amp;type=comment&amp;query=denominator%20list%20by:dang" rel="nofollow">https:&#x2F;&#x2F;hn.algolia.com&#x2F;?dateRange=all&amp;page=0&amp;prefix=true&amp;sor...</a><p>Edit: let&#x27;s do that in this case. I&#x27;ve changed the URL from <a href="https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;" rel="nofollow">https:&#x2F;&#x2F;madebyevan.com&#x2F;algos&#x2F;</a> based on <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33764875" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33764875</a>.
评论 #33765041 未加载
jiffyjeffover 2 years ago
Conflict-free replicated data type, or CRDT, is often referred to as simply “CRDT” in trade jargon without any definition. Apparently.