Section 2 of OP points us to 5.5 regarding "load balancing", after reminding us that "Optimal graph partitioning is an NP-complete problem".<p>In 5.5 we read about <i>single node</i> internal processor load balancing and also the happy comment in the following 5.6 ("Fault Tolerance") that "in MOSAIC, due to its single-machine design, handling fault tolerance is as simple as checkpointing the intermediate state data (i.e., vertex array)."<p>I would be more interested to hear about processing a "Trillion-Edge" graph distributed over "Two" machines: Partitioning a general graph is not as simple as partitioning k/v buckets or relational graphs; and, 10^12 edge order sounds like a lot until you recognize the 'edge amplification' fact of capturing a logic domain using the weak semantics of named associations.