Wikidata did a comprehensive analysis of Graph DBs [0], and settled on BlazeGraph with TitanDB coming a close second.<p>Notably, there are quite a few omissions. DGraph and Cayley [1] being two of those. Interestingly, both are developed by Googlers. Cayley is used by Kythe.io [2], a Google project that kind of competes with srclib [3] by SourceGraph.<p>Cayley has native JavaScript interface, which makes it an interesting choice for Node JS based apps.<p>At work, we settled on TitanDB, primarily because it supports DynamoDB/Cassandra for storage and ElasticSearch. Most of the graph DBs rely on some storage engine or the other underneath-- Cayley supports LevelDB, for instance; whereas TitanDB supports BerkeleyDB apart from aforementioned DyanmoDB and Cassandra.<p>[0] <a href="https://docs.google.com/a/wikimedia.org/spreadsheets/d/1MXikljoSUVP77w7JKf9EXN40OB-ZkMqT8Y5b2NYVKbU/edit#gid=0" rel="nofollow">https://docs.google.com/a/wikimedia.org/spreadsheets/d/1MXik...</a><p>[1] <a href="https://github.com/google/cayley" rel="nofollow">https://github.com/google/cayley</a><p>[2] <a href="https://kythe.io" rel="nofollow">https://kythe.io</a><p>[3] <a href="https://srclib.org" rel="nofollow">https://srclib.org</a>
This is really exciting. I've been hoping for a robust, distributed open source Graph database ever since I first played with Freebase (which clearly had some amazing secret sauce, long-since purchased by Google). The engineer behind DGraph has worked on Google Knowledge Graph, the spiritual successor to Freebase, and obviously understands the space incredibly well: <a href="https://twitter.com/manishrjain" rel="nofollow">https://twitter.com/manishrjain</a>
This looks excellent!<p>Some questions because I need something like this:<p>What does "distributed" mean in this context? Can the graph size be larger than the storage on a single node? If so, how is it partitioned (I think Titan was randomly partitioned)?<p>Has any thought been given to in-graph processing (PageRank etc)?
All the graph database traversals I've seen are fairly simple (Friend of a friend, Movies starring X).<p>Are they a good choice for turn-by-turn navigation, and answering questions (given a traffic dataset) like: "What has been the quickest route between A and P, departing at 8am on a Monday morning?"
Your landing page is missing any kind of "evidence" that it is scaleable, low-latency or high throughput.<p>Also if you are sharing on predicate you will end up in big trouble. Predicates in most RDF datasets are not at all evenly distributed, tending more towards extreme value distributions. e.g. in UniProt the most common predicate has 2,419,000,171 occurrences, the least 1!<p>Also if you are going to benchmark can I suggest the rather good LDBC ones[1]. Even if for marketing reasons you don't want them public they are good to show where you can improve.<p>[1]<a href="http://www.ldbcouncil.org/" rel="nofollow">http://www.ldbcouncil.org/</a>
Is a Graph DB suitable for use cases like products/homes/cars etc where users mostly do "and" queries to narrow down the results set? If so, is it faster than traditional SQL DB?
This looks very promising!<p>Are you planning to add filtering, supported by indexes? Seems a bit useless for production use if you can't filter a query by predicate, or even sort/limit. You could layer something like Elasticsearch on top of it, but then you lose all the graph support.<p>Any thoughts on enforcing schemas?