TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: A Schemaless Data Store Within Your SQL Database

45 点作者 dfragnito超过 3 年前
Hi HN, we&#x27;re Eric and Dean, creators of SchemafreeSQL. Its roots go back to an on-line Web App Development Environment we developed back in 1999. It was comprised of an IDE, Web Server, Object Store, Virtual File System, Template System, and polyglot (Java, JavaScript, and Python). Of course, we named it “.OS”. Then we ended up dropping it. But that&#x27;s a story for another time. It was the ease of use of the Object Store from .OS that we really missed, which brings us back to SchemafreeSQL. It provides an enhanced API to your SQL Database which allows it to function as a Schemaless Objectstore. Yes, &quot;Schemaless Objectstore&quot; might be an oxymoron if you consider that objects have strict &quot;schemas&quot;. So what are we talking about here?<p>It started with a simple goal. We wanted to store data just as simply as one could throw a JSON document into a doc store, but while maintaining the power of the data&#x27;s relations and references in an object sense.<p>What we ended up with was a JSON over HTTP API (for ubiquitous access from any language), that maps any JSON structure into an optimized, static set of indexed tables in any SQL database (MySQL being our first port). Using it, we can immediately store any data we want, without having to create tables and indexes and optimize them.<p>Inserting data is as easy as sending a JSON doc to a doc store, but in SFSQL all nested relations inherent within that document are stored as objects and primitives. And, with a little extra JSON syntax we can use references and eliminate duplication of data (e.g. &quot;#ref&quot;: &quot;oid-goes-here&quot;), all from within the same source JSON doc. There is also no added speed penalty for referencing an object. In fact, all parent nodes in the JSON source data are stored as objects. So any object can be referenced from anywhere, allowing for complex relations (e.g. any-to-any), enabling rapid prototyping of data structures (no tables to create), and enabling unique on-the-fly relations if need be.<p>Queries are issued using JOIN-free SQL query syntax (e.g. SELECT COUNT(*) WHERE $s:person.address.state=&#x27;NY&#x27;) that eliminates the verbosity of the standard SQL JOIN, yet still maintains access to the primitive and aggregate functions of the underlying SQL database (e.g. FORMAT(), COUNT(), etc.) which simply get passed through.<p>What it&#x27;s not? It&#x27;s not a database on it&#x27;s own and it&#x27;s not a horizontally scalable big data store. Although, since it can extend practically any SQL database, it&#x27;s ability to scale is in large part tied to the scaling ability of the chosen SQL database that it&#x27;s extending. It&#x27;s also not an ORM as there is no strict mapping of objects in code to objects in the datastore. And yes, there were trade offs in it&#x27;s design. One design choice was that every primitive be indexed, which results in better performance than against a straight SQL table without the proper indexes, though slower than a straight SQL table with a composite index tailored to that query. But, the ability to JOIN to an existing SQL table within a SFSQL query is on the roadmap which will allow you have the best of both worlds (custom tables w&#x2F;custom indexes referenced from within a SFSQL query). As well, there is a situation one encounters in indexing choices (but that you won&#x27;t encounter with SFSQL) that we like to call the &quot;Index Permutation Limitation&quot;. Simply put, there is a real limit to the number of composite indexes that you can create for any table&#x2F;collection. This is especially a problem when - for instance - you want to give end users the ability to query across any of the possible attributes across the myriad product types that a big online store may carry. We get into detail on this point within our FAQ.<p>So what it is? It&#x27;s an extension to SQL databases that makes storing and querying and modeling and just generally using a datastore... easier. We&#x27;re hoping it will stir up some interesting use-cases.<p>Thanks in advance for the feedback!

10 条评论

productceo超过 3 年前
&gt;Web App Development Environment we developed back in 1999. It was comprised of an IDE, Web Server, Object Store, Virtual File System, Template System, and polyglot (Java, JavaScript, and Python).<p>When you made this, why did you believe building this was what people needed?<p>And what did you end up observing that made you discontinue the project?
AtlasBarfed超过 3 年前
Why isn&#x27;t it horizontally scalable? Are there tons of joins?<p>I wrote a JSON&#x2F;Document&#x2F;Property Graph database front end with Cassandra and DynamoDB backends, so that is ... kinda ... schemaless.<p>I ended up doing a lot of things that RethinkDB and OrientDB did but didn&#x27;t scale like cassandra. I was trying to do a streaming document layer. Also a JSON document database with simple relations is a property graph database.<p>But Elassandra and similar projects kinda did the same thing so I dropped it. But I came out of it thinking it shouldn&#x27;t be hard to scale JSON, which is basically schemaless.<p>I had schema-optional things too where certain named properties&#x2F;elements of a JSON doc would be rendered onto a fixed table and schema, are you.guys doing anything like that? I think a schema optional or options for partial schemas for documents would enable the transition of schemaless data to schema&#x27;d stores in enterprise system evolution.<p>Do you guys intermix all documents like graph databases do, which seems a bit insane if you have to do some large scale table scan or migration of data and have to sift through data? Or build an index on a heap of documents with only some of them needing indexing?
rmbyrro超过 3 年前
Hey, quick question: would this by any chance work as an AWS Lambda container image [1] connected to Aurora Serverless [2] (MySQL or Postgres), perhaps using the Data API [3]?<p>If yes, I think this is a big plus for teams looking for a managed infra.<p>[1] <a href="https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;lambda&#x2F;latest&#x2F;dg&#x2F;lambda-images.html" rel="nofollow">https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;lambda&#x2F;latest&#x2F;dg&#x2F;lambda-images.h...</a><p>[2] <a href="https:&#x2F;&#x2F;aws.amazon.com&#x2F;rds&#x2F;aurora&#x2F;serverless&#x2F;" rel="nofollow">https:&#x2F;&#x2F;aws.amazon.com&#x2F;rds&#x2F;aurora&#x2F;serverless&#x2F;</a><p>[3] <a href="https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;AmazonRDS&#x2F;latest&#x2F;AuroraUserGuide&#x2F;data-api.html" rel="nofollow">https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;AmazonRDS&#x2F;latest&#x2F;AuroraUserGuide...</a>
评论 #30300796 未加载
brad0超过 3 年前
Interesting project! I&#x27;m curious about how this compares to other tech:<p>- How is data stored internally?<p>- What are the tradeoffs with this implementation?<p>- Do you generate indexes on data as it comes in?<p>- How are nested keys handled?<p>- Are indexes eventually consistent?
评论 #30293251 未加载
rmbyrro超过 3 年前
Sounds interesting and useful.<p>How likely is it for this product to be alive and well maintained for the next 5 years at least?
评论 #30295735 未加载
mdellavo超过 3 年前
Are you unrolling the nested JSON data structures and storing as traditional K&#x2F;Vs in an EAV pattern? Possibly using one table for each datatype or using a sparse table?<p>I&#x27;d be curious how this performs for complex queries - does this rely heavily on index intersection?
评论 #30295423 未加载
评论 #30294008 未加载
cryptonector超过 3 年前
Want schema-less SQL? Use an EAV (entity-attribute-value) schema.
评论 #30295489 未加载
评论 #30295463 未加载
azth超过 3 年前
Is it similar to this: <a href="https:&#x2F;&#x2F;eng.uber.com&#x2F;logging" rel="nofollow">https:&#x2F;&#x2F;eng.uber.com&#x2F;logging</a> ?
评论 #30296885 未加载
pvg超过 3 年前
There&#x27;s something wrong with your account creation process, it rejects just about any auto-generated (by Chrome) password even though they easily meet the listed criteria.
评论 #30294356 未加载
评论 #30294100 未加载
评论 #30294031 未加载
crooked-v超过 3 年前
This sounds like a lot of reinvention of the JSON&#x2F;JSONB support that Postgres has already had since 9.2&#x2F;9.4. They have a slide deck with a variety of in-depth examples. <a href="https:&#x2F;&#x2F;wiki.postgresql.org&#x2F;images&#x2F;7&#x2F;7c&#x2F;JSON-by-example_FOSDEM_2016.pdf" rel="nofollow">https:&#x2F;&#x2F;wiki.postgresql.org&#x2F;images&#x2F;7&#x2F;7c&#x2F;JSON-by-example_FOSD...</a><p><pre><code> CREATE INDEX review_review_jsonb ON reviews USING GIN (review_jsonb); -- Select data with JSON SELECT review_jsonb#&gt;&gt; &#x27;{product,title}&#x27; AS title , avg((review_jsonb#&gt;&gt; &#x27;{review,rating}&#x27;)::int) AS average_rating FROM reviews WHERE review_jsonb@&gt;&#x27;{&quot;product&quot;: {&quot;category&quot;: &quot;Sheet Music &amp; Scores&quot;}}&#x27; GROUP BY title ORDER BY average_rating DESC ;</code></pre>
评论 #30294259 未加载
评论 #30293913 未加载
评论 #30294621 未加载
评论 #30293177 未加载