TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: A Schemaless Data Store Within Your SQL Database

45 pointsby dfragnitoover 3 years ago
Hi HN, we&#x27;re Eric and Dean, creators of SchemafreeSQL. Its roots go back to an on-line Web App Development Environment we developed back in 1999. It was comprised of an IDE, Web Server, Object Store, Virtual File System, Template System, and polyglot (Java, JavaScript, and Python). Of course, we named it “.OS”. Then we ended up dropping it. But that&#x27;s a story for another time. It was the ease of use of the Object Store from .OS that we really missed, which brings us back to SchemafreeSQL. It provides an enhanced API to your SQL Database which allows it to function as a Schemaless Objectstore. Yes, &quot;Schemaless Objectstore&quot; might be an oxymoron if you consider that objects have strict &quot;schemas&quot;. So what are we talking about here?<p>It started with a simple goal. We wanted to store data just as simply as one could throw a JSON document into a doc store, but while maintaining the power of the data&#x27;s relations and references in an object sense.<p>What we ended up with was a JSON over HTTP API (for ubiquitous access from any language), that maps any JSON structure into an optimized, static set of indexed tables in any SQL database (MySQL being our first port). Using it, we can immediately store any data we want, without having to create tables and indexes and optimize them.<p>Inserting data is as easy as sending a JSON doc to a doc store, but in SFSQL all nested relations inherent within that document are stored as objects and primitives. And, with a little extra JSON syntax we can use references and eliminate duplication of data (e.g. &quot;#ref&quot;: &quot;oid-goes-here&quot;), all from within the same source JSON doc. There is also no added speed penalty for referencing an object. In fact, all parent nodes in the JSON source data are stored as objects. So any object can be referenced from anywhere, allowing for complex relations (e.g. any-to-any), enabling rapid prototyping of data structures (no tables to create), and enabling unique on-the-fly relations if need be.<p>Queries are issued using JOIN-free SQL query syntax (e.g. SELECT COUNT(*) WHERE $s:person.address.state=&#x27;NY&#x27;) that eliminates the verbosity of the standard SQL JOIN, yet still maintains access to the primitive and aggregate functions of the underlying SQL database (e.g. FORMAT(), COUNT(), etc.) which simply get passed through.<p>What it&#x27;s not? It&#x27;s not a database on it&#x27;s own and it&#x27;s not a horizontally scalable big data store. Although, since it can extend practically any SQL database, it&#x27;s ability to scale is in large part tied to the scaling ability of the chosen SQL database that it&#x27;s extending. It&#x27;s also not an ORM as there is no strict mapping of objects in code to objects in the datastore. And yes, there were trade offs in it&#x27;s design. One design choice was that every primitive be indexed, which results in better performance than against a straight SQL table without the proper indexes, though slower than a straight SQL table with a composite index tailored to that query. But, the ability to JOIN to an existing SQL table within a SFSQL query is on the roadmap which will allow you have the best of both worlds (custom tables w&#x2F;custom indexes referenced from within a SFSQL query). As well, there is a situation one encounters in indexing choices (but that you won&#x27;t encounter with SFSQL) that we like to call the &quot;Index Permutation Limitation&quot;. Simply put, there is a real limit to the number of composite indexes that you can create for any table&#x2F;collection. This is especially a problem when - for instance - you want to give end users the ability to query across any of the possible attributes across the myriad product types that a big online store may carry. We get into detail on this point within our FAQ.<p>So what it is? It&#x27;s an extension to SQL databases that makes storing and querying and modeling and just generally using a datastore... easier. We&#x27;re hoping it will stir up some interesting use-cases.<p>Thanks in advance for the feedback!

10 comments

productceoabout 3 years ago
&gt;Web App Development Environment we developed back in 1999. It was comprised of an IDE, Web Server, Object Store, Virtual File System, Template System, and polyglot (Java, JavaScript, and Python).<p>When you made this, why did you believe building this was what people needed?<p>And what did you end up observing that made you discontinue the project?
AtlasBarfedover 3 years ago
Why isn&#x27;t it horizontally scalable? Are there tons of joins?<p>I wrote a JSON&#x2F;Document&#x2F;Property Graph database front end with Cassandra and DynamoDB backends, so that is ... kinda ... schemaless.<p>I ended up doing a lot of things that RethinkDB and OrientDB did but didn&#x27;t scale like cassandra. I was trying to do a streaming document layer. Also a JSON document database with simple relations is a property graph database.<p>But Elassandra and similar projects kinda did the same thing so I dropped it. But I came out of it thinking it shouldn&#x27;t be hard to scale JSON, which is basically schemaless.<p>I had schema-optional things too where certain named properties&#x2F;elements of a JSON doc would be rendered onto a fixed table and schema, are you.guys doing anything like that? I think a schema optional or options for partial schemas for documents would enable the transition of schemaless data to schema&#x27;d stores in enterprise system evolution.<p>Do you guys intermix all documents like graph databases do, which seems a bit insane if you have to do some large scale table scan or migration of data and have to sift through data? Or build an index on a heap of documents with only some of them needing indexing?
rmbyrroover 3 years ago
Hey, quick question: would this by any chance work as an AWS Lambda container image [1] connected to Aurora Serverless [2] (MySQL or Postgres), perhaps using the Data API [3]?<p>If yes, I think this is a big plus for teams looking for a managed infra.<p>[1] <a href="https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;lambda&#x2F;latest&#x2F;dg&#x2F;lambda-images.html" rel="nofollow">https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;lambda&#x2F;latest&#x2F;dg&#x2F;lambda-images.h...</a><p>[2] <a href="https:&#x2F;&#x2F;aws.amazon.com&#x2F;rds&#x2F;aurora&#x2F;serverless&#x2F;" rel="nofollow">https:&#x2F;&#x2F;aws.amazon.com&#x2F;rds&#x2F;aurora&#x2F;serverless&#x2F;</a><p>[3] <a href="https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;AmazonRDS&#x2F;latest&#x2F;AuroraUserGuide&#x2F;data-api.html" rel="nofollow">https:&#x2F;&#x2F;docs.aws.amazon.com&#x2F;AmazonRDS&#x2F;latest&#x2F;AuroraUserGuide...</a>
评论 #30300796 未加载
brad0over 3 years ago
Interesting project! I&#x27;m curious about how this compares to other tech:<p>- How is data stored internally?<p>- What are the tradeoffs with this implementation?<p>- Do you generate indexes on data as it comes in?<p>- How are nested keys handled?<p>- Are indexes eventually consistent?
评论 #30293251 未加载
rmbyrroover 3 years ago
Sounds interesting and useful.<p>How likely is it for this product to be alive and well maintained for the next 5 years at least?
评论 #30295735 未加载
mdellavoover 3 years ago
Are you unrolling the nested JSON data structures and storing as traditional K&#x2F;Vs in an EAV pattern? Possibly using one table for each datatype or using a sparse table?<p>I&#x27;d be curious how this performs for complex queries - does this rely heavily on index intersection?
评论 #30295423 未加载
评论 #30294008 未加载
cryptonectorover 3 years ago
Want schema-less SQL? Use an EAV (entity-attribute-value) schema.
评论 #30295489 未加载
评论 #30295463 未加载
azthover 3 years ago
Is it similar to this: <a href="https:&#x2F;&#x2F;eng.uber.com&#x2F;logging" rel="nofollow">https:&#x2F;&#x2F;eng.uber.com&#x2F;logging</a> ?
评论 #30296885 未加载
pvgover 3 years ago
There&#x27;s something wrong with your account creation process, it rejects just about any auto-generated (by Chrome) password even though they easily meet the listed criteria.
评论 #30294356 未加载
评论 #30294100 未加载
评论 #30294031 未加载
crooked-vover 3 years ago
This sounds like a lot of reinvention of the JSON&#x2F;JSONB support that Postgres has already had since 9.2&#x2F;9.4. They have a slide deck with a variety of in-depth examples. <a href="https:&#x2F;&#x2F;wiki.postgresql.org&#x2F;images&#x2F;7&#x2F;7c&#x2F;JSON-by-example_FOSDEM_2016.pdf" rel="nofollow">https:&#x2F;&#x2F;wiki.postgresql.org&#x2F;images&#x2F;7&#x2F;7c&#x2F;JSON-by-example_FOSD...</a><p><pre><code> CREATE INDEX review_review_jsonb ON reviews USING GIN (review_jsonb); -- Select data with JSON SELECT review_jsonb#&gt;&gt; &#x27;{product,title}&#x27; AS title , avg((review_jsonb#&gt;&gt; &#x27;{review,rating}&#x27;)::int) AS average_rating FROM reviews WHERE review_jsonb@&gt;&#x27;{&quot;product&quot;: {&quot;category&quot;: &quot;Sheet Music &amp; Scores&quot;}}&#x27; GROUP BY title ORDER BY average_rating DESC ;</code></pre>
评论 #30294259 未加载
评论 #30293913 未加载
评论 #30294621 未加载
评论 #30293177 未加载