Generating JSON Directly from Postgres

269 pointsby izendalmost 4 years ago

27 comments

> Too often, web tiers are full of boilerplate that does nothing except convert a result set into JSON.Yeah. This is the problem: we've abandoned the hypermedia architecture of the web for a dumb-data, RPC model. I suppose if you are going to go that direction and things are simple enough, you can jam it in the DB and get smoking perf.But as things get more complicated, where does the business logic live? Maybe in the database as stored procedures? It's less crazy than it sounds.An alternative is to cut out the other middle man: JSON, ditch heavy front-end javascript and return HTML instead, going back to the hypermedia approach. The big problem with that is a step back in UX functionality, but there are tools[1][2][3] for addressing that.[1] - <a href="https://htmx.org" rel="nofollow">https://htmx.org</a> (my library)[2] - <a href="https://unpoly.com" rel="nofollow">https://unpoly.com</a>[3] - <a href="https://hotwired.dev" rel="nofollow">https://hotwired.dev</a>

评论 #27850345 未加载

评论 #27851930 未加载

评论 #27850914 未加载

评论 #27850790 未加载

评论 #27851443 未加载

评论 #27853167 未加载

评论 #27849804 未加载

评论 #27852790 未加载

评论 #27848742 未加载

评论 #27848707 未加载

评论 #27855123 未加载

评论 #27852160 未加载

评论 #27851793 未加载

评论 #27852353 未加载

评论 #27857206 未加载

评论 #27850352 未加载

评论 #27849377 未加载

评论 #27850314 未加载

评论 #27850831 未加载

评论 #27854306 未加载

评论 #27882648 未加载

评论 #27854366 未加载

评论 #27853633 未加载

评论 #27851918 未加载

评论 #27853457 未加载

tofuahdudealmost 4 years ago

A lot of commenters jumping on the "cut out the middle tier" title.The bulk of the content is simply about formatting JSON in the DB instead of manually mapping rows in the application layer.It doesn't say "eliminate your API layer" or "have no application logic between client and db" as most are jumping to.I find the actual methods described as helpful in that I can convert to the data structures I ultimately want in one pass instead of two.Doesn't mean I don't have validation or a traditional API layer. Just easier to use.

nicholasjarnoldalmost 4 years ago

If one wishes to "cut out the middle tier" they might be better served by a investigating a project like PostgREST [0][0] <a href="https://postgrest.org/en/stable/" rel="nofollow">https://postgrest.org/en/stable/</a>

评论 #27850580 未加载

评论 #27851459 未加载

评论 #27853288 未加载

评论 #27851282 未加载

felixgealmost 4 years ago

As with everything, it depends.I've implemented a large system managing billions of records using exactly this approach of cutting out a lot of boilerplate in the application layer.The most important thing is to be pragmatic. This approach works great for CRUD as well as many types of complex analytical queries. However, for cases where the the complexity or performance of the SQL was unacceptable, we also decoded data into application data structures for further processing.When done well, you can get the best of both worlds as needed.

maxk42almost 4 years ago

It is a really, really bad idea to put business logic in your database and there are people in this thread downvoting everyone who points it out. Please do not do this in production - I've had to maintain systems that married business logic and database and it was a nightmare to maintain and cost a fortune to fix.

评论 #27851248 未加载

评论 #27849482 未加载

评论 #27851831 未加载

评论 #27853895 未加载

评论 #27850422 未加载

评论 #27850905 未加载

评论 #27850324 未加载

评论 #27850724 未加载

jansommeralmost 4 years ago

The obligatory comment to these posts is always "check out PostgREST". We use it for all frontend api's. Combined with Node or any other middleware, we add additional code if everything can't be done in plpgsql, but requests are usually just passed straight through Node.

评论 #27853720 未加载

jkingsberyalmost 4 years ago

"Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should."While interesting, the advice offered in this post is generally bad, or at least in complete. OK, so, you "cut out the middle tier..." what now? Are web clients connecting to PostGRES directly? Will PostGRES handle connection authentication and request authorization? Logging that a particular user looked at particular data? Can the client cache this data? If so, for how long? Even taking the author's premise, this is not a good pattern - a simple JSON get request still comes with a bunch of other stuff that this doesn't bother to address.But the premise is wrong - few applications just spit out JSON like this, they also have to handle updates, and there's business logic involved in that. Data needs to be structured in a way that's reasonable for clients to consume, which isn't always a row of data (maybe it should be collapsing all values in a column, so that in the JSON the client just gets a JSON array of the values).

评论 #27849077 未加载

xaduhaalmost 4 years ago

The more things change, the more they stay the same.If we lived in XHTML 2.0 universe instead of HTML5 universe, then that would be the way you do things. Not with SQL and JSON mind you, but with XML everywhere.I had a glimpse into that world and it looked pretty good at the time. XML databases + XQuery can produce whatever you want as long as it's XML. And XML could be many things. Many horrible things too, made by people with arms growing out of places they shouldn't.

limaoscarjulietalmost 4 years ago

It is pretty easy to get any result set straight from PostgreSQL as json object:"WITH _ AS (" real query goes here ") SELECT COALESCE(array_to_json(array_agg(_.*)), '[]') AS xyz FROM _";From there it behaves as standard JSON, wich can be easily loaded into any language.

pjungwiralmost 4 years ago

If you are using Rails and ActiveModelSerializers, I wrote a gem to take your serializer definition and build the whole JSON result in Postgres: [0] It hasn't gotten much use, but I know at least one company that has used it to get 100x performance boosts on #index endpoints.It generates jsonapi right now, since that is standard for Ember projects, but it'd be pretty easy to add an adapter for some other format.[0] <a href="https://github.com/pjungwir/active_model_serializers_pg" rel="nofollow">https://github.com/pjungwir/active_model_serializers_pg</a>

评论 #27859269 未加载

评论 #27849833 未加载

评论 #27850453 未加载

green7eaalmost 4 years ago

I made a side project to experiment with this approach (and rust) a few years ago. It worked pretty well but it made simple queries more complex[1]. I was a little bit worried that complex queries would become a lot more complex so I'm not sure how well this approach would scale.It removed a crazy amount of boilerplate code [2] though and made the project much more focused on the actual data which is always a good thing.Ever since, I've been very curious to try edgedb [3] since it promises to do similar things without making the queries more complex.[1]<a href="https://github.com/green7ea/newsy/blob/master/src/feed_overview.sql" rel="nofollow">https://github.com/green7ea/newsy/blob/master/src/feed_overv...</a>[2]<a href="https://github.com/green7ea/newsy/blob/master/src/main.rs#L44" rel="nofollow">https://github.com/green7ea/newsy/blob/master/src/main.rs#L4...</a>[3]<a href="https://www.edgedb.com/" rel="nofollow">https://www.edgedb.com/</a>

throwaway4400almost 4 years ago

I'm surprised no one has mentioned lock in. It is very difficult to swap out your db if all your business logic is weaved into it.

评论 #27853275 未加载

your_challengeralmost 4 years ago

Databases are providing greater native functionalities which will reduce our efforts in writing boilerplate code.But the underlying SQL gets complex and it gets harder to maintain high-performance code - or hire good engineers.This has made me intrested in EdgeDB [1]. They are abstracting the high-performance SQL bit from the engineer by introducing a new query language EdgeQL [2].You can think of EdgeQL and EdgeDB to be the next step in abstraction just like how we have many programming languages, that did just that for machine code and beyond. You may be able to write high performance assembly, but it's tiresome and may not lead to high performance. Also a better UX for the engineer (eg: Kotlin on JVM) improves time to market and overall better for the ecosystem.[1] <a href="https://www.edgedb.com/" rel="nofollow">https://www.edgedb.com/</a>[2] <a href="https://www.edgedb.com/docs/edgeql/index" rel="nofollow">https://www.edgedb.com/docs/edgeql/index</a>

评论 #27861980 未加载

zbjornsonalmost 4 years ago

Shameless plug: I made bson-to-json for more or less the same purpose with MongoDB: converting the raw BSON response from the DB directly to JSON strings. Saves lots of CPU cycles and GC pressure.<a href="https://github.com/zbjornson/bson-to-json" rel="nofollow">https://github.com/zbjornson/bson-to-json</a>

DoomHotelalmost 4 years ago

Postgres (>11) also implements the SQL/JSON path language that lets you extract table data from JSON.

评论 #27853945 未加载

kbensonalmost 4 years ago

To me, the thing that excites me about this is more efficient ORM querying of data. Where previously you either had to make multiple queries to fill parent and child relationships (whether on demand of explicitly and prefilling a cache) or make a larger join where a lot of data may be duplicated in 1:N relationships making it inefficient (with a few levels of 1:N relationships chained leading to extremely inefficient data transfer), now you could actually just get a JSON data structure back of exactly what each field should be. There are inefficiencies at scale here as well with duplicated key names, but that can possibly be worked around to a degree with arrays and an array type index of some sort.

tehlikealmost 4 years ago

Once you build V8 into postgres, things get real interesting (thin javascript projections, and a bunch of other stuff)..<a href="https://github.com/plv8/plv8" rel="nofollow">https://github.com/plv8/plv8</a>

kisamotoalmost 4 years ago

I have an issue about performance I have often wondered.Discussions like these encourage me to put more work into the DB layer however only recently have SQL databases like PostgreSQL become horizontally scalable (I know read replicas are a thing and maybe they are the answer).So why would I put more work and effort into the less scalable data storage over the horizontally scalable, stateless app layer?

评论 #27853248 未加载

rkwasnyalmost 4 years ago

Don't do this.It will create various problems when next engineer adds some business logic, another one adds some stored procedures....

ptrwisalmost 4 years ago

I did the approach with nesting jsonb_agg queries, it worked for me but was a bit hacky and sometimes complex. I hoped for Graph Query Language (the one supposed to be part of SQL standard) to make it easier to retrieve structred json from flat tables, but it doesn't look ready yet.

kerreksalmost 4 years ago

Does it scale well? I feel like doing all business logic in procedures would require company to use one huge database in order to intersect various domains

adsharmaalmost 4 years ago

<pre><code> resp = ( await DepartmentQuery([1,2]) .edge("employees") .project(["name", "start_date", ":id"]) .to_json() ) </code></pre> <a href="https://github.com/adsharma/fquery/" rel="nofollow">https://github.com/adsharma/fquery/</a> <a href="https://adsharma.github.io/fquery/" rel="nofollow">https://adsharma.github.io/fquery/</a>

评论 #27852016 未加载

Thaxllalmost 4 years ago

It's a bad idea and I'd like to see how do people actually do test using this model. Now you have to run PG locally everytime you need to unit test something? Also PG returns some JSON but if the object is not exactly what you want to send back you create another object and merge the two? Just don't do it, it's a terrible design. It "works" for simple API but for anything serious this is wrong.

评论 #27851294 未加载

评论 #27850737 未加载

评论 #27849923 未加载

dborehamalmost 4 years ago

You can also insert JSON directly into PostgreSQL.

gigatexalalmost 4 years ago

I wish RedShift got these functions. :(

blacktrianglealmost 4 years ago

Please just no, do not do this. Even the dumbest most simple system there is will evolve beyond simple CRUD access to tables. The second you put your schema on the wire, that schema can now never change. That's just such an insane commitment to make.

评论 #27848863 未加载

评论 #27848983 未加载

评论 #27848865 未加载

评论 #27850041 未加载

jexpalmost 4 years ago

I'm not impressed. Same in Neo4j'sMATCH (e:Employee) RETURN e { .* } as employee;MATCH (e:Employee) RETURN collect(e { .* }) as employees;MATCH (e:Employee)-[:IN]->(d:Department) WITH d, collect(e {.*}) as employees