<i>> Too often, web tiers are full of boilerplate that does nothing except convert a result set into JSON.</i><p>Yeah. This is the problem: we've abandoned the hypermedia architecture of the web for a dumb-data, RPC model. I suppose if you are going to go that direction and things are simple enough, you can jam it in the DB and get smoking perf.<p>But as things get more complicated, where does the business logic live? Maybe in the database as stored procedures? It's less crazy than it sounds.<p>An alternative is to cut out the <i>other</i> middle man: JSON, ditch heavy front-end javascript and return HTML instead, going back to the hypermedia approach. The big problem with that is a step back in UX functionality, but there are tools[1][2][3] for addressing that.<p>[1] - <a href="https://htmx.org" rel="nofollow">https://htmx.org</a> (my library)<p>[2] - <a href="https://unpoly.com" rel="nofollow">https://unpoly.com</a><p>[3] - <a href="https://hotwired.dev" rel="nofollow">https://hotwired.dev</a>
A lot of commenters jumping on the "cut out the middle tier" title.<p>The bulk of the content is simply about formatting JSON in the DB instead of manually mapping rows in the application layer.<p>It doesn't say "eliminate your API layer" or "have no application logic between client and db" as most are jumping to.<p>I find the actual methods described as helpful in that I can convert to the data structures I ultimately want in one pass instead of two.<p>Doesn't mean I don't have validation or a traditional API layer. Just easier to use.
If one wishes to "cut out the middle tier" they might be better served by a investigating a project like PostgREST [0]<p>[0] <a href="https://postgrest.org/en/stable/" rel="nofollow">https://postgrest.org/en/stable/</a>
As with everything, it depends.<p>I've implemented a large system managing billions of records using exactly this approach of cutting out a lot of boilerplate in the application layer.<p>The most important thing is to be pragmatic. This approach works great for CRUD as well as many types of complex analytical queries. However, for cases where the the complexity or performance of the SQL was unacceptable, we also decoded data into application data structures for further processing.<p>When done well, you can get the best of both worlds as needed.
It is a really, really bad idea to put business logic in your database and there are people in this thread downvoting everyone who points it out. Please do not do this in production - I've had to maintain systems that married business logic and database and it was a nightmare to maintain and cost a fortune to fix.
The obligatory comment to these posts is always "check out PostgREST". We use it for all frontend api's. Combined with Node or any other middleware, we add additional code if everything can't be done in plpgsql, but requests are usually just passed straight through Node.
"Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should."<p>While interesting, the advice offered in this post is generally bad, or at least in complete. OK, so, you "cut out the middle tier..." what now? Are web clients connecting to PostGRES directly? Will PostGRES handle connection authentication and request authorization? Logging that a particular user looked at particular data? Can the client cache this data? If so, for how long? Even taking the author's premise, this is not a good pattern - a simple JSON get request still comes with a bunch of other stuff that this doesn't bother to address.<p>But the premise is wrong - few applications just spit out JSON like this, they also have to handle updates, and there's business logic involved in that. Data needs to be structured in a way that's reasonable for clients to consume, which isn't always a row of data (maybe it should be collapsing all values in a column, so that in the JSON the client just gets a JSON array of the values).
The more things change, the more they stay the same.<p>If we lived in XHTML 2.0 universe instead of HTML5 universe, then that would be the way you do things. Not with SQL and JSON mind you, but with XML everywhere.<p>I had a glimpse into that world and it looked pretty good at the time. XML databases + XQuery can produce whatever you want as long as it's XML. And XML could be many things. Many horrible things too, made by people with arms growing out of places they shouldn't.
It is pretty easy to get any result set straight from PostgreSQL as json object:<p>"WITH _ AS ("
real query goes here
") SELECT COALESCE(array_to_json(array_agg(_.*)), '[]') AS xyz FROM _";<p>From there it behaves as standard JSON, wich can be easily loaded into any language.
If you are using Rails and ActiveModelSerializers, I wrote a gem to take your serializer definition and build the whole JSON result in Postgres: [0] It hasn't gotten much use, but I know at least one company that has used it to get 100x performance boosts on #index endpoints.<p>It generates jsonapi right now, since that is standard for Ember projects, but it'd be pretty easy to add an adapter for some other format.<p>[0] <a href="https://github.com/pjungwir/active_model_serializers_pg" rel="nofollow">https://github.com/pjungwir/active_model_serializers_pg</a>
I made a side project to experiment with this approach (and rust) a few years ago. It worked pretty well but it made simple queries more complex[1]. I was a little bit worried that complex queries would become a lot more complex so I'm not sure how well this approach would scale.<p>It removed a crazy amount of boilerplate code [2] though and made the project much more focused on the actual data which is always a good thing.<p>Ever since, I've been very curious to try edgedb [3] since it promises to do similar things without making the queries more complex.<p>[1]<a href="https://github.com/green7ea/newsy/blob/master/src/feed_overview.sql" rel="nofollow">https://github.com/green7ea/newsy/blob/master/src/feed_overv...</a><p>[2]<a href="https://github.com/green7ea/newsy/blob/master/src/main.rs#L44" rel="nofollow">https://github.com/green7ea/newsy/blob/master/src/main.rs#L4...</a><p>[3]<a href="https://www.edgedb.com/" rel="nofollow">https://www.edgedb.com/</a>
Databases are providing greater native functionalities which will reduce our efforts in writing boilerplate code.<p>But the underlying SQL gets complex and it gets harder to maintain high-performance code - or hire good engineers.<p>This has made me intrested in EdgeDB [1]. They are abstracting the high-performance SQL bit from the engineer by introducing a new query language EdgeQL [2].<p>You can think of EdgeQL and EdgeDB to be the next step in abstraction just like how we have many programming languages, that did just that for machine code and beyond. You may be able to write high performance assembly, but it's tiresome and may not lead to high performance. Also a better UX for the engineer (eg: Kotlin on JVM) improves time to market and overall better for the ecosystem.<p>[1] <a href="https://www.edgedb.com/" rel="nofollow">https://www.edgedb.com/</a><p>[2] <a href="https://www.edgedb.com/docs/edgeql/index" rel="nofollow">https://www.edgedb.com/docs/edgeql/index</a>
Shameless plug: I made bson-to-json for more or less the same purpose with MongoDB: converting the raw BSON response from the DB directly to JSON strings. Saves lots of CPU cycles and GC pressure.<p><a href="https://github.com/zbjornson/bson-to-json" rel="nofollow">https://github.com/zbjornson/bson-to-json</a>
To me, the thing that excites me about this is more efficient ORM querying of data. Where previously you either had to make multiple queries to fill parent and child relationships (whether on demand of explicitly and prefilling a cache) or make a larger join where a lot of data may be duplicated in 1:N relationships making it inefficient (with a few levels of 1:N relationships chained leading to extremely inefficient data transfer), now you could actually just get a JSON data structure back of exactly what each field should be. There are inefficiencies at scale here as well with duplicated key names, but that can possibly be worked around to a degree with arrays and an array type index of some sort.
Once you build V8 into postgres, things get real interesting (thin javascript projections, and a bunch of other stuff)..<p><a href="https://github.com/plv8/plv8" rel="nofollow">https://github.com/plv8/plv8</a>
I have an issue about performance I have often wondered.<p>Discussions like these encourage me to put more work into the DB layer however only recently have SQL databases like PostgreSQL become horizontally scalable (I know read replicas are a thing and maybe they are the answer).<p>So why would I put more work and effort into the less scalable data storage over the horizontally scalable, stateless app layer?
I did the approach with nesting jsonb_agg queries, it worked for me but was a bit hacky and sometimes complex. I hoped for Graph Query Language (the one supposed to be part of SQL standard) to make it easier to retrieve structred json from flat tables, but it doesn't look ready yet.
Does it scale well? I feel like doing all business logic in procedures would require company to use one huge database in order to intersect various domains
It's a bad idea and I'd like to see how do people actually do test using this model. Now you have to run PG locally everytime you need to unit test something? Also PG returns some JSON but if the object is not exactly what you want to send back you create another object and merge the two? Just don't do it, it's a terrible design. It "works" for simple API but for anything serious this is wrong.
Please just no, do not do this. Even the dumbest most simple system there is will evolve beyond simple CRUD access to tables. The second you put your schema on the wire, that schema can now never change. That's just such an insane commitment to make.
I'm not impressed. Same in Neo4j's<p>MATCH (e:Employee)
RETURN e { .* } as employee;<p>MATCH (e:Employee)
RETURN collect(e { .* }) as employees;<p>MATCH (e:Employee)-[:IN]->(d:Department)
WITH d, collect(e {.*}) as employees