Streaming data to clients at the edge (apps or services) is a hard problem. We built an approach that keeps Postgres at the center and allows clients to consume a stream of data that is already in Postgres without any additional moving parts.<p>This works with read replica style scaling or with Postgres flavours that support scaling out easily (eg: Cosmos Postgres, Yugabyte & cockroach coming soon).
I think there’s a huge unacknowledged gap in the database industry for good streaming products.<p>If we are building a report or dashboard that we pull up a few times a day then a pull based model where we query the database on page load is fine.<p>For almost anything else such as an app, a microservice, an alerting system, a web page, a dashboard, we want to be able to update it in near real time for the user experience. Receiving a stream of query results is by far the easiest way to do this.<p>Polling is obviously a poor interim solution.<p>I think streaming will be a huge story in data over the next decade. The products are coming through now which is a start.
While I love ingenuity of developers at Hasura (because I've been personally through these scaling challenges), I always get a gag reaction with GraphQL. I've honestly tried hard to digest it, and I can tell you at large scale where single DB won't cut it, you would either need to develop a large federation layer like [Netflix](<a href="https://netflixtechblog.com/how-netflix-scales-its-api-with-graphql-federation-part-1-ae3557c187e2" rel="nofollow">https://netflixtechblog.com/how-netflix-scales-its-api-with-...</a>), or just rip-it out. Streaming might elevate the problem a little, but I real problem still lurks under the hood. The fact that front-end community wants to fit everything under GraphQL bothers me, because every backend developer knows that a single tool/technology is usually not the best tool for solving all problems of your life. Remember the golden words, THERE IS NO SILVER BULLET!
I've been using Hasura in production for small commercial projects deployed on AWS and I've been positively impressed by the stability and the speed up - it makes spinning up a graphQL backend with row-level security straightforward.
FD: I work at Hasura.<p>Seeing some feedback on GraphQL - Hasura has had support for converting templated GraphQL into RESTish endpoints (with Open API Spec docs if needed). We are planning to do the same for this streaming API as well - does anyone have good examples of existing REST/RESTish endpoints that something similar?
We have written a post[1] on building a real-time chat app with Streaming Subscriptions on Postgres. It gives a quick overview of the architecture used and how you can leverage the API on the client side with AuthZ. There’s a live demo that you all can try out.[2]<p>[1] <a href="https://hasura.io/blog/building-real-time-chat-apps-with-graphql-streaming-subscriptions/" rel="nofollow">https://hasura.io/blog/building-real-time-chat-apps-with-gra...</a>
[2] <a href="https://eclectic-dragon-25a38c.netlify.app/" rel="nofollow">https://eclectic-dragon-25a38c.netlify.app/</a><p>Would love to see more use cases coming out of this :)
Has anyone come across neat tools for load-testing streaming APIs?<p>We used <a href="https://github.com/hasura/graphql-bench" rel="nofollow">https://github.com/hasura/graphql-bench</a> and a set of scripts to monitor runtime characteristics of Hasura and Postgres, and reconciliation to make sure data was received as expected and in-order.<p>But would love to see if there's other tools that folks have come across!
It sounds like the stream must be an append-only table. This is awkward - I would expect streaming updated results for a query. If I want clients to refetch changed record in real time, do I still need to build that on top of this stream primitive? Like, I stream an audit log style table, and then refetch any IDs mentioned in the stream separately?