I recently had to write SQL query generation for AWS Athena, which is based off Presto 0.217<p>It turns out that the dialect doesn't support LATERAL joins with a LIMIT in them. The below query only works if you remove the LIMIT clause.<p><a href="https://i.stack.imgur.com/rdB1s.png" rel="nofollow">https://i.stack.imgur.com/rdB1s.png</a><p>This makes saying things like "Fetch all artists where ..., for each artist fetch their first 3 albums where ..., and for each album fetch the top 10 tracks where ..." really difficult<p>Does Trino support this out of curiosity?
One of my favorite OSS projects! Probably the most flexible and fully featured distributed SQL query engine around. Congrats and looking forward to the next decade!
Big shout out to Brian Olsen from the Trino community (and Starburst) for helping the Trino community be successful<p>- <a href="https://github.com/bitsondatadev" rel="nofollow">https://github.com/bitsondatadev</a><p>- <a href="https://www.linkedin.com/in/bitsondatadev/" rel="nofollow">https://www.linkedin.com/in/bitsondatadev/</a><p>I recommend the Trino Slack for people not already in it: <a href="https://trino.io/slack.html" rel="nofollow">https://trino.io/slack.html</a>
btw, if you want to know the backstory on why Presto is now called Trino, here's the article:<p><a href="https://trino.io/blog/2022/08/02/leaving-facebook-meta-best-for-trino.html" rel="nofollow">https://trino.io/blog/2022/08/02/leaving-facebook-meta-best-...</a>
One of the features I'm interested in (or would like to have) from Trino or Presto is the workload management which can better manage different types of queries and allocate resources accordingly. This becomes important when more applications adopt Trino or Presto as a distributed SQL database/platform, where the impact from different queries or workloads can be mitigated, besides the dedicated resources (CPU, MEM, etc.) can be allocated to high priority workloads. I'm really wondering if/when such capabilities may be provided.<p>BTW, purely curiosity, I compared Trino with Presto from OSS point of view (<a href="https://ossinsight.io/analyze/prestodb/presto?vs=trinodb%2Ftrino" rel="nofollow">https://ossinsight.io/analyze/prestodb/presto?vs=trinodb%2Ft...</a>), both communities are still popular but Trino seems more active than Presto now. I also wonder if two communities may reunion someday again to really boost its impact (comparing to Spark community).
Also just to note.. I am currently working on a refresh of Trino: The Definitive Guide .. and would love to see you all at Trino Summit in November.<p><a href="https://trino.io/blog/2022/06/30/trino-summit-call-for-speakers.html" rel="nofollow">https://trino.io/blog/2022/06/30/trino-summit-call-for-speak...</a>
We are building a SaaS BI tool.<p>To enable the users to connect to their databases... we have a form that collects the database credentials from the user, saves it in a secure way, and when the user writes or uses an SQL query, we establish a database connection right away (from our server), execute it, and return the results, and we keep the connection alive for like 15mins.<p>But with serverless architecture, first query could go to instance 1, so instance 1 will establish a db connection, then the second query could go to instance 2, so instance 2 will establish another one. You could end up with a lot of unnecessary connections.<p>If you use AWS RDS (for yourself), beside lambda for example, AWS have RDS Proxy to solve this problem.<p>So I was thinking about using Trino like the RDS Proxy, but for more databases, and for our customers database, not ours. Is that doable with Trino?
I do the support for my department's trino cluster. We move ~1tb (and growing) in ETL jobs and support interactive queries for the data scientists/analysts.<p>It would be super good if you guys added big query write support. Its really annoying to have to run a hive cluster in google to act as a proxy for this.
The thing I wonder about with Presto and to a lesser extent Spark is, how many of their users adopted this tool because it was an easy migration path from Hive, and how many of those users will eventually re-platform to something else?
Can anyone clarify the differences between Trino and SparkSQL? Our company has used SparkSQL to aggressively replace use-cases that were based on PrestoSQL in the past.