The SQL query engine Trino (formerly PrestoSQL) recaps a decade of innovation

96 pointsby simpligilityalmost 3 years ago

16 comments

gavinrayalmost 3 years ago

I recently had to write SQL query generation for AWS Athena, which is based off Presto 0.217It turns out that the dialect doesn't support LATERAL joins with a LIMIT in them. The below query only works if you remove the LIMIT clause.<a href="https://i.stack.imgur.com/rdB1s.png" rel="nofollow">https://i.stack.imgur.com/rdB1s.png</a>This makes saying things like "Fetch all artists where ..., for each artist fetch their first 3 albums where ..., and for each album fetch the top 10 tracks where ..." really difficultDoes Trino support this out of curiosity?

评论 #32345951 未加载

评论 #32346570 未加载

评论 #32349101 未加载

评论 #32345869 未加载

评论 #32345860 未加载

tekkertjealmost 3 years ago

One of my favorite OSS projects! Probably the most flexible and fully featured distributed SQL query engine around. Congrats and looking forward to the next decade!

评论 #32345622 未加载

skadamatalmost 3 years ago

Big shout out to Brian Olsen from the Trino community (and Starburst) for helping the Trino community be successful- <a href="https://github.com/bitsondatadev" rel="nofollow">https://github.com/bitsondatadev</a>- <a href="https://www.linkedin.com/in/bitsondatadev/" rel="nofollow">https://www.linkedin.com/in/bitsondatadev/</a>I recommend the Trino Slack for people not already in it: <a href="https://trino.io/slack.html" rel="nofollow">https://trino.io/slack.html</a>

评论 #32345990 未加载

bitsondatadevalmost 3 years ago

btw, if you want to know the backstory on why Presto is now called Trino, here's the article:<a href="https://trino.io/blog/2022/08/02/leaving-facebook-meta-best-for-trino.html" rel="nofollow">https://trino.io/blog/2022/08/02/leaving-facebook-meta-best-...</a>

jerryjerryjerryalmost 3 years ago

One of the features I'm interested in (or would like to have) from Trino or Presto is the workload management which can better manage different types of queries and allocate resources accordingly. This becomes important when more applications adopt Trino or Presto as a distributed SQL database/platform, where the impact from different queries or workloads can be mitigated, besides the dedicated resources (CPU, MEM, etc.) can be allocated to high priority workloads. I'm really wondering if/when such capabilities may be provided.BTW, purely curiosity, I compared Trino with Presto from OSS point of view (<a href="https://ossinsight.io/analyze/prestodb/presto?vs=trinodb%2Ftrino" rel="nofollow">https://ossinsight.io/analyze/prestodb/presto?vs=trinodb%2Ft...</a>), both communities are still popular but Trino seems more active than Presto now. I also wonder if two communities may reunion someday again to really boost its impact (comparing to Spark community).

评论 #32346320 未加载

simpligilityalmost 3 years ago

Also just to note.. I am currently working on a refresh of Trino: The Definitive Guide .. and would love to see you all at Trino Summit in November.<a href="https://trino.io/blog/2022/06/30/trino-summit-call-for-speakers.html" rel="nofollow">https://trino.io/blog/2022/06/30/trino-summit-call-for-speak...</a>

mrwnmonmalmost 3 years ago

We are building a SaaS BI tool.To enable the users to connect to their databases... we have a form that collects the database credentials from the user, saves it in a secure way, and when the user writes or uses an SQL query, we establish a database connection right away (from our server), execute it, and return the results, and we keep the connection alive for like 15mins.But with serverless architecture, first query could go to instance 1, so instance 1 will establish a db connection, then the second query could go to instance 2, so instance 2 will establish another one. You could end up with a lot of unnecessary connections.If you use AWS RDS (for yourself), beside lambda for example, AWS have RDS Proxy to solve this problem.So I was thinking about using Trino like the RDS Proxy, but for more databases, and for our customers database, not ours. Is that doable with Trino?

dmeadalmost 3 years ago

I do the support for my department's trino cluster. We move ~1tb (and growing) in ETL jobs and support interactive queries for the data scientists/analysts.It would be super good if you guys added big query write support. Its really annoying to have to run a hive cluster in google to act as a proxy for this.

评论 #32345995 未加载

评论 #32345934 未加载

georgewfraseralmost 3 years ago

The thing I wonder about with Presto and to a lesser extent Spark is, how many of their users adopted this tool because it was an easy migration path from Hive, and how many of those users will eventually re-platform to something else?

评论 #32345773 未加载

simpligilityalmost 3 years ago

Also working on a new edition for Trino: The Definitive Guide at the moment.

QuotedAtomsalmost 3 years ago

Can anyone clarify the differences between Trino and SparkSQL? Our company has used SparkSQL to aggressively replace use-cases that were based on PrestoSQL in the past.

评论 #32346519 未加载

simpligilityalmost 3 years ago

Its great to see how far the project has come from the humble beginnings to the current, rich open source ecosystem and community.

ck_onealmost 3 years ago

Can Trino be used as a Snowflake replacement? How is the query speed compared to Snowflake?

评论 #32345962 未加载

评论 #32346339 未加载

评论 #32346235 未加载

评论 #32346011 未加载

评论 #32345974 未加载

tzuryalmost 3 years ago

Trino vs ClickHouse, can anyone tell from experience how those two compare?

评论 #32346226 未加载

评论 #32346498 未加载

评论 #32346350 未加载

kache_almost 3 years ago

trino is awesomecan't believe this shit is free as in freedom

mrwnmonmalmost 3 years ago

Can you use Trino as a database proxy?

评论 #32346589 未加载

16 comments

gavinrayalmost 3 years ago

评论 #32345951 未加载

评论 #32346570 未加载

评论 #32349101 未加载

评论 #32345869 未加载

评论 #32345860 未加载

tekkertjealmost 3 years ago

One of my favorite OSS projects! Probably the most flexible and fully featured distributed SQL query engine around. Congrats and looking forward to the next decade!

评论 #32345622 未加载

skadamatalmost 3 years ago

评论 #32345990 未加载

bitsondatadevalmost 3 years ago

jerryjerryjerryalmost 3 years ago

评论 #32346320 未加载

simpligilityalmost 3 years ago

mrwnmonmalmost 3 years ago

dmeadalmost 3 years ago

评论 #32345995 未加载

评论 #32345934 未加载

georgewfraseralmost 3 years ago

评论 #32345773 未加载

simpligilityalmost 3 years ago

Also working on a new edition for Trino: The Definitive Guide at the moment.

QuotedAtomsalmost 3 years ago

Can anyone clarify the differences between Trino and SparkSQL? Our company has used SparkSQL to aggressively replace use-cases that were based on PrestoSQL in the past.

评论 #32346519 未加载

simpligilityalmost 3 years ago

Its great to see how far the project has come from the humble beginnings to the current, rich open source ecosystem and community.

ck_onealmost 3 years ago

Can Trino be used as a Snowflake replacement? How is the query speed compared to Snowflake?