TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

SQLGlot: SQL parser, transpiler, optimizer – translate to Presto, Spark, Hive

120 pointsby blopeuralmost 3 years ago

6 comments

flakinessalmost 3 years ago
The parser.py [1] has only 1.6k lines. And it is hand-written parser. This size is amazing if it&#x27;s really capable, but I intuitively doubt it. For example, duckdb&#x27;s select.y [2] has 3700 lines, and this is only for SELECT. ZetaSQL&#x27;s grammar file [3] is almost 10k lines.<p>The SQL is a monstrous language. Is there any trick that keeps the code simple?<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;tobymao&#x2F;sqlglot&#x2F;blob&#x2F;main&#x2F;sqlglot&#x2F;parser.py" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;tobymao&#x2F;sqlglot&#x2F;blob&#x2F;main&#x2F;sqlglot&#x2F;parser....</a> [2] <a href="https:&#x2F;&#x2F;github.com&#x2F;duckdb&#x2F;duckdb&#x2F;blob&#x2F;master&#x2F;third_party&#x2F;libpg_query&#x2F;grammar&#x2F;statements&#x2F;select.y" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;duckdb&#x2F;duckdb&#x2F;blob&#x2F;master&#x2F;third_party&#x2F;lib...</a> [3] <a href="https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;zetasql&#x2F;blob&#x2F;master&#x2F;zetasql&#x2F;parser&#x2F;bison_parser.y" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;zetasql&#x2F;blob&#x2F;master&#x2F;zetasql&#x2F;parser...</a>
评论 #31988849 未加载
评论 #31985504 未加载
评论 #31986016 未加载
评论 #31985360 未加载
captaintobsalmost 3 years ago
Author here, feel free to ask me any questions!<p>Something that I&#x27;m working on is a pure python SQL engine <a href="https:&#x2F;&#x2F;github.com&#x2F;tobymao&#x2F;sqlglot&#x2F;blob&#x2F;main&#x2F;sqlglot&#x2F;executor&#x2F;python.py" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;tobymao&#x2F;sqlglot&#x2F;blob&#x2F;main&#x2F;sqlglot&#x2F;executo...</a>. It does the whole shebang, parsing, optimizations, logical planning, physical execution.
评论 #31983868 未加载
评论 #31985382 未加载
评论 #31986471 未加载
评论 #31985973 未加载
评论 #31984144 未加载
RobinLalmost 3 years ago
SQLGlot is great. We&#x27;ve used it to extend our FOSS probabilistic data linking library[1] so that it is now capable of executing against a variety of SQL backends (Spark, Presto, DuckDB, Sqlite), significantly widening our potential user base.<p>We implement the core statistical model in SQL, and then use SQLGlot to transpile to the target execution engine. One big motivation was to futureproof our work - we&#x27;re no longer tied down to Spark, and so when the &#x27;next big thing&#x27; (GPU accelerated SQL for analytics?) comes along, it should be relatively straightforward to support it by writing another adaptor.<p>Working on this has highlighted some of the really tricky problems associated with translating between SQL engines, and we haven&#x27;t hit any major problems, so kudos to the author!<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;moj-analytical-services&#x2F;splink&#x2F;tree&#x2F;splink3" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;moj-analytical-services&#x2F;splink&#x2F;tree&#x2F;splin...</a>
评论 #31986447 未加载
eatonphilalmost 3 years ago
Neat! I did an exploration of sql parsers in different languages [0] and couldn&#x27;t find much for python. But between this project itself and the couple it lists in the benchmarks I have a few more to look at.<p>[0] <a href="https:&#x2F;&#x2F;datastation.multiprocess.io&#x2F;blog&#x2F;2022-04-11-sql-parsers.html" rel="nofollow">https:&#x2F;&#x2F;datastation.multiprocess.io&#x2F;blog&#x2F;2022-04-11-sql-pars...</a>
评论 #31983430 未加载
评论 #31984223 未加载
评论 #31986397 未加载
Pandabobalmost 3 years ago
Could this be used in VSCode as plugin to autoformat&#x2F;lint my .sql files?
xiaodaialmost 3 years ago
nice one. do you feel that having it in pure python leaves some performance on the table? or is performance not so critical in this use case?
评论 #31984118 未加载