TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Semantic Diff for SQL

228 点作者 s0ck_r4w将近 3 年前

10 条评论

chrisjc将近 3 年前
Very nice&#x2F;interesting.<p>Somewhat related question and apologies if this is already stated in the documentation (it&#x27;s rather dense and I haven&#x27;t had the time to read through it completely)...<p>Can you use sqlglot to create custom DDL dialects that have custom first class objects? For instance, if I want to build a custom SQL&#x2F;DDL&#x2F;DML dialect that had a new kind of object such as a &quot;pipe&quot;, &quot;kitchensink&quot;, etc, would sqlglot be a good tool to use?<p>I&#x27;ve tried playing around with Apache Calcite, but it lost me pretty quickly since the examples to customize&#x2F;extend DDLs were quite lacking in my opinion.
评论 #32280433 未加载
stochtastic将近 3 年前
I&#x27;ve been very impressed with sqlglot, and am looking forward to trying this feature. The only issue I&#x27;ve had with sqlglot is transpiling for use with a specific spark version: in my experience Spark is not great about surfacing obvious &#x27;not registered&#x27; errors when a function isn&#x27;t supported (especially in &gt;=2.4). I ran into this with width_bucket, which is only in the most recent release. I am curious whether there&#x27;s a straightforward way to write with a specific release and catch the error in transpilation rather than execution.<p>Side note: Iaroslav (post author) and Toby (sqlglot creator) are both amazing, and I&#x27;m so glad that they&#x27;re working on open source projects like this.
评论 #32280001 未加载
评论 #32280280 未加载
lichtenberger将近 3 年前
Really awesome work :-)<p>I&#x27;ve implemented the Fast Match &#x2F; Simple Edit script algorithm almost 10 years ago for my Master&#x27;s thesis[1] for my database project[1][2] in order to import revisions of files with a hopefully minimal edit number of edit operations between the stored revision and a new one (back then it was for XML databases).<p>The diffing was only one aspect for the visual analytics approach to compare the revisions (tree structures) visually [4]. Internally the nodes are addressed through dense, ascending 64bit ints stored in a special trie index. Furthermore, during the import optionally changes are tracked as well as a rolling hash is stored for each node optionally. After the import you can query the changes or execute time travel queries easily.<p>Technically, a tree of tries is mapped to an append-only data file using a persistent data structure (in the functional sense), COW with path copying and a novel sliding snapshot algorithm for the leaf data pages itself.<p>I always have the vision to implement different visualizations to compare the revisions in a web frontend, but I&#x27;m currently spending my time on improving the latency of both writes and reads.<p>Thus, if someone would like to help, that would be awesome :-)<p>Kind regards<p>Johannes<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;JohannesLichtenberger&#x2F;master-thesis&#x2F;blob&#x2F;master&#x2F;Master&#x2F;Thesis&#x2F;thesis.pdf" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;JohannesLichtenberger&#x2F;master-thesis&#x2F;blob&#x2F;...</a><p>[2] <a href="https:&#x2F;&#x2F;github.com&#x2F;sirixdb&#x2F;sirix" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;sirixdb&#x2F;sirix</a><p>[3] <a href="https:&#x2F;&#x2F;github.com&#x2F;sirixdb&#x2F;sirix&#x2F;tree&#x2F;master&#x2F;bundles&#x2F;sirix-core&#x2F;src&#x2F;main&#x2F;java&#x2F;org&#x2F;sirix&#x2F;diff&#x2F;algorithm&#x2F;fmse" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;sirixdb&#x2F;sirix&#x2F;tree&#x2F;master&#x2F;bundles&#x2F;sirix-c...</a><p>[4] <a href="https:&#x2F;&#x2F;youtube.com&#x2F;watch?v=l9CXXBkl5vI" rel="nofollow">https:&#x2F;&#x2F;youtube.com&#x2F;watch?v=l9CXXBkl5vI</a>
评论 #32283928 未加载
karmakaze将近 3 年前
I thought this was going to be something else like being able to tell that a rewritten query returns the same set of rows, but with potentially a very different query plan. E.g. dependent EXISTS subquery vs IN subquery.
评论 #32279326 未加载
评论 #32279195 未加载
评论 #32279273 未加载
difflens将近 3 年前
Interesting, will give sqlglot a look when we get to adding SQL support in DiffLens [<a href="https:&#x2F;&#x2F;github.com&#x2F;marketplace&#x2F;difflens" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;marketplace&#x2F;difflens</a>]. Or perhaps DiffLens can just use sqlglot :) Either way we&#x27;re very happy to see another semantic diff tool.<p>P.S: We work on DiffLens. It currently supports TS, JS, CSS and text diffs. We&#x27;re working on making a VS Code extension currently
noisy_boy将近 3 年前
I wonder if this is a topical thread to check if anyone is aware of a Java based solution to parse a CREATE VIEW statement to get a mapping between the view columns and the corresponding source table columns. I checked out jsqlparser[0] and it does produce an AST which can be parsed using the visitor-pattern[1] but was wondering if there is a more &quot;out-of-the-box&quot; solution involving less work. Due to various reasons, querying the database information schema is not an option I can pursue.<p>[0]: <a href="https:&#x2F;&#x2F;github.com&#x2F;JSQLParser&#x2F;JSqlParser" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;JSQLParser&#x2F;JSqlParser</a><p>[1]: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Visitor_pattern" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Visitor_pattern</a>
评论 #32285359 未加载
评论 #32288093 未加载
zasdffaa将近 3 年前
&gt; (a + b) =&gt; (b + a)<p>&gt; Semantically the query hasn’t changed<p>Now hang on a minute. Extend that to 3 and it can be (mssql but true in any I guess):<p><pre><code> declare @hi int = 2147483647; declare @lo int = -2147483648; declare @x int = @hi + @lo + @hi; -- ok declare @y int = @hi + @hi + @lo; -- &#x27;Arithmetic overflow error&#x27; </code></pre> Worse yet with floats. I see what you&#x27;re saying and good stuff, I&#x27;m thinking about this myself and I appreciate this article and will read it properly, but the edge cases have to be acknowledged.<p>Edit: this kind of thing is apparently something compiler writers keep rediscovering the hard way.
评论 #32304116 未加载
AeroNotix将近 3 年前
What about difftastic?
评论 #32278983 未加载
sk1pper将近 3 年前
Nitpick:<p>&gt; when a nested query is refactored into a common table expression (CTE), this kind of change doesn’t have any functional impact on either a query or its outcome<p>This isn’t quite true, at least in Postgres. It won’t affect the outcome, but it can affect the query plan.
评论 #32286184 未加载
评论 #32287501 未加载
tessierashpool将近 3 年前
this is very cool, but I believe this bit of the README is incorrect:<p><i>Text-based diff tools such as git diff, when applied to a code base, have certain limitations. First, they can only detect insertions and deletions, not movements or updates of individual pieces of code. </i><p>git diff can detect movements. looking at my .gitconfig, I think it&#x27;s the &quot;frag = magenta&quot; line.
评论 #32285754 未加载
评论 #32282546 未加载