科技回声

10 条评论

chrisjc将近 3 年前

Very nice/interesting.Somewhat related question and apologies if this is already stated in the documentation (it's rather dense and I haven't had the time to read through it completely)...Can you use sqlglot to create custom DDL dialects that have custom first class objects? For instance, if I want to build a custom SQL/DDL/DML dialect that had a new kind of object such as a "pipe", "kitchensink", etc, would sqlglot be a good tool to use?I've tried playing around with Apache Calcite, but it lost me pretty quickly since the examples to customize/extend DDLs were quite lacking in my opinion.

评论 #32280433 未加载

stochtastic将近 3 年前

I've been very impressed with sqlglot, and am looking forward to trying this feature. The only issue I've had with sqlglot is transpiling for use with a specific spark version: in my experience Spark is not great about surfacing obvious 'not registered' errors when a function isn't supported (especially in >=2.4). I ran into this with width_bucket, which is only in the most recent release. I am curious whether there's a straightforward way to write with a specific release and catch the error in transpilation rather than execution.Side note: Iaroslav (post author) and Toby (sqlglot creator) are both amazing, and I'm so glad that they're working on open source projects like this.

评论 #32280001 未加载

评论 #32280280 未加载

lichtenberger将近 3 年前

Really awesome work :-)I've implemented the Fast Match / Simple Edit script algorithm almost 10 years ago for my Master's thesis[1] for my database project[1][2] in order to import revisions of files with a hopefully minimal edit number of edit operations between the stored revision and a new one (back then it was for XML databases).The diffing was only one aspect for the visual analytics approach to compare the revisions (tree structures) visually [4]. Internally the nodes are addressed through dense, ascending 64bit ints stored in a special trie index. Furthermore, during the import optionally changes are tracked as well as a rolling hash is stored for each node optionally. After the import you can query the changes or execute time travel queries easily.Technically, a tree of tries is mapped to an append-only data file using a persistent data structure (in the functional sense), COW with path copying and a novel sliding snapshot algorithm for the leaf data pages itself.I always have the vision to implement different visualizations to compare the revisions in a web frontend, but I'm currently spending my time on improving the latency of both writes and reads.Thus, if someone would like to help, that would be awesome :-)Kind regardsJohannes[1] <a href="https://github.com/JohannesLichtenberger/master-thesis/blob/master/Master/Thesis/thesis.pdf" rel="nofollow">https://github.com/JohannesLichtenberger/master-thesis/blob/...</a>[2] <a href="https://github.com/sirixdb/sirix" rel="nofollow">https://github.com/sirixdb/sirix</a>[3] <a href="https://github.com/sirixdb/sirix/tree/master/bundles/sirix-core/src/main/java/org/sirix/diff/algorithm/fmse" rel="nofollow">https://github.com/sirixdb/sirix/tree/master/bundles/sirix-c...</a>[4] <a href="https://youtube.com/watch?v=l9CXXBkl5vI" rel="nofollow">https://youtube.com/watch?v=l9CXXBkl5vI</a>

评论 #32283928 未加载

karmakaze将近 3 年前

I thought this was going to be something else like being able to tell that a rewritten query returns the same set of rows, but with potentially a very different query plan. E.g. dependent EXISTS subquery vs IN subquery.

评论 #32279326 未加载

评论 #32279195 未加载

评论 #32279273 未加载

difflens将近 3 年前

Interesting, will give sqlglot a look when we get to adding SQL support in DiffLens [<a href="https://github.com/marketplace/difflens" rel="nofollow">https://github.com/marketplace/difflens</a>]. Or perhaps DiffLens can just use sqlglot :) Either way we're very happy to see another semantic diff tool.P.S: We work on DiffLens. It currently supports TS, JS, CSS and text diffs. We're working on making a VS Code extension currently

noisy_boy将近 3 年前

I wonder if this is a topical thread to check if anyone is aware of a Java based solution to parse a CREATE VIEW statement to get a mapping between the view columns and the corresponding source table columns. I checked out jsqlparser[0] and it does produce an AST which can be parsed using the visitor-pattern[1] but was wondering if there is a more "out-of-the-box" solution involving less work. Due to various reasons, querying the database information schema is not an option I can pursue.[0]: <a href="https://github.com/JSQLParser/JSqlParser" rel="nofollow">https://github.com/JSQLParser/JSqlParser</a>[1]: <a href="https://en.wikipedia.org/wiki/Visitor_pattern" rel="nofollow">https://en.wikipedia.org/wiki/Visitor_pattern</a>

评论 #32285359 未加载

评论 #32288093 未加载

zasdffaa将近 3 年前

> (a + b) => (b + a)> Semantically the query hasn’t changedNow hang on a minute. Extend that to 3 and it can be (mssql but true in any I guess):<pre><code> declare @hi int = 2147483647; declare @lo int = -2147483648; declare @x int = @hi + @lo + @hi; -- ok declare @y int = @hi + @hi + @lo; -- 'Arithmetic overflow error' </code></pre> Worse yet with floats. I see what you're saying and good stuff, I'm thinking about this myself and I appreciate this article and will read it properly, but the edge cases have to be acknowledged.Edit: this kind of thing is apparently something compiler writers keep rediscovering the hard way.

评论 #32304116 未加载

AeroNotix将近 3 年前

What about difftastic?

评论 #32278983 未加载

sk1pper将近 3 年前

Nitpick:> when a nested query is refactored into a common table expression (CTE), this kind of change doesn’t have any functional impact on either a query or its outcomeThis isn’t quite true, at least in Postgres. It won’t affect the outcome, but it can affect the query plan.

评论 #32286184 未加载

评论 #32287501 未加载

tessierashpool将近 3 年前

this is very cool, but I believe this bit of the README is incorrect:Text-based diff tools such as git diff, when applied to a code base, have certain limitations. First, they can only detect insertions and deletions, not movements or updates of individual pieces of code. git diff can detect movements. looking at my .gitconfig, I think it's the "frag = magenta" line.

Semantic Diff for SQL

10 条评论

Semantic Diff for SQL

10 条评论