Against SQL

499 点作者 deafcalculus将近 4 年前

58 条评论

slx26将近 4 年前

I think the problem of this essay is that it's overly technical: only those versed well enough in SQL will really care to read the whole thing, and if they are already at that level, either they accepted that "SQL will get the job done in the end", or they learned to live along it and now even kinda embrace it, and are happy to write about how the examples are very poor and dismiss the critique based on that, when the essay kinda explains it main point pretty well:>> The core message [...] is that there is potentially a huge amount of value to be unlocked by replacing SQLTo me, a lot of people defends SQL saying that "perfect is the enemy of good" and that SQL simply works. Not the favourite of anyone, but everyone kinda accepts it.And yeah, it's true. People use SQL because it's good enough, and trying to reinvent the wheel would take more work (individually speaking) than just dealing with SQL as it is right now. For large organizations where the effort could be justified, all your engineers already know SQL anyway, so it's not so great either.But for something so relevant as relational databases, perfect is not the enemy of good. We do deserve better. We generally agree that SQL has many pitfalls, it's not great for any kind of user (for non-technical users, a visual programming language would work well here, more like what Airtable does, closing the bridge between spreadsheet and hardcore database, and for technical users, it does feel unwieldy and quirky). We should be more open to at least consider critiques and proposals for better. We might find out that people, from time to time, are making some good points.

评论 #27792233 未加载

评论 #27792118 未加载

评论 #27792297 未加载

评论 #27793175 未加载

评论 #27792211 未加载

评论 #27792002 未加载

评论 #27794973 未加载

评论 #27792016 未加载

评论 #27792719 未加载

评论 #27793538 未加载

评论 #27808165 未加载

danbruc将近 4 年前

I wonder how much of the limitation are necessary in order for the query optimizer to have any chance at finding a good execution plan. As you add more and more abstractions and more and more general computations in the middle of your queries, it will probably become harder and harder for the query optimizer to understand what you are actually trying to do and figure out how to do it efficiently. Are you not running the risk that the database will have to blindly scan through entire tables calling your user defined functions as black boxes on each row?I would also guess that we could have a better SQL but I do not think it could and should look anything like a general purpose programming language because otherwise you might get in the way of efficient query execution. Maybe some kind of declarative data flow description with sufficient facilities to name and reuse bits and pieces.And maybe you actually want two languages which SQL with its procedural parts already kind of has. One limited but well optimizable language to actually access the data, one more general language to perform additional processing without having to leave the server. Maybe the real problem of SQL lies mostly in its procedural part and how it interfaces and interacts with the query part.

评论 #27792615 未加载

评论 #27792585 未加载

评论 #27792267 未加载

bob1029将近 4 年前

I feel like most frustrations with SQL boil down to fighting against a shitty schema.When you are sitting in a properly normalized database, it is a lot easier to write joins and views such that you can compose higher order queries on top.If you are doing any sort of self-joins or other recursive/case madness, the SQL itself is typically not the problem. Whoever sat down with the business experts on day 1 in that conference room probably got the relational model wrong and they are ultimately to blame for your suffering.If you have an opportunity to start over on a schema, don't try to do it in the database the first few times. Build it in excel and kick it around with the stakeholders for a few weeks. Once 100% of the participants are comfortable and understand why things are structured (related) the way they are, you can then proceed with the prototype implementation.Achieving 3NF or better is usually a fundamental requirement for ensuring any meaningfully-complex schema doesn't go off the rails over time.Only after you get it correct (facts/types/relations) should you even think about what performance issues might arise from what you just modeled. Premature optimization is how you end up screwing yourself really badly 99% of the time. Model it correctly, then consider an optimization pass if performance cannot be reconciled with basic indexing or application-level batching/caching.

评论 #27794616 未加载

评论 #27794419 未加载

评论 #27794385 未加载

评论 #27795213 未加载

NavinF将近 4 年前

>The usual response to complaints about the lack of union types in sql is that you should use an id column that joins against multiple tables, one for each possible type.>create table json_value(id integer);>create table json_bool(id integer, value bool)>create table json_number(id integer, value double);No, the usual response is "Don't do that!"99% of the time you either know the data types (so each JSON object becomes a row in a table where the column names are keys) or you don't know the data types and store the whole object as a BLOBI'd be on board with adding unions to SQL, but I doubt I'd use that feature very often.

评论 #27791935 未加载

progre将近 4 年前

I do love SQL and at least where I live (MS SQL Server) it can be made to run amazingly fast if you take some care with your queries and indexes. It's not portable though: as far as I know not a single one of the big sql vendors follows the standards 100% and more importantly, spending some time with one vendor will give you some habits that are sure to not work as well with another (cursor constructs are generally a death blow to performance in tsql but they are the way to do it on oracle for example). So I kind of agree with the author here.But I also feel that maybe they are asking a bit much from SQL. The complaint that complex subqueries are complex... Well then don't use them? I would use WITH constructs in that situation because I find them easier to read but that's beside the point. I think its perfectly fine to pull out multiple result sets from simple queries and then do the complex stuff in your host language.

评论 #27791809 未加载

评论 #27791878 未加载

评论 #27791815 未加载

评论 #27795118 未加载

评论 #27791821 未加载

评论 #27791791 未加载

james_woods将近 4 年前

SQL was not made for programmers alone. It has been invented also for not so technical people so that verboseness and overhead is part of the deal.>When Ray and I were designing Sequel in 1974, we thought that the predominant use of the language would be for ad-hoc queries by planners and other professionals whose domain of expertise was not primarily data- base management. We wanted the language to be simple enough that ordinary people could ‘‘walk up and use it’’ with a minimum of training.<a href="https://ieeexplore.ieee.org/document/6359709" rel="nofollow">https://ieeexplore.ieee.org/document/6359709</a>

评论 #27792332 未加载

评论 #27791906 未加载

评论 #27792553 未加载

erezsh将近 4 年前

I share the author's point of view, which led me to start a new relational programming language that compiles to SQL. It's a way to build on existing databases, like postgres or mysql, with all of their advantages, but improve on many of SQL's limitations.If that sounds interesting, you can find it here: <a href="https://github.com/erezsh/Preql" rel="nofollow">https://github.com/erezsh/Preql</a>

评论 #27795039 未加载

评论 #27825275 未加载

评论 #27792544 未加载

PudgePacket将近 4 年前

This is a great article and you can tell the author has deep experience with SQL from the way they speak and the other projects they're involved in.I think many of the comments here are missing the point by saying "Oh you can get around that issue in that example snippet by doing X Y Z". Sure there are workaround for everything if you know the One Weird Trick with these 10 gotchas that I won't tell you about... but that just makes the authors point.We can do better. We deserve better.What could things look like if you could radically alter the SQL language, replace it altogether, or even move layers from databases or applications into each other?Who knows if it will be better or worse, but I'd like to find out.

latte将近 4 年前

SQL and the relational model mostly works well and it's probably not practical to redo the enormous amount of work that was invested in SQL and its implementations and extensions.As someone who frequently used SQL for analytics and less frequently for app development, I would gladly use a language that would transparently translate to SQL while adding some syntactic niceties, like Coffeescript did to JS:- Join / subquery / CTE shortcuts for common use cases (e. g. for the FK lookups that are mentioned in the article)- More flexible treatment of whitespace (e. g. allow trailing commas, allow reordering of clauses etc.)And for the language to be usable, it would probably need: - First class support for some extended SQL syntax commonly used in practice (e.g. Postgres's additions)- integration with console tools (e.g. psql), common libraries (e.g. pandas, psycopg2) and schema introspection tools- editor support / syntax highlighting.It would probably be good to model the syntax of that language on some DSL-friendly general purpose language (like Scala, Kotlin or Ruby).

评论 #27792790 未加载

js4ever将近 4 年前

After a little bit more than 2 decades of coding, SQL is nearly the only thing that was constant in my career.It's a skill I used every working day, I'm pretty sure I will still use it in 20 years.On the other side, tt's very unlikely that the ORM 'du jour' will exist in 3 years from now.

评论 #27792697 未加载

smitty1e将近 4 年前

> First, while SQL allows user-defined types, it doesn't have any concept of a union type.Isn't a union type essentially a de-normalized field?This seems like attacking arithmetic operators for their lousy character string support.Weren't XML databases (briefly) a (marketing) thing some decades back?One idea might be to have everyone integrate jq[1] into their database engines. My understanding is that one can make the JSON do back flips with jq. Then we can move to complaining about queries that appear to have been written in Klingon instead of boring ol' SQueaL.[1] <a href="https://stedolan.github.io/jq/manual/" rel="nofollow">https://stedolan.github.io/jq/manual/</a>

评论 #27792619 未加载

评论 #27792815 未加载

评论 #27792590 未加载

SPBS将近 4 年前

SQL is a pretty warty implementation of relational databases (with non-composability being its primary sin IMO), but we're stuck with it at this point. A new querying DSL that fixes all of SQL's flaws is only half the story, getting enough programmers on the planet to buy into it is another half. To do that you'd need this new piece of software to at least be as fast and as battle tested as existing SQL databases. Even the new generation of massively-scalable relational databases stick with some form of SQL instead of inventing a new DSL because of the sheer momentum behind this sorry syntax.

sizzler将近 4 年前

Anybody can criticise SQL, programming languages, etc. It isn't hard and it doesn't make you better than the people that wrote them. When someone says "this thing that has been working fine for decades needs to be completely replaced" and barely mentions any alternative, I don't think they understand the process involved in replacing things or the terrible (non) proposition they are offering.Increment on SQL, write a translation layer, and see if people adopt it. Maybe 10 years from now your idea will be more popular than standard SQL. Most likely your idea sucks though and you will stay in the easy land of criticising things.The front-end is infinitely more complex than SQL on the backend. I write fairly common web applications and the SQL part is maybe 10% of my time, and very easy. React is where I spend most of my time. I don't have any problem that really needs to be solved. SQL works for me even though it isn't perfect. Any imperfections can most likely be incrementally fixed. I use tagged templates in JavaScript to deal with parameters, composability, and reusability.The fact the the author highlights GraphQL as supposedly the great alternative shows how ridiculous the proposition is. GraphQL does basically nothing. It is 10% of the functionally of SQL.

评论 #27792298 未加载

pjungwir将近 4 年前

It would be really cool if databases had an Option<T> type. Then you could remove all the NULLs. Although you can mark a column as NOT NULL, that restriction doesn't "travel": it isn't present for function inputs/outputs, subquery results, etc. Adding it to the type system gives you a lot more mileage. And then joins could be option-aware: an inner join would have outputs matching the input types, but an outer join would have Option outputs (for at least one side).I'm curious how much work has been done on optimizers for Tutorial D or other D variants. It looks way nicer to use, but I wonder if it is easier to stumble into pathological cases.

评论 #27797343 未加载

评论 #27791996 未加载

评论 #27791897 未加载

tome将近 4 年前

The only way out that I can see is to design embedded domain specific languages (EDSLs) that inherit the expressiveness, composability and type safety from the host language. That's what Opaleye and Rel8 (Postgres EDSLs for Haskell do. Haskell is particularly good for this. The query language can be just a monad and therefore users can carry all of their knowledge of monadic programming to writing database queries.This approach doesn't resolve all of the author's complaints but it does solve many.Disclaimer: I'm the author of Opaleye. Rel8 is built on Opaleye. Other relational query EDSLs are available.[1] <a href="https://github.com/tomjaguarpaw/haskell-opaleye/" rel="nofollow">https://github.com/tomjaguarpaw/haskell-opaleye/</a> [2] <a href="https://github.com/circuithub/rel8/" rel="nofollow">https://github.com/circuithub/rel8/</a>

kthejoker2将近 4 年前

As someone from the analytics side who's been working with SQL for 30 years (First Choice, remember that?) (but who also wrote a fair share of ORM boilerplate), I find these debates fascinating .. but also kind of trivial, in the sense that SQL has a lot of other pros and cons that app devs rarely consider.Truly it is blind men evaluating an elephant.Given SQL's roots as a human-friendly declarative interface, the only thing I see completely replacing it in the near future is a Copilot-style neural implant where you just think of the results you want.

ngrilly将近 4 年前

A more pragmatic view in that article: <a href="https://blog.nelhage.com/post/some-opinionated-sql-takes/" rel="nofollow">https://blog.nelhage.com/post/some-opinionated-sql-takes/</a>

评论 #27792549 未加载

rawoke083600将近 4 年前

I've been thinking about this problem a lot, CRUDs, GraphQL,ORMs, Models etc. Mostly in the "CRUD-Like" environment. I have been thinking about a "client side SQL impl".In most CRUD's we currently have on the backend layers and layers of software with ORMS, frameworks etc, and it all boils down to "Writing/Generating the correct(good-enough) SQL"We now have added stuff like GraphQL, which if you squint hard enough (ok very hard) can be seen as being a SQL alternative(Language to get the actual data).Maybe SQL + "GraphQL-Like" Layers should "evolve" into ONE common "data scripting language" ?Maybe we have something like "ClientSide-SQL" - which can be a subset of ServerSide-SQL ?We need the "TypeScript" of "data-querying" which can be run on the server,client, moon and my device, where one can also only define any "Types" ONCE.Anywhoo - I think there is still a lot to be done, researched and discovered in this section of CS :)

评论 #27794825 未加载

twodave将近 4 年前

We have a general rule on our team that complex SQL is a code smell. In our project complex queries are usually an indication of a poor design.Anything SQL that can be made simpler via dynamic generation (which is safe as long as you use proper parameters for user inputs) is favored over creating logical branches in queries. Anything that can be processed further quickly in memory in the app (mapping operations, string ops, ordering/filtering predictably small data sets, etc.) we tend to offload from SQL into something more suitable.And we tend to solve a class of problem in our data layer and reuse those generalized patterns heavily. This makes our codebase predictable even when dealing with unfamiliar subject matter.Of course there are always places where some complex query is necessary (especially when building reports), but if it’s status quo then you’re doing something wrong—-it’s only a matter of time until you end up with a performance nightmare on your hands.

评论 #27798638 未加载

评论 #27794580 未加载

thinkr42将近 4 年前

Though verbose and somewhat strange at times, one thing I love about SQL is that the query statements read like a set definition from set theory. That declarative nature is pretty powerful IMO, sure there are hiccups but it is a different way of thinking.

评论 #27792991 未加载

评论 #27793043 未加载

benjiweber将近 4 年前

> By far the most common case for joins is following foreign keys. SQL has no special syntax for thisYou can use NATURAL JOINselect * from foo natural join barWorks as long as the keys are named the same. However, a lot of people have a habit of naming keys differently in the two tables.

评论 #27792428 未加载

评论 #27791882 未加载

评论 #27792292 未加载

croes将近 4 年前

>what if we want to return the salary too?>the only solution is to change half of the lines in the queryHow about adding a second subquery for the salary.

评论 #27791729 未加载

评论 #27791681 未加载

Izkata将近 4 年前

The GROUP BY section is odd:> You can use as to name scalar values anywhere they appear. Except in a group by.<pre><code> -- can't name this value > select x2 from foo group by x+1 as x2; ERROR: syntax error at or near "as" LINE 1: select x2 from foo group by x+1 as x2; -- sprinkle some more select on it > select x2 from (select x+1 as x2 from foo) group by x2; ?column? ---------- (0 rows) </code></pre> Looking at that first one I'm just kinda like "well duh, there's nothing special there" - it doesn't work with ORDER BY either, you use that to rename columns (on SELECT) or tables (on FROM and JOIN).And then it goes on to show ways to work around that:> Rather than fix this bizaare oversight, the SQL spec allows a novel form of variable naming - you can refer to a column by using an expression which produces the same parse tree as the one that produced the column.Instead of just... using the renamed column?<pre><code> select x+1 as x2 from foo group by x2;</code></pre>

评论 #27794776 未加载

评论 #27797359 未加载

nojvek将近 4 年前

I agree with the Author. SQL is not a great query language. Almost every decently sized app I have written I have needed some sort of a query compiler so I don’t have to deal with nuances.Also agree that GraphQL is a pretty fantastic language for working with graphs. And that relational databases are essentially graphs. Hasura is neat.

fbn79将近 4 年前

Admit have not read the article but has of my personal experience I think the hostility of developers vs SQL came from lack of fundamental formation and experience in declarative programming and full constant every day immersion in imperative programming.

评论 #27792171 未加载

评论 #27792011 未加载

评论 #27792075 未加载

评论 #27791986 未加载

thayne将近 4 年前

> So instead the best we can do is add json to the SQL spec and hope that all the databases implement it in a compatible way (they don't).Of course they are incompatible. That's just par for the course when it comes to SQL.

chris_wot将近 4 年前

This isn't just a matter of some constant programmer overhead, like SQL queries taking 20% longer to write.20% longer to write than what alternative? And how is this being measured?And.. am I missing something?By far the most common case for joins is following foreign keys. SQL has no special syntax for this:<pre><code> select foo.id, quux.value from foo, bar, quux where foo.bar_id = bar.id and bar.quux_id = quux.id </code></pre> Why can't this be expressed as an INNER JOIN?And can't some of these subqueries be written using a WHERE EXISTS or a windowing function?

评论 #27792121 未加载

评论 #27791783 未加载

评论 #27791760 未加载

_the_inflator将近 4 年前

I feel the pain.As someone who only uses SQL a couple of times a year, I feel that SQL shares the same fate as everything in IT: invented almost 50 years ago, not with today's world in mind, it has been blown up somewhat. Reminds me a bit of JavaScript: everything that can be done in JavaScript, will be done in JavaScript.Like after C followed C++ and here Java and others there will be new DSL and techniques on top of SQL.The article has its merits. Better abstractions for different use cases.

评论 #27792034 未加载

jackbravo将近 4 年前

Here in hacker news it was posted this article about the story of SQL biggest rival, QUEL, which is pretty related: <a href="https://www.holistics.io/blog/quel-vs-sql/" rel="nofollow">https://www.holistics.io/blog/quel-vs-sql/</a>?It is not that we didn't try to replace it, but just as other comments have said, SQL was good enough, and already has the biggest mind share.

asavinov将近 4 年前

One alternative to SQL (type of thinking) is Column-SQL [1] which is based on a new data model. This model is relies on two equal constructs: sets (tables) and functions (columns). It is opposed to the relational algebra which is based on only sets and set operations. One benefit of Column-SQL is that it does not use joins and group-by for connectivity and aggregation, respectively, which are known to be quite difficult to understand and error prone in use. Instead, many typical data processing patterns are implemented by defining new columns: link columns instead of join, and aggregate columns instead of group-by.More details about "Why functions and column-orientation" (as opposed to sets) can be found in [2]. Shortly, problems with set-orientation and SQL are because producing sets is not what we frequently need - we need new columns and not new table. And hence applying set operations is a kind of workaround due the absence of column operations.This approach is implemented in the Prosto data processing toolkit [0] and Column-SQL[1] is a syntactic way to define its operations.[0] <a href="https://github.com/asavinov/prosto" rel="nofollow">https://github.com/asavinov/prosto</a> Prosto is a data processing toolkit - an alternative to map-reduce and join-groupby[1] <a href="https://prosto.readthedocs.io/en/latest/text/column-sql.html" rel="nofollow">https://prosto.readthedocs.io/en/latest/text/column-sql.html</a> Column-SQL (work in progress)[2] <a href="https://prosto.readthedocs.io/en/latest/text/why.html" rel="nofollow">https://prosto.readthedocs.io/en/latest/text/why.html</a> Why functions and column-orientation?

mlinksva将近 4 年前

Inspiring article (I love SQL, but it's also frustrating). My only wish would be to see the criticisms used as a checklist to evaluate SQL improvements or new query languages.EdgeQL, indirectly linked at the end of the article, looks at a glance like it might score well. EdgeDB's blog post [1] criticizing SQL and introducing EdgeQL seems to cover the same concepts (inexpressive, incompressible, non-porous) with slightly differing language in some cases (e.g.. system cohesion for porousness).Noticed after posting this comment that there's a post today about EdgeQL. [2][1] <a href="https://www.edgedb.com/blog/we-can-do-better-than-sql" rel="nofollow">https://www.edgedb.com/blog/we-can-do-better-than-sql</a> [2] <a href="https://news.ycombinator.com/item?id=27793398" rel="nofollow">https://news.ycombinator.com/item?id=27793398</a>

historyloop将近 4 年前

SQL isn't immutable, it's always evolving. I find it awkward some of the arguments in the article read like "you couldn't do that before CTE was added". But it WAS added, so?If you want to fix SQL, contribute to the next version of the standard, or provide example by implementing what you want to see out there.

jmull将近 4 年前

So what?Complaining about SQL is the easy part. Actually, it's the first skill most new SQL developers truly master.I'm waiting for the viable alternative. There are a lot (a LOT) of solutions that handle some cases, but inevitably you need to get into the SQL anyway because that's the DBMS' native API (and now you also need to fight your way through the abstraction, oh and since there are a LOT of solutions a different one is used every chance someone gets, so you need to relearn how to fight through the abstraction all the time).I doubt it's going to change. There's actually no significant reason. SQL (actually, the set of mutually incompatible SQL variants) is thoroughly entrenched and a small problem... that is, it's rarely the dominant reason a project/product succeeds or fails, or takes too long, or becomes unmaintainable, etc.

jandrewrogers将近 4 年前

Another subtle issue with SQL is that it tacitly assumes a great deal about the internal architecture of the database engine implementing it. SQL is designed to be easy to implement for the way SQL databases worked in the 1990s. Unfortunately, modern high-end databases today have radically different internal architectures, are capable of much greater internal expressivity as a minimum, and are designed to support data models as first-class citizens that weren't even on the radar in the 1990s. Patching the first-class capabilities of modern database kernels into the SQL language, such as generalized recursion, can often be awkward or require non-standard syntax or behaviors that defeat easy optimization. The DDL has similar issues, particularly around its concept of what an "index" can look like under the hood or the myriad ways in which data can be organized.I've used and even written SQL databases for much of my career. SQL is pretty satisfactory for what it was designed to do. I view SQL like classic inheritance-based OOP; it works well for the problem domains for which it was originally designed, but is poor for efficiently expressing problem domains that are better expressed in a composition-based or functional way. Yet it worked so well in its original domain that we try to apply it everywhere. The diversity of data models and the kinds of operations we want to do with them today is far greater than was considered when SQL crystallized into its current form.The limitation of most nominal SQL replacements I've seen is that they commit the same sin of SQL originally: overfitting for a problem domain that the designer was most interested in. There is an appetite for a really good SQL replacement if done well, and in principle anything SQL can do could be directly translated into a new language for compatibility.

ComodoHacker将近 4 年前

>By far the most common case for joins is following foreign keys. SQL has no special syntax for thisThat's because there can be more than one FK relationship between the same two tables. For example, if we model a binary tree, there could be references to left, right and parent nodes.

KingOfCoders将近 4 年前

We're currently moving into a different direction, removing Spark Code to move most of the stuff into BigQuery SQL (which can use structs, one of the points of the article), because it's easier for Data (Engineers|Analysts|Scientists) to write SQL than e.g. Scala.

评论 #27793685 未加载

LeonB将近 4 年前

I would like a typescript style transpiler tool chain for testing out new language features that are seamlessly transpiled down to existing sql.Once that’s in place I don’t know which features I’d want first… but there’s a lot of them!

mcv将近 4 年前

Having worked a lot with neo4j, a graph DB, over the past two years, I must say I'm surprised how rigid and inexpressive SQL is by comparison. We started our project with a SQL database, but some queries would be 10 or more lines with multiple joins. Very hard to read. Once we switched to neo4j, the same query was a single, easily readable line.SQL is very well-established, but it's also old, and it shows its age. It's kinda weird how easily we jump from one programming language to another, and yet we can't seem to move on from our main relational query language.

cm2187将近 4 年前

One thing I don't understand in SQL is why creating a tmp table is so verbose, why we can't use type inference.There is an internal software where I work where to create a tmp table you just assign the result of the query to a variable. It is so much nicer. So for instance creating a tmp table becomes as simple as the below, no need to declare each columns, to do an insert, to drop the table in the end:<pre><code> @t = select colA, colB from tbl select top 10 * from @t order by colB</code></pre>

评论 #27791771 未加载

评论 #27791757 未加载

评论 #27791756 未加载

AtNightWeCode将近 4 年前

Not that bad workarounds. The N + 1 problem is usually not a big issue with ORM:s but one should the check the generated code I think. Seen far worse written code. (Well, if you don't do SELECT *...)I have other issues with SQL:The linear way resources are needed with the amount of data but no built in way to handle it.That integer ids are way overused and basically locking every database to a specific environment.The index tweaking.The workarounds for write speed.The fact that you can do anything in SQL and people know it.

lenkite将近 4 年前

Been coding for over a decade and written thousands of simple and complex queries and I have always thought SQL sucked but was too afraid to ever express that opinion since everyone else believes it is the best thing since sliced bread. Quite relieved that some experts feel the same way.

zug_zug将近 4 年前

I guess I don't get it. It uses a bunch of big sounding technical terms ("inexpressive" "non-pourous") to criticize sql, but when I actually read it this seems to be mostly miniscule details that could be added trivially to an SQL engine if there was demand. For example, joining natively on foreign key seems like a trivial convenience, I'm not sure it proves any larger point to me, many people prefer code that is more verbose and clear about what it does than magical/implicit.Another example complaint hidden behind a ominous-sounding word boils down to "Using a table expression inside a scalar expression is generally not possible, unless the table expression returns only 1 column and either a) the table expression is guaranteed to return at most 1 row or b) your usage fits into one of the hard-coded patterns such as exists."Uh, great I've never needed to do that in my career, and so if you care so much make a PR, but suggesting that SQL itself is somehow the problem is laughable. It would be orders of magnitude more effort to try to standardize the industry on a new query language than to patch table expressions. I can scarcely imagine what a productivity loss it would be to the industry of SQL standardization were dropped, it would be much worse than python 2/3 debacle.Also "incompressible" - Sounds like the author doesn't use views/materialized-views.Finally the "fragile" example is just the author writing a bad query. The example here is performant and less fragile: <a href="https://stackoverflow.com/questions/612231/how-can-i-select-rows-with-maxcolumn-value-partition-by-another-column-in-mys" rel="nofollow">https://stackoverflow.com/questions/612231/how-can-i-select-...</a>etc.

bvrmn将近 4 年前

> fk_join(foo, 'bar_id', bar, 'quux_id', quux)This example has same amount of semantic entities as in SQL. Also there is USING. Also why author needs a strict modeling over json when one can model in native types? It's a very strange article.

评论 #27791942 未加载

Crash0v3rid3将近 4 年前

I’m always asked how I am so good at sql. I laugh given I know how crappy my sql skills are. It’s really that I just know our schema so well I can formulate a decent enough query to extract what I need.Knowing your schema design is just as important as knowing sql.

JoelJacobson将近 4 年前

I suggest using the fact foreign keys are constraints with unique names, and using these names to explicitly specify what column(s) to join between the two foreign key tables.In PostgreSQL [2], foreign key contraint names only need to be unique per table, which allows using the foreign table "as is" as the constraint name, which allows for nice short names. In other databases, the names will just need to be a little longer.Given this schema:<pre><code> CREATE TABLE baz ( id integer NOT NULL, PRIMARY KEY (id) ); CREATE TABLE bar ( id integer NOT NULL, baz_id integer, PRIMARY KEY (id), CONSTRAINT baz FOREIGN KEY (baz_id) REFERENCES baz ); CREATE TABLE foo ( id integer NOT NULL, bar_id integer, PRIMARY KEY (id), CONSTRAINT bar FOREIGN KEY (bar_id) REFERENCES bar ); </code></pre> We could write a normal SQL query like this:<pre><code> SELECT bar.id AS bar_id, baz.id AS baz_id FROM foo JOIN bar ON bar.id = foo.bar_id LEFT JOIN baz ON baz.id = bar.baz_id WHERE foo.id = 123 </code></pre> I suggest adding a new binary operator, allowed anywhere where a table name is expected, taking the table alias to join from as left operand, and the name of the foreign kery contraint to follow as the right operand.Perhaps "->" could be used for this purpose, since it's currently not used by the SQL spec in the FROM clause.This would allow rewriting the above query into this:<pre><code> SELECT bar.id AS bar_id, baz.id AS baz_id FROM foo JOIN foo->bar LEFT JOIN bar->baz WHERE foo.id = 123 </code></pre> Where e.g. "foo->bar" means:<pre><code> follow the foreign key constraint named "bar" on the table/alias "foo" </code></pre> If the same join type is desired for multiple joins, another idea is to allow chaining the operator:<pre><code> SELECT bar.id AS bar_id, baz.id AS baz_id FROM foo LEFT JOIN foo->bar->baz WHERE foo.id = 123 </code></pre> Which would cause both joins to be left joins.<pre><code> SELECT bar.id AS bar_id, baz.id AS baz_id FROM foo LEFT JOIN foo->bar->baz WHERE foo.id = 123 </code></pre> [1] <a href="https://scattered-thoughts.net/writing/against-sql/" rel="nofollow">https://scattered-thoughts.net/writing/against-sql/</a>[2] <a href="https://www.postgresql.org/" rel="nofollow">https://www.postgresql.org/</a>

评论 #27793089 未加载

gumby将近 4 年前

SQL is a COBOL-era language — though there are 15 years between them, language theory was quite rudimentary at that time.But it exists and is adequate. And, as Gabriel’s famous essay says, Worse is Better.

tritiy将近 4 年前

Was this written by a nnet? I found it so hard to read as if it author has written it in another language and then used some weird translation engine.

cletus将近 4 年前

The complexity of the SQL spec is a fair point. Inconsistencies between implementations has some merit but in practice doesn't really matter (eg how often do you really replace your database?).A lot of the rest of it reads like the author started with this conclusion and then went looking for justification.Example: the author states it's hard to return more than one column with a correlated subquery. That's what with clauses or join with queries are for. The author later mentions with statements so is aware of them.As for JSON, I honestly don't think anybody needs that. Either return a JSON blob (generally bad idea IMHO) or you need to construct it in code.The example of join verbosity has issues too. First, abbreviated syntax would need to express what kind of join to do (eg inner vs outer). Second, I find this fairly natural:<pre><code> SELECT ... FROM a JOIN b ON a.id = b.a_id LEFT OUTER JOIN c ON b.id = c.b_id </code></pre> The author instead used this syntax:<pre><code> SELECT FROM a, b, c WHERE a.id = b.a_id AND b.id = c.b_id </code></pre> The also leaves the join type unexpressed. In some SQLs you say:<pre><code> AND b.id = c.b_id (+) </code></pre> But that's kind of ugly and old-fashioned. The first syntax is preferable and clear.On "compressability", SQL has this. They're called views. GraphQL has a notion called fragments that SQL doesn't. This is one of those things that sounds like a good idea but probably isn't. It makes queries much harder to read and I've seen this reach the point where a fragment is so widely used changing it is expensive (eg generated code) and removing anything is impossible. Plus a lot of users end up querying things they don't need.Poor optimization and error messages of with clauses aren't really an argument against SQL. They're an argument against particular implementations. Extracting an anonymous query into a WITH clause should be a no-op to performance for any half-decent query optimizer/executor.Writing extensions (eg functions) should be discouraged. It's harder to deploy and debug and the last thing you want is a badly written C function crashing your database.Years ago we also had stored procedures (eg Oracle PL/SQL) and nobody does that anymore because it's terrible. You don't want that.There's a lot in there about pathological corner cases that I honestly don't really care about.I do agree that ORMs are generally a disaster.Lastly, it's worth noting that SQL unless a lot of alternatives has a solid theoretical basis and that is relational algebra. SQL wasn't created in a vacuum. SQL is just a way to express those constructs.I will say that SQL got the order of clauses wrong whereas LINQ got this right. SQL should actually look more like this:<pre><code> FROM a WHERE a.foo = 'bar' SELECT id, col1, col2 </code></pre> Honestly though, SQL just isn't "broken". That's why it's endured so long despite the NoSQL fad and various efforts to replace it.

patkai将近 4 年前

Am surprised that in such a long thread nobody mentions RethinkDB.

评论 #27792336 未加载

评论 #27792335 未加载

chubot将近 4 年前

Amazing critique! It has a wealth of examples -- I liked the "N+1 query bugs" and "feral concurrency" links (stuff I've experienced but didn't have a name for).----The comparison of SQL vs. flink windowing ("kernel space" vs "user space") reminds me of the this 2013 call to change the design of browsers feaetures:<a href="https://extensiblewebmanifesto.org/" rel="nofollow">https://extensiblewebmanifesto.org/</a>Basically there's a lot of stuff implemented stuff in the C++ layer of the browser that's impossible to emulate in JavaScript, and that's a bad design.It is indeed alarming how much syntax SQL has. It reminds me of shell, where every string manipulation function like stripping a prefix has custom syntax like ${x//pat/replace} or ${x%%prefix}. Oil (<a href="https://www.oilshell.org/" rel="nofollow">https://www.oilshell.org/</a>) will simply have functions for this, like x.sub('pat', 'replace').----I also wonder if the author has worked with dplyr and the tidyverse at all? He mentions Pandas, but IMO it's a clunkier imitation of those ideas (and I'm saying that as a Python programmer).Tidy data was my intro to the design of dplyr: <a href="http://vita.had.co.nz/papers/tidy-data.html" rel="nofollow">http://vita.had.co.nz/papers/tidy-data.html</a>It's very inspired by the relational model, but it has a few more operations like "gather" and "spread" which turn "long" format into "wide" format and vice versa.It has a clean and expressive API: <a href="https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf" rel="nofollow">https://www.rstudio.com/wp-content/uploads/2015/02/data-wran...</a>It composes like regular code, so you can write stuff like:<pre><code> bin_sizes %>% select(c(host_label, path, num_bytes)) %>% left_join(bytecode_size, by = c('host_label')) %>% mutate(native_code_size = num_bytes - bytecode_size) -> sizes </code></pre> Good comparison of the relational model and data frames: Is a Dataframe Just a Table? <a href="https://plateau-workshop.org/assets/papers-2019/10.pdf" rel="nofollow">https://plateau-workshop.org/assets/papers-2019/10.pdf</a>I link all of these in What is a Data Frame? (In Python, R, and SQL) <a href="https://www.oilshell.org/blog/2018/11/30.html" rel="nofollow">https://www.oilshell.org/blog/2018/11/30.html</a>

pjmlp将近 4 年前

> Why did SQL have to add it to the language spec?Most likely because there isn't cargo for SQL, everyone has to make do with a default install offers, and most big boys databases offer FFI to Java, .NET and C.> This works for data modelling (although it's still clunky because you must try joins against each of the tables at every use site rather than just ask the value which table it refers to)Only if one never learned what views are for, and the various flavours they come in.> By far the most common case for joins is following foreign keys. SQL has no special syntax for this:<pre><code> select foo.id, quux."value" from foo inner join bar on foo.bar_id = bar.id inner join quux on bar.quux_id = quux.id </code></pre> Really, how much time was spent learning SQL before complaining?

评论 #27791869 未加载

评论 #27792579 未加载

trapatsas将近 4 年前

There are two types of people in the world. The ones that are pro-SQL and the ones that don’t understand how SQL works

评论 #27792594 未加载

ltbarcly3将近 4 年前

Lots of the examples here are yhe author writing very poor, non idiomatic SQL and then criticizing it.I could write a point by point rebuttal but I'll just pick one point, compressibility: VIEWs.

评论 #27791847 未加载

评论 #27791732 未加载

bullen将近 4 年前

I thought they where talking about the data not being able to compress, the actual queries don't need to be compressed.But you need to separate the data and the index so you can compress the data while still searching the index, and none of the SQL databases do that because they don't have one file per value (for obvious disk-size reasons).We need to approach the database as files, even add features to our filesystems to accomodate that.In my distributed HTTP/JSON database I use ext4 type small to not run out of inodes before disk space.

评论 #27791768 未加载

historyloop将近 4 年前

Did the author forget that we had this entire "NoSQL" period that lasted well over a decade, where SQL was the worst thing ever, and everyone kept coming with the superior alternatives to SQL?What happened?What happened is many of those NoSQL products started adding SQL syntax and features to their databases, others disappears, and yet others specialized into niches where they don't compete with SQL RDBMS at all, which remains the primary database paradigm and language.So those are the facts. If someone still believes they know better, put up or shut up.

评论 #27795520 未加载

justshowpost将近 4 年前

It's all about the background. For HLL and even basic programmers grasping SQL poses little-to-no challenges. Some are even falling in love with SQL despite some minor inconsistencies and prolix wordy verbosity and asking for writing more SQL.In contrast, users of, for example, the lingo where object minus object equals NaN are terrified when suddenly exposed to type zoo like <a href="https://www.postgresql.org/docs/9.5/datatype.html" rel="nofollow">https://www.postgresql.org/docs/9.5/datatype.html</a> (Disclaimer: a relatively randomly chosen example, neither endorsement nor preference of particular RDBMS/dialect). And let's keep in mind what types above form a structures and these structures getting manipulated en mass as intrinsically unordered sets (which are data types too!). That is, a leap from barely existing concept of data types to circa 30% of DDL/DML keeps scripters out of SQL.So the reason behind that endless «SQL bad» teeth gnashing turns out to be very simple.

tonymet将近 4 年前

Relational Tables & SQL should be just one storage mechanism for your app.What if someone told you: build an app, but only use b-trees? Then you start complaining about all the shortcomings of b-trees.The point is that you have relational tables / SQL, along with many other persistence , storage & indexing mechanisms: distributed hashtables, queues, lists, etc.All the apps I've worked on have mixed SQL with all of the other data structures with consistent or inconsistent replication among them depending on the use-case.One way to manage this is a key-value online tier and a relational offline tier, with inconsistent replication online to offline.SQL & RDMBS are very powerful, but like any tool, limited to the designated use case. Stop trying to make it do everything.

roenxi将近 4 年前

One of the elephants in the room with SQL is that it is one of a small number of popular languages that doesn't use<pre><code> function(arg, arg, arg) </code></pre> It is strange that "SELECT a, b, c FROM schema.table" keeps any aura of respectability. That is legitimately outdated syntax, people don't write languages that way any more. It was a 70s era experiment and what was learned from that experiment is that the style has no upside and comes with downsides. It should be 2 or 3 functions, with brackets.With full knowledge of SQL, the successful languages that followed it were C/Python/Java/Javascript that use lots of functions and a smattering of special syntax for control structures.

评论 #27791804 未加载

评论 #27792402 未加载

评论 #27792054 未加载

评论 #27791862 未加载

评论 #27791863 未加载

58 条评论

slx26将近 4 年前

评论 #27792233 未加载

评论 #27792118 未加载

评论 #27792297 未加载

评论 #27793175 未加载

评论 #27792211 未加载

评论 #27792002 未加载

评论 #27794973 未加载

评论 #27792016 未加载

评论 #27792719 未加载

评论 #27793538 未加载

评论 #27808165 未加载

danbruc将近 4 年前

评论 #27792615 未加载

评论 #27792585 未加载

评论 #27792267 未加载

bob1029将近 4 年前

评论 #27794616 未加载

评论 #27794419 未加载

评论 #27794385 未加载

评论 #27795213 未加载

NavinF将近 4 年前

评论 #27791935 未加载

progre将近 4 年前

评论 #27791809 未加载

评论 #27791878 未加载

评论 #27791815 未加载

评论 #27795118 未加载

评论 #27791821 未加载

评论 #27791791 未加载

james_woods将近 4 年前

评论 #27792332 未加载

评论 #27791906 未加载

评论 #27792553 未加载

erezsh将近 4 年前

评论 #27795039 未加载

评论 #27825275 未加载

评论 #27792544 未加载

PudgePacket将近 4 年前

latte将近 4 年前

评论 #27792790 未加载

js4ever将近 4 年前

评论 #27792697 未加载

smitty1e将近 4 年前

评论 #27792619 未加载

评论 #27792815 未加载

评论 #27792590 未加载

SPBS将近 4 年前

sizzler将近 4 年前

评论 #27792298 未加载

pjungwir将近 4 年前

评论 #27797343 未加载

评论 #27791996 未加载

评论 #27791897 未加载

tome将近 4 年前

kthejoker2将近 4 年前

ngrilly将近 4 年前

A more pragmatic view in that article: <a href="https://blog.nelhage.com/post/some-opinionated-sql-takes/" rel="nofollow">https://blog.nelhage.com/post/some-opinionated-sql-takes/</a>

评论 #27792549 未加载

rawoke083600将近 4 年前

评论 #27794825 未加载

twodave将近 4 年前

评论 #27798638 未加载

评论 #27794580 未加载

thinkr42将近 4 年前

评论 #27792991 未加载

评论 #27793043 未加载

benjiweber将近 4 年前

评论 #27792428 未加载

评论 #27791882 未加载

评论 #27792292 未加载

croes将近 4 年前

>what if we want to return the salary too?>the only solution is to change half of the lines in the queryHow about adding a second subquery for the salary.

评论 #27791729 未加载

评论 #27791681 未加载

Izkata将近 4 年前

评论 #27794776 未加载

评论 #27797359 未加载

nojvek将近 4 年前

fbn79将近 4 年前

评论 #27792171 未加载

评论 #27792011 未加载

评论 #27792075 未加载

评论 #27791986 未加载

thayne将近 4 年前

chris_wot将近 4 年前

评论 #27792121 未加载

评论 #27791783 未加载

评论 #27791760 未加载

_the_inflator将近 4 年前

评论 #27792034 未加载

jackbravo将近 4 年前

asavinov将近 4 年前

mlinksva将近 4 年前

historyloop将近 4 年前

jmull将近 4 年前

jandrewrogers将近 4 年前

ComodoHacker将近 4 年前

KingOfCoders将近 4 年前

评论 #27793685 未加载

LeonB将近 4 年前

mcv将近 4 年前

cm2187将近 4 年前

评论 #27791771 未加载

评论 #27791757 未加载

评论 #27791756 未加载

AtNightWeCode将近 4 年前

lenkite将近 4 年前

zug_zug将近 4 年前

bvrmn将近 4 年前

评论 #27791942 未加载

Crash0v3rid3将近 4 年前

JoelJacobson将近 4 年前

评论 #27793089 未加载

gumby将近 4 年前

tritiy将近 4 年前

Was this written by a nnet? I found it so hard to read as if it author has written it in another language and then used some weird translation engine.

cletus将近 4 年前

patkai将近 4 年前

Am surprised that in such a long thread nobody mentions RethinkDB.

评论 #27792336 未加载

评论 #27792335 未加载

chubot将近 4 年前

pjmlp将近 4 年前

评论 #27791869 未加载

评论 #27792579 未加载

trapatsas将近 4 年前

There are two types of people in the world. The ones that are pro-SQL and the ones that don’t understand how SQL works

评论 #27792594 未加载

ltbarcly3将近 4 年前

Lots of the examples here are yhe author writing very poor, non idiomatic SQL and then criticizing it.I could write a point by point rebuttal but I'll just pick one point, compressibility: VIEWs.