Nothing of essence in this post. I had a look at the other posts they all look similar. I greatly dislike this new trend of "doc-spaming" where everyone makes a blog writing walls of text explaining trivial things, crowding out the actual docs from search results.<p>I swear to god whenever I search for something in the python standard library I have to scroll past a bunch of sites like this before i find the actual docs.
There's no good reason for using two different keywords for filtering and it harms composability. But of course SQL isn't very composable anyhow.<p>Preferably, SQL would look more like:<p><pre><code> FROM Foo f
WHERE f.value < 10
JOIN Bar b on b.id = f.bar_id
WHERE b.other_value > 20
SELECT f.group, f.value
GROUP BY f.group the_group AGGREGATING SUM(f.value) the_sum
WHERE the_sum > 100
ORDER BY the_group
</code></pre>
And so on. FROM introduces a source of data which flows into the next line. SELECT does a projection, GROUP BY applies aggregation operators on groups and also does a projection. WHERE could be inserted pretty much anywhere.
In one interview I had, I was asked, "What are you good at?". I said, "Pretty good with SQL stuff". This was _literally_ the question they asked. It was a good jumping off point for them to probe how much I knew. I like this article's explanation. The thing that trips me up on specific databases, is whether aliases assigned in the "select" portion are available in the "group by" sections.
Huh, I always thought HAVING was the equivalent for GROUP BY that WHERE is for SELECT, i.e. that it happened during aggregation, just as WHERE happens during data retrieval from the table. But apparently you can use HAVING instead of WHERE and it would work the same even if you don't have a GROUP BY clause (and even though it would possibly be less efficient)? Seems a bit redundant.
For me the aha moment was to understand the logical order of SQL operations: <a href="https://blog.jooq.org/a-beginners-guide-to-the-true-order-of-sql-operations/" rel="nofollow">https://blog.jooq.org/a-beginners-guide-to-the-true-order-of...</a>. Never had a problem since.
SELECT department_id, count(*) AS employees_no
FROM employee
WHERE gender = 'F'
GROUP BY department_id
HAVING employees_no < 10;<p>This phrase returns departments where the number of <i>female</i> employees is <10 right, rather than listing female employees from departments that have <10 employees <i>in total</i>?
<p><pre><code> select category, sum(amount) as total
from mytable
group by category
HAVING total > 100
</code></pre>
is equal to (and potentially more performant than)<p><pre><code> select category, total
from (
select category, sum(amount) as total
from mytable
group by category
) as mysubquery
WHERE total > 100</code></pre>
The article does a good job of explaining the difference between WHERE and HAVING. The simplest resource I've found for this is Julia Evans' "SQL queries run in this order" [0], which points out for example that SELECTs are one of the last things to run in a query.<p>I've managed software teams and data engineering teams, and both teams get tripped up with even moderate SQL queries. To simplify, we encouraged teams to use a clearer subset of SQL. Most HAVING can be replaced with a WHERE inside a more readable and explicit subquery. Similarly, we got rid of most RIGHT JOINS.<p>[0] <a href="https://jvns.ca/blog/2019/10/03/sql-queries-don-t-start-with-select/" rel="nofollow">https://jvns.ca/blog/2019/10/03/sql-queries-don-t-start-with...</a>
WHERE folks HAVING difficulty with this concept? Sorry, couldn't resist.<p>But seriously, this concept doesn't seem to require as much attention as this post gives it.
Is there any book that has the exact specifications of this dumpster fire that is called sql?<p>With no other language I have faced so many gotchas with implicit conversions and unexpected behaviors.