Markus Winand is a great SQL expert. I'm really grateful for everything that he shares with us. He writes well, explains us the differences between the different SQL engines. And his books are fantastic references.<p>I highly recommend his "No to offset" tutorial <a href="http://use-the-index-luke.com/no-offset" rel="nofollow">http://use-the-index-luke.com/no-offset</a>
Should be common knowledge, but I still see offset being used way to often, even in core ORM frameworks :-(
If your query is complex enough to warrant a (non-recursive) CTE, it's probably complex enough to warrant breaking that CTE out into a separate view (or function, if it performs a write). Now you can independently test and reuse that view, which you can't do with a CTE.<p>(Bonus if you're in Postgres: Postgres optimizes across views, but not across CTEs. Go figure.)
Updatable CTEs are great, for example if you want to set a column to 1 where the date is the latest day for each value in a different column, you would do something like this<p><pre><code> ;WITH cte AS(SELECT
ROW_NUMBER() OVER(PARTITION BY SomeVal
ORDER BY SomeDate DESC) AS row,
* FROM SomeTable)
UPDATE cte
SET id = CASE WHEN row = 1 THEN 1 ELSE 0 END
</code></pre>
Here is the DDL and DML in case you want to play around with this<p><pre><code> CREATE TABLE SomeTable(id int,SomeVal char(1),SomeDate date)
INSERT SomeTable
SELECT null,'A','20110101'
UNION ALL
SELECT null,'A','20100101'
UNION ALL
SELECT null,'A','20090101'
UNION ALL
SELECT null,'B','20110101'
UNION ALL
SELECT null,'B','20100101'
UNION ALL
SELECT null,'C','20110101'
UNION ALL
SELECT null,'C','20100101'
UNION ALL
SELECT null,'C','20090101'</code></pre>
The optimization fence in PostgreSQL is usually bad for SELECTs but it has it's own benefits. See this thread <a href="http://www.postgresql.org/message-id/201209191305.44674.db@kavod.com" rel="nofollow">http://www.postgresql.org/message-id/201209191305.44674.db@k...</a>
What I really want from SQL is the ability to do a single select with one or more parts of each returned record being an array (defined by a one-to-many join) and others simple scalars.<p>It looks like Postgres can do it [1] with its JSON capabilities, and to be honest its being a little while since I've dug into it, so things might have changed (I'm actually not even sure I'm phrasing the question correctly), but the old way of doing a single select and then looping over it in whatever code is calling the SQL and doing N more selects, one for each row to get the sub-array is horrible.<p>[1] <a href="http://bender.io/2013/09/22/returning-hierarchical-data-in-a-single-sql-query/" rel="nofollow">http://bender.io/2013/09/22/returning-hierarchical-data-in-a...</a>
Very interesting site, but does anyone know why Oracle and mysql have a so poor support of the standard ? It looks as if postgres was the only one trying to comply.
with has completely changed my usage of PostgreSQL. I am a newbie and being able to split queries into logical small steps with this has made SQL much more fun.
A common pattern I still haven't solved is this: I have table <i>A</i>, and now I want to add table <i>B</i> and give it one row for each row in table <i>A</i>. Then I want to add a foreign key column to table <i>A</i> pointing at the row I just added.<p>Of course this is easy to do with a loop, but can you do it all in pure SQL? If you have a solution I would love to see it.<p>Here is my history of attempts:<p><pre><code> DROP TABLE IF EXISTS dogs;
DROP TABLE IF EXISTS doghouses;
DROP SEQUENCE IF EXISTS dogs_id_seq;
DROP SEQUENCE IF EXISTS doghouses_id_seq;
BEGIN;
CREATE SEQUENCE dogs_id_seq;
CREATE SEQUENCE doghouses_id_seq;
CREATE TABLE dogs (
id INTEGER PRIMARY KEY DEFAULT nextval('dogs_id_seq'),
name TEXT NOT NULL
);
INSERT INTO dogs
(name)
VALUES
('Sparky'),
('Spot')
;
-- Now we want to give each dog a doghouse:
CREATE TABLE doghouses (
id INTEGER PRIMARY KEY DEFAULT nextval('doghouses_id_seq'),
name TEXT NOT NULL
);
ALTER TABLE dogs ADD COLUMN doghouse_id INTEGER REFERENCES doghouses (id);
/*
-- ERROR: syntax error at or near "INTO"
UPDATE dogs AS d
SET doghouse_id = (
INSERT INTO doghouses
(name) VALUES (d.name)
RETURNING id
)
;
*/
/*
-- ERROR: WITH clause containing a data-modifying statement must be at the top level
UPDATE dogs AS d
SET doghouse_id = (
WITH x AS (
INSERT INTO doghouses
(name) VALUES (d.name)
RETURNING id
) SELECT * FROM x
)
;
*/
/*
-- ERROR: missing FROM-clause entry for table "dogs"
WITH homes AS (
INSERT INTO doghouses
(name)
SELECT name
FROM dogs
RETURNING doghouses.id AS doghouse_id, dogs.id AS dog_id
)
UPDATE dogs
SET doghouse_id = homes.doghouse_id
FROM homes
WHERE dogs.id = homes.dog_id
;
*/
-- ERROR: syntax error at or near "INTO"
UPDATE dogs AS d1
SET doghouse_id = h.id
FROM dogs d2
INNER JOIN LATERAL (
INSERT INTO doghouses
(name) VALUES (d2.name)
RETURNING id
)
ON true
WHERE d1.id = d2.id
;
COMMIT;</code></pre>
Wow, never knew this. I've been fine with just doing nested subqueries but this is definitely easier to read. Great to learn a helpful piece of everyday syntax before I've had my coffee.
One of the most useful things about CTEs is PostgreSQL's support of having DELETE ... RETURNING and UPDATE ... RETURNING in the the WITH clauses. This way you can easily implement moving rows from one table to another.