> We also use Postgres which doesn't create the obstacles that some other database systems have, for example with locking tables on schema changes.<p>Actually schema changes still require exclusive table locks, they're just held very briefly since Postgres doesn't need to rewrite each tuple. (With some exceptions, like adding a NOT NULL column, or reducing the size of a VARCHAR.)<p>I don't mean to be pedantic, it can be important because the exclusive lock can be blocked by other long-running queries. Then Postgres will try to grant the alter table's exclusive lock before other conflicting locks, so normal reads and writes will be locked out until the long query completes or one of the queries is killed.
pt-online-schema-change, part of the Percona Toolkit, is my go-to tool for making production changes in MySQL.<p>Improvely and W3Counter both have tons of multi-gigabyte tables. A regular ALTER TABLE statement could take hours to run, and would lock the table the entire time, essentially taking down the service.<p>Percona's tool makes a copy of the table, runs the ALTER TABLE statements on the copy, then sets up triggers on the original to temporarily mirror data-altering queries to the copy. It then intelligently batches the data migration from the old table to the new one to avoid overloading the server, while printing progress and ETA on the console. When the copying's done, the old table is dropped and the modified one renamed in a single transaction.<p>It takes just one command and has always worked flawlessly.<p><a href="http://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html" rel="nofollow">http://www.percona.com/doc/percona-toolkit/2.2/pt-online-sch...</a>
The column change phase 2 has a sneaky bug in it.<p>In the order presented:<p>1) Run migration<p>2) Apply trigger<p>There will be a point in time where the migration has been done and the trigger hasn't been applied.<p>Even sneakier, if the trigger is applied in code and not in the database, there will be period in time where some of the code is running the trigger and some isn't.<p>The trigger needs to be in production and executing prior to the backfill running.<p>I'm also loathe to rely on "triggers" in application model logic. I've seen too many platforms where the operations engineers interact directly with the database, resulting in required logic not executing.
Good idea for the fake delete column. I've always seen non rollback-able column removal migrations as something we just have to do with, this totally solves the problem.<p>As for when is the good time to actually remove the column, I think this can merge pretty well with my current thinking about migrations.<p>A new developer joined my company (we use rails as well). When setting things up, he tried to run `rake db:migrate` (my fault, I mentioned it in my doc instead of `rake db:schema:load`, I wrote doc as I was setting up the very first lines of the project). There was several years of migrations, some of them not working anymore since they used code that does not exist anymore (like `#add_hstore_index` from activerecord-postgres-hstore, which has been replaced by native postgres support).<p>This made me thinks that there is no need to maintain migrations in repository if we can't use them anymore. And thus, from time to time, migrations should be flushed. This flush should be the perfect time to actually delete columns (well, this should be done just before). That way, we ensure our migrations can constantly be rolled back, without breaking anything.<p>Edit : refining on that, we probably don't want to flush all migrations, but only the older ones. So, how do we insert the migration to actually remove data ? We can't just put it between old migrations and the ones we'll keep, since it would be considered already migrated. We can't remove it directly in sql console either, or we would have to do it on every machine and this would caused desync among developers machines.<p>I think the best way to do it would be to append a new non-reversible (but that does not break on reverse) migration, that uses `table_exists?` and `column_exists?` to ensure it is only ran on databases that contains the deprecated columns / tables. Something like :<p><pre><code> class RemoveDeprecated < ActiveRecord::Migration
def up
remove_column :foos, :bar_deprecated if column_exists? :foos, :bar_deprecated
end
def down
end
end</code></pre>
When I saw this, I first thought it was an extension of <a href="https://news.ycombinator.com/item?id=5326159" rel="nofollow">https://news.ycombinator.com/item?id=5326159</a> ;)
I prefer our approach:<p><i>> Case #1: I want to add a new column</i><p>Issue an ALTER TABLE ADD COLUMN on the production database. Try not to make it NOT NULL.<p><i>> Case #2: I want to change a column type</i><p>Add another column (see Case #1) of the same name, with a suffix like "_2" or "_NUM" so you can tell them apart.<p><i>> Case #3: I want to rename a column</i><p>Don't risk it. Keep using the original column name. Very few of the column names reflect the reality of that they contain anyway. It's little bits of insider knowledge like this ("Product name is actually stored in ITEMNAME_LONG, ITEMNAME is only a CHAR(15) so we keep UPC in there") that contribute to our fast-paced, agile work environment.