Looking for a remote web / back-end engineer?
You should check out my resume (PDF) and give me a call.

Why Rails Migrations are wrong headed

Ever since migrations were introduced to Rails I’ve heard nothing but praise for them, and truth be told, they are a far better way of setting up your database than the standard raw sql import. But, that’s where the goodness ends.

The problem is in the concept of going up or down in database versions. The core concept is great, to be able to roll back to a previous version of the database, but the implementation is completely out of sync with the version control systems we use to manage the codebase that depends on that database. I’ll use subversion as an example because (for those of you still stuck in CVS land) every time you do a check in the system gets tagged with a new revision number.

Everything starts out fine. The initial migration reflects the needs of the initial codebase. But after that they’re never the same. There’s no way to now what migration corresponds to what version of the codebase. What if in checkin 400 I add a new migration that changes the schema. A few weeks later I need to roll back the codebase to version 380. What migration number should i roll the database back to if any? Unless you happen to remember what migration number you were on when the codebase was at 380 you’re screwed.

So what’s the solution?

Well, the solution starts with the schema.rb file that’s created every time you run a migration. If you’re like me, that gets checked in whenever it changes and the schema.rb file is always in sync with your codebase. So you’ve always got a representation of the appropriate database configuration for any revision of your codebase. That’s good. In fact that’s better than the actual migration scripts because it’s always in sync. But there’s one more problem…

Priming the database
Most webapps don’t function well with a completely empty database. There are usually some default settings, maybe an admin account to log in with the first time, things like that. And there are a few rake tasks out there to bootstrap your database with the contents of some .yaml files. There are others to dump your database into the same files for later import, or generating real data to run your tests against.

Putting it all together
If you make sure that your your .yaml files for bootstrapping are always in sync (you should be keeping them in sync anyway) then you can avoid the problem entirely by just making your own schema.rb file (it’s just a migration file for all the tables at once) and never running rake db:migrate again. Instead run your schema.rb (make sure it forces table creation) and then rake db:bootstrap. When you need to modify your database schema don’t make a new migration script. It’ll only confuse matters down the road. Instead, modify your schema.rb and make sure your .yaml files are in sync (you’d have to do that even if you were using migrations).

Damien Tanner points out that migrations are very useful when migrating a live site. And I have to agree. My proposed solution is seriously lacking in that department. What would most likely be best is a blending of both concepts. Don’t use migrations to muck about with your database during development. Instead, create migration scripts only for migrating data on live sites. Don’t title them “create_foo_table” or “add_foo_column” but instead name them something like “live_site_version_4” or something like that that makes it clear exactly what it’s for and how it pertains to your site.