How to clean up old Rails migration files

How to clean up old Rails migration files

Get rid of those broken database migration chain and improve developer experience when working with Ruby on Rails

Play this article

Ruby on Rails comes with a pretty good database schema management system. It lets developers easily manipulate database schema, add extensions, add functions, and rollback any changes.

However, if you've ever worked with older applications or in a team environment where multiple features (and migrations) are taking place at the same time, it's a fairly common occurrence that the migration files become "out of sync" with reality.

Sometimes this is due to one-off migrations that someone creates, perhaps to backfill data on a column. Other times it may be that migrations were added on two separate pull requests which were incompatible and a rollback was performed.

In any case, the migration chain breaks and it isn't noticed until the next time someone run bin/rails db:migrate.

In this article, I want to talk about why the db/migrate folder should be periodically cleared out and how you can easily add it as part of your code maintenance task.

First, let's take a look at how Rails handle migrations...

How Rails migrations work

Migration files are typically added to the db/migrate directory of a project, and contains a sequential list of migrations — sorted by date prefix in the name:

db/migrate/20170113151609_add_email_to_users.rb

The first part of the filename indicates the date and time of the migration. Rails uses this time signature to keep track of "where along the path" its actual database schema is at.

When you run bin/rails db:migrate, Rails does not re-run all migrations. It determines, based on the timestamp of the migration file and comparing it with its internal schema version, which migration has not yet been run and only runs those new migrations.

The Rails guide has a lot of good information on the topic if you want to learn more.

Why migration files should be pruned

Periodically, these migration files should be either deleted or moved to another directory and out of the way.

The problem with keeping old database migration files is that it is not the right way to set up the database schema. However, I see this quite often in CI set up files and in README documentation on various projects.

In fact, Rails itself tells developers (in the schema.rb file):

This file is the source Rails uses to define your schema when running rails db:schema:load. When creating a new database, rails db:schema:load tends to be faster and is potentially less error prone than running all of your migrations from scratch. Old migrations may fail to apply correctly if those migrations use external dependencies or application code.

As I mentioned above, the chain of migrations tend to be broken especially in projects with large teams and in older codebases.

Over time, the migration files can no longer be run and the migration errors out and frustrates developers across the team.

To avoid this from happening, it is best to remove old migrations or to move them into a separate directory, and encourage developers to use bin/rails db:setup when setting up the project's database instead of bin/rails db:migrate. The Rails guide references this: guides.rubyonrails.org/active_record_migrat..

Make sure that bin/rails db:setup is also used when setting up the database on the CI as well.

Some developers object to the idea of clearing out old migration files, but there really isn't a good reason to keep them. As developers, it's a comforting feeling to keep things around "just in case". However in this case, you already have the schema file which serves the same function.

An exception to this advice

There is a special exception to what I've written above when it's probably a good idea to keep specific migration files.

Migrations that come from Rails engines should — in most cases — be preserved. As described by the Rails Guide, migrations that are generated from Rails engines (eg. such as those from authentication or model tagging and auditing gems) could potentially be regenerated with new timestamps by the gems themselves and rerun.

Because this is not desired, retaining these specific gem migrations will avoid this problem.

These types of migrations will (usually) have comments like this in the migration file:

# This migration comes from blorgh (originally 20210621082949)

In my experience, however, this situation is typically avoided because these gem migrations require developers to run a command like bin/rails g some_gem upgrade, which would then add any migrations which (the gem thinks) does not yet exist. When upgrading gems, it's important to see what the gem is doing anyway, so any and all migrations added should be carefully examined.

Pruning migration files

The simplest way to prune the db/migrate folder is to delete the files. However, my personal preference is to move migration files to an archive directory in case anyone wants to take a look at them at some later point in time.

Here, I've written a Rake task to do just this:

# lib/tasks/migration_archive.rake
namespace :db do
  namespace :migrate do
    desc 'Archives old DB migration files'
    task :archive do
      sh 'mkdir -p db/migrate/archive'
      sh '[ ! -f src ] || mv db/migrate/*.rb db/migrate/archive'
    end
  end
end

Then, whenever you want to archive your DB migration files, just run:

bin/rails db:migrate:archive

The above command will create a db/migrate/archive directory if it doesn't yet exist, and then it will move all migrations in the db/migrate folder to the new archive folder.

Conclusion

Rails has an extremely useful database schema management system. Although quite versatile on its own, older codebases and especially repositories where multiple team members are modifying the schema can result in a database migration chain that become broken.

It is not necessary to keep migration files around after they’ve been run. In general, it’s a good rule of thumb to treat database migrations as having a “forward” direction. That is, even if you need to perform a rollback, do so in a brand new migration file and never with a direct rollback.

Nonetheless, migration chains break somewhere along the line even with the best intensions. A migration that backfills data, for example, may no longer work because the underlying model is changed in some way.

Database schema should always be restored from the db/schema.rb file (or the db structure file if you choose to use the SQL variant). For the old DB migration files, we can either delete them or archive them in a separate directory.