Protecting Data Integrity with Database Level Constraints

The importance of database-level constraints

Even though Rails comes with a way to add validations on Models it’s still a great idea to add database-level constraints as a backup to the application-level constraints (rails model validations). This helps protect database integrity when validations somehow get skipped.

Scaled applications with multiple application instances can undermine validations due to race conditions. And developers can use methods that bypass ActiveRecord callbacks (ie. validations)

Common constraints

Making sure specified attributes on records are unique

Model validation

class User < ApplicationRecord
  validates :email, uniqueness: true
end

Database constraint

When creating the table for the first time

class CreateUsers < ActiveRecord::Migration[7.0]
  def change
    create_table :users do |t|
      t.string :email, null: false
      t.string :another_attribute, null: false

      # Single attribute email to be unique throughout the table
      t.index [:email], unique: true

      # Combination of attributes to be unique throughout the table
      t.index [:email, :another_attribute], unique: true
    end
  end
end

Adding to an existing table

class AddUniqueIndexToUsers < ActiveRecord::Migration[7.0]
  def change
    add_index :users, [:email, :another_attribute], unique: true
  end
end

Making sure a column is never null

Model validation

class User < ApplicationRecord
  validates :name, presence: true
end

Database constraint

When creating the table for the first time

class CreateUsers < ActiveRecord::Migration[7.0]
  def change
    create_table :users do |t|
      t.string :name, null: false
    end
  end
end

Adding to an existing table

class AddUniqueIndexToUsers < ActiveRecord::Migration[7.0]
  def change
    change_column :users, :email, :string, null: false
  end
end

Validate based on boolean expressions

Check constraints allow you to define boolean expressions that must return truthy before a record can be saved.

In the model, you can have a validation to protect the data integrity of your attributes

class Invoice < ApplicationRecord
  validates :amount, numericality: {
        greater_than_or_equal_to: 0
  }
end

And then back this up on the database level with a check constraint

class AddAmountCheckToInvoices < ActiveRecord::Migration[7.0]
  def change
    add_check_constraint :invoices, "amount >= 0",
      name: "amount_non_negative"
  end
end

Now a database error is thrown if somehow application level validations are skipped

# Skip validations (don't try this at home)
=> invoice.amount = -1200
=> invoice.save(validate: false)

PG::CheckViolation: ERROR:  new row for relation "invoices" violates check constraint "amount_non_negative" (ActiveRecord::StatementInvalid)

Make sure associated child records are deleted when the parent record is deleted

Model-level

class Users < ActiveRecord::Base
   has_many :items, dependent: :destroy_all
end

Database-level

When creating the table for the first time

class CreateUsers < ActiveRecord::Migration[7.0]
  def change
    create_table :users do |t|
      t.string :name
      t.references :organization, foreign_key: {on_delete: :cascade}
    end
  end
end

Adding to an existing table

class AddForeignKeyToUsers < ActiveRecord::Migration[7.0]
  def change
   add_reference :users, :organization, foreign_key: {on_delete: :cascade}
  end
end

Other options for add_reference in Rails migration

Tradeoff

Since the database will delete the child records automatically without reaching out to the application, any delete callbacks placed on child record models will be ignored.

If you have specified that child records also delete certain other records when they are deleted, you have to be careful to add those constraints at the db level as well.

Chris Oliver from GoRails recommends only relying on dependent: :destroy which is slower but will make sure application-level logic is run after deletions (since the db will call back to the application).

Even the official Rails guide argues that data integrity logic should belong in your models. Seems that each use case should decide whether a model only or model + db constaints approach is best.

Additional resources

rails, database