Understanding how CAPYSQUASH consolidates migrations using pgsquash-engine technology

HOW CAPYSQUASH WORKS

Understanding how CAPYSQUASH consolidates your migrations helps you use it confidently and safely.

💡 This guide explains the technology behind CAPYSQUASH Platform, capysquash-cli, and pgsquash-engine. All three use the same underlying consolidation approach.

THE PROBLEM

MIGRATION FILE EXPLOSION

Every schema change in rapid development creates a new migration file. Over time, this leads to:

100-300+ migration files cluttering your repository
Slow CI/CD pipelines running 150+ migrations
Merge conflicts in migration files
Difficult debugging - which file changed what?
Slow repository clones due to file count

THE SOLUTION

CAPYSQUASH consolidates your migrations while validating schema equivalence through Docker-based testing.

Core Principle

SCHEMA EQUIVALENCE, NOT FILE EQUIVALENCE

Two sets of migrations are equivalent if they produce the same database schema, even if the SQL differs.

ORIGINAL

-- 001.sql
CREATE TABLE users (id INT);

-- 002.sql
ALTER TABLE users ADD email TEXT;

-- 003.sql
ALTER TABLE users ADD name TEXT;

SQUASHED

-- 001_combined.sql
CREATE TABLE users (
id INT,
email TEXT,
name TEXT
);

☑ Different SQL, but identical final schema

THE VALIDATION PROCESS

CAPYSQUASH uses Docker to guarantee safety through a 6-step validation process:

ANALYZE MIGRATIONS

Scan your migration files to identify consolidation opportunities based on:

Operation types (CREATE, ALTER, DROP)
Table dependencies
Timestamp ordering
Compatibility rules

CREATE TEST CONTAINER

Spin up a fresh Postgres container in Docker:

docker run -d postgres:16

APPLY ORIGINAL MIGRATIONS

Run all your existing migrations in order and capture the final schema using pg_dump:

pg_dump --schema-only > original_schema.sql

CONSOLIDATE MIGRATIONS

Create new, consolidated migration files based on the analysis:

Merge compatible operations
Eliminate redundant changes
Preserve operation order
Maintain data integrity

VALIDATE SQUASHED MIGRATIONS

Create a new container, apply squashed migrations, and capture the schema:

pg_dump --schema-only > squashed_schema.sql

Then perform a byte-for-byte comparison of the schema dumps.

COMMIT OR ROLLBACK

If schemas match: ☑ Replace old migrations with squashed ones

If schemas differ: ☒ Abort and report the discrepancy

SAFETY GUARANTEES

WHAT capysquash GUARANTEES

☑ ALWAYS SAFE

Schema equivalence validation
Docker-based isolation
Atomic operations
Rollback on failure
No database changes

<div>
  <h4 className="font-black uppercase mb-3">☒ NEVER TOUCHES</h4>
  <ul className="list-disc list-inside font-bold text-sm space-y-2 pl-4">
    <li>Production databases</li>
    <li>Staging databases</li>
    <li>Development databases</li>
    <li>Any live data</li>
    <li>Deployed migrations</li>
  </ul>
</div>

CONSOLIDATION RULES

capysquash follows smart rules to determine what can be safely consolidated:

The 6-Phase Processing Pipeline

Under the Hood

CAPYSQUASH uses pgsquash-engine, which processes migrations through a sophisticated 6-phase pipeline. Understanding these phases helps you appreciate the rigor behind the "magic."

Plugin Initialization

Detect third-party patterns (Clerk, Supabase, Auth0, etc.) and activate plugins

Parse migrations for plugin detection
Extract config from pgsquash.config.json
Discover and activate matching plugins
Register plugin-specific consolidation rules

Parse & Track

Uses PostgreSQL's actual parser (pg_query_go) to build object lifecycles

Lexical analysis and AST generation
Extract DDL operations (CREATE, ALTER, DROP)
Track object lifecycles (one per database object)
Build dependency graph between objects

Analyze Dependencies

Build comprehensive dependency graph and detect cycles

Foreign key relationships
View dependencies on tables
Function dependencies on types
Detect circular dependencies

Consolidate

Apply safety-appropriate rules to merge operations

Sequential alterations on same table
Redundant operation elimination
Column rename chain simplification
Index consolidation

Generate & Transform

Build final SQL with post-processing

Generate consolidated SQL statements
Apply formatting and style rules
Add plugin-specific compatibility SQL
Create migration metadata

Validate (Docker)

Docker-based schema equivalence verification

Spin up fresh Postgres containers
Apply original migrations, capture schema
Apply squashed migrations, capture schema
Byte-for-byte schema comparison

Compatible Operations

OPERATIONS THAT CAN BE MERGED:

SAME TABLE OPERATIONS

-- Can merge into one CREATE TABLE
CREATE TABLE users (id INT);
ALTER TABLE users ADD email TEXT;
ALTER TABLE users ADD name TEXT;

<div>
  <p className="font-black text-sm mb-2">INDEX ADDITIONS</p>
  <div className="brutalist-code text-xs">

-- Can merge into single migration
CREATE INDEX idx_email ON users(email);
CREATE INDEX idx_name ON users(name);

  </div>
</div>

<div>
  <p className="font-black text-sm mb-2">SEQUENTIAL CONSTRAINTS</p>
  <div className="brutalist-code text-xs">

-- Can consolidate if no dependencies
ALTER TABLE orders ADD CONSTRAINT fk_user ...;
ALTER TABLE orders ADD CONSTRAINT chk_amount ...;

  </div>
</div>

Incompatible Operations

⚠️ OPERATIONS KEPT SEPARATE:

Data migrations with complex transformations
Breaking changes that require application updates
Time-sensitive operations that must run in order
External dependencies like extensions or functions
Deployment markers that indicate releases

WHEN SQUASHING HAPPENS

MANUAL

Run capysquash locally

capysquash squash

CI/CD

Automated in your pipeline

- run: capysquash analyze

GITHUB APP

Automatic PRs

Coming soon!

PERFORMANCE

TYPICAL SQUASH TIMES

50 MIGRATIONS

~30s

200 MIGRATIONS

~2min