How CAPYSQUASH Works
Understanding how CAPYSQUASH consolidates migrations using pgsquash-engine technology
HOW CAPYSQUASH WORKS
Understanding how CAPYSQUASH consolidates your migrations helps you use it confidently and safely.
š” This guide explains the technology behind CAPYSQUASH Platform, capysquash-cli, and pgsquash-engine. All three use the same underlying consolidation approach.
THE PROBLEM
MIGRATION FILE EXPLOSION
Every schema change in rapid development creates a new migration file. Over time, this leads to:
- 100-300+ migration files cluttering your repository
- Slow CI/CD pipelines running 150+ migrations
- Merge conflicts in migration files
- Difficult debugging - which file changed what?
- Slow repository clones due to file count
THE SOLUTION
CAPYSQUASH consolidates your migrations while validating schema equivalence through Docker-based testing.
Core Principle
SCHEMA EQUIVALENCE, NOT FILE EQUIVALENCE
Two sets of migrations are equivalent if they produce the same database schema, even if the SQL differs.
ORIGINAL
-- 001.sql CREATE TABLE users (id INT); -- 002.sql ALTER TABLE users ADD email TEXT; -- 003.sql ALTER TABLE users ADD name TEXT;
SQUASHED
-- 001_combined.sql CREATE TABLE users ( id INT, email TEXT, name TEXT );
ā Different SQL, but identical final schema
THE VALIDATION PROCESS
CAPYSQUASH uses Docker to guarantee safety through a 6-step validation process:
ANALYZE MIGRATIONS
Scan your migration files to identify consolidation opportunities based on:
- Operation types (CREATE, ALTER, DROP)
- Table dependencies
- Timestamp ordering
- Compatibility rules
CREATE TEST CONTAINER
Spin up a fresh Postgres container in Docker:
docker run -d postgres:16APPLY ORIGINAL MIGRATIONS
Run all your existing migrations in order and capture the final schema using pg_dump:
pg_dump --schema-only > original_schema.sqlCONSOLIDATE MIGRATIONS
Create new, consolidated migration files based on the analysis:
- Merge compatible operations
- Eliminate redundant changes
- Preserve operation order
- Maintain data integrity
VALIDATE SQUASHED MIGRATIONS
Create a new container, apply squashed migrations, and capture the schema:
pg_dump --schema-only > squashed_schema.sqlThen perform a byte-for-byte comparison of the schema dumps.
COMMIT OR ROLLBACK
If schemas match: ā Replace old migrations with squashed ones
If schemas differ: ā Abort and report the discrepancy
SAFETY GUARANTEES
WHAT capysquash GUARANTEES
ā ALWAYS SAFE
- Schema equivalence validation
- Docker-based isolation
- Atomic operations
- Rollback on failure
- No database changes
<div>
<h4 className="font-black uppercase mb-3">ā NEVER TOUCHES</h4>
<ul className="list-disc list-inside font-bold text-sm space-y-2 pl-4">
<li>Production databases</li>
<li>Staging databases</li>
<li>Development databases</li>
<li>Any live data</li>
<li>Deployed migrations</li>
</ul>
</div>CONSOLIDATION RULES
capysquash follows smart rules to determine what can be safely consolidated:
The 6-Phase Processing Pipeline
Under the Hood
CAPYSQUASH uses pgsquash-engine, which processes migrations through a sophisticated 6-phase pipeline. Understanding these phases helps you appreciate the rigor behind the "magic."
Plugin Initialization
Detect third-party patterns (Clerk, Supabase, Auth0, etc.) and activate plugins
- Parse migrations for plugin detection
- Extract config from
pgsquash.config.json - Discover and activate matching plugins
- Register plugin-specific consolidation rules
Parse & Track
Uses PostgreSQL's actual parser (pg_query_go) to build object lifecycles
- Lexical analysis and AST generation
- Extract DDL operations (CREATE, ALTER, DROP)
- Track object lifecycles (one per database object)
- Build dependency graph between objects
Analyze Dependencies
Build comprehensive dependency graph and detect cycles
- Foreign key relationships
- View dependencies on tables
- Function dependencies on types
- Detect circular dependencies
Consolidate
Apply safety-appropriate rules to merge operations
- Sequential alterations on same table
- Redundant operation elimination
- Column rename chain simplification
- Index consolidation
Generate & Transform
Build final SQL with post-processing
- Generate consolidated SQL statements
- Apply formatting and style rules
- Add plugin-specific compatibility SQL
- Create migration metadata
Validate (Docker)
Docker-based schema equivalence verification
- Spin up fresh Postgres containers
- Apply original migrations, capture schema
- Apply squashed migrations, capture schema
- Byte-for-byte schema comparison
Compatible Operations
OPERATIONS THAT CAN BE MERGED:
SAME TABLE OPERATIONS
-- Can merge into one CREATE TABLE
CREATE TABLE users (id INT);
ALTER TABLE users ADD email TEXT;
ALTER TABLE users ADD name TEXT;<div>
<p className="font-black text-sm mb-2">INDEX ADDITIONS</p>
<div className="brutalist-code text-xs">-- Can merge into single migration
CREATE INDEX idx_email ON users(email);
CREATE INDEX idx_name ON users(name); </div>
</div>
<div>
<p className="font-black text-sm mb-2">SEQUENTIAL CONSTRAINTS</p>
<div className="brutalist-code text-xs">-- Can consolidate if no dependencies
ALTER TABLE orders ADD CONSTRAINT fk_user ...;
ALTER TABLE orders ADD CONSTRAINT chk_amount ...; </div>
</div>Incompatible Operations
ā ļø OPERATIONS KEPT SEPARATE:
- Data migrations with complex transformations
- Breaking changes that require application updates
- Time-sensitive operations that must run in order
- External dependencies like extensions or functions
- Deployment markers that indicate releases
WHEN SQUASHING HAPPENS
MANUAL
Run capysquash locally
capysquash squashCI/CD
Automated in your pipeline
- run: capysquash analyzeGITHUB APP
Automatic PRs
Coming soon!
PERFORMANCE
TYPICAL SQUASH TIMES
50 MIGRATIONS
~30s
200 MIGRATIONS
~2min
200 MIGRATIONS
~5min
1000+ MIGRATIONS
~10min
ā” Most of the time is spent in Docker validation, ensuring 100% safety
NEXT STEPS
- Safety Levels - Choose the right safety level
- Docker Validation - Deep dive into validation
- Migration Analysis - How consolidation works
How is this guide?