Best Practices Guide
Comprehensive guide to using CAPYSQUASH effectively and writing safe PostgreSQL migrations
Best Practices Guide
Expert strategies for using CAPYSQUASH effectively and writing production-grade PostgreSQL migrations
This comprehensive guide covers CAPYSQUASH-specific best practices alongside battle-tested PostgreSQL migration patterns to help you optimize safely and ship confidently.
Applies To
These best practices apply to CAPYSQUASH Platform, capysquash-cli, and pgsquash-engine integrations.
Table of Contents
- When to Use CAPYSQUASH
- Safety Level Selection
- Migration Organization
- PostgreSQL Best Practices
- Version Control Workflow
- Team Collaboration
- Validation & Testing
When to Use CAPYSQUASH
✓ Good Times to Squash
After Milestones
- ✓ After beta launch
- ✓ After 1.0 release
- ✓ After major refactor
Regular Cadence
- ✓ Monthly or quarterly
- ✓ When consolidation > 50%
- ✓ Before major releases
When Onboarding
- ✓ Before new developers join
- ✓ Before open-sourcing
- ✓ Before client handoff
✗ Bad Times to Squash
Avoid These Scenarios
During Active Development: Wait for feature branches to merge—don't squash mid-sprint.
Right Before Critical Deploy: Squash early in the release cycle and test thoroughly.
Without Validation: Always validate squashed migrations. Never skip Docker validation.
Safety Level Selection
Choose the appropriate safety level based on your environment, project stage, and team size:
By Environment
Recommended: aggressive or standard
- Fast iteration is priority
- Easy to rebuild if issues occur
- Lower risk tolerance needed
capysquash squash migrations/*.sql --safety aggressiveRecommended: standard or conservative
- Balance between speed and safety
- Testing ground for production
- Should mirror production data
capysquash squash migrations/*.sql --safety standardRecommended: conservative or paranoid
- Maximum safety is critical
- Zero tolerance for data loss
- Thorough validation required
capysquash squash migrations/*.sql --safety paranoidBy Project Stage
| Project Stage | Recommended Safety Level | Rationale |
|---|---|---|
| Early startup (< 6 months) | aggressive | Rapid iteration, easy to rebuild |
| Growing (6-18 months) | standard | Balancing speed with stability |
| Established (18+ months) | conservative | Production data, customer impact |
| Enterprise / Critical Systems | paranoid | Zero-downtime requirements |
By Team Size
| Team Size | Recommended Safety Level |
|---|---|
| Solo developer | aggressive or standard |
| Small team (2-5) | standard |
| Medium team (6-20) | conservative |
| Large team (20+) | conservative or paranoid |
Migration Organization
File Naming Conventions
Recommended: Sequential with descriptive names
001_extensions_and_schemas.sql
002_user_authentication.sql
003_content_management.sql
004_analytics_tables.sql
005_indexes_and_constraints.sql
006_rls_policies.sql
007_functions_and_triggers.sql
008_seed_data.sqlBenefits:
- Clear ordering maintains dependency chain
- Self-documenting purpose
- Easy team navigation
- Git-friendly for conflict resolution
Migration Categories
Organize migrations by dependency order to ensure safe consolidation:
Extensions & Schemas
Enable PostgreSQL extensions and create custom schemas first
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE SCHEMA IF NOT EXISTS analytics;Types & Enums
Define custom types before tables that use them
CREATE TYPE user_role AS ENUM ('admin', 'user', 'guest');Tables & Core Structure
Create table definitions
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
email TEXT NOT NULL,
role user_role NOT NULL
);Constraints & Foreign Keys
Add relationships after all tables exist
ALTER TABLE orders
ADD CONSTRAINT fk_orders_users
FOREIGN KEY (user_id) REFERENCES users(id);Indexes
Create indexes for query performance
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);Functions & Triggers
Add procedural code
CREATE FUNCTION update_modified_column()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;Security & RLS
Apply Row Level Security policies and grants
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_isolation ON users
FOR ALL USING (auth.uid() = id);PostgreSQL Best Practices
Core Principles
Safety First
Never risk data loss or extended downtime
Minimize Locks
Keep operations non-blocking whenever possible
Test Everything
Validate migrations against production-like data
Migration Anti-Patterns
☒ Anti-Pattern 1: Adding Non-Nullable Columns
Don't Do This
ALTER TABLE users ADD COLUMN email VARCHAR(255) NOT NULL;Why it's bad:
- Requires full table rewrite
- Locks table for duration (could be minutes on large tables)
- Fails if table has existing rows without a default
Do this instead:
-- Step 1: Add column as nullable
ALTER TABLE users ADD COLUMN email VARCHAR(255);
-- Step 2: Backfill data (in batches for large tables)
UPDATE users
SET email = 'placeholder@example.com'
WHERE email IS NULL;
-- Step 3: Add NOT NULL constraint
ALTER TABLE users ALTER COLUMN email SET NOT NULL;-- Step 1: Add column with default (PostgreSQL 11+ optimizes this!)
ALTER TABLE users
ADD COLUMN email VARCHAR(255)
DEFAULT 'placeholder@example.com';
-- Step 2: Remove default (doesn't affect existing rows)
ALTER TABLE users ALTER COLUMN email DROP DEFAULT;
-- Step 3: Add NOT NULL constraint
ALTER TABLE users ALTER COLUMN email SET NOT NULL;PostgreSQL 11+ optimizes adding columns with defaults—no table rewrite required!
☒ Anti-Pattern 2: Creating Indexes on Large Tables
Don't Do This
CREATE INDEX idx_users_email ON users(email);Why it's bad:
- Acquires SHARE lock on table
- Blocks writes for entire index build
- On a 10M row table, could take 30+ seconds
Do this instead:
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);What CONCURRENTLY does:
- ✓ Builds index without blocking writes
- ✓ Takes longer but doesn't lock table
- ✓ Safe for production use
Important caveats:
- Cannot run inside a transaction
- Requires slightly more disk space during build
- May fail if canceled (leaves invalid index)
Check for failed concurrent indexes:
SELECT schemaname, tablename, indexname
FROM pg_indexes
WHERE indexname IN (
SELECT indexrelid::regclass::text
FROM pg_index
WHERE NOT indisvalid
);☒ Anti-Pattern 3: Changing Column Types Carelessly
Dangerous Operation
ALTER TABLE users ALTER COLUMN id TYPE BIGINT;Why it's risky:
- Requires full table rewrite
- Locks table during rewrite
- Can break foreign key constraints
- May require rebuilding all indexes
Safe type changes (no rewrite):
VARCHAR(50)→VARCHAR(100)✓NUMERIC(10,2)→NUMERIC(12,2)✓VARCHAR→TEXT✓
For unsafe changes, use the shadow column pattern:
Add New Column
ALTER TABLE users ADD COLUMN id_new BIGINT;Backfill Data
-- In batches for large tables
UPDATE users
SET id_new = id
WHERE id_new IS NULL
LIMIT 10000;
-- Repeat until completeSwap Columns
BEGIN;
ALTER TABLE users RENAME COLUMN id TO id_old;
ALTER TABLE users RENAME COLUMN id_new TO id;
COMMIT;Cleanup
ALTER TABLE users DROP COLUMN id_old;☒ Anti-Pattern 4: Adding Foreign Keys Without NOT VALID
Blocking Operation
ALTER TABLE orders
ADD CONSTRAINT fk_orders_users
FOREIGN KEY (user_id) REFERENCES users(id);Why it's bad: Acquires lock while scanning entire table to validate existing rows.
Do this instead:
-- Step 1: Add constraint as NOT VALID (doesn't check existing rows)
ALTER TABLE orders
ADD CONSTRAINT fk_orders_users
FOREIGN KEY (user_id) REFERENCES users(id) NOT VALID;
-- Step 2: Validate constraint (checks existing rows without exclusive lock)
ALTER TABLE orders VALIDATE CONSTRAINT fk_orders_users;Why this works:
NOT VALIDadds constraint instantlyVALIDATEscans table without blocking writes- New rows are checked immediately
- Existing rows validated in background
Batching Large Updates
For updates affecting millions of rows, use batching to avoid long-running transactions:
DO $$
DECLARE
batch_size INT := 10000;
updated INT;
total_updated BIGINT := 0;
BEGIN
LOOP
UPDATE users
SET status = 'active'
WHERE status IS NULL
AND id IN (
SELECT id FROM users
WHERE status IS NULL
LIMIT batch_size
);
GET DIAGNOSTICS updated = ROW_COUNT;
total_updated := total_updated + updated;
RAISE NOTICE 'Updated % rows (total: %)', updated, total_updated;
EXIT WHEN updated = 0;
-- Pause to let other transactions through
PERFORM pg_sleep(0.1);
END LOOP;
RAISE NOTICE 'Migration complete. Total rows updated: %', total_updated;
END $$;PostgreSQL Gotchas to Avoid
Quick Tips
TEXT vs VARCHAR: They're the same! Use TEXT (no arbitrary length limit)
SERIAL vs IDENTITY: Use GENERATED ALWAYS AS IDENTITY (PostgreSQL 10+) for better SQL standard compliance
TIMESTAMP vs TIMESTAMPTZ: Always use TIMESTAMPTZ for user-generated times to store UTC properly
Version Control Workflow
Before Squashing
Create Backup Branch
git checkout -b backup-migrations-$(date +%Y%m%d)
git push origin backup-migrations-$(date +%Y%m%d)Create Feature Branch
git checkout main
git checkout -b chore/consolidate-migrationsRun Squash
capysquash squash migrations/*.sql \
--output clean/ \
--safety conservativeValidate Results
capysquash validate migrations/ clean/Replace Migrations
mv migrations/ migrations_original/
mv clean/ migrations/Commit Changes
git add migrations/
git commit -m "chore: consolidate migrations (287 → 12 files)"Create Pull Request
gh pr create \
--title "Consolidate migrations" \
--body "Reduces migration count by 96%. See details in commit."Commit Message Template
chore: consolidate migrations
- Reduced from 287 to 12 files (96% reduction)
- Validated with pgsquash Docker validation
- Safety level: conservative
- All schemas identical
Consolidation details:
- CREATE + ALTER pairs: 89 consolidated
- Duplicate indexes removed: 23
- RLS policies optimized: 34
- Deployment time: 8min → 45secTeam Collaboration
Communication Checklist
Before squashing:
- ✓ Announce in team chat
- ✓ Ensure no active migration work in progress
- ✓ Check for open PRs touching migrations
- ✓ Plan timing (end of sprint, between features)
After squashing:
- ✓ Share validation results
- ✓ Update team documentation
- ✓ Notify about file location changes
- ✓ Update migration trackers or wikis
Code Review Checklist
Reviewer Checklist
Reviewers should verify:
- Validation passed successfully
- Safety level is documented
- File count reduction is documented
- Commit message is clear and detailed
- No open PRs conflict with this change
- Deployment process updated (if needed)
- CI/CD pipelines still work correctly
- Team has been notified of changes
Validation & Testing
Always Validate
Never Skip Validation
# ALWAYS run validation before deploying
capysquash validate migrations/ clean/Never skip validation for:
- Production migrations
- Shared branches (main, develop)
- Client projects
- First-time squashing
When Validation Fails
Don't Panic
Validation is designed to catch issues—this is expected behavior.
Review the Diff
capysquash validate --show-diff migrations/ clean/Try More Conservative
capysquash squash migrations/*.sql --safety paranoidCheck for Edge Cases
Complex triggers, custom types, or ORM-specific patterns may need special handling.
Report Issues
If paranoid mode fails validation, file an issue with reproduction details.
Testing Migrations
Test against production-like data:
# 1. Clone production schema (without data)
pg_dump --schema-only production > schema.sql
psql test_db < schema.sql
# 2. Generate synthetic data
psql test_db <<EOF
INSERT INTO users (name, email)
SELECT
'User ' || generate_series,
'user' || generate_series || '@example.com'
FROM generate_series(1, 1000000);
EOF
# 3. Run migration
psql test_db < migration.sql
# 4. Measure performance
psql test_db -c "EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';"Configuration Management
Environment-Specific Configs
Create different configurations for each environment:
File: pgsquash.dev.json
{
"safety_level": "aggressive",
"rules": {
"remove_dead_code": true,
"deduplicate_functions": true
},
"performance": {
"parallel_workers": 8
}
}Usage:
capysquash squash migrations/*.sql --config pgsquash.dev.jsonFile: pgsquash.prod.json
{
"safety_level": "conservative",
"rules": {
"remove_dead_code": false,
"deduplicate_functions": false
},
"output": {
"preserve_comments": true,
"add_headers": true
}
}Usage:
capysquash squash migrations/*.sql --config pgsquash.prod.jsonKeep Config in Version Control
git add pgsquash.*.json
git commit -m "docs: add pgsquash configurations for each environment"Common Pitfalls
☒ Don't: Squash in Feature Branch
Bad Practice
# Bad: Squashing in feature branch
git checkout feature/new-table
capysquash squash migrations/*.sql --output clean/
# Now you'll have conflicts with main!Do instead: Squash in a separate chore branch from main
☒ Don't: Skip Validation
Dangerous
# Bad: No validation
capysquash squash migrations/*.sql --output clean/
mv clean/ migrations/
git commit -m "squashed" # 😱 Hope and pray!Do instead: Always validate first—it takes seconds and prevents disasters
☒ Don't: Lose Original Migrations
Data Loss Risk
# Bad: Overwriting originals without backup
capysquash squash migrations/*.sql --output migrations/ --force
# Lost originals forever!Do instead: Output to a new directory and keep backups
Rollback Strategy
Keep Backups
# Before consolidating
cp -r migrations/ migrations_backup_$(date +%Y%m%d)/
# Or use git tags
git tag backup-migrations-$(date +%Y%m%d)
git push origin --tagsRollback if Needed
# If issues arise after consolidation
mv migrations/ migrations_consolidated/
mv migrations_backup_20240115/ migrations/
# Or with git
git revert <consolidation-commit>ORM-Specific Patterns
Prisma
# Squash Prisma migrations
capysquash squash prisma/migrations/*/migration.sql --output consolidated/
# Keep Prisma metadata intact
# Don't delete _prisma_migrations tableDrizzle
# Squash Drizzle migrations
capysquash squash drizzle/migrations/*.sql --output consolidated/
# Preserve __drizzle_migrations tableSupabase
# Squash Supabase migrations
capysquash squash supabase/migrations/*.sql --output consolidated/CAPYSQUASH automatically preserves Supabase-specific patterns:
- Auth schema and functions
- Storage schema and buckets
- RLS policies
- Realtime functions
Monitoring & Maintenance
Track Migration Health
Create a monthly check script:
#!/bin/bash
# monthly-migration-check.sh
DATE=$(date +%Y-%m-%d)
mkdir -p reports
capysquash analyze migrations/*.sql \
--report-format json > "reports/$DATE.json"
# Extract metrics
FILE_COUNT=$(jq '.file_count' "reports/$DATE.json")
CONSOLIDATION=$(jq '.consolidation_percentage' "reports/$DATE.json")
echo "$DATE,$FILE_COUNT,$CONSOLIDATION" >> metrics.csv
# Alert if consolidation opportunity is high
if (( $(echo "$CONSOLIDATION > 60" | bc -l) )); then
echo "⚠️ Time to consolidate migrations ($CONSOLIDATION% opportunity)"
fiSet Consolidation Thresholds
| Consolidation Opportunity | Action |
|---|---|
| < 30% | No action needed |
| 30-50% | Consider consolidating soon |
| 50-70% | Plan consolidation within sprint |
| > 70% | Consolidate ASAP (technical debt building) |
Next Steps
How is this guide?