CAPYSQUASH

Migration Optimization Strategies: Advanced Patterns & Techniques

Master sophisticated migration consolidation patterns, safety levels, and optimization strategies for production PostgreSQL databases

Advanced Patterns & Techniques

Master the sophisticated patterns and algorithms that enable safe, aggressive migration consolidation while maintaining 100% schema equivalence.

pgsquash-engine employs multiple optimization strategies, each carefully designed to handle specific migration patterns while preserving semantic meaning. This guide explores these strategies in depth, helping you understand when and how to apply them for maximum impact.

OPTIMIZATION CATEGORIES

FOUR PILLARS OF OPTIMIZATION

STRUCTURAL

  • Table consolidation
  • Index optimization
  • Constraint merging
  • Type definitions

PERFORMANCE

  • Query optimization
  • Index strategy
  • Execution order
  • Parallel processing

SEMANTIC

  • Dead code removal
  • Function consolidation
  • Redundant elimination
  • Pattern recognition

SAFETY

  • Dependency preservation
  • Data integrity
  • Rollback capability
  • Validation guarantees

STRUCTURAL OPTIMIZATIONS

1. Table Lifecycle Consolidation

TABLE EVOLUTION PATTERNS

PATTERN: INCREMENTAL COLUMN ADDITION

BEFORE (12 migrations)

-- 001_create_users.sql
CREATE TABLE users (id SERIAL);

-- 002_add_email.sql
ALTER TABLE users ADD COLUMN email VARCHAR(255);

-- 003_add_name.sql
ALTER TABLE users ADD COLUMN name VARCHAR(100);

-- 004_add_created_at.sql
ALTER TABLE users ADD COLUMN created_at TIMESTAMP;

-- 005_add_updated_at.sql
ALTER TABLE users ADD COLUMN updated_at TIMESTAMP;

-- 006_make_email_not_null.sql
ALTER TABLE users ALTER COLUMN email SET NOT NULL;

-- 007_add_email_unique.sql
ALTER TABLE users ADD CONSTRAINT users_email_unique UNIQUE (email);

-- 008_add_status.sql
ALTER TABLE users ADD COLUMN status VARCHAR(50) DEFAULT 'active';

-- 009_add_phone.sql
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- 010_drop_phone.sql
ALTER TABLE users DROP COLUMN phone;

-- 011_add_settings.sql
ALTER TABLE users ADD COLUMN settings JSONB DEFAULT '{}';

-- 012_add_indexes.sql
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_status ON users(status);

AFTER (1 migration)

-- 001_users_optimized.sql
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE,
    name VARCHAR(100),
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW(),
    status VARCHAR(50) DEFAULT 'active',
    settings JSONB DEFAULT '{}'
);

-- Indexes in same transaction
CREATE INDEX idx_users_status ON users(status);

📊 Results: 92% file reduction, 100% schema equivalent

PATTERN: TEMPORARY COLUMN CLEANUP

-- Original (3 migrations)
ALTER TABLE users ADD COLUMN temp_flag BOOLEAN;
UPDATE users SET temp_flag = TRUE WHERE status = 'active';
ALTER TABLE users DROP COLUMN temp_flag;

-- Optimized (0 migrations)
-- Entire lifecycle removed - temp_flag never existed in final schema

2. Index Strategy Optimization

INDEX CONSOLIDATION RULES

REDUNDANT INDEX REMOVAL

-- Redundant: Unique index already provides lookup
CREATE UNIQUE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_email_lookup ON users(email); -- Redundant

-- Optimized: Keep only unique index
CREATE UNIQUE INDEX idx_users_email ON users(email);

COMPOSITE INDEX MERGING

-- Separate indexes (less efficient)
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_status ON users(status);
CREATE INDEX idx_users_created_at ON users(created_at);

-- Composite index (better for common queries)
CREATE INDEX idx_users_email_status_created ON users(email, status, created_at);

PARTIAL INDEX OPTIMIZATION

-- Full index (larger, slower)
CREATE INDEX idx_users_active ON users(status);

-- Partial index (smaller, faster for active users)
CREATE INDEX idx_users_active ON users(id) WHERE status = 'active';

PERFORMANCE OPTIMIZATIONS

1. Execution Order Optimization

DEPENDENCY-AWARE ORDERING

ORIGINAL: CHAOTIC ORDER

-- Wrong order that causes issues
CREATE INDEX idx_orders_user_id ON orders(user_id);  -- Fails: table doesn't exist
CREATE TABLE users (id SERIAL PRIMARY KEY);
CREATE TABLE orders (id SERIAL, user_id INTEGER REFERENCES users(id));
ALTER TABLE users ADD COLUMN email VARCHAR(255);

OPTIMIZED: DEPENDENCY RESOLVED

-- Correct dependency order
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255)
);

CREATE TABLE orders (
    id SERIAL,
    user_id INTEGER REFERENCES users(id)
);

CREATE INDEX idx_orders_user_id ON orders(user_id);

DEPENDENCY GRAPH ANALYSIS:

  • ► users (no dependencies) → executed first
  • ► orders (depends on users) → executed second
  • ► idx_orders_user_id (depends on orders) → executed last

2. Batch Operation Grouping

TRANSACTION OPTIMIZATION

INEFFICIENT: MULTIPLE TRANSACTIONS

-- Transaction 1
BEGIN;
CREATE TABLE users (id SERIAL);
COMMIT;

-- Transaction 2
BEGIN;
ALTER TABLE users ADD COLUMN email TEXT;
COMMIT;

-- Transaction 3
BEGIN;
ALTER TABLE users ADD COLUMN name TEXT;
COMMIT;

☒ Multiple round trips, slower deployment

OPTIMIZED: SINGLE TRANSACTION

-- Single optimized transaction
BEGIN;
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email TEXT,
    name TEXT
);
COMMIT;

☑ One round trip, atomic deployment

SEMANTIC OPTIMIZATIONS

1. Dead Code Elimination

SMART DEAD CODE DETECTION

UNUSED FUNCTION REMOVAL

-- Migration 023: Function created
CREATE OR REPLACE FUNCTION calculate_user_age(birth_date DATE)
RETURNS INTEGER AS $$
BEGIN
    RETURN EXTRACT(YEAR FROM AGE(NOW(), birth_date));
END;
$$ LANGUAGE plpgsql;

-- Migration 045: Function never used, replaced with different logic
CREATE OR REPLACE FUNCTION get_user_profile(user_id UUID)
RETURNS JSONB AS $$
BEGIN
    RETURN jsonb_build_object(
        'id', user_id,
        'age', EXTRACT(YEAR FROM AGE(NOW(), u.birth_date))
    ) FROM users u WHERE u.id = user_id;
END;
$$ LANGUAGE plpgsql;

-- Migration 089: Original function still exists but unused
-- capysquash detects this and removes the unused function

🧠 AI Analysis: calculate_user_age() never referenced in subsequent migrations

REDUNDANT CONSTRAINT CLEANUP

-- Original: Redundant constraints
ALTER TABLE users ADD CONSTRAINT users_email_not_null CHECK (email IS NOT NULL);
ALTER TABLE users ALTER COLUMN email SET NOT NULL;  -- Makes CHECK constraint redundant

-- Optimized: Keep only NOT NULL constraint
ALTER TABLE users ALTER COLUMN email SET NOT NULL;
-- CHECK constraint automatically removed

2. Function Consolidation

FUNCTION EVOLUTION CONSOLIDATION

INCREMENTAL FUNCTION IMPROVEMENTS

-- Original (4 separate migrations)
CREATE FUNCTION send_notification(user_id UUID, message TEXT);
ALTER FUNCTION send_notification(user_id UUID, message TEXT) IMMUTABLE;
ALTER FUNCTION send_notification(user_id UUID, message TEXT) SECURITY DEFINER;
ALTER FUNCTION send_notification(user_id UUID, message TEXT) RETURNS BOOLEAN;

-- Optimized (single migration)
CREATE OR REPLACE FUNCTION send_notification(user_id UUID, message TEXT)
RETURNS BOOLEAN
IMMUTABLE
SECURITY DEFINER
AS $$
BEGIN
    -- Implementation
    RETURN TRUE;
END;
$$ LANGUAGE plpgsql;

SIMILAR FUNCTION MERGING

-- Original: Duplicate logic
CREATE FUNCTION get_admin_users() RETURNS TABLE(id UUID, email TEXT) AS $$...$$;
CREATE FUNCTION get_active_users() RETURNS TABLE(id UUID, email TEXT) AS $$...$$;

-- Optimized: Single parameterized function
CREATE FUNCTION get_users_by_status(status TEXT DEFAULT 'active')
RETURNS TABLE(id UUID, email TEXT) AS $$...$$;

ADVANCED PATTERNS

1. Multi-Table Relationship Optimization

RELATIONSHIP CONSOLIDATION

PATTERN: RELATED TABLES CREATED TOGETHER

-- Original: Scattered across 8 migrations
CREATE TABLE users (id SERIAL PRIMARY KEY);
CREATE TABLE profiles (user_id INTEGER REFERENCES users(id));
CREATE TABLE preferences (user_id INTEGER REFERENCES users(id));
CREATE TABLE settings (user_id INTEGER REFERENCES users(id));
ALTER TABLE users ADD COLUMN email VARCHAR(255);
ALTER TABLE profiles ADD COLUMN bio TEXT;
ALTER TABLE preferences ADD COLUMN theme VARCHAR(20);
ALTER TABLE settings ADD COLUMN notifications BOOLEAN DEFAULT TRUE;

-- Optimized: Logical grouping
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255)
);

CREATE TABLE profiles (
    user_id INTEGER PRIMARY KEY REFERENCES users(id),
    bio TEXT
);

CREATE TABLE preferences (
    user_id INTEGER PRIMARY KEY REFERENCES users(id),
    theme VARCHAR(20) DEFAULT 'light'
);

CREATE TABLE settings (
    user_id INTEGER PRIMARY KEY REFERENCES users(id),
    notifications BOOLEAN DEFAULT TRUE
);

2. Schema Migration Patterns

SCHEMA ORGANIZATION STRATEGIES

FUNCTIONAL GROUPING

-- 001_core_tables.sql
CREATE TABLE users (...);
CREATE TABLE sessions (...);
CREATE TABLE organizations (...);

-- 002_content_tables.sql
CREATE TABLE posts (...);
CREATE TABLE comments (...);
CREATE TABLE likes (...);

-- 003_analytics_tables.sql
CREATE TABLE events (...);
CREATE TABLE metrics (...);
CREATE TABLE reports (...);

DEPENDENCY GROUPING

-- 001_foundation.sql
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TYPE user_status AS ENUM (...);
CREATE TYPE notification_type AS ENUM (...);

-- 002_core_schema.sql
CREATE TABLE users (...);
CREATE TABLE organizations (...);

-- 003_application_tables.sql
CREATE TABLE posts (...);
CREATE TABLE comments (...);

PLATFORM-SPECIFIC OPTIMIZATIONS

1. Supabase Optimization Patterns

SUPABASE-SPECIFIC STRATEGIES

AUTH SCHEMA OPTIMIZATION

-- Supabase auth integration
CREATE TABLE public.profiles (
    id UUID REFERENCES auth.users(id) PRIMARY KEY,
    email TEXT,
    full_name TEXT,
    avatar_url TEXT,
    website TEXT,
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Row Level Security optimized
ALTER TABLE public.profiles ENABLE ROW LEVEL SECURITY;

CREATE POLICY "Users can view own profile"
ON public.profiles FOR SELECT
USING (auth.uid() = id);

CREATE POLICY "Users can update own profile"
ON public.profiles FOR UPDATE
USING (auth.uid() = id);

-- Storage integration
CREATE TABLE public.attachments (
    id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    user_id UUID REFERENCES auth.users(id),
    storage_path TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

REALTIME SUBSCRIPTIONS OPTIMIZATION

-- Optimized for Supabase Realtime
CREATE TABLE public.messages (
    id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    room_id UUID NOT NULL,
    user_id UUID REFERENCES auth.users(id),
    content TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Realtime-optimized indexes
CREATE INDEX idx_messages_room_created ON public.messages(room_id, created_at DESC);
CREATE INDEX idx_messages_user_room ON public.messages(user_id, room_id);

-- RLS for realtime
CREATE POLICY "Users can view messages in their rooms"
ON public.messages FOR SELECT
USING (
    room_id IN (
        SELECT room_id FROM room_members
        WHERE user_id = auth.uid()
    )
);

2. Neon Branching Optimization

NEON BRANCH-AWARE MIGRATIONS

BRANCH-SAFE MIGRATIONS

-- Safe for branching: No external dependencies
CREATE TABLE feature_flags (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name TEXT UNIQUE NOT NULL,
    enabled BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Branch-optimized: Development-only features
CREATE TABLE debug_logs (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    message TEXT,
    metadata JSONB,
    created_at TIMESTAMPTZ DEFAULT NOW()
) -- Only created in development branches

CI/CD INTEGRATION

# Neon-aware capysquash usage
capysquash squash migrations/ \
  --safety=standard \
  --neon-branch-aware \
  --output=optimized/

PERFORMANCE METRICS

Optimization Impact Analysis

OPTIMIZATION METRICS

87%

AVERAGE FILE REDUCTION

287 files → 37 files

73%

DEPLOYMENT TIME SAVED

12 minutes → 3 minutes

100%

SCHEMA ACCURACY

Docker-validated equivalence

REAL-WORLD EXAMPLE: E-COMMERCE PLATFORM

► Original: 523 migrations, 45MB, 18 minute deployment

► Optimized: 67 migrations, 8MB, 4 minute deployment

► Storage saved: 37MB (82% reduction)

► CI/CD time saved: 14 minutes per deploy

TROUBLESHOOTING

Common Optimization Issues

OPTIMIZATION CHALLENGES

CIRCULAR DEPENDENCIES

Problem:

Tables reference each other, preventing consolidation

Solution:

Defer foreign key constraints or use intermediate tables

-- capysquash automatically handles this
CREATE TABLE a (
    id SERIAL PRIMARY KEY,
    b_id INTEGER REFERENCES b(id)
);

CREATE TABLE b (
    id SERIAL PRIMARY KEY,
    a_id INTEGER REFERENCES a(id)
);

-- Becomes (with deferred constraints):
CREATE TABLE a (
    id SERIAL PRIMARY KEY,
    b_id INTEGER
);

CREATE TABLE b (
    id SERIAL PRIMARY KEY,
    a_id INTEGER
);

-- Constraints added after table creation
ALTER TABLE a ADD CONSTRAINT fk_a_b FOREIGN KEY (b_id) REFERENCES b(id) DEFERRABLE;
ALTER TABLE b ADD CONSTRAINT fk_b_a FOREIGN KEY (a_id) REFERENCES a(id) DEFERRABLE;

DATA MIGRATION CONFLICTS

Problem:

Schema changes require data transformation

Solution:

Separate schema and data migrations, use staging tables

BEST PRACTICES

Optimization Guidelines

OPTIMIZATION BEST PRACTICES

DO ☑
  • ► Group related changes together
  • ► Consider deployment order
  • ► Validate with Docker
  • ► Test on staging first
  • ► Document optimization decisions
  • ► Use appropriate safety levels
  • ► Monitor performance impact
DON'T ☒
  • ► Don't optimize prematurely
  • ► Don't break data dependencies
  • ► Don't ignore rollback scenarios
  • ► Don't mix schema and data changes
  • ► Don't skip validation
  • ► Don't optimize in production
  • ► Don't lose migration history

NEXT STEPS


OPTIMIZATION PHILOSOPHY

The goal isn't just fewer files - it's better organization, faster deployments, and maintainable schemas. Smart optimization improves developer experience while maintaining 100% safety.

How is this guide?

On this page