CAPYSQUASH

AI Integration

Leverage AI-powered features for smarter migration consolidation

AI INTEGRATION

capysquash includes optional AI-powered analysis capabilities that enhance migration consolidation through semantic understanding of SQL code.

OVERVIEW

AI-ENHANCED CAPABILITIES

☑ SEMANTIC ANALYSIS

Identify semantically equivalent functions despite different implementations

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ DEAD CODE DETECTION</h4>
  <p className="font-bold text-sm">Find unused functions and procedures across migrations</p>
</div>

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ AUTH PATTERN RECOGNITION</h4>
  <p className="font-bold text-sm">Detect Supabase, Clerk, Auth0, NextAuth patterns</p>
</div>

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ PERFORMANCE OPTIMIZATION</h4>
  <p className="font-bold text-sm">Suggest query and schema improvements</p>
</div>

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ COMPLEXITY ANALYSIS</h4>
  <p className="font-bold text-sm">Assess function and migration complexity</p>
</div>

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ CONSISTENCY VALIDATION</h4>
  <p className="font-bold text-sm">Verify logical consistency between schemas</p>
</div>

SUPPORTED AI PROVIDERS

CLAUDE (ANTHROPIC)

Model: claude-3-5-sonnet-20241022

Best for semantic analysis and code understanding

<div className="space-y-2 text-sm">
  <div className="flex justify-between font-bold">
    <span>Function equivalence</span>
    <span className="text-primary">⭐⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Dead code detection</span>
    <span className="text-primary">⭐⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Auth patterns</span>
    <span className="text-primary">⭐⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Performance</span>
    <span className="text-primary">⭐⭐⭐⭐⭐</span>
  </div>
</div>

OPENAI

Model: gpt-4

Best for general analysis and optimization

<div className="space-y-2 text-sm">
  <div className="flex justify-between font-bold">
    <span>Function equivalence</span>
    <span className="text-primary">⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Dead code detection</span>
    <span className="text-primary">⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Auth patterns</span>
    <span className="text-primary">⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Performance</span>
    <span className="text-primary">⭐⭐⭐⭐</span>
  </div>
</div>

SETUP

Step 1: Get API Key

CLAUDE (RECOMMENDED)

Get API Key: console.anthropic.com

OPENAI (ALTERNATIVE)

Get API Key: platform.openai.com

# Claude (Recommended)
export ANTHROPIC_API_KEY="sk-ant-api03-..."

# Or add to shell profile
echo 'export ANTHROPIC_API_KEY="sk-ant-api03-..."' >> ~/.zshrc
source ~/.zshrc

# OpenAI (Alternative)
export OPENAI_API_KEY="sk-..."
echo 'export OPENAI_API_KEY="sk-..."' >> ~/.zshrc
source ~/.zshrc

Step 2: Verify Setup

capysquash ai-test

Expected output:

🤖 Testing AI Provider Integrations

Testing Claude (Anthropic)...
☑ Provider: Claude
- Status: Available
- Model: claude-3-5-sonnet-20241022
- Capabilities:
  ► Function semantic equivalence
  ► Dead code detection
  ► Authentication pattern recognition
  ► Performance optimization suggestions
  ► Complexity analysis
  ► Code coverage analysis

AI Integration Status: 1 provider available

Step 3: Test AI Features

capysquash ai-demo

This demonstrates all AI capabilities with sample data.

AI FEATURES IN DETAIL

1. Function Semantic Equivalence

Identifies functions that are semantically equivalent despite different implementations.

FUNCTION 1

CREATE FUNCTION count_active_users()
RETURNS INTEGER AS $$
BEGIN
  RETURN (SELECT COUNT(*)
          FROM users
          WHERE status = 'active');
END;
$$ LANGUAGE plpgsql;

FUNCTION 2 (EQUIVALENT)

CREATE FUNCTION total_active_users()
RETURNS INTEGER AS $$
DECLARE
  user_count INTEGER;
BEGIN
  SELECT COUNT(*) INTO user_count
  FROM users
  WHERE status = 'active';
  RETURN user_count;
END;
$$ LANGUAGE plpgsql;

AI ANALYSIS:

"These functions are semantically equivalent - both return count of active users"

2. Dead Code Detection

Finds unused functions, triggers, and database objects.

Analysis types:

  • Static analysis (no references in migrations)
  • Usage pattern analysis (exists but never called)
  • Dependency analysis (no dependents)

EXAMPLE OUTPUT:

Dead Code Functions: 3
- old_user_cleanup (last used: never)
- legacy_email_validator (replaced by new version)
- unused_helper_function (no references)

3. Authentication Pattern Detection

Identifies auth/security patterns in your schema.

Detected patterns:

  • Supabase Auth (RLS policies, auth schema)
  • Clerk JWT validation
  • Auth0 integration
  • NextAuth database tables
  • Custom JWT handling
  • Firebase Auth
  • Storage policies

EXAMPLE DETECTION:

Authentication patterns detected:
► SUPABASE_AUTH: RLS policies on users table
► CLERK_JWT_V2: JWT validation function
► STORAGE_POLICY: File upload policies

Extra validation recommended for auth-related changes

4. Performance Optimization

Suggests performance improvements based on schema and query patterns.

Optimization types:

  • Missing or redundant indexes

  • Better query structures

  • Partitioning and materialized views

  • More efficient data types

  • Performance-impacting constraints

    EXAMPLE SUGGESTIONS:

    • Add composite index on (user_id, created_at) for events table
    • Consider BRIN index for timestamp columns on large tables
    • Use JSONB instead of JSON for better performance
    • Partition events table by date for faster queries
    • Add partial index on users(email) WHERE status = 'active'

USAGE IN WORKFLOWS

Analyze-Deep Workflow

# Comprehensive AI analysis
export ANTHROPIC_API_KEY="sk-ant-..."
capysquash analyze-deep migrations/*.sql

Provides:

  • Complete dependency analysis
  • AI semantic insights
  • Authentication patterns
  • Dead code identification
  • Performance suggestions
  • Risk assessment

SAFE Workflow

# Production-ready with AI safety validation
capysquash safe migrations/*.sql --output production/

AI checks:

  • Auth pattern detection (extra care needed)
  • Schema consistency validation
  • Dead code warnings
  • Risk assessment

FAST Workflow

# Development with AI optimization
capysquash fast migrations/*.sql --output dev/

AI optimizations:

  • Function deduplication
  • Performance suggestions auto-applied
  • Equivalent function consolidation

COST MANAGEMENT

⚠️ API USAGE COSTS

Pricing:

  • Claude: ~$3 per million tokens
  • OpenAI GPT-4: ~$30 per million tokens

Typical usage:

  • analyze-deep: 1000-5000 tokens (~$0.01-0.05 per run)
  • safe workflow: 500-2000 tokens (~$0.005-0.02 per run)
  • fast workflow: 1000-3000 tokens (~$0.01-0.03 per run)

Optimization tips:

  • Use AI only for deep analysis (infrequent)
  • Use standard workflow for regular squashing (no AI)
  • Reserve AI workflows for important cases

BEST PRACTICES

☑ DO

  • Use AI for insights and analysis
  • Validate AI-influenced changes
  • Start with analysis before squashing
  • Monitor API usage costs
  • Trust validation, not just AI

✗ DON'T

  • Blindly trust AI recommendations
  • Skip validation after AI squashing
  • Use AI for every operation
  • Deploy without testing
  • Ignore validation errors

TROUBLESHOOTING

ERROR: "AI analyzer not initialized"

Cause: No API keys configured

export ANTHROPIC_API_KEY="sk-ant-..."
capysquash ai-test

ERROR: "Provider unavailable"

Cause: Invalid or expired API key

Regenerate key at provider dashboard

SLOW AI ANALYSIS

Solution: Use AI selectively

# No AI for routine operations
capysquash squash migrations/*.sql --safety standard

# AI only for analysis
capysquash analyze-deep migrations/*.sql

NEXT STEPS

How is this guide?

On this page