AI INTEGRATION

capysquash includes optional AI-powered analysis capabilities that enhance migration consolidation through semantic understanding of SQL code.

OVERVIEW

AI-ENHANCED CAPABILITIES

☑ SEMANTIC ANALYSIS

Identify semantically equivalent functions despite different implementations

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ DEAD CODE DETECTION</h4>
  <p className="font-bold text-sm">Find unused functions and procedures across migrations</p>
</div>

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ AUTH PATTERN RECOGNITION</h4>
  <p className="font-bold text-sm">Detect Supabase, Clerk, Auth0, NextAuth patterns</p>
</div>

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ PERFORMANCE OPTIMIZATION</h4>
  <p className="font-bold text-sm">Suggest query and schema improvements</p>
</div>

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ COMPLEXITY ANALYSIS</h4>
  <p className="font-bold text-sm">Assess function and migration complexity</p>
</div>

<div className="brutalist-card-sm p-4">
  <h4 className="font-black uppercase mb-2">☑ CONSISTENCY VALIDATION</h4>
  <p className="font-bold text-sm">Verify logical consistency between schemas</p>
</div>

SUPPORTED AI PROVIDERS

CLAUDE (ANTHROPIC)

Model: claude-3-5-sonnet-20241022

Best for semantic analysis and code understanding

<div className="space-y-2 text-sm">
  <div className="flex justify-between font-bold">
    <span>Function equivalence</span>
    <span className="text-primary">⭐⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Dead code detection</span>
    <span className="text-primary">⭐⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Auth patterns</span>
    <span className="text-primary">⭐⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Performance</span>
    <span className="text-primary">⭐⭐⭐⭐⭐</span>
  </div>
</div>

OPENAI

Model: gpt-4

Best for general analysis and optimization

<div className="space-y-2 text-sm">
  <div className="flex justify-between font-bold">
    <span>Function equivalence</span>
    <span className="text-primary">⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Dead code detection</span>
    <span className="text-primary">⭐⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Auth patterns</span>
    <span className="text-primary">⭐⭐⭐</span>
  </div>
  <div className="flex justify-between font-bold">
    <span>Performance</span>
    <span className="text-primary">⭐⭐⭐⭐</span>
  </div>
</div>

SETUP

Step 1: Get API Key

CLAUDE (RECOMMENDED)

Get API Key: console.anthropic.com

OPENAI (ALTERNATIVE)

Get API Key: platform.openai.com

# Claude (Recommended)
export ANTHROPIC_API_KEY="sk-ant-api03-..."

# Or add to shell profile
echo 'export ANTHROPIC_API_KEY="sk-ant-api03-..."' >> ~/.zshrc
source ~/.zshrc

# OpenAI (Alternative)
export OPENAI_API_KEY="sk-..."
echo 'export OPENAI_API_KEY="sk-..."' >> ~/.zshrc
source ~/.zshrc

Step 2: Verify Setup

capysquash ai-test

Expected output:

🤖 Testing AI Provider Integrations

Testing Claude (Anthropic)...
☑ Provider: Claude
- Status: Available
- Model: claude-3-5-sonnet-20241022
- Capabilities:
  ► Function semantic equivalence
  ► Dead code detection
  ► Authentication pattern recognition
  ► Performance optimization suggestions
  ► Complexity analysis
  ► Code coverage analysis

AI Integration Status: 1 provider available

Step 3: Test AI Features

capysquash ai-demo

This demonstrates all AI capabilities with sample data.

AI FEATURES IN DETAIL

1. Function Semantic Equivalence

Identifies functions that are semantically equivalent despite different implementations.

FUNCTION 1

CREATE FUNCTION count_active_users()
RETURNS INTEGER AS $$
BEGIN
  RETURN (SELECT COUNT(*)
          FROM users
          WHERE status = 'active');
END;
$$ LANGUAGE plpgsql;

FUNCTION 2 (EQUIVALENT)

CREATE FUNCTION total_active_users()
RETURNS INTEGER AS $$
DECLARE
  user_count INTEGER;
BEGIN
  SELECT COUNT(*) INTO user_count
  FROM users
  WHERE status = 'active';
  RETURN user_count;
END;
$$ LANGUAGE plpgsql;

AI ANALYSIS:

"These functions are semantically equivalent - both return count of active users"

2. Dead Code Detection

Finds unused functions, triggers, and database objects.

Analysis types:

Static analysis (no references in migrations)
Usage pattern analysis (exists but never called)
Dependency analysis (no dependents)

EXAMPLE OUTPUT:

Dead Code Functions: 3
- old_user_cleanup (last used: never)
- legacy_email_validator (replaced by new version)
- unused_helper_function (no references)

3. Authentication Pattern Detection

Identifies auth/security patterns in your schema.

Detected patterns:

Supabase Auth (RLS policies, auth schema)
Clerk JWT validation
Auth0 integration
NextAuth database tables
Custom JWT handling
Firebase Auth
Storage policies

EXAMPLE DETECTION:

Authentication patterns detected:
► SUPABASE_AUTH: RLS policies on users table
► CLERK_JWT_V2: JWT validation function
► STORAGE_POLICY: File upload policies

Extra validation recommended for auth-related changes

4. Performance Optimization

Suggests performance improvements based on schema and query patterns.

Optimization types:

Missing or redundant indexes
Better query structures
Partitioning and materialized views
More efficient data types
Performance-impacting constraints
EXAMPLE SUGGESTIONS:
- Add composite index on (user_id, created_at) for events table
- Consider BRIN index for timestamp columns on large tables
- Use JSONB instead of JSON for better performance
- Partition events table by date for faster queries
- Add partial index on users(email) WHERE status = 'active'

USAGE IN WORKFLOWS

Analyze-Deep Workflow

# Comprehensive AI analysis
export ANTHROPIC_API_KEY="sk-ant-..."
capysquash analyze-deep migrations/*.sql

Provides:

Complete dependency analysis
AI semantic insights
Authentication patterns
Dead code identification
Performance suggestions
Risk assessment

SAFE Workflow

# Production-ready with AI safety validation
capysquash safe migrations/*.sql --output production/

AI checks:

Auth pattern detection (extra care needed)
Schema consistency validation
Dead code warnings
Risk assessment

FAST Workflow

# Development with AI optimization
capysquash fast migrations/*.sql --output dev/

AI optimizations:

Function deduplication
Performance suggestions auto-applied
Equivalent function consolidation

COST MANAGEMENT

⚠️ API USAGE COSTS

Pricing:

Claude: ~$3 per million tokens
OpenAI GPT-4: ~$30 per million tokens

Typical usage:

analyze-deep: 1000-5000 tokens (~$0.01-0.05 per run)
safe workflow: 500-2000 tokens (~$0.005-0.02 per run)
fast workflow: 1000-3000 tokens (~$0.01-0.03 per run)

Optimization tips:

Use AI only for deep analysis (infrequent)
Use standard workflow for regular squashing (no AI)
Reserve AI workflows for important cases

BEST PRACTICES

☑ DO

Use AI for insights and analysis
Validate AI-influenced changes
Start with analysis before squashing
Monitor API usage costs
Trust validation, not just AI

✗ DON'T

Blindly trust AI recommendations
Skip validation after AI squashing
Use AI for every operation
Deploy without testing
Ignore validation errors

TROUBLESHOOTING

ERROR: "AI analyzer not initialized"

Cause: No API keys configured

export ANTHROPIC_API_KEY="sk-ant-..."
capysquash ai-test

ERROR: "Provider unavailable"

Cause: Invalid or expired API key

Regenerate key at provider dashboard

SLOW AI ANALYSIS

Solution: Use AI selectively

# No AI for routine operations
capysquash squash migrations/*.sql --safety standard

# AI only for analysis
capysquash analyze-deep migrations/*.sql

NEXT STEPS

pgsquash-engine Commands - Using AI-enhanced commands
Safety Levels - AI in different safety contexts
Configuration - Advanced AI settings

AI Integration