Optional AI capabilities for advanced migration analysis in CAPYSQUASH and capysquash-cli

AI-POWERED FEATURES

CAPYSQUASH and capysquash-cli offer optional AI capabilities for advanced migration analysis. AI features are completely optional - the core engine works great without them.

OPTIONAL ENHANCEMENT

AI features enhance analysis with semantic understanding, but aren't required. The core engine uses PostgreSQL's parser and Docker validation - AI adds automatic fixing and deeper insights on top.

SUPPORTED AI PROVIDERS

capysquash supports three AI providers. Configure at least one to use AI features.

ANTHROPIC (CLAUDE)

☑ Recommended

► Best for SQL analysis
► Excellent reasoning
► Fast responses
► Reliable output format

OPENAI (GPT-4)

☑ Supported

► Good for general tasks
► Wide model selection
► Consistent quality
► Large context window

AZURE OPENAI

☑ Supported

► Enterprise compliance
► Data residency control
► Custom deployments
► SLA guarantees

SETUP

Configure Anthropic (Claude) - Recommended

# Set API key
export ANTHROPIC_API_KEY="sk-ant-..."

# Test connection
capysquash ai-test

Get API key: console.anthropic.com

Configure OpenAI (GPT-4)

# Set API key
export OPENAI_API_KEY="sk-..."

# Test connection
capysquash ai-test

Get API key: platform.openai.com

Configure Azure OpenAI

# Set endpoint and key
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_API_KEY="..."

# Test connection
capysquash ai-test

Setup: Azure Portal

AI COMMANDS

ai-test - Test Provider Connection

CONNECTION TESTING

Verify AI provider setup before using other AI features.

capysquash ai-test

What it checks:

► API key validity
► Network connectivity
► Provider availability
► Basic query functionality

Example output:

Testing AI Providers...

☑ Anthropic (Claude):
   Model: claude-3-5-sonnet-20241022
   Status: Connected
   Response time: 342ms

☒ OpenAI:
   Status: Not configured
   Set OPENAI_API_KEY to enable

☒ Azure OpenAI:
   Status: Not configured

Summary: 1 provider ready (Anthropic)

ai-demo - See AI in Action

DEMONSTRATION MODE

See what AI can do with sample migrations.

capysquash ai-demo

Demonstrates:

► Semantic function similarity detection
► Dead code identification
► Complexity scoring
► Platform pattern recognition

Example demonstrations:

Function Similarity
- Shows two functions that do the same thing differently
- AI identifies semantic equivalence
- Suggests consolidation
Dead Code Detection
- Identifies unused functions and triggers
- Shows why they're unused (no callers)
- Suggests safe removal
Complexity Analysis
- Scores function complexity (1-10)
- Identifies refactoring candidates
- Suggests optimization strategies

ai-fix - Automatic Migration Fixing

⚠️ EXPERIMENTAL: AUTO-FIX

AI-assisted fixing of migration errors with automatic retry.

capysquash ai-fix [migration-directory] [options]

⚠️ EXPERIMENTAL STATUS

This feature is experimental. Always review fixes before deploying. Creates backups automatically but use in development first.

How it works:

Analyze - Detects errors in migrations
AI Suggest - Uses AI to propose fixes
Apply - Applies fixes (with confirmation unless --auto-apply)
Validate - Runs Docker validation
Repeat - If validation fails, tries again (up to --max-attempts)

Options:

--max-attempts <n>

Maximum fix attempts (default: 5)

--auto-apply

Apply fixes without confirmation

--verbose

Show AI reasoning and detailed process

Usage examples:

# Interactive fixing (recommended)
capysquash ai-fix migrations/

# See AI reasoning
capysquash ai-fix migrations/ --verbose

# Automatic fixing with more attempts
capysquash ai-fix migrations/ --max-attempts=10 --auto-apply

What it can fix:

☑ Syntax errors
☑ Missing semicolons
☑ Invalid SQL statements
☑ Type mismatches
☑ Dependency ordering issues
☑ Schema inconsistencies

Limitations:

☒ Cannot fix logic errors
☒ Cannot fix data-dependent issues
☒ May struggle with complex migrations
☒ Requires manual review of fixes

analyze-deep - AI-Enhanced Analysis

Available as a pre-configured workflow. See Commands Reference.

capysquash analyze-deep migrations/

Provides:

Full dependency graph analysis
AI-powered semantic analysis
Dead code detection
Function complexity scoring
Platform pattern detection
Optimization suggestions
Risk assessment

WHAT AI PROVIDES

Semantic Analysis

BEYOND SYNTAX

AI understands what code does, not just what it says.

-- These two functions are semantically equivalent
-- AI can detect this even with different implementations

CREATE FUNCTION count_active_users_v1()
RETURNS INTEGER AS $$
  SELECT COUNT(*) FROM users WHERE status = 'active';
$$ LANGUAGE SQL;

CREATE FUNCTION count_active_users_v2()
RETURNS INTEGER AS $$
DECLARE
  result INTEGER;
BEGIN
  SELECT COUNT(*) INTO result FROM users WHERE status = 'active';
  RETURN result;
END;
$$ LANGUAGE PLPGSQL;

Dead Code Detection

UNUSED CODE IDENTIFICATION

AI traces function calls and trigger usage to find dead code.

► Functions that are never called
► Triggers on non-existent tables
► Views that are never queried
► Procedures without callers

Complexity Scoring

REFACTORING CANDIDATES

AI scores function complexity on a scale of 1-10.

1-3: SIMPLE

Clean, maintainable

4-7: MODERATE

Could be simplified

8-10: COMPLEX

Refactor recommended

Platform Pattern Recognition

FRAMEWORK DETECTION

AI identifies platform-specific patterns and best practices.

► Supabase: auth.users, storage.buckets, RLS policies
► Clerk: JWT v2 schemas, authentication tables
► Auth0: Identity management patterns
► NextAuth: Session and account schemas
► Neon: Serverless optimizations

COST CONSIDERATIONS

AI features use paid API services. Be mindful of usage.

💰 API COSTS

ANTHROPIC PRICING

► Input: ~$3 per 1M tokens
► Output: ~$15 per 1M tokens
► Typical analysis: $0.05-$0.20

OPENAI PRICING

► GPT-4: ~$30-$60 per 1M tokens
► GPT-4-turbo: ~$10-$30 per 1M tokens
► Typical analysis: $0.10-$0.50

💡 Tip: Use AI features selectively. Run analyze first (free), then use analyze-deep (paid) only when you need deeper insights.

PRIVACY & SECURITY

🔒 DATA HANDLING

WHAT GETS SENT TO AI

► SQL migration content (schema definitions)
► Function bodies and logic
► Table structures and relationships
► NO sensitive data: actual table data is never sent

PROVIDER POLICIES

► Anthropic: Does not train on your data
► OpenAI: API data not used for training (per API terms)
► Azure OpenAI: Full enterprise compliance and data residency

RECOMMENDATIONS

► Review provider terms before use
► Use Azure OpenAI for strict compliance needs
► Don't include sensitive comments in migrations
► Consider scrubbing proprietary naming if needed

WHEN TO USE AI FEATURES

💡 Using CAPYSQUASH Platform? AI features are available in Pro and Enterprise plans with built-in provider management. See subscription plans for details.

☑ GOOD USE CASES

► Large codebases - Finding dead code in 1000+ functions
► Legacy migrations - Understanding old undocumented code
► Debugging - Quick fixes for syntax errors
► Complexity audit - Identifying refactoring candidates
► Learning - Understanding what migrations do

⚠️ SKIP AI FOR

► Small projects - 10-20 migrations don't need AI
► Simple schemas - Basic CRUD doesn't benefit much
► CI/CD pipelines - Adds cost and latency
► Sensitive data - Use core features only
► Cost constraints - Core engine is free

BEST PRACTICES

💡 OPTIMIZATION TIPS

1. START FREE, ADD AI WHEN NEEDED

Run capysquash analyze first (free). Only use analyze-deep (paid) when you need semantic insights.

2. USE AI-FIX IN DEVELOPMENT

Test ai-fix on dev migrations first. Review all fixes before applying to production schemas.

3. CACHE ANALYSIS RESULTS

Save analyze-deep output to avoid re-running expensive analysis on unchanged migrations.

4. MONITOR API COSTS

Check your AI provider dashboard regularly. Set billing alerts to avoid surprises.

TROUBLESHOOTING

API Key Not Working

# Verify key is set
echo $ANTHROPIC_API_KEY

# Test connection
capysquash ai-test

# Check for typos
export ANTHROPIC_API_KEY="sk-ant-..."  # Ensure no extra spaces

Rate Limiting

Error: "Rate limit exceeded"

► Wait a few minutes and retry
► Check your API plan limits
► Process migrations in smaller batches
► Upgrade your API plan if needed

AI Provides Poor Results

If AI suggestions seem off:

► Try a different provider (Claude vs GPT-4)
► Use --verbose to see AI reasoning
► Verify migrations are well-formed SQL
► Remember: AI is optional, core features still work

NEXT STEPS

CAPYSQUASH Platform - AI features in Pro and Enterprise plans
capysquash-cli Commands - CLI commands including AI features
Configuration Guide - Configure AI providers
Troubleshooting - Common issues and solutions

Remember: AI features are completely optional. CAPYSQUASH and the underlying engine provide excellent migration consolidation without any AI.

AI-Powered Features (Optional)