Testing Architecture
Overview
Claude Code has a multi-layered testing strategy that balances unit tests, integration tests, and "VCR" recorded API interactions. Given the complexity of LLM interactions, traditional mocking isn't enough - we need to record and replay actual API responses.
Plain text
┌─────────────────────────────────────────────────────────────────────────────┐
│ TESTING PYRAMID │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ │
│ │ E2E │ Full CLI interactions │
│ │ Tests │ (expensive, full API calls) │
│ └────┬────┘ │
│ │ │
│ ┌─────▼─────┐ │
│ │ VCR │ Recorded API interactions │
│ │ Tests │ (deterministic, fast) │
│ └─────┬─────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ Integration │ Services, tools with mocks │
│ │ Tests │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌─────────────────▼─────────────────┐ │
│ │ Unit Tests │ Pure functions, utils │
│ │ (jest/bun test) │ │
│ └─────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘