The Problem
Different models have different strengths and different blind spots. Claude is strong on architecture and reasoning but would sometimes roll over function-level bugs and missed optimisations, particularly in file-by-file review. We built Spotter to bring open code models into the loop as a dedicated audit layer, reviewing file by file and catching what Claude missed. The goal was to shift the human reviewer to a true final pass. Instead of doing full review from scratch, you're resolving flagged issues that have already been surfaced and cross-validated. The 'you should have caught this' category of bugs mostly disappears before the review even starts.
Spotter watches file changes and routes diffs to multiple LLMs in parallel. Ollama, OpenWebUI, Claude Code, or any OpenAI-compatible endpoint. It cross-validates findings before surfacing them. The goal was a zero-interruption review loop that runs while you work, not after you push.
What's interesting
Multi-model consensus engine
Findings from each model are clustered using Jaccard token similarity and union-find grouping. Only findings that survive a configurable quorum make it through, reducing noise without losing signal. A judge strategy lets one model act as a second pass over the surviving candidates.
Four review modes
Watch mode debounces file changes and reviews diffs incrementally. One-shot works as a pre-commit hook. Deep audit batches the full codebase with resumable checkpoints so long runs can be interrupted and continued. Git range mode reviews a commit range retroactively.
Layered config with per-project overrides
Config loads from hardcoded defaults → global XDG config → project .spotter.yml → CLI flags. Environment variables interpolate inline. Per-model system prompts let you give different instructions to each backend. One watches for runtime bugs, another for structural patterns.
Context isolation for Claude Code
The Claude Code backend runs from a temp directory to prevent the project's CLAUDE.md from bleeding into the review prompt. Without this, the reviewer inherits project-specific instructions that bias findings.