Two human gates, cross-vendor review
The plan and each build phase are critiqued by two different model families. Exactly two reviews per gate — no R3+, no drift, no rubber-stamping.
Anvil is a single, self-contained AI coding agent that reads, writes, edits, and runs your project itself — wrapped in a disciplined two-gate review workflow so a long vibe-coding session can't quietly drift off the rails. One Rust binary. Model-agnostic. Installs in one line and updates itself.
Anvil is feature-complete for its core workflow and installs/updates cleanly — but it's early. It's been field-tested mostly on Windows (the macOS/Linux builds pass CI but haven't been run in anger yet); several big pieces landed recently; and the review-gate flow is verified by hand rather than an automated suite. Expect rough edges, and please file issues — that feedback is exactly what the beta is for. For the best experience, use a capable tool-calling model for the coder role.
What makes Anvil different isn't the agent — everyone has one of those now. It's the discipline around it: at two human gates, two different model families review the work — once on the plan, once on each phase's git diff — in a sequential review → fix → re-review loop.
The plan and each build phase are critiqued by two different model families. Exactly two reviews per gate — no R3+, no drift, no rubber-stamping.
R1 reviews → the coder applies fixes → you review → R2 reviews → the coder applies fixes → summary. R2 re-reviews after R1's fixes, catching bugs those fixes introduce.
It reads, edits, and runs the repo through tools — including a reliable apply_patch diff format. No attaching files, no copy-pasting output back to disk.
A persistent task anchor, automatic compaction into working memory, and a lightweight repo map keep the coder on-goal across long, tool-heavy sessions.
Ollama / local, OpenAI, xAI, Anthropic, Groq, Azure, AWS Bedrock, Google, OpenRouter, and any OpenAI-compatible gateway. Mix vendors freely — that's the whole point.
Prebuilt binaries for macOS, Linux, and Windows. anvil update swaps the binary in place — no reinstall, no toolchain.
Free-form agentic chat by default; structure only at the two gates. You approve; the coder does the work; a genuine cross-vendor second opinion keeps it honest.
Talk with the coder about goals, constraints, and architecture, then ask it to write plan.md itself — phased, with goals and acceptance criteria.
/lock-plan runs the review loop on the plan; /accept-plan quenches it — records its hash and unlocks phase work.
The coder implements a phase directly — reads, edits, runs tests. /accept-phase runs the same review loop on the actual git diff.
/ship-phase quenches the phase. Re-run /accept-phase any time to re-review. Repeat per phase until you're done.
The review loop, at every gate
REVIEW_*.md at the repo root. The human approves at each gate; the coder does the creative work.
Between the gates it's an ordinary, capable coding agent — driven by tools, scoped to the repo root, and engineered not to flail across long sessions.
The coder core
read_file, write_file, apply_patch (a validated, context-located multi-file diff format), edit_file, list_dir, grep, run_command, project_state — paths that escape the root are rejected.
.anvil/session.json) survives restarts, alongside a live reality snapshot and user-editable context files: decisions.md, assumptions.md, scratch.md, ARCHITECTURE.md.
Set up providers, models, and roles once in a global config — every future repo is configured the moment you cd in. Keys live in the OS keyring or a shared .env; live model lists where the provider exposes them.
OpenAI, xAI (Grok), Anthropic, Groq, Google, Together, Fireworks, OpenRouter, Mistral, DeepSeek, Perplexity, Cohere, and more.
Ollama (http://localhost:11434/v1) and LM Studio. A first-run quick-setup picks live local models for CODER / R1 / R2.
Azure OpenAI, AWS Bedrock, Google Vertex, Gradient — plus any OpenAI-compatible endpoint via a custom provider and base URL.
Only the coder needs a strong, tool-calling model. The two reviewers are single-shot critique, so local models handle them fine — and keeping them a different family than the coder is exactly the cross-vendor second opinion Anvil is built around. A capable cloud model as the coder; local Ollama models as R1 / R2.
The coder drives itself through tool calls, so it needs a model that supports tool/function calling. Anvil reads each model's capabilities from models.dev and warns you at startup if your coder model can't do tool calls. /models shows each role's window, tool support, and price.
No Rust toolchain required — the installer downloads a prebuilt binary for your OS and puts it on your PATH. Prebuilt binaries cover macOS (Intel + Apple Silicon), Linux (x86_64 + arm64, static musl), and Windows (x86_64).
macOS / Linux
$ curl -fsSL https://raw.githubusercontent.com/\
ai-nhancement/Anvil/master/install.sh | sh
Windows (PowerShell)
> irm https://raw.githubusercontent.com/\
ai-nhancement/Anvil/master/install.ps1 | iex
In any repo, create Anvil's per-project state dir and launch the TUI — the setup wizard runs automatically on first use, then just type to chat with your coder.
$ anvil init # create .anvil/ state dir $ anvil # launch the full TUI
On launch Anvil does a quiet, cached check for a newer release and shows a pulsing update badge when one exists. To update, run anvil update from the shell or /update from inside the TUI — it downloads the right binary and replaces the running one in place. Pin a build with ANVIL_VERSION=v0.2.0 on the install command.
The forge-themed TUI
run_command is gated with a selectable prompt — run once, allow that program for the session, or decline. Break in anytime with Ctrl+B.
/ for workflow, config, model, and memory commands — /status, /models, /config, /memory, /view-plan, /view-reviews — plus a headless CLI (init, setup, status, update) for scripting.