Start Here¶
Crucis is an autonomy scaffold for code-generating agents. It provides structured automated feedback — tests, adversarial review, constraints, holdout verification — so agents can iterate longer without human intervention. The result: generated code that actually works beyond the examples you gave it.
Is Crucis right for your project?¶
| Situation | Path |
|---|---|
| Starting a new function or module from scratch | New Project Quickstart |
| Adding verified behavior to an existing codebase | Existing Codebase Quickstart |
| Want to understand the approach first | Why Crucis |
If you work inside Claude Code, OpenCode, or Codex, you can use Crucis as an MCP server instead of the CLI. The MCP server exposes every CLI command as a tool your agent can call directly.
Prerequisites¶
- Python 3.10+ (3.12+ recommended)
- An agent CLI on your PATH — either
claudeorcodex - An API key —
ANTHROPIC_API_KEYfor Claude, orOPENAI_API_KEY/codex loginfor Codex
Verify your environment¶
This checks Python version, agent binaries, API keys, Docker availability, and runtime settings. Fix any [FAIL] items before proceeding.
Install¶
Also available via pip:
Or run directly from the source tree (no install):
What crucis init creates¶
By default, crucis init creates only two files: objective.yaml and src/solution.py. Profiles and settings are optional — add them with --with-profiles or --with-settings. Settings are auto-created on first crucis run if they don't exist.
You don't need to manually split examples into train and holdout sets. Just list your examples under examples: and Crucis automatically holds out the last ~20% for verification. If you need manual control, you can still use an explicit holdout: field. Use holdout: [] to opt out of holdout verification entirely.
Next step¶
Pick a quickstart:
- New Project Quickstart — build a function from scratch with full verification
- Existing Codebase Quickstart — add verified behavior to your current project