Skip to content

MCP Server

~4 min read

Crucis ships an MCP (Model Context Protocol) server that exposes its autonomy scaffold as tools, resources, and prompts. AI agents running inside Claude Code, OpenCode, Codex, or any MCP-compatible host can use Crucis natively without shelling out to the CLI.

Two modes of operation

Pipeline mode — Crucis spawns its own subprocess agents (Claude, Codex) to handle generation, adversarial review, and implementation. You call a single tool like crucis_run and get results when done. Best for: automated workflows, CI-style runs, hands-off execution.

Step-by-step mode — Your agent acts as the generator, critic, and implementer. Crucis provides the prompts, validates your work, and verifies correctness. Best for: interactive development, learning the TDD flow, fine-grained control.

Quick start

# 1. Authorize the workspace
mkdir -p .crucis && touch .crucis/mcp_enabled
# or: export CRUCIS_MCP_AUTHORIZED=1

# 2. Configure your MCP host
# .mcp.json
{
  "mcpServers": {
    "crucis": {
      "command": "crucis-mcp",
      "env": {
        "CRUCIS_WORKSPACE": "/path/to/your/project",
        "CRUCIS_MCP_AUTHORIZED": "1"
      }
    }
  }
}
# 3. Use the tools (19 available)
crucis_validate -> crucis_doctor -> crucis_run -> crucis_summary

Tools (19)

Tool Annotation Description
crucis_run llm Full pipeline: fit + evaluate (spawns agents)
crucis_run_fit llm Fit phase only: generate and harden tests
crucis_run_evaluate llm Evaluate phase only: implement and verify
crucis_get_prompt read-only Get the system prompt for any pipeline step
crucis_submit_test_suite mutating Save and validate an agent-written test suite
crucis_validate read-only Validate objective.yaml structure and semantics
crucis_summary read-only Pipeline status and per-task progress
crucis_run_probe llm Run cheating probe against test suite
crucis_doctor read-only Environment and workspace diagnostics
crucis_init mutating Scaffold a workspace with starter files
crucis_run_plan llm Generate a structured plan.md
crucis_dry_run read-only Preview pipeline config without API calls
crucis_reset destructive Reset checkpoint (all tasks or specific ones)
crucis_submit_adversarial_report mutating Save adversarial review findings
crucis_write_tests mutating Materialize checkpoint tests to disk
crucis_verify_implementation read-only Run tests + holdout evals against code
crucis_promote mutating Promote an optimizer candidate to active (requires optimizer enabled)
crucis_optimizer_worker llm Run background optimizer (requires optimizer enabled)
crucis_check_constraints read-only Check source code against required/advisory constraints

Resources (7)

Resource URI Description
crucis://objective Parsed objective definition (JSON)
crucis://checkpoint Full checkpoint state (JSON)
crucis://task/{name}/test-suite Generated test suite source for a task
crucis://task/{name}/adversarial-report Adversarial report for a task (JSON)
crucis://constraints/{profile} Constraint profile definition (JSON)
crucis://plan Generated plan.md content
crucis://curriculum Implementation brief markdown

Security

Workspace authorization — Every tool call verifies the workspace is authorized before proceeding. A workspace must opt in via a marker file (.crucis/mcp_enabled) or the environment variable CRUCIS_MCP_AUTHORIZED=1. Unauthorized workspaces receive a WorkspaceNotAuthorizedError.

Path traversal prevention — All file path arguments are resolved (including symlinks) and checked against the workspace boundary. Null bytes in paths are rejected. Paths exceeding 4,096 characters are rejected. Any resolved path outside the workspace is blocked with a PathTraversalError.

Input size limits — Source code inputs (for crucis_check_constraints and crucis_submit_test_suite) are limited to 1 MB to prevent resource exhaustion.

Credential handling — API keys and secrets are read from environment variables, never from tool parameters. The server itself stores no credentials.

STDIO isolation — The server redirects Rich console output to stderr to prevent UI text from corrupting the JSON-RPC protocol on stdout.

Prompts (5)

Canned workflow templates your agent can invoke.

Prompt Description
setup-crucis Guide: scaffold workspace and configure objective (uses auto-holdout)
tdd-workflow Guide: full pipeline run with subprocess agents
verify-code-quality Guide: check a source file against required/advisory constraints
harden-tests Guide: run fit phase and review adversarial findings
agent-tdd-workflow Guide: step-by-step TDD where the agent does everything

Agent-friendly design

The MCP server is designed to minimize round-trips and guide agents through workflows.

Tool annotations — Every tool declares its safety profile via MCP tool annotations. MCP clients that support annotations can auto-approve read-only tools and prompt for destructive ones.

Next-step hints — Every tool response includes a next_steps array that tells the agent exactly what to do next. The hints are contextual: success and failure produce different guidance.

Structured errors — Error responses include the exception type and an actionable hint.

Pre-validation — Long-running tools (crucis_run, crucis_run_fit) pre-validate the objective and profiles before starting. This catches typos and missing files immediately instead of failing minutes later.

Smart output truncationcrucis_verify_implementation keeps the tail of pytest output (where failures and the summary appear) rather than the head, so agents can see what actually failed.