Skip to main content

MCP agent sessions

botctl mcp exposes a small MCP-compatible JSON-RPC API for persistent agent TUI sessions backed by tmux. Claude, Codex, and Agy are supported through provider-specific spawn tools.

Start a transport

botctl mcp stdio
botctl mcp http --bind 127.0.0.1:8787
botctl mcp http --bind 0.0.0.0:8787 --allow-non-loopback

The HTTP transport is a stateless, Streamable-HTTP-compatible JSON request/response server (single endpoint POST /mcp):

  • POST /mcp returns application/json (one JSON-RPC response per request — no SSE). A request body containing only notifications/responses returns 202 Accepted with an empty body.
  • GET /mcp returns 405 Method Not Allowed with Allow: POST, DELETE, OPTIONS: there is no server-initiated SSE stream (botctl emits no server notifications).
  • DELETE /mcp returns 204 No Content — a no-op, because the server is stateless (no Mcp-Session-Id; agent state lives in SQLite keyed by tool args).
  • OPTIONS /mcp returns 204 with Allow: POST, GET, DELETE, OPTIONS (no CORS/Access-Control-* headers — this is not a browser API).
  • Protocol version: HTTP advertises 2025-03-26. The MCP-Protocol-Version header is optional, but if present on a post-initialize request it must equal 2025-03-26 or the request is rejected with 400. (The stdio transport still advertises 2024-11-05.)
  • Security: A missing Origin is allowed (native MCP clients omit it); if present, Origin must be loopback — a literal Origin: null (sandboxed/opaque browser context) is rejected with 403. The host check parses the authority strictly (only a loopback IP or exactly localhost), so loopback-prefixed names like 127.0.0.1.evil.com and userinfo smuggling like http://127.0.0.1@evil.com are rejected. When bound to loopback, the Host header must also be loopback (DNS-rebinding protection). Content-Type/Accept handling is lenient.
  • Non-loopback bind: binding a non-loopback address (e.g. 0.0.0.0) is hard-rejected unless you pass --allow-non-loopback, because the MCP control plane has no authentication and can spawn/kill agents. With the flag, a prominent warning is printed at startup.
  • A top-level JSON-RPC batch (array) is rejected with a JSON-RPC invalid_request error.

Explicitly out of scope: SSE response bodies, GET SSE streams, SSE resumability / Last-Event-ID, Mcp-Session-Id / stateful sessions, authentication/OAuth, and browser CORS.

Tools

Tool names are unprefixed; MCP clients typically expose them under the server name (e.g. mcp__botctl__spawn_claude). Provider-specific spawn tools are advertised only when their provider binary is available on PATH.

  • spawn_claude — Claude spawn tool. Required: cwd. Optional: model_preset (best/balanced/fast/cheap, defaults to best), advanced raw model, effort (low/medium/high/xhigh/max), agent, permission_mode (acceptEdits/auto/bypassPermissions/default/dontAsk/plan), settings, timeout_ms, policy.
  • spawn_codex — Codex spawn tool. Required: cwd. Optional: model_preset (best/balanced/fast/cheap, defaults to best), advanced raw model, effort, timeout_ms, policy.
  • spawn_agy — Agy/Antigravity spawn tool. Required: cwd. Optional: timeout_ms, policy.
  • Per-provider argument support:
    • Claude: model--model, effort--effort, agent--agent, permission_mode--permission-mode, settings--settings.
    • Codex: model-m, effort-c model_reasoning_effort=<v>. agent, permission_mode, and settings are rejected.
    • Agy: model/effort/agent/permission_mode/settings are all rejected (no matching CLI flags).
  • prompt — submit one prompt to that managed ID, wait for completion, and keep the tmux window alive. If the registry row is killed/dead or its recorded pane is missing, prompt can resurrect the same ID by starting a replacement pane from the persisted launch configuration before submitting the prompt. If the recorded pane ID exists but belongs to a different tmux window, the call blocks as ambiguous_target instead of overwriting identity.
  • wait — wait for an existing managed session to reach a terminal outcome. It does not resurrect missing panes.
  • kill — kill only the verified managed tmux window; idempotent when already gone.
  • snapshot — capture raw pane text and classifier state for diagnostics. pane_text is the raw tmux capture, while recent_lines and structured outcome.snapshot drop only trailing blank terminal padding and then return the last useful lines from that same capture. Calling snapshot also refreshes the persisted agent.state from the live classified state. It does not resurrect missing panes.
  • send_keys — unsafe operator escape hatch scoped to the managed ID. It does not resurrect missing panes.
  • one_shot — create a temporary managed session, run exactly one prompt to a terminal outcome, then always attempt to kill the window (best-effort cleanup). Preferred arg: prompt (non-empty); aliases text, message, input, and initial_prompt are accepted. Optional: cwd (defaults to the MCP server current directory), provider (defaults to the first available provider binary in claude, codex, agy order), model_preset (best by default for Claude/Codex), advanced raw model, effort/agent/permission_mode/settings/timeout_ms/policy (same provider validation as persistent spawns; permission_mode/settings are claude-only). It is implemented by composing session creation + prompt + kill, so the prompt is submitted exactly once.
    • Timeout (per-phase): timeout_ms applies independently to the spawn-ready wait and the prompt turn; kill uses its own default. Worst case ≈ 2×timeout_ms + kill.
    • Auto-approval policy: uses managed auto-approval (no_yolo=false) — only folder-trust and gated agy command-permission prompts auto-advance; all other approvals block. A caller MAY set policy.no_yolo:true to be more conservative; it can never broaden beyond the existing gate.
    • Error vs result: argument-validation failures (missing/blank prompt, bad cwd, invalid optional args) surface as JSON-RPC errors. Spawn, turn, and kill failures are NOT errors — they are encoded in the result fields below and the call returns a normal result (isError:false). A failure after the window is created still always attempts the best-effort kill (no leaked tmux window).
    • Connection slot (operator note): one_shot holds its HTTP connection slot for the full duration of the call (spawn-ready wait + prompt turn + kill, worst case ≈ 2×timeout_ms + kill), unlike the split spawn/prompt/wait flow. Effective one_shot concurrency is therefore bounded by MAX_ACTIVE_CONNECTIONS (64) — long-running one-shots can saturate the slot pool, so size timeout_ms and client concurrency accordingly.
    • Result shape: the call returns a normal result whose fields encode the outcome. Fields: agent (or null on spawn failure), spawn_outcome, outcome (ready/needs_user_input/provider_error/blocked/timeout/dead/busy/unknown/spawn_failed), message (set on ready/needs_user_input), fresh_message, killed, kill ({status: ok|error|busy|skipped, …}), and error (present on spawn failure, a post-creation failure, or kill-error detail). provider_error includes error_excerpt when a visible Codex provider/API error is detected in the pane.

Sessions are addressed by generated public IDs, not tmux names. The registry stores exact tmux session/window/pane IDs (and the chosen provider) under the botctl state directory and uses SQLite lock rows so concurrent prompt/wait/kill/snapshot/send operations on the same ID return busy instead of racing.

Lifecycle resilience

Blocked outcomes use stable lower-snake-case reasons in outcome.warnings, outcome.blocked_reason, and the registry's current blocked fields. Current reasons are startup_choice_prompt, agy_folder_trust_prompt, agy_settings_persist_prompt, agy_command_permission_prompt, folder_trust_prompt, permission_dialog, survey_prompt, plan_approval_prompt, diff_dialog, provider_error, ambiguous_target, and unknown_state.

For blocked/provider-error/unknown outcomes, the registry persists only current evidence: blocked_reason, blocked_at_ms, and a bounded blocked_snapshot excerpt equivalent to the last 20 useful lines. It does not store full pane captures in blocked fields. Blocked fields are cleared when the session transitions back to ready/running/dead/killed, except cleanup-killed rows preserve the evidence that justified the automatic kill.

Normal lifecycle calls run conservative best-effort cleanup for stale managed rows. Cleanup may kill only aged, unlocked panes whose live recapture still shows a known blocker/provider error. It does not kill persistent ready sessions, running sessions, active response/editing states, unknown states, ambiguous targets, or newly-created rows.

Managed Codex responses include agent.command_health where botctl can verify registry identity. ok_codex means the verified pane command is codex; ok_node_codex_managed means the verified registry-backed Codex pane is currently running node; unexpected_command means the verified pane has another command; dead and ambiguous_target report missing or mismatched identity; not_applicable is used for non-Codex providers or when no live lookup was performed.

Safety notes

  • prompt CLI behavior is unchanged: one-shot botctl prompt still cleans up its managed window after success.
  • MCP sessions are persistent: prompt does not auto-kill after a reply.
  • Only prompt resurrects killed/dead/missing managed IDs; wait, snapshot, and send_keys report current state instead.
  • Automation resolves the managed ID back to one exact tmux pane before acting.
  • Unknown and unsupported blocker states return structured outcomes instead of guessing.
  • Visible Codex provider/API errors return structured provider_error outcomes with error_excerpt; this is a tool result, not a JSON-RPC transport error.
  • FolderTrustPrompt may be advanced with raw Enter; ordinary permission dialogs require supported policy and otherwise block.
  • Agy command-permission prompts are auto-approved with raw Enter only when the pane process is agy and the default cursor is still on 1. Yes (skipped when policy.no_yolo is set); Agy folder-trust and settings-persist prompts always block for manual review because the latter can mutate settings.json.
  • For non-Claude providers, prompt submission pastes the text and presses Enter; Claude-only keybindings (ClearInput, ExternalEditor) are not consulted.
  • For Agy, fresh-message extraction is pane-scrape based: prompt returns the latest assistant text as fresh_message: true when a new message is detected, otherwise it keeps polling until the timeout and reports stale_transcript. Use snapshot to inspect the pane directly.
  • send_keys only reports that keys/text were sent. It does not imply the agent made progress.

Smoke and torture testing

Manual smoke flow with real tmux and the chosen agent CLI on PATH:

  1. Start botctl mcp stdio or botctl mcp http --bind 127.0.0.1:8787.
  2. Call the provider-specific spawn tool, such as spawn_claude, with a temp repo cwd.
  3. Call prompt twice for the same ID and verify the second response uses the same persistent window.
  4. Call snapshot and inspect the exact tmux IDs.
  5. Call kill and verify the managed window is gone.

Torture scenarios to run manually before widening client use: concurrent prompts for one ID should yield one active operation and busy for the rest; concurrent prompts across different IDs should progress independently; killing the tmux window externally should make prompt/wait return dead; killing while a wait holds the lock should return busy.