Skip to main content

Architecture

botctl is built around one constraint: terminal automation is only safe when transport, observation, classification, and policy stay separate.

Current module map

  • src/tmux.rs — tmux transport, pane discovery, capture, key sending, and control-mode session management
  • src/runtime.rs — central local runtime, Unix-socket RPC, shared pane snapshots, event fanout, and yolo/action ownership
  • src/observe.rs — bounded observation, control-line parsing, and capture-backed reports
  • src/serve.rs — legacy observation helpers reused by tests while the runtime owns the live loop
  • src/screen_model.rs — best-effort stream reconstruction helper used by the runtime
  • src/classifier.rs — frame-to-state classification and recap metadata detection
  • src/automation.rs — action definitions, keybinding resolution, and guarded workflow rules
  • src/fixtures.rs — fixture recording, loading, and replay support
  • src/prompt.rs — prompt staging and external-editor handoff helpers
  • src/yolo.rs — state persistence for the yolo permission loop
  • src/agy.rs — Antigravity (agy) pane discovery, state classification, conversation id resolution, and pane-scrape last-message extraction
  • src/proc_fd.rs — shared /proc/<pid>/fd walking helpers used by agy, Pi, and last-message resolution
  • src/app.rs — command execution, status/doctor output, and top-level workflow orchestration
  • src/cli.rs — argument parsing and command definitions
  • src/main.rs — process entry point and error printing
  • src/lib.rs — crate module exports

Safety boundaries

Transport

The tmux layer should do tmux things only:

  • resolve panes
  • capture panes
  • send keys
  • open and hold control-mode connections

It should not decide whether an action is safe.

Observation

Observation is responsible for gathering terminal evidence:

  • control-mode stream lines
  • %output and %extended-output
  • tmux notifications
  • capture-pane snapshots for reconciliation

capture-pane is still the primary source for classification. In the central runtime, the live stream model is a best-effort helper that can break ties when stream-driven reconciliation would otherwise stay Unknown, but capture-backed snapshots remain the base truth.

Classification

The classifier turns a frame into an explicit state.

Current states:

  • ChatReady
  • PromptEditing
  • UserQuestionPrompt
  • BusyResponding
  • PermissionDialog
  • PlanApprovalPrompt
  • FolderTrustPrompt
  • SurveyPrompt
  • ExternalEditorActive
  • DiffDialog
  • Unknown

Unknown is preferred over a false positive.

Automation and policy

Automation should only run after:

  1. the target is resolved to an explicit pane id
  2. the pane is confirmed to be Claude-owned for guarded automation, classified as Codex for command permission approval, or passively resolved as OpenCode, Pi, or Antigravity for dashboard visibility
  3. the current classified state permits the workflow

This is why raw send-keys success is never enough.

Runtime model

Today botctl uses one authoritative live owner and several clients:

  • runtime owns tmux control mode, pane discovery, classification snapshots, guarded actions, and yolo supervision
  • dashboard is a runtime client
  • serve is a runtime-backed stream and HTTP facade
  • yolo is a runtime policy client
  • observe remains a bounded one-shot diagnostic command

SQLite remains the durable store for workspace identity, pending prompts, desired yolo state, and cached runtime metadata.

Observation model

Today botctl uses two observation paths:

  • bounded one-shot observation through observe
  • long-lived observation through runtime

The current live model is still a compromise:

  • stream events give low latency
  • capture-pane gives authoritative snapshots
  • classification still runs on captured pane text, not a full reconstructed terminal screen

That means the central runtime is a foundation, not the finished screen model.

Central runtime architecture today

The current runtime implementation is intentionally local and explicit:

  • one foreground runtime process
  • one Unix socket per state root at <state-root>/runtime.sock
  • one tmux control-mode session per observed tmux session
  • per-pane buffering of recent streamed output
  • screen_model reconstruction as a helper layer, not the source of truth
  • debounced reconciliation via capture-pane
  • structured events fanned out to local clients

serve sits on top of that runtime and filters the shared state to one session for stdout and HTTP responses.

For operator convenience, dashboard, yolo, and serve can auto-start that runtime in a hidden tmux session. Managed clients keep lightweight runtime leases so the runtime is not torn down while another managed listener is still using it.

This is the first slice of the larger serve-mode plan described in PLANS-Serve-Mode.md.

Design rules

  • prefer explicit pane ids over names or indexes
  • never automate ambiguous targets
  • keep observation and policy separate
  • preserve the user's Claude keybindings as the source of truth
  • keep fixture-based regression coverage close to classifier behavior
  • update guarded workflows and tests in the same change when classifier states change