Skip to main content
The runtime spawns, reuses, and reaps the containers that agents (and meeting bots) run in. It is the only thing that touches the orchestrator; the control plane just asks it to run a dispatch.

Why it exists: safety

An agent is an untrusted, tool-using process operating on sensitive data. So every dispatch runs isolated (its own container), sandboxed (no egress except through brokered tools), and scoped to only the workspaces and tools it was granted. Isolation is what makes the governance real rather than advisory — which is why agents never run in the control plane.

One lifecycle, one substrate

  • TTL-on-idle — a container lives while it works and is reaped when idle. No warm/oneshot bookkeeping; continuity is the session file in the workspace.
  • Sub-second, ephemeral, thousands in parallel — the single-machine coding-agent model, made multi-tenant and cheap.

Where it runs

The runtime is orchestration-agnostic. The kernel owns the runtime.v1 lifecycle — starting → running → stopping → stopped → destroyed, emitting an event on every transition — and delegates the one substrate-specific question (how do I start, observe, and stop a workload?) to a pluggable Backend with a five-method port: start · exit_code · terminate · kill · cleanup. The same control plane and the same unit.v1 dispatch drive every backend; only the implementation behind that port differs:
  • Process — agents and bots are spawned as child processes, no Docker socket required.
  • Docker — each workload is its own container via the Docker socket. This is what the open core ships, brought up with Docker Compose (make all).
  • Kubernetes — the same workload model scheduled as a Pod across a cluster.
The backend is selected per deployment by RUNTIME_BACKEND (default docker). Because all three honour the same port, the lifecycle a caller observes is identical across substrates — a bot and an agent are the same runtime.v1 workload, differing only by profile and env.

On Kubernetes

With RUNTIME_BACKEND=k8s, a workload is a bare Pod, created with kubectl run … --restart=Never (the kernel shells out to kubectl — no client library, mirroring the Docker backend). Two choices are deliberate:
  • --restart=Never — the kernel owns restart and reaping (TTL-on-idle, max-lifetime, per-owner quotas). A Pod that resurrected itself would defeat the kernel’s “has it stopped?” detection, so the Pod must stay dead once it exits.
  • A bare Pod, not a Deployment/Job — a dispatch is a single ephemeral run, not a replicated service. The Pod is named vexa-<workloadId> (DNS-1123) in the namespace the runtime reads from the downward API (POD_NAMESPACE). Pod phase drives the backend’s exit check (Pending/Running → still running; Succeeded → exit 0; Failed → the container’s terminated exit code), which the kernel turns into the terminal state stopped with reason completed (exit 0) or failed (nonzero).
Current state (open core): the Kubernetes backend implements the lifecycle — it spawns, observes, and stops the Pod — but not yet the workspace mount or credential brokering the Docker backend does (bind-mounting the workspace at /workspace and injecting brokered model credentials as env, never in the dispatch envelope). The in-cluster substrate it would run under — a ServiceAccount + RBAC to create Pods, and a volume for the workspace — is intended to ship as a Helm chart and is not in this open-core tree. The shipped, self-hostable path is Docker Compose — see Deployment. Code: core/runtime/src/runtime_kernel/k8s_backend.py (lifecycle) vs core/runtime/src/runtime_kernel/docker_backend.py (the reference mount + credential path).

Already in production

Vexa’s meeting bots are browser containers spawned by this exact runtime. Running agents this way is the same machinery, a different workload type — not a new system to stand up.