Why it exists: safety
An agent is an untrusted, tool-using process operating on sensitive data. So every dispatch runs isolated (its own container), sandboxed (no egress except through brokered tools), and scoped to only the workspaces and tools it was granted. Isolation is what makes the governance real rather than advisory — which is why agents never run in the control plane.One lifecycle, one substrate
- TTL-on-idle — a container lives while it works and is reaped when idle. No warm/oneshot bookkeeping; continuity is the session file in the workspace.
- Sub-second, ephemeral, thousands in parallel — the single-machine coding-agent model, made multi-tenant and cheap.
Where it runs
The runtime is orchestration-agnostic. The kernel owns theruntime.v1
lifecycle — starting → running → stopping → stopped → destroyed, emitting an event on every
transition — and delegates the one substrate-specific question (how do I start, observe, and stop a
workload?) to a pluggable Backend with a five-method port: start · exit_code · terminate ·
kill · cleanup. The same control plane and the same unit.v1 dispatch drive every backend; only
the implementation behind that port differs:
- Process — agents and bots are spawned as child processes, no Docker socket required.
- Docker — each workload is its own container via the Docker socket. This is what the open core
ships, brought up with Docker Compose (
make all). - Kubernetes — the same workload model scheduled as a Pod across a cluster.
RUNTIME_BACKEND (default docker). Because all three
honour the same port, the lifecycle a caller observes is identical across substrates — a bot and an
agent are the same runtime.v1 workload, differing only by profile and env.
On Kubernetes
WithRUNTIME_BACKEND=k8s, a workload is a bare Pod, created with kubectl run … --restart=Never
(the kernel shells out to kubectl — no client library, mirroring the Docker backend). Two choices are
deliberate:
--restart=Never— the kernel owns restart and reaping (TTL-on-idle, max-lifetime, per-owner quotas). A Pod that resurrected itself would defeat the kernel’s “has it stopped?” detection, so the Pod must stay dead once it exits.- A bare Pod, not a Deployment/Job — a dispatch is a single ephemeral run, not a replicated
service. The Pod is named
vexa-<workloadId>(DNS-1123) in the namespace the runtime reads from the downward API (POD_NAMESPACE). Pod phase drives the backend’s exit check (Pending/Running→ still running;Succeeded→ exit 0;Failed→ the container’s terminated exit code), which the kernel turns into the terminal statestoppedwith reasoncompleted(exit 0) orfailed(nonzero).
Current state (open core): the Kubernetes backend implements the lifecycle — it spawns,
observes, and stops the Pod — but not yet the workspace mount or credential brokering the Docker
backend does (bind-mounting the workspace at
/workspace and injecting brokered
model credentials as env, never in the dispatch envelope). The in-cluster substrate it would run under —
a ServiceAccount + RBAC to create Pods, and a volume for the workspace — is intended to ship as a Helm
chart and is not in this open-core tree. The shipped, self-hostable path is Docker Compose — see
Deployment. Code: core/runtime/src/runtime_kernel/k8s_backend.py (lifecycle) vs
core/runtime/src/runtime_kernel/docker_backend.py (the reference mount + credential path).