Base URL & auth
| Hosted | https://api.cloud.vexa.ai |
| Self-hosted | http://localhost:18056 (the gateway; API_GATEWAY_HOST_PORT, default 18056) |
Platforms
Pass one of these asplatform:
| Platform | platform value |
|---|---|
| Google Meet | google_meet |
| Zoom | zoom |
| Microsoft Teams | teams |
Send a bot to a meeting
google_meet · zoom · teamsThe meeting id from the join URL (e.g.
abc-defg-hij).Display name the bot uses in the call. Defaults to
Vexa.ISO code (e.g.
en); omit to auto-detect.transcribe (default) or translate.POST /bots
Get the transcript
GET /transcripts/{platform}/{native_meeting_id}
Response
completed: false and are replaced by completed: true
confirmations.
Manage the bot
Update config — PUT /bots/{platform}/{native_meeting_id}/config
Stop / leave — DELETE /bots/{platform}/{native_meeting_id}
Running bots — GET /bots/status
Make the bot speak — POST /bots/{platform}/{native_meeting_id}/speak
Config (
PUT …/config, change language/task mid-call) and speak (POST …/speak, TTS into the
call) ride the live bot-control plane and are not yet wired in the v0.12 open-core stack — they currently
return 404. Send-a-bot, stop, running bots (GET /bots/status), list, and transcripts are live.List meetings — GET /meetings
Single meeting — GET /meetings/{meeting_id}
Speaker-attributed transcripts
Each segment is diarized — attributed to a speaker (a bound display name, or a provisional label until it binds) — with word-level timestamps (words[]) and a confidence. Speaker attribution is
text-level (who said what), via speaker binding / clustering / captions — not separate audio tracks.
Times are seconds from session start; absolute_start_time / absolute_end_time give wall-clock.
The same segments arrive live (completed: false, a pending draft) and then confirmed
(completed: true) — the gateway forwards the confirmed-plus-pending bundle to subscribers as the meeting
runs.
Recordings
The meeting’s audio recording is uploaded to object storage — on a self-host, your own MinIO bucket, so it never leaves your environment. This is the meeting audio, stored separately from the diarized transcript above (there is no “per-speaker audio” — speaker separation lives in the transcript as text).List recordings — GET /recordings
Recording detail — GET /recordings/{recording_id}
Master metadata (finalize-on-read) — GET /recordings/{recording_id}/master?type=audio
raw_url pointing at the byte stream
GET /recordings/{recording_id}/media/{media_file_id}/raw, which the player loads.
In Vexa’s runtime terms, a bot is a browser container; the transcript it produces compiles into the workspace, where agents act on it. See Meetings.