Skip to main content
The Meetings API sends a bot into a call and returns its transcript in real time. It is the public Vexa API — the same surface whether you use the hosted service or self-host.

Base URL & auth

Hostedhttps://api.cloud.vexa.ai
Self-hostedhttp://localhost:18056 (the gateway; API_GATEWAY_HOST_PORT, default 18056)
Every request carries your key:
-H "X-API-Key: <API_KEY>"

Platforms

Pass one of these as platform:
Platformplatform value
Google Meetgoogle_meet
Zoomzoom
Microsoft Teamsteams
The bot joins like any participant — no plugins, no host configuration.

Send a bot to a meeting

platform
string
required
google_meet · zoom · teams
native_meeting_id
string
required
The meeting id from the join URL (e.g. abc-defg-hij).
bot_name
string
Display name the bot uses in the call. Defaults to Vexa.
language
string
ISO code (e.g. en); omit to auto-detect.
task
string
transcribe (default) or translate.
POST /bots
curl -X POST "$API_BASE/bots" \
  -H "X-API-Key: $API_KEY" -H "Content-Type: application/json" \
  -d '{"platform":"google_meet","native_meeting_id":"abc-defg-hij","bot_name":"Vexa","language":"en"}'

Get the transcript

GET /transcripts/{platform}/{native_meeting_id}
curl -H "X-API-Key: $API_KEY" \
  "$API_BASE/transcripts/google_meet/abc-defg-hij"
Response
{
  "segments": [
    {
      "segment_id": "sess_9c2a:spk1:12400",
      "speaker": "Jane Liu",
      "text": "Let's lock the renewal pricing by July 1.",
      "start": 12.4, "end": 15.1,
      "language": "en",
      "completed": true,
      "confidence": 0.93,
      "words": [{ "word": "Let's", "start": 12.4, "end": 12.6, "probability": 0.98 }]
    }
  ]
}
Segments stream in while the meeting runs — poll this endpoint, or subscribe over WebSocket for live, per-segment updates. Live drafts arrive as completed: false and are replaced by completed: true confirmations.

Manage the bot

Update config — PUT /bots/{platform}/{native_meeting_id}/config
curl -X PUT "$API_BASE/bots/google_meet/abc-defg-hij/config" \
  -H "X-API-Key: $API_KEY" -H "Content-Type: application/json" \
  -d '{"language":"es","task":"translate"}'
Stop / leave — DELETE /bots/{platform}/{native_meeting_id}
curl -X DELETE "$API_BASE/bots/google_meet/abc-defg-hij" -H "X-API-Key: $API_KEY"
Running bots — GET /bots/status
curl -H "X-API-Key: $API_KEY" "$API_BASE/bots/status"
Make the bot speak — POST /bots/{platform}/{native_meeting_id}/speak
curl -X POST "$API_BASE/bots/google_meet/abc-defg-hij/speak" \
  -H "X-API-Key: $API_KEY" -H "Content-Type: application/json" \
  -d '{"text":"Thanks everyone, wrapping up."}'
Config (PUT …/config, change language/task mid-call) and speak (POST …/speak, TTS into the call) ride the live bot-control plane and are not yet wired in the v0.12 open-core stack — they currently return 404. Send-a-bot, stop, running bots (GET /bots/status), list, and transcripts are live.
List meetings — GET /meetings
curl -H "X-API-Key: $API_KEY" "$API_BASE/meetings"
Single meeting — GET /meetings/{meeting_id}
curl -H "X-API-Key: $API_KEY" "$API_BASE/meetings/12345"

Speaker-attributed transcripts

Each segment is diarized — attributed to a speaker (a bound display name, or a provisional label until it binds) — with word-level timestamps (words[]) and a confidence. Speaker attribution is text-level (who said what), via speaker binding / clustering / captions — not separate audio tracks. Times are seconds from session start; absolute_start_time / absolute_end_time give wall-clock. The same segments arrive live (completed: false, a pending draft) and then confirmed (completed: true) — the gateway forwards the confirmed-plus-pending bundle to subscribers as the meeting runs.

Recordings

The meeting’s audio recording is uploaded to object storage — on a self-host, your own MinIO bucket, so it never leaves your environment. This is the meeting audio, stored separately from the diarized transcript above (there is no “per-speaker audio” — speaker separation lives in the transcript as text).
List recordings — GET /recordings
curl -H "X-API-Key: $API_KEY" "$API_BASE/recordings"
Recording detail — GET /recordings/{recording_id}
curl -H "X-API-Key: $API_KEY" "$API_BASE/recordings/42"
Master metadata (finalize-on-read) — GET /recordings/{recording_id}/master?type=audio
curl -H "X-API-Key: $API_KEY" "$API_BASE/recordings/42/master?type=audio"
The master metadata returns a raw_url pointing at the byte stream GET /recordings/{recording_id}/media/{media_file_id}/raw, which the player loads.
In Vexa’s runtime terms, a bot is a browser container; the transcript it produces compiles into the workspace, where agents act on it. See Meetings.