Create Transcription

curl -X POST https://api.bota.dev/v1/recordings/rec_abc123/transcribe \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "language": "en",
    "diarization": true,
    "provider": "whisper"
  }'

{
  "id": "txn_abc123",
  "recording_id": "rec_abc123",
  "status": "pending",
  "language": "en",
  "provider": "whisper",
  "duration_ms": null,
  "segments": null,
  "text": null,
  "error": null,
  "created_at": "2025-01-15T10:10:00Z",
  "completed_at": null
}

POST

recordings

{id}

transcribe

curl -X POST https://api.bota.dev/v1/recordings/rec_abc123/transcribe \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "language": "en",
    "diarization": true,
    "provider": "whisper"
  }'

{
  "id": "txn_abc123",
  "recording_id": "rec_abc123",
  "status": "pending",
  "language": "en",
  "provider": "whisper",
  "duration_ms": null,
  "segments": null,
  "text": null,
  "error": null,
  "created_at": "2025-01-15T10:10:00Z",
  "completed_at": null
}

Start an asynchronous transcription job for a recording. The recording must be in uploaded status. Use webhooks to receive real-time notifications when the transcription completes, or poll the Get Transcription endpoint.

Authentication

Requires an API key with transcriptions:write scope.

curl -X POST https://api.bota.dev/v1/recordings/rec_abc123/transcribe \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "language": "en",
    "diarization": true,
    "provider": "whisper"
  }'

Path Parameters

string

required

The recording’s unique identifier (e.g., rec_abc123).

Request Body

language

string

Language code for the audio content. If not specified, the language is auto-detected.ISO 639-1 two-letter codes (e.g., en, es, zh) work across all providers. Some providers also accept regional variants — see Language Code Formats below.

diarization

boolean

default:"true"

Enable speaker diarization to identify different speakers in the transcript.

provider

string

ASR provider to use for transcription. If not specified, uses the system default.

Provider	Description
`whisper`	OpenAI Whisper (default) - 99 languages, word timestamps
`deepgram`	Deepgram Nova-2 - Real-time capable, speaker diarization
`assemblyai`	AssemblyAI - Best/Nano models, async API
`elevenlabs`	ElevenLabs - High accuracy, language detection

Response

Returns the newly created transcription object with pending status.

{
  "id": "txn_abc123",
  "recording_id": "rec_abc123",
  "status": "pending",
  "language": "en",
  "provider": "whisper",
  "duration_ms": null,
  "segments": null,
  "text": null,
  "error": null,
  "created_at": "2025-01-15T10:10:00Z",
  "completed_at": null
}

Response Fields

Field	Type	Description
`id`	string	The transcription’s unique identifier (e.g., `txn_abc123`)
`recording_id`	string	The recording this transcription belongs to
`status`	string	Current status: `pending`, `processing`, `completed`, or `failed`
`language`	string \| null	Language code used for transcription
`provider`	string	ASR provider used (e.g., `whisper`, `deepgram`)
`duration_ms`	integer \| null	Audio duration in milliseconds (populated on completion)
`segments`	array \| null	Array of transcript segments with speaker labels and timestamps (populated on completion)
`text`	string \| null	Full transcript text (populated on completion)
`error`	object \| null	Error details if the transcription failed
`created_at`	string	ISO 8601 timestamp when the transcription was created
`completed_at`	string \| null	ISO 8601 timestamp when the transcription completed or failed

Transcription Status

Status	Description
`pending`	Job queued, waiting to start
`processing`	Transcription in progress
`completed`	Transcription finished successfully
`failed`	Transcription failed (check `error` field)

Polling for Results

After starting a transcription job, poll the Get Transcription endpoint:

async function waitForTranscription(transcriptionId) {
  while (true) {
    const response = await fetch(
      `https://api.bota.dev/v1/transcriptions/${transcriptionId}`,
      { headers: { 'Authorization': 'Bearer sk_live_...' } }
    );

    const transcription = await response.json();

    if (transcription.status === 'completed') {
      return transcription;
    }

    if (transcription.status === 'failed') {
      throw new Error(transcription.error.message);
    }

    // Wait 2 seconds before polling again
    await new Promise(resolve => setTimeout(resolve, 2000));
  }
}

Transcription typically takes 10-30% of the audio duration. A 30-minute recording usually completes in 3-10 minutes.

Auto-Transcription

Instead of calling this endpoint manually, you can enable auto-transcription to automatically start a transcription job whenever a recording upload completes. Configure via the hierarchical config system:

curl -X PUT https://api.bota.dev/v1/projects/proj_xxx/config/processing \
  -H "Authorization: Bearer sk_live_..." \
  -d '{ "auto_transcription": { "enabled": true, "provider": "whisper" } }'

See the Auto-Processing Guide for full details.

Webhooks (Recommended)

For production use, subscribe to webhook events instead of polling:

transcription.started - Processing begins
transcription.completed - Transcription finished successfully
transcription.failed - Transcription failed

See Webhooks Overview for setup instructions.

Language Code Formats

Each ASR provider accepts different language code formats. ISO 639-1 two-letter codes are recommended as they work across all providers.

Provider	Accepted Format	Example	Notes
`whisper`	ISO 639-1	`en`, `zh`, `ja`	99 languages supported
`deepgram`	BCP-47	`en`, `en-US`, `zh-CN`	Supports regional variants with hyphen
`assemblyai`	ISO 639-1 / underscore regional	`en`, `en_us`, `es`	Regional variants use underscore
`elevenlabs`	ISO 639-1 or ISO 639-3	`en` or `eng`, `ja` or `jpn`	Accepts both 2-letter and 3-letter codes

If you need regional accuracy (e.g., US English vs British English), use the provider-specific format:

// Deepgram - BCP-47 with hyphen
{ "language": "en-US", "provider": "deepgram" }

// AssemblyAI - underscore regional
{ "language": "en_us", "provider": "assemblyai" }

Supported Languages per Provider

Whisper — 99 languages (ISO 639-1)

af Afrikaans, am Amharic, ar Arabic, as Assamese, az Azerbaijani, ba Bashkir, be Belarusian, bg Bulgarian, bn Bengali, bo Tibetan, br Breton, bs Bosnian, ca Catalan, cs Czech, cy Welsh, da Danish, de German, el Greek, en English, es Spanish, et Estonian, eu Basque, fa Persian, fi Finnish, fo Faroese, fr French, gl Galician, gu Gujarati, ha Hausa, haw Hawaiian, he Hebrew, hi Hindi, hr Croatian, ht Haitian Creole, hu Hungarian, hy Armenian, id Indonesian, is Icelandic, it Italian, ja Japanese, jw Javanese, ka Georgian, kk Kazakh, km Khmer, kn Kannada, ko Korean, la Latin, lb Luxembourgish, ln Lingala, lo Lao, lt Lithuanian, lv Latvian, mg Malagasy, mi Maori, mk Macedonian, ml Malayalam, mn Mongolian, mr Marathi, ms Malay, mt Maltese, my Myanmar, ne Nepali, nl Dutch, nn Nynorsk, no Norwegian, oc Occitan, pa Punjabi, pl Polish, ps Pashto, pt Portuguese, ro Romanian, ru Russian, sa Sanskrit, sd Sindhi, si Sinhala, sk Slovak, sl Slovenian, sn Shona, so Somali, sq Albanian, sr Serbian, su Sundanese, sv Swedish, sw Swahili, ta Tamil, te Telugu, tg Tajik, th Thai, tk Turkmen, tl Tagalog, tr Turkish, tt Tatar, uk Ukrainian, ur Urdu, uz Uzbek, vi Vietnamese, yi Yiddish, yo Yoruba, zh Chinese

Deepgram — 36+ languages (BCP-47)

Language	Code(s)
English	`en`, `en-US`, `en-GB`, `en-AU`, `en-IN`, `en-NZ`
Chinese	`zh`, `zh-CN`, `zh-TW`
Spanish	`es`, `es-419`, `es-ES`
French	`fr`, `fr-CA`
Portuguese	`pt`, `pt-BR`, `pt-PT`
German	`de`
Italian	`it`
Dutch	`nl`
Japanese	`ja`
Korean	`ko`
Russian	`ru`
Hindi	`hi`
Turkish	`tr`
Polish	`pl`
Ukrainian	`uk`
Swedish	`sv`
Norwegian	`no`
Danish	`da`
Finnish	`fi`
Indonesian	`id`
Malay	`ms`
Thai	`th`
Vietnamese	`vi`
Czech	`cs`
Romanian	`ro`
Hungarian	`hu`
Greek	`el`
Bulgarian	`bg`
Croatian	`hr`
Slovak	`sk`
Tamil	`ta`
Telugu	`te`
Kannada	`kn`
Malayalam	`ml`
Bengali	`bn`
Gujarati	`gu`
Marathi	`mr`

AssemblyAI — 17+ languages (ISO 639-1 / underscore)

Language	Code(s)
English	`en`, `en_us`, `en_gb`, `en_au`
Spanish	`es`
French	`fr`
German	`de`
Italian	`it`
Portuguese	`pt`
Dutch	`nl`
Japanese	`ja`
Korean	`ko`
Chinese	`zh`
Hindi	`hi`
Turkish	`tr`
Russian	`ru`
Polish	`pl`
Ukrainian	`uk`
Vietnamese	`vi`
Finnish	`fi`

ElevenLabs — 99 languages (ISO 639-1 / ISO 639-3)

Language	ISO 639-1	ISO 639-3	Language	ISO 639-1	ISO 639-3
English	`en`	`eng`	Italian	`it`	`ita`
Chinese	`zh`	`zho`	Indonesian	`id`	`ind`
German	`de`	`deu`	Hindi	`hi`	`hin`
Spanish	`es`	`spa`	Finnish	`fi`	`fin`
Russian	`ru`	`rus`	Vietnamese	`vi`	`vie`
Korean	`ko`	`kor`	Hebrew	`he`	`heb`
French	`fr`	`fra`	Ukrainian	`uk`	`ukr`
Japanese	`ja`	`jpn`	Greek	`el`	`ell`
Portuguese	`pt`	`por`	Thai	`th`	`tha`
Turkish	`tr`	`tur`	Arabic	`ar`	`ara`
Polish	`pl`	`pol`	Czech	`cs`	`ces`
Dutch	`nl`	`nld`	Danish	`da`	`dan`
Swedish	`sv`	`swe`	Hungarian	`hu`	`hun`

Full list: afr, amh, ara, asm, aze, bak, bel, bul, ben, bod, bre, bos, cat, ces, cym, dan, deu, ell, eng, spa, est, eus, fas, fin, fao, fra, glg, guj, hau, haw, heb, hin, hrv, hat, hun, hye, ind, isl, ita, jpn, jav, kat, kaz, khm, kan, kor, lat, ltz, lin, lao, lit, lav, mlg, mri, mkd, mal, mon, mar, msa, mlt, mya, nep, nld, nno, nor, oci, pan, pol, pus, por, ron, rus, san, snd, sin, slk, slv, sna, som, sqi, srp, sun, swe, swa, tam, tel, tgk, tha, tuk, tgl, tur, tat, ukr, urd, uzb, vie, yid, yor, zho

Complete Upload Get Transcription

Documentation Index

​Authentication

​Path Parameters

​Request Body

​Response

​Response Fields

​Transcription Status

​Polling for Results

​Auto-Transcription

​Webhooks (Recommended)

​Language Code Formats

​Supported Languages per Provider

Authentication

Path Parameters

Request Body

Response

Response Fields

Transcription Status

Polling for Results

Auto-Transcription

Webhooks (Recommended)

Language Code Formats

Supported Languages per Provider