VocaBusta Voice API (1.0.0)

Download OpenAPI specification:

URL: https://developers.vocabusta.satryx.ai

Lifelike African voices — text-to-speech, transcription, voice cloning and video dubbing. The same engine that powers VocaBusta Studio.

All endpoints live under /voice on the Satryx API. Every call needs a Bearer API key (satryx_live_… / satryx_test_…); generation endpoints also require an active VocaBusta subscription on the account.

Speech

Text-to-speech synthesis

Synthesize speech

Synthesize text to a WAV audio file. The response body is raw audio/wav; synthesis metadata is returned in the X-Vox-Metadata response header as a JSON string.

Authorizations:

apiKey

Request Body schema: application/json
required

text required	string [ 1 .. 5000 ] characters
voice_id	string Default: "af_heart"
speed	number <float> [ 0.5 .. 2 ] Default: 1
language	string or null
exaggeration	number or null [ 0 .. 1 ] Chatterbox emphasis/emotion. Unset uses the voice default.
cfg_weight	number or null [ 0 .. 1 ] How tightly synthesis follows the reference voice.
stability	number <float> [ 0 .. 1 ] Default: 0.5 Applies to non-Chatterbox (Kokoro) voices.
similarity	number <float> [ 0 .. 1 ] Default: 0.75 Applies to non-Chatterbox (Kokoro) voices.

Responses

Request samples

Payload

Content type

application/json

{"text": "How far? Welcome to VocaBusta.",
"voice_id": "vocabusta_pcm_female",
"speed": 1,
"language": "pcm",
"exaggeration": 1,
"cfg_weight": 1,
"stability": 0.5,
"similarity": 0.75
}

Response samples

401
403
429
502

Content type

application/json

{"detail": "VocaBusta is not active for this account. Manage it in Billing."
}

Stream speech synthesis

Same request as /voice/tts, but streams WAV audio chunks as they are synthesized for low-latency playback.

Authorizations:

apiKey

Request Body schema: application/json
required

text required	string [ 1 .. 5000 ] characters
voice_id	string Default: "af_heart"
speed	number <float> [ 0.5 .. 2 ] Default: 1
language	string or null
exaggeration	number or null [ 0 .. 1 ] Chatterbox emphasis/emotion. Unset uses the voice default.
cfg_weight	number or null [ 0 .. 1 ] How tightly synthesis follows the reference voice.
stability	number <float> [ 0 .. 1 ] Default: 0.5 Applies to non-Chatterbox (Kokoro) voices.
similarity	number <float> [ 0 .. 1 ] Default: 0.75 Applies to non-Chatterbox (Kokoro) voices.

Responses

Request samples

Payload

Content type

application/json

{"text": "How far? Welcome to VocaBusta.",
"voice_id": "vocabusta_pcm_female",
"speed": 1,
"language": "pcm",
"exaggeration": 1,
"cfg_weight": 1,
"stability": 0.5,
"similarity": 0.75
}

Response samples

401
403

Content type

application/json

{"detail": "VocaBusta is not active for this account. Manage it in Billing."
}

Transcription

Speech-to-text

Transcribe audio

Transcribe an uploaded audio file. African languages are routed to the Vocabanga ASR model; other languages fall back to Whisper.

Authorizations:

apiKey

Request Body schema: multipart/form-data
required

file required	string <binary> The audio file to transcribe.
language	string VocaBusta language code (e.g. `yo`, `pcm`) or `auto`.
word_timestamps	boolean Default: true Include per-word start/end times.

Responses

Response samples

200
400
401
403
502

Content type

application/json

{"id": "string",
"transcript": "string",
"language": "pcm",
"duration_seconds": 0,
"segments": [{"id": 0,
"start": 0,
"end": 0,
"text": "string",
"words": [{"word": "string",
"start": 0,
"end": 0
}
]
}
],
"engine": "string",
"model": "string"
}

Voices

Voice catalog and cloning

List voices

Return every available voice — VocaBusta African-language voices plus any voices you have cloned. This endpoint is ungated.

Responses

Response samples

200

Content type

application/json

[{"id": "vocabusta_yo_female",
"name": "Adunni",
"description": "string",
"accent": "Yoruba",
"gender": "female",
"category": "vocabusta",
"language": "yo",
"language_name": "Yoruba",
"tags": ["string"
],
"preview_url": "string",
"engine": "vocabusta"
}
]

Delete a cloned voice

Delete a cloned voice. Only cloned_… voices can be deleted.

Authorizations:

apiKey

path Parameters

voice_id

required

string

Example: cloned_a1b2c3

Responses

Response samples

200
400

Content type

application/json

{"status": "deleted",
"voice_id": "cloned_a1b2c3"
}

Clone a voice

Clone a voice from a short reference clip (~10–30s, one speaker). Cloning is zero-shot — the clone is ready in seconds.

Authorizations:

apiKey

Request Body schema: multipart/form-data
required

file required	string <binary> Clean reference clip.
name required	string Display name for the voice.
description	string Default: ""

Responses

Response samples

200
400
401
403

Content type

application/json

{"voice_id": "cloned_a1b2c3",
"name": "string",
"description": "string",
"status": "processing",
"preview_url": "string",
"created_at": "2019-08-24T14:15:22Z"
}

Dubbing

Video analysis, translation and re-voicing

Analyze a video

Analyze a video into a transcript with word timing and speaker diarization. Returns a job_id; poll /voice/dub/jobs/{job_id} until done. The finished result is a DubAnalysis.

Authorizations:

apiKey

Request Body schema: multipart/form-data
required

file required	string <binary>
language	string Spoken language hint.
diarize	boolean Default: true

Responses

Response samples

200
400
403
503

Content type

application/json

{"job_id": "string"
}

Translate dub segments

Translate every segment's text into the target language.

Authorizations:

apiKey

Request Body schema: application/json
required

required	Array of objects (DubSegment)
target_language required	string One of `en`, `yo`, `ig`, `ha`, `sw`, `zu`.

Responses

Request samples

Payload

Content type

application/json

{"segments": [{"id": 0,
"start": 0,
"end": 0,
"text": "string",
"speaker": "SPEAKER_00"
}
],
"target_language": "yo"
}

Response samples

200
400
403

Content type

application/json

{"segments": [{"id": 0,
"start": 0,
"end": 0,
"text": "string",
"speaker": "SPEAKER_00"
}
],
"target_language": "string"
}

Render a dubbed video

Render a dubbed video from translated segments and per-speaker voice assignments. Returns a job_id; poll /voice/dub/jobs/{job_id}. The finished result is { video_base64, format }.

Authorizations:

apiKey

Request Body schema: multipart/form-data
required

file required	string <binary>
segments required	string JSON-encoded array of translated DubSegment.
voice_map	string Default: "{}" JSON object mapping speaker → voice_id or "preserve".
exaggeration	number <float> [ 0 .. 1 ]
cfg_weight	number <float> [ 0 .. 1 ]

Responses

Response samples

200
400
403
503

Content type

application/json

{"job_id": "string"
}

Poll a dubbing job

Authorizations:

apiKey

path Parameters

job_id

required

string

Responses

Response samples

200
403

Content type

application/json

{"status": "queued",
"progress": 0.45,
"result": { },
"error": "string"
}

History

Per-account generation history

List generation history

Authorizations:

apiKey

query Parameters

limit	integer Default: 50
offset	integer Default: 0
feature	string Enum: "tts" "stt" "clone" "dub" Filter by feature.

Responses

Response samples

200
401

Content type

application/json

[{"id": "string",
"feature": "tts",
"title": "string",
"voice_name": "string",
"text": "string",
"transcript": "string",
"audio_url": "string",
"duration_seconds": 0,
"character_count": 0,
"created_at": "2019-08-24T14:15:22Z"
}
]

Delete a history item

Authorizations:

apiKey

path Parameters

item_id

required

string

Responses

Response samples

200
404

Content type

application/json

{"status": "deleted",
"id": "string"
}

VocaBusta Voice API (1.0.0)

Speech

Synthesize speech

Authorizations:

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Stream speech synthesis

Authorizations:

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Transcription

Transcribe audio

Authorizations:

Request Body schema: multipart/form-datarequired

Responses

Response samples

Voices

List voices

Responses

Response samples

Delete a cloned voice

Authorizations:

path Parameters

Responses

Response samples

Clone a voice

Authorizations:

Request Body schema: multipart/form-datarequired

Responses

Response samples

Dubbing

Analyze a video

Authorizations:

Request Body schema: multipart/form-datarequired

Responses

Response samples

Translate dub segments

Authorizations:

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Render a dubbed video

Authorizations:

Request Body schema: multipart/form-datarequired

Responses

Response samples

Poll a dubbing job

Authorizations:

path Parameters

Responses

Response samples

History

List generation history

Authorizations:

query Parameters

Responses

Response samples

Delete a history item

Authorizations:

path Parameters

Responses

Response samples

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: multipart/form-data
required

Request Body schema: multipart/form-data
required

Request Body schema: multipart/form-data
required

Request Body schema: application/json
required

Request Body schema: multipart/form-data
required