Skip to main content

Text to speech

Turn text into natural African-language speech with POST /voice/tts.

Request

POST https://api.satryx.ai/voice/tts — JSON body.

FieldTypeDefaultNotes
textstringRequired. 1–5000 characters.
voice_idstringaf_heartA voice from GET /voice/voices, e.g. vocabusta_yo_female.
speednumber1.0Playback speed, 0.52.0.
languagestring | nullnullLanguage hint, e.g. yo, pcm. Usually inferred from the voice.
exaggerationnumber | nullnullChatterbox emphasis/emotion, 0.01.0. Unset uses the voice's registry default.
cfg_weightnumber | nullnullHow tightly synthesis follows the reference voice, 0.01.0. Unset uses the voice's default.
stabilitynumber0.50.01.0. Applies to non-Chatterbox (Kokoro) voices.
similaritynumber0.750.01.0. Applies to non-Chatterbox (Kokoro) voices.

Response

200 OK with the raw WAV audio as the body (Content-Type: audio/wav).

Synthesis metadata is returned in the X-Vox-Metadata response header as a JSON string:

{
"id": "0f3c…",
"voice_id": "vocabusta_yo_female",
"voice_name": "Adunni",
"text": "Ẹ káàbọ̀.",
"duration_seconds": 1.42,
"sample_rate": 24000,
"character_count": 9,
"created_at": "2026-06-27T10:00:00Z"
}

Examples

cURL

curl https://api.satryx.ai/voice/tts \
-H "Authorization: Bearer $SATRYX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Ẹ káàbọ̀ sí VocaBusta.",
"voice_id": "vocabusta_yo_female",
"speed": 1.0,
"exaggeration": 0.6
}' \
--output yoruba.wav

Python

import os, requests

res = requests.post(
"https://api.satryx.ai/voice/tts",
headers={"Authorization": f"Bearer {os.environ['SATRYX_API_KEY']}"},
json={
"text": "Ndewo, nnọọ na VocaBusta.",
"voice_id": "vocabusta_ig_male",
"speed": 1.0,
},
)
res.raise_for_status()
open("igbo.wav", "wb").write(res.content)

Streaming

For low-latency playback, POST /voice/tts/stream takes the same body and streams WAV audio chunks as they're synthesized (Transfer-Encoding: chunked, Content-Type: audio/wav). Use it when you're piping audio straight to a player rather than saving a file.

curl -N https://api.satryx.ai/voice/tts/stream \
-H "Authorization: Bearer $SATRYX_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "Streaming live...", "voice_id": "vocabusta_en_ng_female"}' \
--output stream.wav

Tips

  • Expressiveness — raise exaggeration for livelier delivery; raise cfg_weight to hew closer to the reference timbre. Leave both unset to use each voice's tuned default.
  • Tone matters — for Yoruba and Igbo, include the correct diacritics in text; the model is tone-aware and the wrong tone changes the word.
  • Chunk long text — for very long passages, split on sentence boundaries and concatenate the WAVs client-side for snappier first-audio.

Next