Voice cloning

Clone a voice from a short reference clip, then synthesize new speech in that voice. Cloning is zero-shot — no training step, the clone is ready in seconds.

Clone a voice

POST https://api.satryx.ai/voice/clone — multipart/form-data.

Field	Type	Default	Notes
`file`	file	—	Required. A clean reference clip (~10–30s of clear speech, one speaker).
`name`	string	—	Required. Display name for the voice.
`description`	string	`""`	Optional description.

Response

200 OK — JSON:

{
  "voice_id": "cloned_a1b2c3",
  "name": "Chidi",
  "description": "My narration voice",
  "status": "ready",
  "preview_url": "data:audio/wav;base64,UklGR... ",
  "created_at": "2026-06-27T10:00:00Z"
}

voice_id always starts with cloned_.
status is one of processing | ready | failed.
preview_url is an instant sample of the cloned voice (a data URL) when the engine could generate one.

Example

curl https://api.satryx.ai/voice/clone \
  -H "Authorization: Bearer $SATRYX_API_KEY" \
  -F "file=@my-voice.wav" \
  -F "name=Chidi" \
  -F "description=My narration voice"

import os, requests

with open("my-voice.wav", "rb") as f:
    res = requests.post(
        "https://api.satryx.ai/voice/clone",
        headers={"Authorization": f"Bearer {os.environ['SATRYX_API_KEY']}"},
        files={"file": f},
        data={"name": "Chidi", "description": "My narration voice"},
    )
res.raise_for_status()
voice_id = res.json()["voice_id"]

Speak in the cloned voice

Pass the cloned_… id straight to /voice/tts as voice_id:

curl https://api.satryx.ai/voice/tts \
  -H "Authorization: Bearer $SATRYX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "This is my cloned voice.", "voice_id": "cloned_a1b2c3"}' \
  --output cloned.wav

Delete a voice

DELETE https://api.satryx.ai/voice/voices/{voice_id} — removes a cloned voice. Only cloned_… voices can be deleted; premade and VocaBusta catalog voices cannot.

curl -X DELETE https://api.satryx.ai/voice/voices/cloned_a1b2c3 \
  -H "Authorization: Bearer $SATRYX_API_KEY"

Returns { "status": "deleted", "voice_id": "cloned_a1b2c3" }.

Best practices

Quality in, quality out — use a clean, dry recording with no background music or overlapping speakers.
Consent — only clone voices you own or have explicit permission to clone.
Tune at synthesis time — adjust exaggeration and cfg_weight on /voice/tts to dial expressiveness vs. fidelity for the clone.

Clone a voice​

Response​

Example​

Speak in the cloned voice​

Delete a voice​

Best practices​