API Documentation

Complete reference for the VoiceAI API. All endpoints require authentication via API key.

lockAuthentication

All API requests require an API key. Get yours from the API Keys page. Include it in every request header:

header
xi-api-key: YOUR_API_KEY

Quick example with Bearer token:

curl -X GET "https://api.voiceai.com/v1/credits" \
  -H "xi-api-key: sk-voic_your_api_key_here"

languageBase URL

https://api.voiceai.com/v1

All endpoint paths below are relative to this base URL.

tollCredits System

Each API call costs credits. Credits are deducted when a task is created. If a task fails, credits are refunded automatically.

record_voice_over
Text to Speech
0.5-1 credits / char
translate
Dubbing
Per task
subtitles
Speech to Text
Per task
graphic_eq
Sound Effects
50-200 credits
mic
Voice Changer
Per duration
music_note
Music Gen
Per task

scheduleAsync Tasks

Most endpoints create an async task. You get a task_id immediately. Get results via:

  • webhookWebhook -- pass receive_url and we'll POST the result when done
  • syncPolling -- call GET /v1/task/:task_id until "done" or "error"
record_voice_over

Text to Speech

Create Speech (ElevenLabs)

Converts text into speech using a voice of your choice and returns audio.

POST/v1/text-to-speech/{voice_id}
curl -X POST "https://api.voiceai.com/v1/text-to-speech/$VOICE_ID?output_format=mp3_44100_128" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "text": "The first move is what sets everything in motion.",
  "model_id": "eleven_multilingual_v2",
  "with_transcript": false,
  "receive_url": "https://your-server.com/webhook"
}'
textstringrequiredText to convert to speech
model_idstringTTS model. Default: eleven_multilingual_v2. Models: eleven_multilingual_v2 (1x), eleven_v3 (1x), eleven_flash_v2_5 (0.5x), eleven_turbo_v2_5 (0.5x), eleven_turbo_v2 (0.5x), eleven_flash_v2 (0.5x)
voice_settings.stabilitynumberVoice stability 0-1. Default: 0.5
voice_settings.similarity_boostnumberSimilarity boost 0-1. Default: 0.75
voice_settings.speednumberSpeech speed 0.25-4.0. Default: 1.0
with_transcriptbooleanReturn transcript of the audio. Default: false. +15% credit cost
loudness_normalizationbooleanNormalize volume for consistent loudness. Default: false. +15% credit cost
receive_urlstringWebhook URL to receive audio when done (optional)
chevron_rightResponse Example
json
{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}
chevron_rightWebhook / Polling Result
json
{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/audio.mp3",
    "srt_url": "https://cdn.voiceai.com/audio.srt",
    "json_url": "https://cdn.voiceai.com/audio.json"
  },
  "type": "tts"
}

Create Speech (Minimax)

Converts text to speech using Minimax TTS service with voice tuning.

POST/v1m/task/text-to-speech
curl -X POST "https://api.voiceai.com/v1m/task/text-to-speech" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "text": "Hello, this is a test message.",
  "model": "speech-2.6-hd",
  "voice_setting": {
    "voice_id": "209533299589184",
    "vol": 1,
    "pitch": 0,
    "speed": 1
  },
  "language_boost": "Auto",
  "with_transcript": false,
  "receive_url": "https://your-server.com/webhook"
}'
textstringrequiredText to convert to speech
modelstringTTS model. Default: speech-2.6-hd. Models: speech-2.8-hd (1x), speech-2.8-turbo (0.5x), speech-2.6-hd (1x), speech-2.6-turbo (0.5x), speech-02-hd (1x), speech-02-turbo (0.5x)
voice_setting.voice_idstringrequiredVoice ID
voice_setting.volnumberVolume 0.5-2.0. Default: 1
voice_setting.pitchnumberPitch -12 to 12. Default: 0
voice_setting.speednumberSpeed 0.01-10.0. Default: 1
language_booststringLanguage boost. Default: "Auto". See common/config
with_transcriptbooleanReturn transcript. Default: false. +15% credit cost
loudness_normalizationbooleanNormalize volume for consistent loudness. Default: false. +15% credit cost
receive_urlstringWebhook URL (optional)
chevron_rightResponse Example
json
{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}
chevron_rightWebhook / Polling Result
json
{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/audio.mp3",
    "srt_url": "https://cdn.voiceai.com/audio.srt"
  },
  "type": "minimax_tts"
}
mic

Voice Cloning

Clone Voice

Uploads an audio file to create a custom voice clone.

POST/v1m/voice/clone
curl -X POST "https://api.voiceai.com/v1m/voice/clone" \
  -H "xi-api-key: $API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file=@audio_sample.mp3 \
  -F voice_name="My Custom Voice" \
  -F language_tag="English" \
  -F gender_tag="female"
filefilerequiredAudio file (.mp3). Max: 20MB or 5 minutes
voice_namestringName for the cloned voice
preview_textstringPreview text for the voice sample
language_tagstringLanguage (e.g. "English"). See common/config
need_noise_reductionbooleanRemove noise from audio. Default: false
gender_tagstringGender: "male" or "female"
chevron_rightResponse Example
json
{
  "success": true,
  "cloned_voice_id": 123456789
}

Delete Voice Clone

Deletes a voice clone and all its references.

DELETE/v1m/voice/clone/{voice_clone_id}
bash
curl -X DELETE "https://api.voiceai.com/v1m/voice/clone/$VOICE_CLONE_ID" \
  -H "xi-api-key: $API_KEY"
chevron_rightResponse Example
json
{
  "success": true
}

List Voice Clones

Retrieves all voice clones owned by the authenticated user.

GET/v1m/voice/clone
bash
curl "https://api.voiceai.com/v1m/voice/clone" \
  -H "xi-api-key: $API_KEY"
chevron_rightResponse Example
json
{
  "success": true,
  "data": [
    {
      "voice_id": "12345",
      "parent_voice_id": "0",
      "voice_name": "Graceful Lady",
      "tag_list": ["English", "Clone"],
      "cover_url": "https://...",
      "create_time": 1736947070593,
      "voice_status": 2,
      "sample_audio": "https://..."
    }
  ]
}
subtitles

Speech to Text

Transcribe Audio

Transcribes audio and returns transcript as JSON and SRT.

POST/v1/speech-to-text
curl -X POST "https://api.voiceai.com/v1/task/speech-to-text" \
  -H "xi-api-key: $API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file=@recording.mp3 \
  -F receive_url="https://your-server.com/webhook"
filefilerequiredAudio file: mp3, aac, aiff, ogg, opus, wav, webm, flac, m4a. Max: 200MB
receive_urlstringWebhook URL (optional)
chevron_rightResponse Example
json
{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}
chevron_rightWebhook / Polling Result
json
{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "json_url": "https://cdn.voiceai.com/transcript.json",
    "srt_url": "https://cdn.voiceai.com/transcript.srt"
  },
  "type": "speech_to_text"
}
translate

Dubbing

Dub an Audio File

Dubs a provided audio file into a given language. Returns dubbed audio & transcript (srt).

POST/v1/dubbing
curl -X POST "https://api.voiceai.com/v1/task/dubbing" \
  -H "xi-api-key: $API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file=@audio.mp3 \
  -F num_speakers="0" \
  -F disable_voice_cloning="false" \
  -F source_lang="auto" \
  -F target_lang="es" \
  -F receive_url="https://your-server.com/webhook"
filefilerequiredAudio file (.m4a, .mp3). Max: 20MB or 5 minutes
num_speakersintegerNumber of speakers. Default: 0 (auto-detect)
disable_voice_cloningboolean[BETA] Use similar voice from library. Default: true
source_langstringSource language. Default: "auto"
target_langstringrequiredTarget language to dub into
receive_urlstringWebhook URL (optional)
chevron_rightResponse Example
json
{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}
chevron_rightWebhook / Polling Result
json
{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/dubbed.mp3",
    "srt_url": "https://cdn.voiceai.com/dubbed.srt",
    "json_url": "https://cdn.voiceai.com/dubbed.json"
  },
  "type": "dubbing"
}
graphic_eq

Sound Effects

Generate Sound Effects

Generate sound effects from text prompts using AI.

POST/v1/sound-effects
curl -X POST "https://api.voiceai.com/v1/task/sound-effect" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "text": "Thunder rolling with heavy rain",
  "duration_seconds": 5.0,
  "prompt_influence": 0.3,
  "loop": false,
  "receive_url": "https://your-server.com/webhook"
}'
textstringrequiredSound effect prompt (max 450 characters)
duration_secondsnumber|nullDuration 0.5-30s. Default: null (auto)
prompt_influencenumberPrompt adherence 0-1. Default: 0.3
loopbooleanCreate looping sound. Default: false
model_idstringModel. Default: "eleven_text_to_sound_v2"
receive_urlstringWebhook URL (optional)
Credit Cost: duration_seconds: null (auto) = 200 credits. Specified duration = 50 credits/sec. Minimum: 50 credits.
chevron_rightResponse Example
json
{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}
chevron_rightWebhook / Polling Result
json
{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "credit_cost": 200,
  "metadata": {
    "output_uri": "https://cdn.voiceai.com/sound.mp3",
    "character_cost": 200,
    "duration_seconds": null
  },
  "progress": 100,
  "type": "sound_effect"
}
music_note

Music Generation

Generate Music

Generates music using Minimax AI. Supports both idea-based and lyrics-based generation.

POST/v1m/task/music
curl -X POST "https://api.voiceai.com/v1m/task/music-generation" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "title": "My Song Title",
  "idea": "A relaxing jazz piece for a quiet evening",
  "n": 1,
  "style_id": "8",
  "mood_id": "5",
  "scenario_id": "5",
  "rewrite_idea_switch": false,
  "receive_url": "https://your-server.com/webhook"
}'
ideastringMusic idea/description (20-300 characters)
lyricsstringSong lyrics (50-3000 characters). Alternative to idea
titlestringSong title (max 40 characters)
nnumberNumber of tracks to generate 1-3. Default: 1
style_idstringMusic style (see reference below)
mood_idstringMusic mood (see reference below)
scenario_idstringMusic scenario (see reference below)
rewrite_idea_switchbooleanRewrite idea for better results. Default: false
receive_urlstringWebhook URL (optional)

Styles (style_id)

1 Pop
2 Urban
3 Rock
4 Hip Hop
5 Electronic
6 Reggae
7 Blues
8 Jazz
9 Folk
10 Country
11 Classical
12 R&B
13 Disco
15 Experimental
17 World
18 Ethnic

Moods (mood_id)

1 Relaxed
2 Happy
3 Energetic
4 Romantic
5 Sad
6 Angry
7 Inspired
8 Warm
9 Passionate
10 Joyful
11 Longing

Scenarios (scenario_id)

1 Coffee shop
2 Solitary walk
3 Travel
4 Sunset by the sea
5 Quiet evening
6 Late-night bar
7 Urban romance
8 City nightlife
9 Rainy night
10 Sunlit Shores
chevron_rightResponse Example
json
{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}
chevron_rightWebhook / Polling Result
json
{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": [
      "https://cdn.voiceai.com/music_track_1.mp3"
    ],
    "cover_url": [
      "https://cdn.voiceai.com/cover_1.jpg"
    ]
  },
  "type": "minimax_music"
}
library_music

Voices & Models

List Models

Retrieve available voice synthesis models.

GET/v1/models
bash
curl "https://api.voiceai.com/v1/models" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY"

List Recommended Voices

Gets all available voices with search, filtering and pagination.

GET/v2/voices
bash
curl "https://api.voiceai.com/v2/voices" \
  -H "xi-api-key: $API_KEY"

Shared Voices

Retrieves a list of shared voices from the community library.

GET/v1/shared-voices
bash
curl "https://api.voiceai.com/v1/shared-voices" \
  -H "xi-api-key: $API_KEY"

Voice List (Minimax)

Retrieves available voices from Minimax service with pagination and tag filtering.

POST/v1m/voice/list
curl -X POST "https://api.voiceai.com/v1m/voice/list" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "page": 1,
  "page_size": 30,
  "tag_list": []
}'
pagenumberPage number. Default: 1
page_sizenumberItems per page. Default: 30
tag_liststring[]Filter by tags (language, accent, gender, age). See common/config
chevron_rightResponse Example
json
{
  "success": true,
  "data": {
    "has_more": true,
    "total": "454",
    "voice_list": [
      {
        "voice_id": "226893671006276",
        "voice_name": "Graceful Lady",
        "tag_list": ["English", "Female", "Middle Age", "Elegant"],
        "sample_audio": "https://...",
        "cover_url": "https://..."
      }
    ]
  }
}
build

Utilities

Voice Changer (Speech-to-Speech)

Transform voice in an audio file to another voice using AI.

POST/v1/task/voice-changer
curl -X POST "https://api.voiceai.com/v1/task/voice-changer" \
  -H "xi-api-key: $API_KEY" \
  -F 'file=@audio.mp3' \
  -F 'voice_id=21m00Tcm4TlvDq8ikWAM' \
  -F 'model_id=eleven_multilingual_sts_v2' \
  -F 'voice_settings={"stability": 0.5, "similarity_boost": 0.75}' \
  -F 'remove_background_noise=true'
filefilerequiredAudio file (MP3, M4A, WAV). Max: 300MB, 5 hours
voice_idstringrequiredTarget voice ID
model_idstringAI model. Default: "eleven_multilingual_sts_v2"
voice_settingsJSON stringVoice tuning: stability (0-1), similarity_boost (0-1), style (0-1), use_speaker_boost (bool)
remove_background_noisebooleanRemove background noise. Default: false
chevron_rightResponse Example
json
{
  "success": true,
  "task_id": "uuid_task_id",
  "duration": 120.5,
  "credit_cost": 5000,
  "ec_remain_credits": "9500"
}
chevron_rightWebhook / Polling Result
json
{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "credit_cost": 5000,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/transformed.mp3"
  },
  "type": "voice_changer"
}

Voice Isolate

Isolate voice from background noise in an audio file.

POST/v1/task/voice-isolate
bash
curl -X POST "https://api.voiceai.com/v1/task/voice-isolate" \
  -H "xi-api-key: $API_KEY" \
  -F 'file=@audio.mp3'
filefilerequiredAudio file (MP3, M4A, WAV). Max: 300MB, 5 hours. Min: 5 seconds
chevron_rightResponse Example
json
{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}
chevron_rightWebhook / Polling Result
json
{
  "id": "uuid",
  "status": "done",
  "progress": 100,
  "output_uri": "https://cdn.voiceai.com/isolated.mp3",
  "metadata": {
    "file_name": "audio.mp3",
    "duration": 10.5
  }
}

Get Task

Retrieves task details by task ID. Use for polling if you don't use webhooks.

GET/v1/task/{task_id}
curl "https://api.voiceai.com/v1/task/$TASK_ID" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY"
chevron_rightResponse Example
json
{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/audio.mp3",
    "srt_url": "https://cdn.voiceai.com/audio.srt"
  },
  "type": "tts"
}

Get Credits Balance

Returns your current credit balance.

GET/v1/credits
curl "https://api.voiceai.com/v1/credits" \
  -H "xi-api-key: $API_KEY"
chevron_rightResponse Example
json
{
  "success": true,
  "credits": 95000
}