API Documentation

Complete reference for the VoiceAI API. All endpoints require authentication via API key.

lockAuthentication

All API requests require an API key. Get yours from the API Keys page. Include it in every request header:

header

xi-api-key: YOUR_API_KEY

Quick example with Bearer token:

curl -X GET "https://api.voiceai.com/v1/credits" \
  -H "xi-api-key: sk-voic_your_api_key_here"

languageBase URL

https://api.voiceai.com/v1

All endpoint paths below are relative to this base URL.

tollCredits System

Each API call costs credits. Credits are deducted when a task is created. If a task fails, credits are refunded automatically.

record_voice_over

Text to Speech
0.5-1 credits / char

translate

Dubbing
Per task

subtitles

Speech to Text
Per task

graphic_eq

Sound Effects
50-200 credits

mic

Voice Changer
Per duration

music_note

Music Gen
Per task

scheduleAsync Tasks

Most endpoints create an async task. You get a task_id immediately. Get results via:

webhookWebhook -- pass receive_url and we'll POST the result when done
syncPolling -- call GET /v1/task/:task_id until "done" or "error"

record_voice_over

Text to Speech

Create Speech (ElevenLabs)

Converts text into speech using a voice of your choice and returns audio.

POST/v1/text-to-speech/{voice_id}

curl -X POST "https://api.voiceai.com/v1/text-to-speech/$VOICE_ID?output_format=mp3_44100_128" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "text": "The first move is what sets everything in motion.",
  "model_id": "eleven_multilingual_v2",
  "with_transcript": false,
  "receive_url": "https://your-server.com/webhook"
}'

textstringrequiredText to convert to speech

model_idstringTTS model. Default: eleven_multilingual_v2. Models: eleven_multilingual_v2 (1x), eleven_v3 (1x), eleven_flash_v2_5 (0.5x), eleven_turbo_v2_5 (0.5x), eleven_turbo_v2 (0.5x), eleven_flash_v2 (0.5x)

voice_settings.stabilitynumberVoice stability 0-1. Default: 0.5

voice_settings.similarity_boostnumberSimilarity boost 0-1. Default: 0.75

voice_settings.speednumberSpeech speed 0.25-4.0. Default: 1.0

with_transcriptbooleanReturn transcript of the audio. Default: false. +15% credit cost

loudness_normalizationbooleanNormalize volume for consistent loudness. Default: false. +15% credit cost

receive_urlstringWebhook URL to receive audio when done (optional)

chevron_rightResponse Example

json

{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}

chevron_rightWebhook / Polling Result

json

{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/audio.mp3",
    "srt_url": "https://cdn.voiceai.com/audio.srt",
    "json_url": "https://cdn.voiceai.com/audio.json"
  },
  "type": "tts"
}

Create Speech (Minimax)

Converts text to speech using Minimax TTS service with voice tuning.

POST/v1m/task/text-to-speech

curl -X POST "https://api.voiceai.com/v1m/task/text-to-speech" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "text": "Hello, this is a test message.",
  "model": "speech-2.6-hd",
  "voice_setting": {
    "voice_id": "209533299589184",
    "vol": 1,
    "pitch": 0,
    "speed": 1
  },
  "language_boost": "Auto",
  "with_transcript": false,
  "receive_url": "https://your-server.com/webhook"
}'

textstringrequiredText to convert to speech

modelstringTTS model. Default: speech-2.6-hd. Models: speech-2.8-hd (1x), speech-2.8-turbo (0.5x), speech-2.6-hd (1x), speech-2.6-turbo (0.5x), speech-02-hd (1x), speech-02-turbo (0.5x)

voice_setting.voice_idstringrequiredVoice ID

voice_setting.volnumberVolume 0.5-2.0. Default: 1

voice_setting.pitchnumberPitch -12 to 12. Default: 0

voice_setting.speednumberSpeed 0.01-10.0. Default: 1

language_booststringLanguage boost. Default: "Auto". See common/config

with_transcriptbooleanReturn transcript. Default: false. +15% credit cost

loudness_normalizationbooleanNormalize volume for consistent loudness. Default: false. +15% credit cost

receive_urlstringWebhook URL (optional)

chevron_rightResponse Example

json

{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}

chevron_rightWebhook / Polling Result

json

{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/audio.mp3",
    "srt_url": "https://cdn.voiceai.com/audio.srt"
  },
  "type": "minimax_tts"
}

mic

Voice Cloning

Clone Voice

Uploads an audio file to create a custom voice clone.

POST/v1m/voice/clone

curl -X POST "https://api.voiceai.com/v1m/voice/clone" \
  -H "xi-api-key: $API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file=@audio_sample.mp3 \
  -F voice_name="My Custom Voice" \
  -F language_tag="English" \
  -F gender_tag="female"

filefilerequiredAudio file (.mp3). Max: 20MB or 5 minutes

voice_namestringName for the cloned voice

preview_textstringPreview text for the voice sample

language_tagstringLanguage (e.g. "English"). See common/config

need_noise_reductionbooleanRemove noise from audio. Default: false

gender_tagstringGender: "male" or "female"

chevron_rightResponse Example

json

{
  "success": true,
  "cloned_voice_id": 123456789
}

Delete Voice Clone

Deletes a voice clone and all its references.

DELETE/v1m/voice/clone/{voice_clone_id}

bash

curl -X DELETE "https://api.voiceai.com/v1m/voice/clone/$VOICE_CLONE_ID" \
  -H "xi-api-key: $API_KEY"

chevron_rightResponse Example

json

{
  "success": true
}

List Voice Clones

Retrieves all voice clones owned by the authenticated user.

GET/v1m/voice/clone

bash

curl "https://api.voiceai.com/v1m/voice/clone" \
  -H "xi-api-key: $API_KEY"

chevron_rightResponse Example

json

{
  "success": true,
  "data": [
    {
      "voice_id": "12345",
      "parent_voice_id": "0",
      "voice_name": "Graceful Lady",
      "tag_list": ["English", "Clone"],
      "cover_url": "https://...",
      "create_time": 1736947070593,
      "voice_status": 2,
      "sample_audio": "https://..."
    }
  ]
}

subtitles

Speech to Text

Transcribe Audio

Transcribes audio and returns transcript as JSON and SRT.

POST/v1/speech-to-text

curl -X POST "https://api.voiceai.com/v1/task/speech-to-text" \
  -H "xi-api-key: $API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file=@recording.mp3 \
  -F receive_url="https://your-server.com/webhook"

filefilerequiredAudio file: mp3, aac, aiff, ogg, opus, wav, webm, flac, m4a. Max: 200MB

receive_urlstringWebhook URL (optional)

chevron_rightResponse Example

json

{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}

chevron_rightWebhook / Polling Result

json

{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "json_url": "https://cdn.voiceai.com/transcript.json",
    "srt_url": "https://cdn.voiceai.com/transcript.srt"
  },
  "type": "speech_to_text"
}

translate

Dubbing

Dub an Audio File

Dubs a provided audio file into a given language. Returns dubbed audio & transcript (srt).

POST/v1/dubbing

curl -X POST "https://api.voiceai.com/v1/task/dubbing" \
  -H "xi-api-key: $API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file=@audio.mp3 \
  -F num_speakers="0" \
  -F disable_voice_cloning="false" \
  -F source_lang="auto" \
  -F target_lang="es" \
  -F receive_url="https://your-server.com/webhook"

filefilerequiredAudio file (.m4a, .mp3). Max: 20MB or 5 minutes

num_speakersintegerNumber of speakers. Default: 0 (auto-detect)

disable_voice_cloningboolean[BETA] Use similar voice from library. Default: true

source_langstringSource language. Default: "auto"

target_langstringrequiredTarget language to dub into

receive_urlstringWebhook URL (optional)

chevron_rightResponse Example

json

{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}

chevron_rightWebhook / Polling Result

json

{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/dubbed.mp3",
    "srt_url": "https://cdn.voiceai.com/dubbed.srt",
    "json_url": "https://cdn.voiceai.com/dubbed.json"
  },
  "type": "dubbing"
}

graphic_eq

Sound Effects

Generate Sound Effects

Generate sound effects from text prompts using AI.

POST/v1/sound-effects

curl -X POST "https://api.voiceai.com/v1/task/sound-effect" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "text": "Thunder rolling with heavy rain",
  "duration_seconds": 5.0,
  "prompt_influence": 0.3,
  "loop": false,
  "receive_url": "https://your-server.com/webhook"
}'

textstringrequiredSound effect prompt (max 450 characters)

duration_secondsnumber|nullDuration 0.5-30s. Default: null (auto)

prompt_influencenumberPrompt adherence 0-1. Default: 0.3

loopbooleanCreate looping sound. Default: false

model_idstringModel. Default: "eleven_text_to_sound_v2"

receive_urlstringWebhook URL (optional)

Credit Cost: duration_seconds: null (auto) = 200 credits. Specified duration = 50 credits/sec. Minimum: 50 credits.

chevron_rightResponse Example

json

{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}

chevron_rightWebhook / Polling Result

json

{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "credit_cost": 200,
  "metadata": {
    "output_uri": "https://cdn.voiceai.com/sound.mp3",
    "character_cost": 200,
    "duration_seconds": null
  },
  "progress": 100,
  "type": "sound_effect"
}

music_note

Music Generation

Generate Music

Generates music using Minimax AI. Supports both idea-based and lyrics-based generation.

POST/v1m/task/music

curl -X POST "https://api.voiceai.com/v1m/task/music-generation" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "title": "My Song Title",
  "idea": "A relaxing jazz piece for a quiet evening",
  "n": 1,
  "style_id": "8",
  "mood_id": "5",
  "scenario_id": "5",
  "rewrite_idea_switch": false,
  "receive_url": "https://your-server.com/webhook"
}'

ideastringMusic idea/description (20-300 characters)

lyricsstringSong lyrics (50-3000 characters). Alternative to idea

titlestringSong title (max 40 characters)

nnumberNumber of tracks to generate 1-3. Default: 1

style_idstringMusic style (see reference below)

mood_idstringMusic mood (see reference below)

scenario_idstringMusic scenario (see reference below)

rewrite_idea_switchbooleanRewrite idea for better results. Default: false

receive_urlstringWebhook URL (optional)

Styles (style_id)

1 Pop

2 Urban

3 Rock

4 Hip Hop

5 Electronic

6 Reggae

7 Blues

8 Jazz

9 Folk

10 Country

11 Classical

12 R&B

13 Disco

15 Experimental

17 World

18 Ethnic

Moods (mood_id)

1 Relaxed

2 Happy

3 Energetic

4 Romantic

5 Sad

6 Angry

7 Inspired

8 Warm

9 Passionate

10 Joyful

11 Longing

Scenarios (scenario_id)

1 Coffee shop

2 Solitary walk

3 Travel

4 Sunset by the sea

5 Quiet evening

6 Late-night bar

7 Urban romance

8 City nightlife

9 Rainy night

10 Sunlit Shores

chevron_rightResponse Example

json

{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}

chevron_rightWebhook / Polling Result

json

{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": [
      "https://cdn.voiceai.com/music_track_1.mp3"
    ],
    "cover_url": [
      "https://cdn.voiceai.com/cover_1.jpg"
    ]
  },
  "type": "minimax_music"
}

library_music

Voices & Models

List Models

Retrieve available voice synthesis models.

GET/v1/models

bash

curl "https://api.voiceai.com/v1/models" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY"

List Recommended Voices

Gets all available voices with search, filtering and pagination.

GET/v2/voices

bash

curl "https://api.voiceai.com/v2/voices" \
  -H "xi-api-key: $API_KEY"

Shared Voices

Retrieves a list of shared voices from the community library.

GET/v1/shared-voices

bash

curl "https://api.voiceai.com/v1/shared-voices" \
  -H "xi-api-key: $API_KEY"

Voice List (Minimax)

Retrieves available voices from Minimax service with pagination and tag filtering.

POST/v1m/voice/list

curl -X POST "https://api.voiceai.com/v1m/voice/list" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY" \
  -d '{
  "page": 1,
  "page_size": 30,
  "tag_list": []
}'

pagenumberPage number. Default: 1

page_sizenumberItems per page. Default: 30

tag_liststring[]Filter by tags (language, accent, gender, age). See common/config

chevron_rightResponse Example

json

{
  "success": true,
  "data": {
    "has_more": true,
    "total": "454",
    "voice_list": [
      {
        "voice_id": "226893671006276",
        "voice_name": "Graceful Lady",
        "tag_list": ["English", "Female", "Middle Age", "Elegant"],
        "sample_audio": "https://...",
        "cover_url": "https://..."
      }
    ]
  }
}

build

Utilities

Voice Changer (Speech-to-Speech)

Transform voice in an audio file to another voice using AI.

POST/v1/task/voice-changer

curl -X POST "https://api.voiceai.com/v1/task/voice-changer" \
  -H "xi-api-key: $API_KEY" \
  -F 'file=@audio.mp3' \
  -F 'voice_id=21m00Tcm4TlvDq8ikWAM' \
  -F 'model_id=eleven_multilingual_sts_v2' \
  -F 'voice_settings={"stability": 0.5, "similarity_boost": 0.75}' \
  -F 'remove_background_noise=true'

filefilerequiredAudio file (MP3, M4A, WAV). Max: 300MB, 5 hours

voice_idstringrequiredTarget voice ID

model_idstringAI model. Default: "eleven_multilingual_sts_v2"

voice_settingsJSON stringVoice tuning: stability (0-1), similarity_boost (0-1), style (0-1), use_speaker_boost (bool)

remove_background_noisebooleanRemove background noise. Default: false

chevron_rightResponse Example

json

{
  "success": true,
  "task_id": "uuid_task_id",
  "duration": 120.5,
  "credit_cost": 5000,
  "ec_remain_credits": "9500"
}

chevron_rightWebhook / Polling Result

json

{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "credit_cost": 5000,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/transformed.mp3"
  },
  "type": "voice_changer"
}

Voice Isolate

Isolate voice from background noise in an audio file.

POST/v1/task/voice-isolate

bash

curl -X POST "https://api.voiceai.com/v1/task/voice-isolate" \
  -H "xi-api-key: $API_KEY" \
  -F 'file=@audio.mp3'

filefilerequiredAudio file (MP3, M4A, WAV). Max: 300MB, 5 hours. Min: 5 seconds

chevron_rightResponse Example

json

{
  "success": true,
  "task_id": "uuid_task_id",
  "ec_remain_credits": 95000
}

chevron_rightWebhook / Polling Result

json

{
  "id": "uuid",
  "status": "done",
  "progress": 100,
  "output_uri": "https://cdn.voiceai.com/isolated.mp3",
  "metadata": {
    "file_name": "audio.mp3",
    "duration": 10.5
  }
}

Get Task

Retrieves task details by task ID. Use for polling if you don't use webhooks.

GET/v1/task/{task_id}

curl "https://api.voiceai.com/v1/task/$TASK_ID" \
  -H "Content-Type: application/json" \
  -H "xi-api-key: $API_KEY"

chevron_rightResponse Example

json

{
  "id": "uuid_task_id",
  "created_at": "2025-01-01T00:00:00.000Z",
  "status": "done",
  "error_message": null,
  "credit_cost": 1,
  "metadata": {
    "audio_url": "https://cdn.voiceai.com/audio.mp3",
    "srt_url": "https://cdn.voiceai.com/audio.srt"
  },
  "type": "tts"
}

Get Credits Balance

Returns your current credit balance.

GET/v1/credits

curl "https://api.voiceai.com/v1/credits" \
  -H "xi-api-key: $API_KEY"

chevron_rightResponse Example

json

{
  "success": true,
  "credits": 95000
}