Speech to Text API

Transcribe uploaded audio with task polling, subtitle output, and async delivery.

What is Speech to Text API?

Upload supported audio formats, create a speech-to-text task, and retrieve transcript artifacts such as JSON and SRT once the task completes.

Submit audio and check completion through the common task endpoint

JSON and subtitle outputs are returned when the task is done

Use common uploaded audio formats within the supported limits

Bash

api.cheapaiapi.com/v1

curl -X POST "https://api.cheapaiapi.com/v1/task/speech-to-text" \
  -H "Authorization: Bearer sk_your_api_key" \
  -F file=@meeting.mp3

schedule

Designed for task submission and later retrieval rather than immediate sync output.

subtitles

Receive JSON and subtitle outputs when available.

upload_file

Submit common audio file types using multipart form data.

webhook

You can provide a callback URL instead of polling manually.

Async

Flow

JSON + SRT

Output

Multipart

Upload

/v1/task/:id

Task Polling

How do I get the final transcript?expand_more

Poll the common task endpoint and read the transcript URLs from the returned metadata once the task is done.

What file formats are supported?expand_more

Use the file types documented on the speech-to-text docs page for the current route.

Can I skip polling?expand_more

Yes. If the route supports receive_url, you can ask CheapAI to POST the result to your webhook.

Get $2 in free credits. Create an API key, set spend controls if needed, and start testing the current task-based endpoints.