API Documentation
Overview
POST
/api/synthesize
Convert plain text or Markdown into spoken audio and WebVTT subtitles.
Request
Send JSON to generate audio
Headers
Content-Type: application/json
Accept: application/json
Body
{
"text": "Your text or markdown content here"
}
Send a JSON payload with a single text field containing the text or Markdown content to synthesize. For Markdown inputs, front matter is parsed automatically.
Response
Receive audio and subtitles
JSON Payload
{
"audio_base64": "SUQzBAAAAAAAI1RTU0UAAAAPAAADTGF2ZjYx...", // Base64 encoded MP3 audio
"vtt": "WEBVTT\n\n00:00.000 --> 00:01.000\nHello" // WebVTT subtitles
}
The audio_base64 field contains MP3 audio encoded as base64. The vtt field contains subtitle cues in WebVTT format.
Notes
The API currently returns the full audio payload in a single JSON response. For large inputs, expect larger response sizes and longer processing times.
Example
cURL
curl -X POST https://voxify-labs.gaidot.net/api/synthesize \
-H "Content-Type: application/json" \
-d '{
"text": "Hello, this is a text-to-speech synthesis test."
}'