REST API ReferenceEndpointsTools

Text to speech

Convert text into a spoken audio file. Only voices with supportsDirectToolExecution set to true can be used. Optionally choose a voice, language, speed, and pronunciation overrides.

Authentication

AuthorizationBearer
API key from [app.videogen.io/developers](https://app.videogen.io/developers). The full key is only shown once when you create it.

Request

This endpoint expects an object.
ttsTextstringRequired
voiceIdstring or nullOptional

Voice id from GET /v1/resources/tts-voices. A default voice is used when null. Only voices with supportsDirectToolExecution set to true are accepted.

speechLanguageCodestring or nullOptional

ISO-639-1 language hint for pronunciation (e.g. en, es, zh).

pronunciationReplacementslist of objectsOptional
autoExpandPronunciationReplacementsbooleanOptional

When true, automatically expands numbers, symbols, acronyms, and other non-word tokens into their spoken forms before synthesis so the voice pronounces them correctly (e.g. $100one hundred dollars, NASAnasa, 3rdthird). Defaults to false when omitted.

voiceSpeeddoubleOptional
Speech rate multiplier. Defaults to the voice's default speed.
numResultsintegerOptional1-100Defaults to 1
Number of output results to generate. Defaults to 1.
isOutputTemporarybooleanOptionalDefaults to false

When true, generated files are temporary. Temporary files are guaranteed to be available for 24 hours, after which they may be archived at any time. Temporary files are not analyzed (no description, transcript, or embedding will be generated), so they will not appear in search results. Defaults to false.

Response

Execution accepted; poll until complete.

toolExecutionIdstring

Execution id (e.g. vg_exec_...).