For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DashboardAPI PricingGet an API key
  • Guides
    • Introduction
    • Getting started
    • Use with AI agents
    • Examples
    • Authentication
    • Handling async tasks
    • File uploads
    • File hydration
    • Embedding videos
    • Errors
    • Rate limits
    • Libraries & SDKs
  • REST API Reference
    • Overview
    • Workflows
        • POSTGenerate image
        • POSTGenerate video clip
        • POSTText to speech
        • POSTGenerate sound effect
        • POSTGenerate avatar clip
        • POSTVectorize image
        • POSTRemove background from an image
        • POSTRemove background from a video
        • POSTUpscale an image
        • POSTUpscale a video
        • POSTCancel tool execution
        • GETGet tool execution info
        • GETList files
        • POSTSearch files
        • GETGet file
        • POSTCreate file upload
        • POSTHydrate file
        • POSTArchive file
        • POSTEnable public preview
        • POSTDisable public preview
        • GETList avatar presenters
        • GETList TTS voices
        • GETList webhooks
        • POSTCreate webhook
        • DELDelete webhook
  • Webhook events
    • Overview
    • Changelog
LogoLogo
DashboardAPI PricingGet an API key
REST API ReferenceEndpointsTools

Text to speech

POST
https://api.videogen.io/v1/tools/text-to-speech
POST
/v1/tools/text-to-speech
1import { VideoGenClient } from "@videogen/sdk";
2
3const client = new VideoGenClient({ token: "YOUR_TOKEN" });
4await client.tools.textToSpeech({
5 ttsText: "ttsText"
6});
202Accepted
1{
2 "toolExecutionId": "string"
3}

Convert text into a spoken audio file. Only voices with supportsDirectToolExecution set to true can be used. Optionally choose a voice, language, speed, and pronunciation overrides.

Was this page helpful?
Previous

Generate sound effect

Next
Built with

Authentication

AuthorizationBearer

API key from app.videogen.io/developers. The full key is only shown once when you create it.

Request

This endpoint expects an object.
ttsTextstringRequired
voiceIdstring or nullOptional

Voice id from GET /v1/resources/tts-voices. A default voice is used when null. Only voices with supportsDirectToolExecution set to true are accepted.

speechLanguageCodestring or nullOptional

ISO-639-1 language hint for pronunciation (e.g. en, es, zh).

pronunciationReplacementslist of objectsOptional
autoExpandPronunciationReplacementsbooleanOptional

When true, automatically expands numbers, symbols, acronyms, and other non-word tokens into their spoken forms before synthesis so the voice pronounces them correctly (e.g. $100 → one hundred dollars, NASA → nasa, 3rd → third). Defaults to false when omitted.

voiceSpeeddoubleOptional
Speech rate multiplier. Defaults to the voice's default speed.
numResultsintegerOptional1-100Defaults to 1
Number of output results to generate. Defaults to 1.
isOutputTemporarybooleanOptionalDefaults to false

When true, generated files are temporary. Temporary files are guaranteed to be available for 24 hours, after which they may be archived at any time. Temporary files are not analyzed (no description, transcript, or embedding will be generated), so they will not appear in search results. Defaults to false.

Response

Execution accepted; poll until complete.

toolExecutionIdstring

Execution id (e.g. vg_exec_...).