VideoGen API

v1.1.5 — Quality tiers and new remix actions

This release renames the generate-text model field to quality, makes the text-to-speech voice required, adds an image quality tier to the script and voiceover workflows, and introduces three new remix actions. Update your generate-text and text-to-speech calls before upgrading.

Breaking: generate-text renames model to quality and adds a MAX tier. Replace the model field with quality. Accepted values are now LOW, STANDARD, HIGH, and MAX (previously LOW, STANDARD, HIGH); it still defaults to STANDARD.
Breaking: text-to-speech now requires voiceId. Pass a voiceId from GET /v1/resources/tts-voices (only voices with supportsDirectToolExecution set to true are accepted). The previous default-voice fallback has been removed, so requests without voiceId now fail.
Image quality tier on the script and voiceover workflows: POST /v1/workflows/script-to-video and POST /v1/workflows/voiceover-to-video now accept quality (LOW, STANDARD, or HIGH, defaults to STANDARD), the tier used for AI-generated visuals. It only applies when visualStyle.type is AI_IMAGE or ENTITY; STOCK footage is unaffected.
New remix action RESIZE_PROJECT: change a project’s output aspect ratio and re-flow the video to the new ratio (for example, to a vertical 9:16 social format). See Remix actions.
New remix action CLEAN_UP_TRANSCRIPT: tighten every transcript in a project by removing filler words and silent pauses, with optional removeFillers, removePauses, fillerWords, and minPauseSeconds controls.
New remix action CONVERT_IMAGES_TO_VIDEOS: animate every eligible still image into a short AI video clip in place, with optional motionPrompt, muteOutputVideos, and quality controls. It runs asynchronously (one clip per image) and is skipped without error when a project has no eligible images.