Voiceover to video
Creates a project from an uploaded voiceover file and generates a video with matching b-roll. Upload the voiceover via the files API first.
Authentication
Request
Opaque file id of an uploaded voiceover audio file (e.g. vg_file_...). Upload the file first via POST /v1/files/upload.
Visual treatment for the generated b-roll.
Aspect ratio as a width:height pair (e.g. 16 and 9 for 16:9). Not pixel dimensions.
How quickly visuals change. FAST shows more, shorter shots; SLOW holds each visual longer. Defaults to MEDIUM.
Output language as a BCP-47 code (e.g. en, es, fr). Defaults to English.
Caption styling. Omit to use the default style with captions shown. Pass an object to override individual style fields (any omitted field uses the default). Pass null to hide captions entirely.
Optional file id of an uploaded logo image to overlay on the video (e.g. vg_file_...). Upload the image first via POST /v1/files/upload. Only image files are accepted.
Optional production notes for the AI that builds the video — visual direction for how to illustrate the voiceover (e.g. on-screen code or text to display, specific b-roll to feature, or scene-by-scene staging). Never spoken; does not change the uploaded voiceover audio or its transcript.
Optional edits applied to the project after the video is built, in order. Each action runs asynchronously; the response returns one remix action id per action. Captions and a logo are set with the captionStyle and logoFileId request fields above; recommended remix actions here are SET_BACKGROUND_MUSIC for a music bed, ADD_TRANSITIONS to stamp transitions between sections and assets, and EDIT_WITH_AGENT for open-ended natural-language edits. See the Remix actions guide.
Response
Opaque workflow run id (e.g. vg_work_...).
Opaque remix action ids (e.g. vg_rmix_...), one per remixActions entry in request order. Empty when no remix actions were requested. Each runs after the video is built; poll GET /v1/projects/{projectId}/remix-actions.