Introduction

Turn a script into a finished video with a single API call.

The VideoGen API turns a script, voiceover, or slideshow into a finished video with visuals, narration, and captions. Start a workflow with a single API call, then poll or subscribe to webhooks for the result. The same API also exposes standalone tools for generating individual images, video clips, voiceovers, and more.

Base URL

https://api.videogen.io

What you can build

The API has two layers: full video workflows that produce a finished video, and standalone tools that generate a single asset.

Workflows

Workflows are the core of the API. Each one creates a VideoGen project and runs the full generation pipeline (visuals, narration, captions) from a single input. Start a workflow with one POST /v1/workflows/* call, then poll the run or wait for a webhook. When it finishes you get a projectUrl and can export an MP4 through the Projects API.

Available workflows: script to video, voiceover to video, and slideshow to video.

See the Workflows guide for inputs, options, and the run lifecycle.

Tools

Tools generate one asset at a time, with no project or pipeline. Each POST /v1/tools/* call is asynchronous and returns a toolExecutionId to poll. Available tools:

  • Generate images from text or an existing image.
  • Generate video clips from text, an image, or a video.
  • Convert text to speech with 100+ voices.
  • Generate sound effects and music from a prompt.
  • Create avatar videos with a presenter.
  • Upscale images and video, remove image and video backgrounds, vectorize images, and add 3D motion to a still image.

See the REST API reference for every tool, its request fields, and response shape.

Conventions

Timestamps

Every numeric timestamp field in the API (expiresAt, occurredAt, createdAt, and any future additions) is an integer representing seconds since the Unix epoch (UTC, no milliseconds). For example 1745409600 corresponds to 2025-04-23T12:00:00Z.

Next steps