Introduction

Turn a script, voiceover, or slideshow into a finished video with a single API call.

The VideoGen API turns a script, voiceover, or slideshow into a finished video with visuals, narration, and captions. Workflows are the heart of the API: a single call runs the whole generation pipeline and creates a VideoGen project you can edit and export.

Base URL

https://api.videogen.io

The flow

Every video follows the same three steps:

  1. Run a workflow. POST /v1/workflows/* starts the generation pipeline from your script, voiceover, or slideshow, and returns a workflowRunId and a projectId.
  2. Apply remix actions (optional). Layer on edits such as background music, a logo overlay, caption changes, or open-ended natural-language edits.
  3. Export the project. Render the project to an MP4 and download it.

The Getting started guide walks through this flow end to end.

Workflows

Each workflow takes a single input and builds the full video (visuals, narration, captions):

  • Script to video: write a script, get a narrated video with matching visuals.
  • Voiceover to video: upload an audio file, get visuals matched to the narration.
  • Slideshow to video: upload a PDF or slideshow, get a narrated walkthrough.

See the Workflows reference for inputs, options, and the run lifecycle.

Async by default

A finished video takes anywhere from a few seconds to several minutes to produce. Rather than holding a connection open, every workflow and export returns an id immediately. You then get the result by polling or by subscribing to a webhook.

Tools

Beyond workflows, the API exposes standalone tools that generate a single asset with no project or pipeline: images, video clips, voiceovers, sound effects, music, avatar clips, and transforms like upscaling and background removal. Each POST /v1/tools/* call is asynchronous and returns a toolExecutionId to poll. See the REST API reference for every tool and its request shape.

Libraries & SDKs

Official TypeScript (@videogen/sdk) and Python (videogen) clients stay in sync with the API and include helpers for polling, file uploads, and webhook verification. See Libraries & SDKs for install instructions and per-language guides.

Use with AI agents

VideoGen ships an agent skill for Cursor and other AI coding assistants, plus integration guides for OpenAI Agents SDK, Vercel AI SDK, and LangChain. See Use with AI agents for setup, tool definitions, and an AGENTS.md snippet you can drop into your repo.

Conventions

IDs

All IDs are prefixed strings (e.g. vg_work_..., vg_tool_..., vg_file_...). Store them as-is and do not parse them.

Timestamps

Every numeric timestamp field in the API (expiresAt, occurredAt, createdAt, and any future additions) is an integer representing seconds since the Unix epoch (UTC, no milliseconds). For example 1745409600 corresponds to 2025-04-23T12:00:00Z.

Next steps