Introduction

Programmatically generate images, videos, voiceovers, sound effects, and avatar clips.

The VideoGen API lets you integrate AI media generation into your product, pipeline, or workflow. Generate assets with a single API call and retrieve them when they’re ready.

Base URL

https://api.videogen.io

What you can build

CapabilityEndpoint
Generate images (from text or image)POST /v1/tools/generate-image
Generate video clips (from text, image, or video)POST /v1/tools/generate-video-clip
Convert text to speech with 100+ voicesPOST /v1/tools/text-to-speech
Generate sound effects from a promptPOST /v1/tools/generate-sound-effect
Create avatar videos with a presenterPOST /v1/tools/generate-avatar
Vectorize imagesPOST /v1/tools/vectorize-image
Remove image backgroundsPOST /v1/tools/remove-image-background
Remove video backgroundsPOST /v1/tools/remove-video-background
Upscale imagesPOST /v1/tools/upscale-image
Upscale videoPOST /v1/tools/upscale-video

Conventions

Timestamps

Every numeric timestamp field in the API (expiresAt, occurredAt, createdAt, and any future additions) is an integer representing seconds since the Unix epoch (UTC, no milliseconds). For example 1745409600 corresponds to 2025-04-23T12:00:00Z.

Next steps