VercelVercel
Menu

Text-to-Video Generation

Last updated February 22, 2026

Generate videos from text prompts. Describe what you want to see and the model creates a video matching your description.

Google's Veo models generate high-quality videos with optional audio.

ModelDescription
google/veo-3.1-generate-001Latest model with audio generation
google/veo-3.1-fast-generate-001Fast generation
google/veo-3.0-generate-001Previous generation, 1080p max
google/veo-3.0-fast-generate-001Faster generation, lower quality
ParameterTypeRequiredDescription
promptstringYesText description of the video to generate
aspectRatiostringNoAspect ratio ('16:9', '9:16'). Defaults to '16:9'
duration4 | 6 | 8NoVideo length in seconds. Defaults to 8
resolutionstringNoResolution ('720p', '1080p'). Defaults to '720p'
providerOptions.vertex.generateAudiobooleanNoGenerate audio alongside the video. Required for Veo 3 models
providerOptions.vertex.enhancePromptbooleanNoUse Gemini to enhance prompts. Defaults to true
providerOptions.vertex.negativePromptstringNoWhat to discourage in the generated video
providerOptions.vertex.personGeneration'dont_allow' | 'allow_adult' | 'allow_all'NoWhether to allow person generation. Defaults to 'allow_adult'
providerOptions.vertex.compressionQuality'optimized' | 'lossless'NoCompression quality. Defaults to 'optimized'
providerOptions.vertex.sampleCountnumberNoNumber of output videos (1-4)
providerOptions.vertex.seednumberNoSeed for deterministic generation (0-4,294,967,295)
providerOptions.vertex.gcsOutputDirectorystringNoCloud Storage URI to store the generated videos
providerOptions.vertex.referenceImagesarrayNoReference images for style or asset guidance
providerOptions.vertex.pollIntervalMsnumberNoHow often to check task status. Defaults to 5000
providerOptions.vertex.pollTimeoutMsnumberNoMaximum wait time. Defaults to 600000 (10 minutes)
veo-text-to-video.ts
import { experimental_generateVideo as generateVideo } from 'ai';
import fs from 'node:fs';
 
const result = await generateVideo({
  model: 'google/veo-3.1-generate-001',
  prompt: 'A pangolin curled on a mossy stone in a glowing bioluminescent forest',
  aspectRatio: '16:9',
  resolution: '1920x1080',
  providerOptions: {
    vertex: {
      generateAudio: true,
    },
  },
});
 
fs.writeFileSync('output.mp4', result.videos[0].uint8Array);

KlingAI offers text-to-video with standard and professional quality modes. Audio generation requires v2.6+ models. Duration is 5-10 seconds.

ModelDescription
klingai/kling-v3.0-t2vMulti-shot generation, 15s clips, enhanced consistency
klingai/kling-v2.6-t2vAudio-visual co-generation, cinematic motion
klingai/kling-v2.5-turbo-t2vFaster generation, lower cost
ParameterTypeRequiredDescription
promptstringYesText description of the video to generate. Max 2500 characters.
aspectRatiostringNoAspect ratio ('16:9', '9:16', '1:1'). Defaults to '16:9'.
durationnumberNoVideo length in seconds. 5 or 10 for v2.x, 3-15 for v3.0. Defaults to 5.
providerOptions.klingai.mode'std' | 'pro'No'std' for standard quality. 'pro' for professional quality. Defaults to 'std'.
providerOptions.klingai.negativePromptstringNoWhat to avoid in the video. Max 2500 characters.
providerOptions.klingai.sound'on' | 'off'NoGenerate audio. Defaults to 'off'. Requires v2.6+.
providerOptions.klingai.cfgScalenumberNoPrompt adherence (0-1). Higher = stricter. Defaults to 0.5. Not supported on v2.x.
providerOptions.klingai.voiceListarrayNoVoice IDs for speech. Max 2 voices. Requires v3.0+ with sound: 'on'.
providerOptions.klingai.multiShotbooleanNoEnable multi-shot generation. Requires v3.0+. See KlingAI multi-shot.
providerOptions.klingai.watermarkInfoobjectNoSet { enabled: true } to generate watermarked result.
providerOptions.klingai.pollIntervalMsnumberNoHow often to check task status. Defaults to 5000.
providerOptions.klingai.pollTimeoutMsnumberNoMaximum wait time. Defaults to 600000 (10 minutes).
klingai-text-to-video.ts
import { experimental_generateVideo as generateVideo } from 'ai';
import fs from 'node:fs';
 
const result = await generateVideo({
  model: 'klingai/kling-v2.6-t2v',
  prompt: 'A chicken flying into the sunset in the style of 90s anime',
  aspectRatio: '16:9',
  duration: 5,
  providerOptions: {
    klingai: {
      mode: 'std',
    },
  },
});
 
fs.writeFileSync('output.mp4', result.videos[0].uint8Array);

Control camera movement during video generation.

ParameterTypeRequiredDescription
providerOptions.klingai.cameraControl.typestringYesCamera movement type. See options below.
providerOptions.klingai.cameraControl.configobjectNoMovement configuration. Required when type is 'simple'.

Camera movement types:

TypeDescriptionConfig required
'simple'Basic movement with one axisYes
'down_back'Camera descends and moves backwardNo
'forward_up'Camera moves forward and tilts upNo
'right_turn_forward'Rotate right then move forwardNo
'left_turn_forward'Rotate left then move forwardNo

Simple camera config options (use only one, set others to 0):

ConfigRangeDescription
horizontal[-10, 10]Camera translation along x-axis. Negative = left.
vertical[-10, 10]Camera translation along y-axis. Negative = down.
pan[-10, 10]Camera rotation around y-axis. Negative = left.
tilt[-10, 10]Camera rotation around x-axis. Negative = down.
roll[-10, 10]Camera rotation around z-axis. Negative = counter-clockwise.
zoom[-10, 10]Focal length change. Negative = narrower FOV.
camera-control.ts
import { experimental_generateVideo as generateVideo } from 'ai';
import fs from 'node:fs';
 
const result = await generateVideo({
  model: 'klingai/kling-v2.6-t2v',
  prompt: 'A serene mountain landscape at sunset',
  aspectRatio: '16:9',
  providerOptions: {
    klingai: {
      mode: 'std',
      cameraControl: {
        type: 'simple',
        config: {
          zoom: 5,
          horizontal: 0,
          vertical: 0,
          pan: 0,
          tilt: 0,
          roll: 0,
        },
      },
    },
  },
});
 
fs.writeFileSync('output.mp4', result.videos[0].uint8Array);

Generate videos with multiple storyboard shots, each with its own prompt and duration. Requires Kling v3.0+ models.

ParameterTypeRequiredDescription
providerOptions.klingai.multiShotbooleanYesSet to true to enable multi-shot generation
providerOptions.klingai.shotTypestringNoSet to 'customize' for custom shot durations
providerOptions.klingai.multiPromptarrayYesArray of shot configurations
providerOptions.klingai.multiPrompt[].indexnumberYesShot order (starting from 1)
providerOptions.klingai.multiPrompt[].promptstringYesText description for this shot
providerOptions.klingai.multiPrompt[].durationstringYesDuration in seconds for this shot
multi-shot.ts
import { experimental_generateVideo as generateVideo } from 'ai';
import fs from 'node:fs';
 
const result = await generateVideo({
  model: 'klingai/kling-v3.0-t2v',
  prompt: '',
  aspectRatio: '16:9',
  duration: 10,
  providerOptions: {
    klingai: {
      mode: 'pro',
      multiShot: true,
      shotType: 'customize',
      multiPrompt: [
        {
          index: 1,
          prompt: 'A sunrise over a calm ocean, warm golden light.',
          duration: '4',
        },
        {
          index: 2,
          prompt: 'A flock of seagulls take flight from the beach.',
          duration: '3',
        },
        {
          index: 3,
          prompt: 'Waves crash against rocky cliffs at sunset.',
          duration: '3',
        },
      ],
      sound: 'on',
    },
  },
});
 
fs.writeFileSync('output.mp4', result.videos[0].uint8Array);

Wan (by Alibaba) offers text-to-video with native audio generation and prompt enhancement. Use resolution parameter (e.g., '1280x720'), not aspectRatio.

ModelDescription
alibaba/wan-v2.6-t2vLatest model with native audio
alibaba/wan-v2.5-t2v-previewPreview model
ParameterTypeRequiredDescription
promptstringYesText description of the video to generate
resolutionstringNov2.6: '1280x720' or '1920x1080'. v2.5: also supports '848x480'
durationnumberNov2.6: 2-15s. v2.5: 5s or 10s only. Defaults to 5
providerOptions.alibaba.promptExtendbooleanNoEnhance prompt for better quality. Defaults to true
providerOptions.alibaba.negativePromptstringNoWhat to avoid in the video. Max 500 characters
providerOptions.alibaba.audioUrlstringNoURL to audio file for audio-video sync (WAV/MP3, 3-30s, max 15MB). v2.5 only
providerOptions.alibaba.shotType'single' | 'multi'No'multi' enables multi-shot cinematic narrative. v2.6 only
providerOptions.alibaba.watermarkbooleanNoAdd watermark to the video. Defaults to false
providerOptions.alibaba.pollIntervalMsnumberNoHow often to check task status. Defaults to 5000
providerOptions.alibaba.pollTimeoutMsnumberNoMaximum wait time. Defaults to 600000 (10 minutes)
wan-text-to-video.ts
import { experimental_generateVideo as generateVideo } from 'ai';
import fs from 'node:fs';
 
const result = await generateVideo({
  model: 'alibaba/wan-v2.6-t2v',
  prompt: 'A chicken flying into the sunset in the style of 90s anime',
  resolution: '1280x720',
  duration: 5,
  providerOptions: {
    alibaba: {
      promptExtend: true,
    },
  },
});
 
fs.writeFileSync('output.mp4', result.videos[0].uint8Array);

Grok Imagine Video (by xAI) generates videos from text prompts with support for multiple aspect ratios and resolutions. Duration ranges from 1-15 seconds.

ModelDurationAspect RatiosResolution
xai/grok-imagine-video1-15s1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3480p, 720p
ParameterTypeRequiredDescription
promptstringYesText description of the video to generate
aspectRatiostringNoAspect ratio ('16:9', '9:16', '1:1', '4:3', '3:4', '3:2', '2:3'). Defaults to '16:9'
durationnumberNoVideo length in seconds (1-15)
resolutionstringNoResolution ('854x480' for 480p, '1280x720' for 720p). Defaults to 480p
providerOptions.xai.resolution'480p' | '720p'NoNative resolution format. Alternative to standard resolution parameter
providerOptions.xai.pollIntervalMsnumberNoHow often to check task status. Defaults to 5000
providerOptions.xai.pollTimeoutMsnumberNoMaximum wait time. Defaults to 600000 (10 minutes)
grok-imagine-video.ts
import { experimental_generateVideo as generateVideo } from 'ai';
import fs from 'node:fs';
 
const result = await generateVideo({
  model: 'xai/grok-imagine-video',
  prompt: 'A chicken flying into the sunset in the style of 90s anime',
  aspectRatio: '16:9',
  duration: 5,
  providerOptions: {
    xai: {
      pollTimeoutMs: 600000,
    },
  },
});
 
fs.writeFileSync('output.mp4', result.videos[0].uint8Array);

Video generation can take several minutes. Set pollTimeoutMs to at least 10 minutes (600000ms) for reliable operation. Generated video URLs are ephemeral and should be downloaded promptly.



Was this helpful?

supported.