How Seedance modes differ

Last updated · April 19, 2026

The eight modes at a glance

Text → Video (T2V). Prompt-only. The default. Use when you have a shot description and no reference.
Image → Video (I2V). Upload a still; Seedance animates it. Great for product hero shots.
First-Last Frame. Two stills; Seedance interpolates a coherent motion arc. Use for precise begin/end framing.
Multi-Frame. Up to four reference stills used as motion anchors. Best for choreographed cuts.
Extend Video. Hand back an existing Seedance clip; we produce a continuation. Good for 8-second clips that need to hit ten.
Mimic Motion. Reference video + a still; the still moves with the reference's motion. The TikTok-dance-transfer mode.
Camera Control. T2V with explicit camera direction (dolly, pan, orbit, push-in).
Reference to Video. Upload a style reference; Seedance applies its look to a fresh prompt. Useful for matching a brand aesthetic.

Which mode for which job

UGC-style ad hook: T2V with a handheld prompt, 9:16, 720p Fast for iteration, 1080p Pro for the hero.
Product on a surface: I2V with a well-lit still of the product.
Brand-consistent B-roll: Reference-to-Video using a prior output as the style anchor.
Before/after transformation: First-Last Frame.
Lip-sync dance or workout transfer: Mimic Motion (note: no audio is generated; use /app/voiceover).

Resolution and duration

720p is the sweet spot for social-format video — faster, cheaper, and TikTok/Reels/Shorts re-encode anyway. 1080p is worth it for YouTube thumbnails, stills pulled from the clip, or anything going to a larger screen.

Durations are 4s, 6s, or 8s. Longer clips aren't more expensive per second — they're more expensive because they run longer. Pick the shortest length that tells the story.

Reference-image tips

Use clean, well-lit references; Seedance will faithfully reproduce noise, compression artifacts, and weird color casts.
Keep subject placement consistent with the prompt — if you upload a centered product shot and prompt for 'low-angle pan,' you'll fight the model.
For Mimic Motion, the reference video should be 3-8 seconds; longer references confuse the motion extractor.