Seedance 2.0 User Manual | New Multimodal Video Creation Experience - Pixshop

Seedance 2.0 is officially launched!

Since the days when we could only tell stories with text and first/last frames, we've dreamed of building a video model that truly understands your expression. Today, it's finally here!

Seedance 2.0 now supports four input modalities: image, video, audio, and text, offering richer expression and more controllable generation.

You can set the visual style with a single image, specify character movements and camera changes with a video, and set the rhythm and atmosphere with a few seconds of audio... Combined with prompts, the creation process becomes more natural, efficient, and truly like being a "director".

📷

Precise Image Reference Reproduction

Accurate reproduction of composition, character details

🎥

Reference Video Replication

Supports replication of camera language, complex action rhythms, and creative effects

⏱

Smooth Extension & Continuity

Generate continuous shots from prompts — not just generate, but "keep filming"

✂️

Enhanced Editing

Supports character replacement, removal, and addition in existing videos

Video creation has never been just about "generation" — it's about controlling expression. 2.0 is not just multimodal, it's a truly controllable way to create.

Seedance 2.0, multimodal creation starts here. Dare to imagine — leave the rest to it.

1. Parameter Overview

Core Dimension	Seedance 2.0
Image Input	Up to 9 images
Video Input	Up to 3 videos, total duration no more than 15s (reference videos cost a bit more)
Audio Input	Supports MP3 upload, up to 3 files, total duration no more than 15s
Text Input	Natural language
Generation Duration	Up to 15s, freely choose between 4-15s
Audio Output	Built-in sound effects/background music

Interaction Limit: The current maximum for mixed inputs is 12 files. We recommend prioritizing materials that have the greatest impact on visuals or rhythm, and allocating file counts wisely across modalities.

2. Interaction Methods

Note: Seedance 2.0 supports "First/Last Frame" and "Universal Reference" entry points. Smart multi-frame and subject reference are not selectable. If you only upload a first frame image + prompt, use the First/Last Frame entry; for multimodal (image, video, audio, text) combined input, enter through the Universal Reference entry.

The current interaction method uses <code>@material_name</code> to specify the purpose of each image, video, and audio, for example:

@Image1 as first frame
@Video1 reference camera language
@Audio1 for background music

Main Interface

Entry: Seedance 2.0 - Universal Reference / First-Last Frame

Open local file dialog

Select files, add to input box

Universal Reference Mode — Method 1: Type "@" to invoke reference

Type "@"

Select reference, drops into input box

Enter prompt

Universal Reference Mode — Method 2: Click the "@" tool to invoke reference

Click "@"

Select reference, drops into input box

Enter prompt

After uploading materials, images, videos, and audio all support hover preview:

Below are some usage examples and creative approaches for different scenarios to help you better understand Seedance 2.0's improvements in generation quality, control capability, and creative expression. If you don't know where to start, check out these examples for inspiration!

Capability Showcase

Seedance 2.0 Capabilities / Improvement Preview

1. Significantly Enhanced Basic Capabilities: More Stable, Smoother, More Realistic!

Beyond multimodal, Seedance 2.0 is significantly enhanced at the foundational level — more realistic physics, more natural and fluid motion, more precise instruction understanding, more stable style consistency. It can reliably handle complex actions, continuous motion, and other challenging generation tasks, making overall video output more realistic and smooth — a comprehensive evolution of core capabilities!

Prompt

A girl elegantly hanging clothes to dry, after finishing she picks up another piece from the bucket and vigorously shakes it out.

Reference Materials

First frame

Generated ResultUltra-realistic video

Prompt

The character in the painting has a guilty expression, eyes darting left and right peeking out of the frame, quickly reaches out to grab a cola and takes a sip, showing a satisfied expression. Then footsteps are heard, the character hurriedly puts the cola back. A cowboy picks up the cola and walks away. Finally the camera pushes forward as the screen fades to black with only a top-lit cola can, with artistic subtitles at the bottom: "YiKou Cola, a must-try!"

Reference Materials

First frame

Prompt

Camera pulls back slightly (revealing the full street view) and follows the heroine as she walks. Wind blows her skirt hem as she walks through a 19th-century London street. A steam-powered car drives by quickly from the right side, its wind blowing up her skirt as she frantically presses it down with both hands in shock. Background sounds include footsteps, crowd noise, and vehicle sounds.

Reference Materials

First frame

Prompt

Camera follows a man in black fleeing rapidly with a crowd chasing behind. Camera switches to a side tracking shot as the panicked character knocks over a fruit stand, gets up and continues running, with sounds of the chaotic crowd.

Reference Materials

First frame

2. Comprehensive Multimodal Upgrade: Video Creation Enters the "Free Combination" Era!

2.1 Multimodal Introduction

Supports uploading text, images, videos, and audio — all of which can be used as source or reference materials. You can reference any content's actions, effects, style, camera movement, characters, scenes, and sounds. As long as your prompt is clear, the model can understand it.

Seedance 2.0 = Multimodal Reference (reference anything) + Strong Creative Generation + Precise Instruction Response (excellent comprehension)

Just describe the visuals and actions you want in natural language, and clarify whether it's a reference or an edit. When using multiple materials, double-check that each @reference is clearly labeled — don't mix up images, videos, and characters!

2.2 Special Usage Methods (No Limits, Just Suggestions)

Have first/last frame images? Also want to reference video actions?

Write clearly in the prompt, e.g.: "@Image1 as first frame, reference @Video1's fighting actions"

Want to extend an existing video?

Specify the extension duration, e.g. "Extend @Video1 by 5s". Note: The selected generation duration should be the duration of the "new portion" (e.g., extend 5s, also select 5s generation length)

Want to merge multiple videos?

Explain the composition logic in the prompt, e.g.: "I want to add a scene between @Video1 and @Video2, content is xxx"

No audio materials?

You can directly reference audio from a video

Want to generate continuous actions?

Add continuity descriptions in the prompt, e.g.: "Character transitions directly from jumping to rolling, maintaining fluid and coherent motion" @Image1@Image2@Image3...

2.3 Those Long-Standing Video Challenges Can Now Actually Be Solved!

Video creation always has its pain points: faces changing between shots, actions not matching, unnatural video extensions, rhythm going off during edits... This multimodal upgrade tackles all these "persistent headaches" at once. Below are specific use cases.

2.3.1 Comprehensive Consistency Improvement

You may have encountered these frustrations: characters looking different from shot to shot, product details getting lost, small text becoming blurry, scene jumps, inconsistent camera styles... These common consistency issues in creation can now all be resolved in 2.0. From faces to clothing to font details, overall consistency is more stable and accurate.

Prompt

Man @Image1 walking tiredly through the corridor after work, his pace slowing, finally stopping at the front door. Close-up on face, the man takes a deep breath, adjusts his emotions, puts away the negativity, becomes relaxed. Then close-up of finding keys, inserting into the lock. After entering the home, his little daughter and a pet dog joyfully run over to greet and hug him. The interior is very warm and cozy, with natural dialogue throughout.

Reference Materials

Character reference

Prompt

Replace the girl in @Video1 with a Chinese opera actress (Hua Dan), on an exquisite stage. Reference @Video1's camera movements and transition effects, using camera angles to match the character's actions, achieving ultimate stage aesthetics and enhanced visual impact.

Reference Videos

Reference video

Prompt

Using the reference image character's appearance, generate a teaser trailer for a period time-travel drama. 0-3 seconds: The male lead with the appearance from reference image 1 holds up a basketball, looking up at the camera. Saying "I just wanted a drink, am I really about to time travel?..." ...

Reference Materials

Character reference

Prompt

Reference all transitions and camera movements from @Video1, one continuous shot. Starting with a chess board, camera pans left to reveal yellow sand on the floor, camera moves up to a beach...

Reference Videos

Transition & camera reference

Prompt

0-2 seconds: Quick four-panel flash cuts, red, pink, purple, and leopard print bow ties shown in sequence, close-up on satin sheen and "chéri" brand text... (Korean voiceover ad)

Reference Materials

Product image

Prompt

Create a commercial-style showcase of the bag in @Image2, with the side view referencing @Image1, the surface texture referencing @Image3. All details of the bag should be showcased, with grand and majestic background music.

Reference Materials

Side reference

Bag main body

Texture reference

Prompt

Use @Image1 as the first frame, first-person perspective, reference @Video1's camera effects. Upper scene references @Image2, left scene references @Image3, right scene references @Image4.

Reference Materials

First frame

Upper scene

Left scene

Right scene

Reference Videos

Camera reference

2.3.2 Advanced / Controllable Camera Movement and Action Precise Replication

Previously, to make a model mimic movie-style blocking, camera work, or complex actions, you either had to write tons of detailed prompts or simply couldn't do it. Now, just upload a reference video and you're good to go.

Prompt

Reference @Image1's male character, he's in the elevator from @Image2. Fully reference @Video1's camera effects and the protagonist's facial expressions. Hitchcock zoom during the panic, then several orbiting shots showing the elevator interior...

Reference Materials

Character

Elevator scene

Scene reference

Reference Videos

Camera reference

Prompt

Reference @Image1's male character, he's in the corridor from @Image2. Fully reference @Video1's camera effects and the protagonist's facial expressions. Camera follows the protagonist running around corners in @Image2...

Reference Materials

Character

Corridor

Long hallway

Fork in the road

Scene

Reference Videos

Camera reference

Prompt

@Image1's tablet as the main subject, camera referencing @Video1, pushing into a screen close-up, camera rotates as the tablet flips to show its full appearance. Data streams on screen keep changing, surroundings gradually transform into a sci-fi data space.

Reference Materials

Tablet

Reference Videos

Focus rotation camera

Prompt

@Image1's actress as the main subject, reference @Video1's camera techniques for rhythmic push-pull-pan movements. The actress's movements also reference the dance moves of the woman in @Video1, performing energetically on stage.

Reference Materials

Actress

Reference Videos

Push-pull dance camera

Prompt

Reference @Image1@Image2 spear-wielding character, @Image3@Image4 dual-blade character, imitate @Video1's actions, fighting in the maple leaf forest from @Image5.

Reference Materials

Spear character 1

Spear character 2

Dual-blade character 1

Dual-blade character 2

Maple leaf forest

Reference Videos

Fight action reference

Prompt

Reference Video1's character actions, reference Video2's orbiting camera language. Generate a fight scene between Character 1 and Character 2. The fight takes place under a starry night, with white dust rising during the battle. The fight scene is spectacular and the atmosphere is very tense.

Reference Materials

Character 1

Character 2

Reference Videos

Fight referenceOrbiting camera reference

Prompt

Reference Video1's camera work and shot transition rhythm, replicate using Image1's red supercar.

Reference Materials

Red supercar

Reference Videos

Car camera reference

2.3.3 Creative Templates / Complex Effect Precise Replication

More than just generating images and writing stories, Seedance 2.0 also supports "copying from reference" — creative transitions, finished ads, movie clips, complex edits. As long as you have reference images or videos, the model can identify action rhythms, camera language, visual structure, and precisely replicate them. Don't worry if you don't know professional terminology — just describe what you want to reference, and the model will generate a high-quality version for you. Be bold! It can really do it.

Prompt

Replace the character in @Video1 with @Image1, @Image1 as first frame. Character puts on virtual sci-fi glasses, reference @Video1's camera work, close orbiting shots, transitioning from third-person to the character's subjective perspective, traveling through the AI virtual glasses...

Reference Materials

Reference Videos

Camera reference

Prompt

Reference the model's facial features from the first image. The model wears outfits from reference images 2-6 and approaches the camera, striking playful, cool, cute, surprised, and stylish poses...

Reference Materials

Model

Outfit 1

Outfit 2

Outfit 3

Outfit 4

Outfit 5

Reference Videos

Fisheye lens reference

Prompt

Reference the video's advertising concept, use the provided down jacket images, with the following ad copy: "This is goose down, this is the warm swan, this is the wearable polar swan-down jacket. Stay warm for the new year, live warm." Generate a new down jacket ad video.

Reference Materials

Down jacket

Goose down

Swan

Reference Videos

Ad reference

Prompt

Black and white ink wash style, @Image1's character references @Video1's effects and actions, performing a segment of ink-wash style Tai Chi kung fu.

Reference Materials

Character

Reference Videos

Tai Chi effect reference

Prompt

Replace @Video1's opening character with @Image1, fully reference Video1's effects and actions. Rose petals grow from the flower stamen in hand, cracks extend upward on the face...

Reference Materials

Character 1

Character 2

Reference Videos

Transformation effect reference

Prompt

Starting from @Image1's ceiling, reference @Video1's puzzle-shatter effect for transition. Replace "BELIEVE" text with "Seedance", reference @Image2's font.

Reference Materials

Ceiling

Font reference

Reference Videos

Puzzle shatter transition

Prompt

Opening with a black screen, reference Video1's particle effects and texture. Golden gilded sand drifts from the left side of the frame and covers to the right, reference @Video1's particle scatter effect. @Image1's text gradually appears in the center of the frame.

Reference Materials

Text

Reference Videos

Particle effect reference

Prompt

@Image1's character references the actions and expression changes from @Video1, showcasing the abstract behavior of eating instant noodles.

Reference Materials

Character

Reference Videos

Action & expression reference

2.3.4 Model Creativity & Storyline Completion

Prompt

Animate @Image1 in left-to-right, top-to-bottom order as a comic performance, keeping character dialogue consistent with the image. Add special sound effects for panel transitions and key plot moments. Overall style should be humorous and witty; performance style references @Video1.

Reference Materials

Comic image

Reference Videos

Performance style reference

Prompt

Reference @Image1's documentary-style storyboard, referencing @Image1's shot divisions, framing, camera movements, visuals, and copy. Create a 15s healing-style opening about "The Four Seasons of Childhood".

Reference Materials

Storyboard

Prompt

Reference Video1's audio, using Images 1-5 as inspiration, create an emotion-driven video. Background music references @Video1.

Reference Materials

Reference Videos

Audio/music reference

2.3.5 Video Extension

Prompt

Extend 15s video, reference @Image1, @Image2's donkey-riding-motorcycle character. Add a creative ad segment: Scene 1: Fixed side camera, donkey rides motorcycle out of the barn... Scene 3: ...ad slogan "Inspire Creativity, Enrich Life"

Reference Materials

Donkey look 1

Donkey look 2

Reference Videos

Original video

Prompt

Extend video by 6s, electric guitar music kicks in, "JUST DO IT" ad text appears mid-screen then gradually fades, camera moves up to the ceiling...

Reference Materials

Athletic wear

Logo

Reference Videos

Original video

Prompt

Extend @Video1 by 15 seconds. 1-5s: Light and shadow slowly slide through blinds across the wooden table and cup... 11-15s: Text gradually appears: "Lucky Coffee", "Breakfast", "AM 7:00-10:00".

Reference Videos

Coffee timelapse original video

Prompt

Extend forward by 10s. In warm afternoon light, the camera starts from a row of awnings fluttering in the breeze at the street corner, slowly panning down to a few small daisies peeking out at the base of the wall...

Reference Videos

Sunflower scooter original video

2.3.6 More Accurate Audio, More Realistic Sound

Prompt

Fixed camera, central fisheye lens looking down through a circular opening. Reference Video1's fisheye lens, have the horse from @Video2 look at the fisheye lens, reference @Video1's speaking actions, background BGM references audio from @Video3.

Reference Videos

Fisheye lensHorse videoAudio reference

Prompt

Based on the provided office building promotional photos, generate a 15-second cinematic realistic-style real estate documentary in 2.35:1 widescreen, 24fps. The narrator's voice tone references @Video1...

Reference Materials

Reference Videos

Narrator voice reference

Prompt

A roasting dialogue in a "Cat & Dog Roast Room", with rich emotions fitting a stand-up performance: Meow-chan (cat host): "Who understands this, family?...", Wangzai (dog host): "You have the nerve to talk about me?..."

Reference Materials

Scene reference

Prompt

The opening music of the classic Yu Opera segment "The Case of Chen Shimei" begins. The black-robed Judge Bao on the left points at the red-robed Chen Shimei on the right, singing Yu Opera through gritted teeth...

Reference Materials

Scene reference

Prompt

Generate a 15-second music video. Keywords: steady composition / gentle push-pull / low-angle heroic feel / documentary but premium... Sunset side-backlight volumetric rays through dust particles, cinematic composition, real film grain, gentle breeze moving coat hems.

Reference Materials

Scene reference

Prompt

The girl in the center wearing a hat gently sings "I'm so proud of my family!"... Latin music starts in the background... The whole family forms a circle, dancing to lively music, skirts swirling.

Reference Materials

Scene reference

Prompt

Fixed camera. The standing muscular man (captain) clenches his fist and says in Spanish: "Raid in three minutes!"... Everyone stands at attention, completing tactical hand signals amid the sound of equipment clashing.

Reference Materials

Scene reference

Prompt

0-3s: Opening alarm clock rings... 3-10s: Quick pan shot, cutting to the opposite side with a close-up of the man's face. The man reluctantly wakes the girl, voice tone and timbre reference @Video1... 12-15s: Cut to full body of the male lead, he sighs: "I really can't do anything about you!"

Reference Materials

Girl

Man

Reference Videos

Voice reference

Prompt

@Image1's monkey walks to the bubble tea shop counter... The monkey orders from the server in a Sichuan accent: "Hey sis, do you have 'Farewell My Concubine'?"

Reference Materials

Monkey

Bichon server

Bubble tea shop

Prompt

In a popular science style and voice, narrate the content from Image 1, which includes the story of Sun Wukong borrowing the Banana Fan from Princess Iron Fan to cross the Flaming Mountains...

Reference Materials

Journey to the West illustration

2.3.7 Stronger Shot Continuity (One-Take)

Prompt

@Image1@Image2@Image3@Image4@Image5, one-take tracking shot, following a runner from the street up stairs, through a corridor, onto a rooftop, and finally overlooking the city.

Reference Materials

Prompt

Starting with @Image1 as the first frame, the view zooms out to an airplane window. Clouds slowly drift into frame, one of them adorned with colorful candy beans... gradually transforming into @Image2's ice cream...

Reference Materials

Window

Ice cream

Character

Prompt

Spy thriller style, @Image1 as the opening frame. Camera follows a female agent in a red coat walking forward... No cuts throughout, one continuous take.

Reference Materials

First frame

Corner building

Masked girl

Mansion

Prompt

From @Image1's exterior shot, first-person perspective quick push into the cabin interior close-up. A little deer @Image2 and a sheep @Image3 are drinking tea and chatting by the fireplace. Camera pushes in for a close-up of the teacup, style referencing @Image4.

Reference Materials

Exterior

Deer

Sheep

Teacup

Prompt

@Image1@Image2@Image3@Image4@Image5, first-person one-take thrilling roller coaster shot, with the coaster going faster and faster.

Reference Materials

2.3.8 Highly Usable Video Editing

Sometimes you already have a video and don't want to find new images or redo everything from scratch — you just want to adjust a small segment of action, extend a few seconds, or make a character's performance closer to your vision. Now you can use existing video as input and make targeted modifications to specific segments, actions, or rhythms without changing anything else.

Prompt

Subvert @Video1's storyline. The man's eyes shift instantly from tender to cold and ruthless. In a moment when the heroine is completely off guard, he forcefully pushes her off the bridge...

Reference Videos

Period drama bridge moongazing original video

Prompt

Subvert @Video1's entire storyline. 0-3s: Man in suit sitting at a bar... 6-9s: Suddenly the suited man pulls out from under the table — an absurdly large snack gift package...

Reference Videos

Male/female lead original video

Prompt

Replace the female lead singer in Video1 with Image1's male lead singer. Actions completely imitate the original video, no cuts, band performance music.

Reference Materials

Male lead singer

Reference Videos

Band performance original video

Prompt

Change the woman's hairstyle in Video1 to long red hair. Image1's great white shark slowly surfaces halfway, behind her.

Reference Materials

Great white shark

Reference Videos

Water surface original video

Prompt

Video1 camera pans right, the fried chicken shop owner busily hands fried chicken to customers in line... Close-up of the owner holding a paper bag printed with Image1's logo...

Reference Materials

Paper bag logo

Reference Videos

Fried chicken shop original video

2.3.9 Music Beat Sync

Prompt

The girl in the poster keeps changing outfits, clothing style referencing @Image1@Image2, holding @Image3's bag, video rhythm references @Video.

Reference Materials

Reference Videos

Music beat sync

Prompt

Images @Image1-7 sync to @Video's keyframe positions and overall rhythm for beat matching. Characters in the frames are more dynamic...

Reference Materials

Reference Videos

Beat rhythm reference

Prompt

@Image1-6 landscape scenes, reference @Video's visual rhythm, transitions match scene style and music rhythm for beat sync.

Reference Materials

Reference Videos

Music rhythm reference

2.3.10 Better Emotional Performance

Prompt

@Image1's woman walks to the mirror, looks at herself. Pose references @Image2. After a moment of contemplation, she suddenly starts screaming in breakdown. The grabbing motion and breakdown screaming emotions and expressions fully reference @Video1.

Reference Materials

Woman

Pose reference

Reference Videos

Emotion reference

Prompt

This is a range hood ad. @Image1 as the opening frame, woman elegantly cooking with no smoke. Camera quickly pans right to @Image2 man sweating profusely, face red, cooking...

Reference Materials

Woman cooking

Man cooking

Range hood

Prompt

@Image1 as the first frame, camera rotates and pushes closer. Character suddenly looks up, facial appearance references @Image2. Starts roaring loudly, excited with some comedic flair, referencing @Image3's expression. Then the character transforms into a bear, referencing @Image4.

Reference Materials

First frame

Face reference

Expression reference

Bear reference

A Final Word

Seedance 2.0's multimodal capabilities are constantly evolving. We will continue to update features and support more input combinations. We hope this user manual helps you unleash your creativity more freely!

If you encounter bugs, have usage suggestions, or need specific scenarios, feel free to leave a message or DM us! We'll keep optimizing to make Jimeng a truly enjoyable and convenient productivity tool for you.

Frequently Asked Questions (FAQ)

What input modalities does Seedance 2.0 support?

Seedance 2.0 supports four input modalities: images (up to 9), videos (up to 3, total duration ≤15s), audio (MP3, up to 3, total duration ≤15s), and text (natural language). The combined input limit is 12 files.

How long of a video can Seedance 2.0 generate?

It can generate videos up to 15 seconds, with free selection between 4-15 seconds. It also supports video extension, allowing you to continue generating from an existing video.

How do I use the multimodal reference feature?

In Universal Reference mode, use "@material_name" to specify the purpose of each image, video, and audio. For example: @Image1 as first frame, @Video1 for camera reference, @Audio1 for background music. You can type "@" directly in the input box or click the "@" button in the toolbar.

What are Seedance 2.0's core capability improvements?

Core capabilities include: multimodal reference (reference anything), precise camera and action replication, creative effect replication, video extension and continuity, video editing (character replacement/removal/addition), music beat sync, one-take continuity, emotional performance, and voice generation. Physics are more realistic, motion is more natural and fluid, instruction understanding is more precise, and style consistency is more stable.

How do I extend an existing video?

After uploading a video, specify the extension duration in the prompt, e.g. "Extend @Video1 by 5s". Note: The generation duration should be set to the "new portion" duration — e.g., if extending by 5s, also select 5s generation length. Both forward and backward extension are supported.

What's the difference between First/Last Frame and Universal Reference?

If you only upload a first frame image + prompt, you can use the First/Last Frame entry for a simpler workflow. For multimodal (image, video, audio, text) combined input, you need to use the Universal Reference entry. Universal Reference mode is more powerful and supports more complex creative needs.