Help Center

Lip Sync

Published 31/08/2025


Lip Sync

About Lip Sync

Lip Sync is a workspace within Workroom where you bring together video and audio to create a final video of a talking character.

You can use videos created in Cinema Motion, or upload your own, and apply voiceovers from Voice Studio, or upload an audio file manually. The system automatically synchronizes the voice with the lip movements.

How It Works

  1. Select or Upload a Video.
    • Select a video previously created in Cinema Motion.
    • Or, click + Add video to upload your own file (.mp4, .mov; up to 500MB).
  2. Select or Upload a Voiceover.
    • Use a voiceover from Voice Studio.
    • Or, click + Add Voice (in the upper-right corner of the block) to upload your audio track (.wav, .mp3; up to 500MB).
  3. Choose a Synchronization Model.
    • The default is Workroom — a versatile and fast model.
    • For more detailed articulation, you can select Sync.io — an enhanced model for complex synchronization and realism (coming soon).
  4. Click Generate.
    • After selecting your video, voiceover, and model, click Generate.
    • In 30–90 seconds, you will receive your final video, which will be saved in History and My Assets and will be available for preview and download.

Model Comparison

ModelGeneration TimePriceBest ForPros
WorkroomLip sync for avatars with moderate dynamics in the frameA practical, versatile model suitable for most tasks
Sync.ioCustom videos and complex scenesHighly accurate and versatile, suitable for custom videos and faces
MultitalkWorking with images: quick lip-syncing and simple gesturesHigh-quality results in minimal amount of time. Generates facial expressions and hand movements

Key Features

Combining Video and Voiceover

Create videos with talking avatars by synchronizing video with voiceovers. This works with videos created in Cinema Motion, as well as with videos you upload yourself. Voiceovers can be created in Voice Studio or uploaded separately.

Lip Sync Model Selection

Use different sync engines based on your goals:

  • Workroom  — Precise synchronization for AI avatars.
  • Sync.io — A versatile engine, suitable for real faces and custom characters.
  • Multitalk — Fast lip sync generation from an image. Suitable for videos with facial expressions and basic gestures.

Getting Started

For Beginners

  • To create a video with a talking avatar, generate a video in Cinema Motion, add a voiceover from Voice Studio, and combine them in Lip Sync.
  • Use short lines (up to ~125 characters) — optimal for videos lasting 5–10 seconds.
  • For the most stable results, use videos and voiceovers created within Workroom, rather than those uploaded manually.
  • Use Multitalk for quickly creating an avatar from an image.

For Experienced Users

  • Upload your own videos or visual scenes from external sources.
  • Choose Sync.io if high articulation accuracy and emotional expressiveness are important.

Best Practices

  1. Match video and audio lengths.
    Aim for 125 characters ≈ 10 seconds of speech. If the video is shorter, the voiceover will be cut off.
  2. Use simple and natural sentences.
    This helps the models synchronize speech and facial expressions better.
  3. If uploading a video, pay attention to the quality of the face.
    The face should be well-lit, unobstructed, and with minimal distortion.
  4. Choose videos with a frontal view of the face.
    Good lighting, positioning, and the absence of extreme emotions are critical for good lip sync.
  5. Break speech into short blocks (up to 10 seconds).
    It improves accurate synchronization and editing.
  6. Avoid the following factors in your lip sync videos, as they degrade the accuracy of speech and facial expression synchronization:
    • Obstructed faces (by hands, hair, objects) — it hinders the algorithms’ ability to correctly read lip movements.
    • Extreme emotions (e.g., shouting, yawning) — they create excessive facial distortions, which reduce lip sync quality.
    • Strong head turns or exiting the frame — the face is partially or completely lost, and synchronization is disrupted.
  7. Use Workroom for avatars with closed mouths and minimal movement, Sync.io for dynamic scenes with open mouths and active expressions, and Multitalk for images emphasizing speech synchronization and simple gestures.

Use Cases

  1. Lip sync for avatars — synchronizing speech and facial expressions for videos.
  2. Video re-dubbing — adding new audio over existing video.
  3. Localization — translation without losing authenticity.
  4. Dubbing corrections — replacing individual lines without re-recording the video.


You might like

Billing & Credits

Billing & Credits

This guide explains how credits, payments and subscriptions work in Workroom.

Account Settings

Account Settings

Learn how to view and update your personal information, email, password, or permanently delete your account.

Face Swap

Face Swap

Replace faces in images and videos automatically – no manual editing required!

Upscale

Upscale

Increase image resolution and enhance detail with single click without losing quality!