AI Celebrity Lip Sync Videos

Upload a face and audio — Kling 3.0 Omni syncs them into a video.

How do I make an AI celebrity lip sync video?

Upload a reference face photo and an audio track into Renoise Canvas, write a prompt describing the performance, and render on Kling 3.0 Omni — its native lipsync maps the audio to mouth movement in one step. For any real public figure's likeness, FacePass clearance with their written consent is required before the model will generate.

Want a still photo to speak a script instead of sing? See the AI talking photo guide

What the lip sync generator covers

Kling 3.0 Omni native lipsync — key facts before you start.

Native lipsync — no editing

Kling 3.0 Omni syncs audio to mouth movement in a single generation pass — no manual keyframing or post-editing required.

3–15 second clips

Each lip sync generation produces a 3–15 second clip. Chain multiple clips in the Canvas Timeline for longer performances.

Audio: voice or music

Attach a spoken voice track or a music audio file — Kling 3.0 Omni reads phonemes and drives mouth shapes for both.

FacePass consent required for real faces

To sync a real person's likeness — including any public figure — FacePass clearance with their written consent is required first.

Make a lip sync video in 3 steps

From a reference face and audio to a synced performance — all inside Renoise Canvas. Demonstrated here with original fictional characters.

Step 1
Upload face and audio
Drag a clear front-facing reference photo and your audio track onto Canvas. For a real person, complete FacePass clearance first.
Step 2
Write the performance prompt
Describe the setting, emotion, and style — "singer on a neon stage, energetic pop performance, wide shot".
Step 3
Select Kling 3.0 Omni and generate
Pick Kling 3.0 Omni from the model menu — it handles lipsync natively. Generate and export the synced clip.

Lip sync performance styles

Examples made with original fictional characters in Renoise — the same pipeline works for any authorized face and audio.

Stage performance

Original 3D-style singer character performing synced to music on a neon-lit stage

Studio presenter

Original fictional presenter in cream blazer delivering synced speech in a studio setting

Chef narration

Original fictional chef character narrating in the kitchen with synced mouth movement

Fantasy monologue

Original dark-fantasy warrior character delivering a synced speech in fire-lit atmosphere

How AI lip sync works — and what FacePass means for real faces

Lip sync maps audio to mouth shapes. Every distinct sound in speech or song — a phoneme — has a corresponding visible mouth position, called a viseme. Kling 3.0 Omni, developed by Kuaishou, reads the audio you provide frame by frame and drives the jaw, lips, and cheeks to match those phoneme shapes. The result is a synchronized talking or singing clip generated directly from your reference face and audio in a single pass — no manual animation or post-processing step required.

The reference face anchors the identity: the model keeps the face consistent across the clip while only the speech animation changes. This is why a still portrait can become a performing singer or presenter — the face stays recognizable, the mouth and expression move.

For fictional characters, illustrated avatars, or AI-generated faces, this pipeline requires no extra steps. For any real, identifiable person — a public figure, a performer, anyone — Renoise requires FacePass clearance before the model will generate. FacePass is the authorized-likeness system: it records that you hold the rights or written consent to use that face for AI-generated video. Without FacePass clearance, the model blocks real human faces by default.

This means you cannot generate a real celebrity's lip sync video in Renoise simply by uploading their photo — that requires FacePass, which requires their written consent. You can however generate a lip sync video with your own face, a consenting subject you represent, or entirely original fictional characters. The demo examples on this page all use original fictional characters to illustrate the pipeline.

Generate your first lip sync video

Watermark-free exports on any paid plan.

Generate See FacePass

Frequently asked questions

1.Can I use a real celebrity's face for lip sync?

Not without their written consent. Renoise's FacePass system requires likeness clearance for any real, identifiable person — including public figures and celebrities. Without clearance the model blocks real human faces by default. You can use your own face, a consenting subject you represent, or fully original fictional characters without FacePass.

2.What audio formats work for lip sync?

Kling 3.0 Omni accepts standard audio tracks — spoken voice recordings and music files both work. Attach the audio file directly in Renoise Canvas alongside your reference face photo. The model reads phonemes from the audio regardless of language or whether the source is speech or song.

3.How long can the lip sync video be?

A single Kling 3.0 Omni generation produces a 3–15 second clip. For a longer performance — a full song chorus, a multi-minute monologue — generate multiple clips and stitch them together in the Canvas Timeline.

4.Does the face need to be perfectly still in the reference photo?

A clear, front-facing portrait gives the best results. The face should be well-lit, unobstructed, and looking roughly toward the camera. Side angles, heavy shadows, or partial occlusion reduce lipsync quality because the model has less face geometry to anchor the mouth animation to.

5.Is this the same as a deepfake?

No. Renoise's lip sync pipeline is a consented AI video generation tool — not an identity-fraud or impersonation tool. Real-face generation requires FacePass clearance with explicit written consent, making unauthorized impersonation of anyone not permitted by the platform.

6.Can I use an AI-generated face instead of a real photo?

Yes. Fictional or AI-generated faces have no FacePass requirement. Generate an original character, then use that same character as the reference face for lip sync. All demo examples on this page use this approach — original fictional characters only.

7.Which model handles lip sync in Renoise?

Kling 3.0 Omni, developed by Kuaishou, is the model that provides native lipsync in Renoise. "Native" means lipsync is built into the generation pipeline — you supply the face and audio and the synced video comes out in one pass, without a separate processing step.

8.What is the difference between lip sync and a talking photo?

The pipeline is identical — both use Kling 3.0 Omni native lipsync with a reference face and audio. The intent differs: a talking photo is typically a presenter or speaker delivering a script; celebrity lip sync refers to the performance / singing entertainment use case. See the AI talking photo guide for the presenter framing.