SadTalker - Talking head videos

Introducing SadTalker and talking photos

Imagine generating lifelike talking head videos with just a facial image and an audio clip. With Sad Talker, this innovative approach becomes a reality!

How does SadTalker do it?

Sad talker harnesses the power of cutting-edge 3D modeling techniques like ExpNet and PoseVAE.

It excels at capturing intricate facial expressions and head poses directly from your audio input.

By applying the resulting 3D motion coefficients to the facial render, Stylized Audio-Driven Talking-head video (SadTalker) crafts videos with unparalleled natural motion and exceptional image quality, surpassing previous methods.

The best part? SadTalker is now seamlessly integrated into the user-friendly stable-diffusion-webui platform and is available on ThinkDiffusion as a pre-installed extension.

Sadtalker is like sticking your hand up the photo to make it say whatever you want

This integration simplifies the entire process for designers and creative minds alike. Whether you're a seasoned pro or just starting out, you'll find it a breeze to use. The stable version of SadTalker, combined with the stable-diffusion-webui, guarantees reliable and consistent performance, making it effortless for users to create high-quality talking head videos like never before.

0:00

/0:05

We are loving this demo by Olivia Sarkias

Using Sad Talker

(1) Click the Sad Talker tab
(2) Upload an image of a face
(3) Upload a voice recording
(I use the sound recorder built in app within windows. WAV and M4A files seem to work well)

Just plug in your source image and audio file into SadTalker on ThinkDiffusion, and let A.I. take care of the rest

Hit Generate!

0:00

/0:03

SadTalker Settings

Pose Style

This will affect the head movement. If we set this to 45 then I have found that this tends to give the best results.

Face model resolution and GFPGAN face enhancer

You can set this to 256 or 512 which will be the resolution of the face.
You can use 512 to have a higher resolution and also select GFPGAN as the face enhancer and you can see we have a much clearer image.

You can change the face resolution and use GFPGAN options to enhance the final video's quality

Hit generate!

0:00

/0:03

PreProcess Options Explained

These settings control how the image is modified to become the target face model resolution.

Crop:
If we select a different image that includes the upper body, when we set the preprocess to crop, it will crop the image

Adjust the preprocess options to control how the composition is modified to become the target face model resolution.