0:00
/0:03

Are you new to creating videos with AI? If so, you're in the right place! In this guide, we'll introduce you to CogVideoX. It is perfect for beginners who want to start making amazing videos from ANY still image. This Image2Video and Text2Video model can be used inside ComfyUI, a powerful interface that makes working with image-to-video model CogVideoX easy and fun. Let's get started!

What is CogVideoX AI?

0:00
/0:23

CogVideoX AI is a custom model and node that has the ability to generate videos based on input using text or images. This makes it perfect for beginners who want to make videos without learning complex video editing software. There are several models for CogVideoX that can be used. But in this guide, we're going to use CogVideoX1.5-5B-I2V , an open-source model created by THUDM which is the recommended model for image to video generation. The I2V stands for Image2Video.

0:00
/0:22

ComfyUI simplifies the process by providing ready made workflows that you can drag & drop. You can focus on being creative without worrying about building complicated workflows, making the whole experience of using ComfyUI and CogVideoX much more accessible and fun.

How to Use CogVideoX in ComfyUI

Installation Guide

💡
Download the workflow and drag & drop it into your ComfyUI window, whether locally or on ThinkDiffusion. If you're using ThinkDiffusion, it's necessary to use at minimum the Turbo 24gb machine, but we do recommend the Ultra 48gb machine.

Custom Nodes

If there are red nodes in the workflow, it means that the workflow lacks the certain required nodes. Install the custom nodes in order for the workflow to work.

  1. Go to ComfyUI Manager  > Click Install Missing Custom Nodes
ThinkDiffusion StableDiffusion ComfyUI img2vid with cogvideox showing where to click install missing nodes.
  1. Check the list below if there's a list of custom nodes that needs to be installed and click the install.
ThinkDiffusion StableDiffusion ComfyUI img2vid with cogvideox shows where to check the list of missing custom nodes and install it

Models

Download the recommended models (see list below) using the ComfyUI manager and go to Install models. Refresh or restart the machine after the files have downloaded.

  1. Go to ComfyUI Manager  > Click Install Models
ThinkDiffusion StableDiffusion ComfyUI img2vid with cogvideox shows where to click install models
  1. When you find the exact model that you're looking for, click install and make sure to press refresh when you are finished.
ThinkDiffusion StableDiffusion ComfyUI img2vid with cogvideox shows where to search the missing models and install it.

Use the models a shown below. Some of the models are already available (pre-loaded) in the CogVideoX custom node and with the model manager. If not available selection, installed it using the information below.

Model Name Where to Get?
CogVideoX-5b-1.5-I2V Pre-loaded in Download CogVideo Model node
google_t5-v1_1-xxl_encoderonly-fp8_e4m3fn.safetensros available in Model Manager of Comfy Manager
t5xxl_fp8_e4m3fn.safetensors available in Model Manager of Comfy Manager
Reminder
💡
CogVideoX1.5-5B-I2V, new released model can now handle the following:
- up to 81 frames of video
- 5-10 secs of video length
- 1360 x 1360 of resolution
- up to 200 words of prompt

Procedures

Now that the hard work is out of the way, let's get creative. You need to follow the steps from top to bottom. The workflow is a one-click process after everything has been set up.

Steps Default Nodes
1. Load an Image ThinkDiffusion-StableDiffusion-ComfyUI-CogVideoX-load-an-image.png
2. Set the Image Size ThinkDiffusion-StableDiffusion-ComfyUI-CogVideoX-resize-image.png
3. Set the Models

Use the t5xxl_fp8_e4m3fn clip model instead with encoderonly if you want a accurante implementation of text prompt.
ThinkDiffusion-StableDiffusion-ComfyUI-CogVideoX-set-models.png
4. Write a Prompt ThinkDiffusion-StableDiffusion-ComfyUI-CogVideoX-write-prompt.png
5. Set the Settings as seen on the Image. Run the Prompt

You can gradually test the frames count while running the prompt. Higher frames sometimes can display a weird video frames
ThinkDiffusion-StableDiffusion-ComfyUI-CogVideoX-set-the-settings.png
6. Check the Video ThinkDiffusion-StableDiffusion-ComfyUI-CogVideoX-check-the-video.png

Reminders

💡
If you have less than 32GB of System RAM, use the t5xxl_fp8_e4m3fn text encoder instead of the t5xxl_fp16 version.

CogVideoX Examples

0:00
/0:02
0:00
/0:03
0:00
/0:01
0:00
/0:01

Shoes Display

Prompt: A quick 360-degree rotate view of shoes on a white background, capturing its unique design.

  • base model - CogVideoX1.5-5B-I2V
  • resolution - 1360 x 768
  • precision - bf16
  • seed - 92162988405408
  • frame - 30
  • steps - 25, cfg - 6, denoise - 1
  • scheduler - CogVideoXDDIM
  • clip - google_t5-v1_1-xxl_encoderonly-fp8_e4m3fn.safetensors

Animate the Anime

Prompt: An anime woman dressed in traditional Japanese attire holds a glowing sparkler, the soft light illuminating her serene expression and the delicate details of her kimono.

  • base model - CogVideoX1.5-5B-I2V
  • resolution - 1360 x 768
  • precision - bf16
  • seed - 184363993861098
  • frame - 50
  • steps - 25, cfg - 6, denoise - 1
  • scheduler - CogVideoXDDIM
  • clip - google_t5-v1_1-xxl_encoderonly-fp8_e4m3fn.safetensors

Food Presentation

Prompt: A zoom out view of a newly cooked ramen, featuring rich, savory broth, perfectly tender noodles, slices of succulent pork, a soft-boiled egg, fresh green onions, and a sprinkle of sesame seeds.

  • base model - CogVideoX1.5-5B-I2V
  • resolution - 1360 x 768
  • precision - bf16
  • seed - 474954467584424
  • frame - 25
  • steps - 25, cfg - 6, denoise - 1
  • scheduler - CogVideoXDDIM
  • clip - t5xxl_fp8_e4m3fn.safetensors

Coffee Demo

Prompt: A barista carefully pour a coffee into a blue cup.

  • base model - CogVideoX1.5-5B-I2V
  • resolution - 1360 x 768
  • precision - bf16
  • seed - 323367415049274
  • frame - 25
  • steps - 25, cfg - 6, denoise - 1
  • scheduler - CogVideoXDDIM
  • clip - t5xxl_fp8_e4m3fn.safetensors

If you’re having issues with installation or slow hardware, you can try any of these workflows on a more powerful GPU in your browser with ThinkDiffusion.

If you enjoy ComfyUI and you want to test out creating awesome animations, then feel free to check out this AnimateDiff tutorial here. Happy creating!