4 bit (UINT4 with SVD rank 32) quantization of nvidia/ChronoEdit-14B-Diffusers using SDNQ.

Usage:

pip install git+https://github.com/Disty0/sdnq
import math
import torch
from PIL import Image
from diffusers.utils import load_image
from chronoedit_diffusers.pipeline_chronoedit import ChronoEditPipeline
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers

pipe = ChronoEditPipeline.from_pretrained("Disty0/ChronoEdit-14B-SDNQ-uint4-svd-r32", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png")
max_area = 480 * 832
aspect_ratio = input_image.height / input_image.width
mod_value = pipe.vae_scale_factor_spatial * pipe.transformer.config.patch_size[1]
height = round(math.sqrt(max_area * aspect_ratio)) // mod_value * mod_value
width = round(math.sqrt(max_area / aspect_ratio)) // mod_value * mod_value
input_image = input_image.resize((width, height))

output = pipe(
    image=input_image,
    prompt="Add a hat to the cat",
    height=height,
    width=width,
    num_frames=5,
    guidance_scale=2.5,
    generator=torch.manual_seed(0),
).frames[0]
image = Image.fromarray((output[-1] * 255).clip(0, 255).astype("uint8"))
image.save("chrono-edit-sdnq-uint4-svd-r32.png.png")

Original BF16 vs SDNQ quantization comparison:

Quantization Model Size Visualization
Input Image - Input Image
Original BF16 28.6 GB Original BF16
SDNQ UINT4 9.5 GB SDNQ UINT4
Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Disty0/ChronoEdit-14B-SDNQ-uint4-svd-r32

Quantized
(2)
this model

Collection including Disty0/ChronoEdit-14B-SDNQ-uint4-svd-r32