Instructions to use oxide-lab/LTX-Video-0.9.8-2B-distilled with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use oxide-lab/LTX-Video-0.9.8-2B-distilled with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="oxide-lab/LTX-Video-0.9.8-2B-distilled",
	filename="text_encoder_gguf/t5-v1_1-xxl-encoder-Q5_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use oxide-lab/LTX-Video-0.9.8-2B-distilled with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M
# Run inference directly in the terminal:
llama-cli -hf oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M
# Run inference directly in the terminal:
llama-cli -hf oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M
# Run inference directly in the terminal:
./llama-cli -hf oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M

Use Docker

docker model run hf.co/oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M

LM Studio
Jan
Ollama
How to use oxide-lab/LTX-Video-0.9.8-2B-distilled with Ollama:
```
ollama run hf.co/oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M
```

Unsloth Studio new

How to use oxide-lab/LTX-Video-0.9.8-2B-distilled with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for oxide-lab/LTX-Video-0.9.8-2B-distilled to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for oxide-lab/LTX-Video-0.9.8-2B-distilled to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for oxide-lab/LTX-Video-0.9.8-2B-distilled to start chatting

Docker Model Runner
How to use oxide-lab/LTX-Video-0.9.8-2B-distilled with Docker Model Runner:
```
docker model run hf.co/oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M
```

Lemonade

How to use oxide-lab/LTX-Video-0.9.8-2B-distilled with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull oxide-lab/LTX-Video-0.9.8-2B-distilled:Q5_K_M

Run and chat with the model

lemonade run user.LTX-Video-0.9.8-2B-distilled-Q5_K_M

List all available models

lemonade list

LTX-Video in Rust (Candle)

This repository provides a high-performance, native Rust implementation of LTX-Video using the Candle ML framework.

Demonstration

Video	Prompt
	A man walks towards a window, looks out, and then turns around. He has short, dark hair, dark skin, and is wearing a brown coat over a red and gray scarf. He walks from left to right towards a window, his gaze fixed on something outside. The camera follows him from behind at a medium distance. The room is brightly lit, with white walls and a large window covered by a white curtain. As he approaches the window, he turns his head slightly to the left, then back to the right. He then turns his entire body to the right, facing the window. The camera remains stationary as he stands in front of the window. The scene is captured in real-life footage.
	The camera pans across a cityscape of tall buildings with a circular building in the center. The camera moves from left to right, showing the tops of the buildings and the circular building in the center. The buildings are various shades of gray and white, and the circular building has a green roof. The camera angle is high, looking down at the city. The lighting is bright, with the sun shining from the upper left, casting shadows from the buildings. The scene is computer-generated imagery.
	The camera pans over a snow-covered mountain range, revealing a vast expanse of snow-capped peaks and valleys.The mountains are covered in a thick layer of snow, with some areas appearing almost white while others have a slightly darker, almost grayish hue. The peaks are jagged and irregular, with some rising sharply into the sky while others are more rounded. The valleys are deep and narrow, with steep slopes that are also covered in snow. The trees in the foreground are mostly bare, with only a few leaves remaining on their branches. The sky is overcast, with thick clouds obscuring the sun. The overall impression is one of peace and tranquility, with the snow-covered mountains standing as a testament to the power and beauty of nature.
	A woman with blood on her face and a white tank top looks down and to her right, then back up as she speaks. She has dark hair pulled back, light skin, and her face and chest are covered in blood. The camera angle is a close-up, focused on the woman's face and upper torso. The lighting is dim and blue-toned, creating a somber and intense atmosphere. The scene appears to be from a movie or TV show.

Features

🦀 Native Rust: No Python dependency required for inference.
🚀 Performance: Optimized for NVIDIA GPUs with Flash Attention v2 and cuDNN.
💾 Memory Efficient: Supports GGUF quantization for T5-XXL text encoder and VAE tiling/slicing for generating 720p+ videos on consumer GPUs.
🛠 Flexible: Easy to use CLI for video generation and library for custom integration.

Quick Start

Installation

Ensure you have Rust and the CUDA Toolkit installed, then:

git clone https://github.com/FerrisMind/candle-video
cd candle-video
cargo build --release --features flash-attn,cudnn

Video Generation

cargo run --example ltx-video --release --features flash-attn,cudnn -- \
          --local-weights "c:\model\models\ltxv-2b-0.9.8-distilled" \
          --unified-weights "c:\model\models\ltxv-2b-0.9.8-distilled" \
          --ltxv-version 0.9.8-2b-distilled \
          --prompt "A woman with blood on her face and a white tank top looks down and to her right, then back up as she speaks."

Performance & Memory

Resolution	Frames	VRAM (BF16)	VRAM (VAE Tiling)
512x768	97	~8-12 GB	~8 GB

Note: Using GGUF T5 encoder saves an additional ~8-12GB of VRAM.

Credits

Original Model: Lightricks/LTX-Video
Framework: HuggingFace Candle
Inspiration: city96/LTX-Video-gguf (for GGUF support patterns)

For more details, visit the main GitHub Repository.

Downloads last month: 814

GGUF

Model size

5B params

Architecture

t5encoder

Hardware compatibility

5-bit

Model tree for oxide-lab/LTX-Video-0.9.8-2B-distilled

Base model

Lightricks/LTX-Video

Quantized

(18)

this model

Collection including oxide-lab/LTX-Video-0.9.8-2B-distilled

LTX-Video

Collection

LTX-Video 0.9.5+ model weights for candle-video • 2 items • Updated Jan 9