Instructions to use nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw")
model = AutoModelForImageTextToText.from_pretrained("nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw

SGLang

How to use nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw with Docker Model Runner:
```
docker model run hf.co/nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw
```

Razorback 12B v0.2 ExLlamaV2 4.0bpw Quant

UnslopNemo with Vision!

A more robust attempt at merging TheDrummer's UnslopNemo v3 into Pixtral 12B.

Has been really stable in my testing so far. Needs more testing to see what samplers it does/doesn't like.

Seems to be the best of both worlds - less sloppy, more engaging content and decent intelligence / visual understanding.

Merging Approach

First, I loaded up Pixtral 12B Base and Mistral Nemo Base to compare their parameter differences. Looking at the L2 norm / relative difference values I was able to isolate which parts of Pixtral 12B are a significant deviation from Mistral Nemo. Because while the language model architecture is the same between the two, a lot of vision understanding has been trained into Pixtral's language model and can break very easily.

Then I calculated merging weights for each parameter using an exponential falloff. The smaller the difference, the higher the weight.

Applied this recipe to Pixtral Instruct (Pixtral-12B-2409) and TheDrummer's UnslopNemo-12B-v3. The goal is to infuse as much Drummer goodness without breaking vision input. And it looks like it's worked!

Usage

Needs more testing to identify best sampling params, but so far just using ~0.7 temp + 0.03 min p has been rock solid.

Use the included chat template (Mistral). No chatml support yet.

Credits

Mistral for mistralai/Pixtral-12B-2409
Unsloth for unsloth/Pixtral-12B-2409 transformers conversion
TheDrummer for TheDrummer/UnslopNemo-12B-v3

Available Sizes

Repo	Bits	Head Bits	Size
nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw	4.0	6.0	8.19 GB
nintwentydo/Razorback-12B-v0.2-exl2-5.0bpw	5.0	6.0	9.54 GB
nintwentydo/Razorback-12B-v0.2-exl2-6.0bpw	6.0	8.0	11.1 GB
nintwentydo/Razorback-12B-v0.2-exl2-8.0bpw	8.0	8.0	13.7 GB

Downloads last month: 5

Model tree for nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw

Base model

nintwentydo/Razorback-12B-v0.2

Quantized

(6)

this model

Collection including nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw

Razorback v0.2

Collection

UnslopNemo: Now with Vision • 5 items • Updated Jan 10, 2025 • 2