Qwen3-30B-A3B-YOYO-Thinking-Chimera-qx86-hi-mlx

Comparing the Qwen3-30B-A3B-YOYO earlier models and Thinking-Chimera

AutoThink  0.454,0.481,0.869,0.673,0.404,0.777,0.643
V2         0.531,0.690,0.885,0.685,0.448,0.785,0.646
V3         0.472,0.550,0.880,0.698,0.442,0.789,0.650
V4         0.511,0.674,0.885,0.649,0.442,0.769,0.618
V5         0.511,0.669,0.885,0.653,0.440,0.772,0.619
Chimera    0.416,0.449,0.685,0.639,0.390,0.770,0.652

The drop in arc and boolq explains some of the behavior.

For comparison, two Nightmedia models:

Element5-1M 0.560,0.709,0.883,0.756,0.448,0.807,0.713
Element6-1M 0.568,0.737,0.880,0.760,0.450,0.803,0.714
Architect18 0.577,0.760,0.879,0.760,0.446,0.803,0.702

This model Qwen3-30B-A3B-YOYO-Thinking-Chimera-qx86-hi-mlx was converted to MLX format from YOYO-AI/Qwen3-30B-A3B-YOYO-Thinking-Chimera using mlx-lm version 0.30.0.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("Qwen3-30B-A3B-YOYO-Thinking-Chimera-qx86-hi-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True, return_dict=False,
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Downloads last month: 42

Safetensors

Model size

31B params

Tensor type

BF16

U32

Model tree for nightmedia/Qwen3-30B-A3B-YOYO-Thinking-Chimera-qx86-hi-mlx

Base model

YOYO-AI/Qwen3-30B-A3B-YOYO-Thinking-Chimera

Quantized

(8)

this model