Qwen3-4b playtesting the second draft of an RLVR environment of Mira's conceptualization.

Focus on one-shot roleplaying scenarios, even division of silly and serious, both narrative and problem-solving.

100 steps, cosine decay, batch size 4, learning rate 1e-5, rank 128, alpha 128.

They seemed fun, so releasing the merged model also, not just the adapter :)

Safetensors

Model size

4B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Lambent/Qwen3-4B-Bard

Base model

Finetuned

Finetuned

(376)

this model

Quantizations