Qwen3-4b playtesting the second draft of an RLVR environment of Mira's conceptualization.

Focus on one-shot roleplaying scenarios, even division of silly and serious, both narrative and problem-solving.

100 steps, cosine decay, batch size 4, learning rate 1e-5, rank 128, alpha 128.

They seemed fun, so releasing the merged model also, not just the adapter :)

Downloads last month
15
Safetensors
Model size
4B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Lambent/Qwen3-4B-Bard

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Finetuned
(376)
this model
Quantizations
2 models