Uploaded model

Developed by: student-abdullah
License: apache-2.0
Quantized from model: Qwen2.5-Coder-0.5B
Created on: 06th July, 2025

Acknowledgement

Quantization Description

This model is quantized using selective quantization from the Qwen2.5-Coder-0.5B base model to increase its speed while preserving the capabilities in generating relevant and accurate responses related python programming. The quantization method included 32-bit quantization of the following Layers:

q_proj
v_proj
o_proj
down_proj
lm_head

Rest of the remaining layers were quantized to q3_k_l

Model Description

Layer Name	Role (Short)	Type
`q_proj`, `k_proj`, `v_proj`	Compute query, key, and value for attention mechanism	Attention Proj
`o_proj`	Projects attention output back to model hidden size	Attention Proj
`down_proj`	Projects MLP output down to hidden size	MLP
`gate_proj`	First part of Gated MLP, controls info flow	MLP
`up_proj`	Expands hidden size in MLP	MLP
`lm_head`	Final linear layer for logits	Output Head
`embed_tokens`	Token embedding layer	Input Embed
`norm`	Final layernorm	Normalization
`*_layernorm`	Normalize inputs to layers	Normalization

Model Architect

Qwen2ForCausalLM(
  (model): Qwen2Model(
    (embed_tokens): Embedding(151936, 896, padding_idx=151665)
    (layers): ModuleList(
      (0-23): 24 x Qwen2DecoderLayer(
        (self_attn): Qwen2Attention(
          (q_proj): Linear(in_features=896, out_features=896, bias=True)
          (k_proj): Linear(in_features=896, out_features=128, bias=True)
          (v_proj): Linear(in_features=896, out_features=128, bias=True)
          (o_proj): Linear(in_features=896, out_features=896, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): Qwen2MLP(
          (gate_proj): Linear(in_features=896, out_features=4864, bias=False)
          (up_proj): Linear(in_features=896, out_features=4864, bias=False)
          (down_proj): Linear(in_features=4864, out_features=896, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): Qwen2RMSNorm((896,), eps=1e-06)
        (post_attention_layernorm): Qwen2RMSNorm((896,), eps=1e-06)
      )
    )
    (norm): Qwen2RMSNorm((896,), eps=1e-06)
    (rotary_emb): LlamaRotaryEmbedding()
  )
  (lm_head): Linear(in_features=896, out_features=151936, bias=False)
)

Performance & Limitations

YET TO BE EXAMINED

Model Performace Evaluation:

YET TO BE EVALUATED

Downloads last month: 4

GGUF

Model size

0.5B params

Architecture

qwen2

Hardware compatibility

32-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for student-abdullah/Quantized_Qwen-2.5-Coding-0.5B_mixed_selective

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-Coder-0.5B

Quantized

(31)

this model

Collection including student-abdullah/Quantized_Qwen-2.5-Coding-0.5B_mixed_selective

Quantized Qwen 2.5 Coder 0.5B

Collection

Qwen 2.5 Coder 0.5B Model is approx 990 Mb in size. This model collections are quantize versions of the model, created through selective quantization. • 1 item • Updated Oct 9, 2025 • 1