An experimental ablation of Gemma-3-27B-it, using the Heretic tool.

Compared to the standard configuration of Heretic, there are a few changes:

  1. The training and test datasets used were extended compared to the default subset used by Heretic
  2. A version of Magnitude-Preserving Orthogonal Ablation (MPOA) is used
  3. To stay faithful to MPOA, the harmful direction to ablate is chosen from between 2 layers (Heretic's "global" direction scope)
  4. To stay faithful to MPOA, a 99% winsorization is applied to the residuals
  5. Some additional refusal markers were added to avoid bypassing the refusal detection with bad punctuation

To achieve strong results:

  1. Parameter ranges were iteratively refined by looking at resulting refusal and divergence scores
  2. The scoring function was adjusted to prioritize low-refusal results

The model name contains the properties of the ablation:

  1. MPOA for the usage of Magnitude-Preserving Orthogonal Ablation
  2. G for the usage of global direction scope
  3. W for the usage of winsorization
  4. D for the measured KL divergence
  5. R for the number of refusals

Original: https://huggingface.co/spikymoth/G3-Heresy-MPOA-G-W99-D0.0394-R03
GGUF (standard): https://huggingface.co/spikymoth/G3-Heresy-MPOA-G-W99-D0.0394-R03-GGUF
GGUF (imatrix): https://huggingface.co/spikymoth/G3-Heresy-MPOA-G-W99-D0.0394-R03-i1-GGUF
MLX: https://huggingface.co/spikymoth/G3-Heresy-MPOA-G-W99-D0.0394-R03-MLX

Downloads last month
745
GGUF
Model size
27B params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support