An experimental ablation of Gemma-3-27B-it, using the Heretic tool.
Compared to the standard configuration of Heretic, there are a few changes:
- The training and test datasets used were extended compared to the default subset used by Heretic
- A version of Magnitude-Preserving Orthogonal Ablation (MPOA) is used
- To stay faithful to MPOA, the harmful direction to ablate is chosen from between 2 layers (Heretic's "global" direction scope)
- To stay faithful to MPOA, a 99% winsorization is applied to the residuals
- Some additional refusal markers were added to avoid bypassing the refusal detection with bad punctuation
To achieve strong results:
- Parameter ranges were iteratively refined by looking at resulting refusal and divergence scores
- The scoring function was adjusted to prioritize low-refusal results
The model name contains the properties of the ablation:
MPOAfor the usage of Magnitude-Preserving Orthogonal AblationGfor the usage of global direction scopeWfor the usage of winsorizationDfor the measured KL divergenceRfor the number of refusals
Original: https://huggingface.co/spikymoth/G3-Heresy-MPOA-G-W99-D0.0394-R03
GGUF (standard): https://huggingface.co/spikymoth/G3-Heresy-MPOA-G-W99-D0.0394-R03-GGUF
GGUF (imatrix): https://huggingface.co/spikymoth/G3-Heresy-MPOA-G-W99-D0.0394-R03-i1-GGUF
MLX: https://huggingface.co/spikymoth/G3-Heresy-MPOA-G-W99-D0.0394-R03-MLX
- Downloads last month
- 745
Hardware compatibility
Log In
to view the estimation
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit