https://huggingface.co/DavidAU/Qwen3.5-13B-Deckard-Heretic-Uncensored-Thinking

#2098

by VaLtEc-BoY - opened 17 days ago

Qwen3.5-13B-Deckard-Heretic-Uncensored-Thinking
A brutally smart model ; a combination a merge (and upscale) of a HERETIC/UNCENSORED Qwen 3.5 (9B), and then final Unsloth tune using DECKARD Dataset (5) over it on local hardware using Unsloth.

Ultra fine detail in both thinking and output generation.

Extra strong in following all instructions including complex system prompts.

48 layers, 639 tensors.

A 13B monster.

Uses upgraded (tools) and debugged (fixes looping, repeats and other issues) jinja template in both training and final model source.

This is also a HERETIC model, trained post "Heretic'ing" -> this model does what you want, no questions asked.

Fully uncensored.

Vision (images) tested -> working with new training.

GeoMaciolek

17 days ago

Link for convenience: https://huggingface.co/DavidAU/Qwen3.5-13B-Deckard-Heretic-Uncensored-Thinking

RichardErkhov

17 days ago

yes, DavidAU notifies us himself =)

https://huggingface.co/mradermacher/model_requests/discussions/2097

FrescoHF

16 days ago

@RichardErkhov
Can you help me with something? I want to apply hybrid layer quantization to a good model later on, but I don't know how to make my quantized layers look like the ones in the screenshot:

RichardErkhov

16 days ago

I think you attached wrong screenshot lol. and what is hybrid layer quantization? you mean gguf? just ask me, I will quant it

FrescoHF

16 days ago

I think you attached wrong screenshot lol. and what is hybrid layer quantization? you mean gguf? just ask me, I will quant it

I mean that when I create a new repository for a quantized model, it doesn't automatically appear in the "Quantizations" section on the original model page. I'd like to understand how to ensure my quants are listed there.
By "hybrid quantization," I'm referring to a technique where, for a target quantization level (e.g., Q5_K_M), I use higher precision for specific components—for instance, keeping embeddings in a higher precision like Q8 and applying higher-bit quantization to the Self-Attention blocks. I also know that keeping the first and last layers at higher precision improves the quality of responses, as these layers are more critical than the intermediate ones. I am already capable of creating such quantized models.
P.S. English isn't my native language, so I'm asking Gemini to translate this text into English.

RichardErkhov

15 days ago

•

edited 15 days ago

ah, it's something to do with the top of readme. explore the source code of any model from mradermacher and you will find what you are searching for, I dont remember it myself as I didnt bother doing it as when I started it wasnt even a feature and I was too lazy to edit the code lol
uhm not sure about that, as I didnt do that for a while, I think googling and checking llama cpp repository would be of a better help, I will also ask nico if he knows how to do that

which is your native language?

FrescoHF

15 days ago

•

edited 15 days ago

ah, it's something to do with the top of readme. explore the source code of any model from mradermacher and you will find what you are searching for, I dont remember it myself as I didnt bother doing it as when I started it wasnt even a feature and I was too lazy to edit the code lol

uhm not sure about that, as I didnt do that for a while, I think googling and checking llama cpp repository would be of a better help, I will also ask nico if he knows how to do that

which is your native language?

I'm from Ukraine; I speak both Ukrainian and Russian, but I'm chatting with the model in Russian.
For some reason, the model doesn't seem to be loading.

FrescoHF

14 days ago

@RichardErkhov , I managed to upload the model and write a description. Now I need to figure out how to display my quantization on the model's home page, but I'll get to that later.

FrescoHF

14 days ago

I think I've figured it out, but for some reason the files aren't showing up here:

RichardErkhov

14 days ago

•

edited 14 days ago

You can for sure speak to me in russian if it helps you understand better =)

Not sure why huggingface doesnt know how to figure the size out, perhaps you want to rename it to something like modelname.IQ4_XS.gguf
Notice the "."after the filename instead of "-"
I dont think it matters, but you never know

FrescoHF

14 days ago

You can for sure speak to me in russian if it helps you understand better =)

Not sure why huggingface doesnt know how to figure the size out, perhaps you want to rename it to something like modelname.IQ4_XS.gguf
Notice the "."after the filename instead of "-"
I dont think it matters, but you never know

I made it. My quantization has already been downloaded 471 times. I didn’t think anyone would really need it.

FrescoHF

12 days ago

•

edited 12 days ago

@RichardErkhov
I just created a quantized version of Google/Gemma-4-E4B and only later realized that it’s a base model. Does that mean I didn’t need to quantize it?

RichardErkhov

12 days ago

You can do any quantization you want =)
but if you wanted to quantize your model, no, you dont need to quantize a base model, you need to quantize only your model

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment