Instructions to use EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF",
	filename="Ministral-3-3B-Instruct-2512-BF16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": [
				{
					"type": "text",
					"text": "Describe this image in one sentence."
				},
				{
					"type": "image_url",
					"image_url": {
						"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
					}
				}
			]
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M

Use Docker

docker model run hf.co/EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M

LM Studio
Jan

vLLM

How to use EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M

Ollama
How to use EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF with Ollama:
```
ollama run hf.co/EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M
```

Unsloth Studio

How to use EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF to start chatting

Docker Model Runner
How to use EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF with Docker Model Runner:
```
docker model run hf.co/EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M
```

Lemonade

How to use EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.Ministral-3-3B-Instruct-2512-GGUF-Q4_K_M

List all available models

lemonade list

------------------------------------------------
- Model Details and Specifications: -
------------------------------------------------

Ministral-3 3B Instruct 2512 (GGUF)

This release contains:
Llama.cpp and Ollama compatible GGUF converted and Quantized model files (Compatible with both Ollama, and Llama.cpp)

Quantized GGUF version of:

Ministral-3-3B-Instruct-2512-BF16
(by MistralAI)

Original Model Link:

mistralai/Ministral-3-3B-Instruct-2512-BF16

-------------------------------------------------------------
- GGUF Conversion and Quantization Details: -
-------------------------------------------------------------

Software used to convert Safetensors to GGUF:

llama.cpp

Software used to create Quantized GGUF Files:

llama.cpp

Specific GitHub Commit Point:

b7540

Converted to GGUF and Quantized by:

EnlistedGhost

--------------------------
---- Original Info ----
--------------------------

(Crossposted from the link in the above section: "Model Details"):

Ministral 3 14B Instruct 2512 BF16

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.

This model is the instruct post-trained version, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 14B can even be deployed locally, capable of fitting in 32GB of VRAM in BF16, and less than 24GB of RAM/VRAM when quantized.

We provide a no-loss FP8 version here, you can find other formats and quantizations in the Ministral 3 - Additional Checkpoints collection.

Key Features

Ministral 3 14B consists of two main architectural components:

13.5B Language Model
0.4B Vision Encoder

The Ministral 3 14B Instruct model offers the following capabilities:

Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
System Prompt: Maintains strong adherence and support for system prompts.
Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
Large Context Window: Supports a 256k context window.

Use Cases

Private AI deployments where advanced capabilities meet practical hardware constraints:

Private/custom chat and AI assistant deployments in constrained environments
Advanced local agentic use cases
Fine-tuning and specialization
And more...

Bringing advanced AI capabilities to most environments.

Ministral 3 Family

Model Name	Type	Precision	Link
Ministral 3 3B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 3B Instruct 2512	Instruct post-trained	BF16	Hugging Face
Ministral 3 3B Reasoning 2512	Reasoning capable	BF16	Hugging Face
Ministral 3 8B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 8B Instruct 2512	Instruct post-trained	BF16	Hugging Face
Ministral 3 8B Reasoning 2512	Reasoning capable	BF16	Hugging Face
Ministral 3 14B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 14B Instruct 2512	Instruct post-trained	BF16	Hugging Face
Ministral 3 14B Reasoning 2512	Reasoning capable	BF16	Hugging Face

Other formats available here.

Benchmark Results

We compare Ministral 3 to similar sized models.

Reasoning

Model	AIME25	AIME24	GPQA Diamond	LiveCodeBench
Ministral 3 14B	0.850	0.898	0.712	0.646
Qwen3-14B (Thinking)	0.737	0.837	0.663	0.593

Ministral 3 8B	0.787	0.860	0.668	0.616
Qwen3-VL-8B-Thinking	0.798	0.860	0.671	0.580

Ministral 3 3B	0.721	0.775	0.534	0.548
Qwen3-VL-4B-Thinking	0.697	0.729	0.601	0.513

Instruct

Model	Arena Hard	WildBench	MATH Maj@1	MM MTBench
Ministral 3 14B	0.551	68.5	0.904	8.49
Qwen3 14B (Non-Thinking)	0.427	65.1	0.870	NOT MULTIMODAL
Gemma3-12B-Instruct	0.436	63.2	0.854	6.70

Ministral 3 8B	0.509	66.8	0.876	8.08
Qwen3-VL-8B-Instruct	0.528	66.3	0.946	8.00

Ministral 3 3B	0.305	56.8	0.830	7.83
Qwen3-VL-4B-Instruct	0.438	56.8	0.900	8.01
Qwen3-VL-2B-Instruct	0.163	42.2	0.786	6.36
Gemma3-4B-Instruct	0.318	49.1	0.759	5.23

Base

Model	Multilingual MMLU	MATH CoT 2-Shot	AGIEval 5-shot	MMLU Redux 5-shot	MMLU 5-shot	TriviaQA 5-shot
Ministral 3 14B	0.742	0.676	0.648	0.820	0.794	0.749
Qwen3 14B Base	0.754	0.620	0.661	0.837	0.804	0.703
Gemma 3 12B Base	0.690	0.487	0.587	0.766	0.745	0.788

Ministral 3 8B	0.706	0.626	0.591	0.793	0.761	0.681
Qwen 3 8B Base	0.700	0.576	0.596	0.794	0.760	0.639

Ministral 3 3B	0.652	0.601	0.511	0.735	0.707	0.592
Qwen 3 4B Base	0.677	0.405	0.570	0.759	0.713	0.530
Gemma 3 4B Base	0.516	0.294	0.430	0.626	0.589	0.640

License

This model is licensed under the Apache 2.0 License.

You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.

Downloads last month: 669

GGUF

Model size

3B params

Architecture

mistral3

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Model tree for EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF

Base model

mistralai/Ministral-3-3B-Base-2512

Finetuned

mistralai/Ministral-3-3B-Instruct-2512-BF16

Quantized

(16)

this model

EnlistedGhost
/

Ministral-3-3B-Instruct-2512-GGUF

------------------------------------------------
- Model Details and Specifications: -
------------------------------------------------

Ministral-3 3B Instruct 2512 (GGUF)

-------------------------------------------------------------
- GGUF Conversion and Quantization Details: -
-------------------------------------------------------------

--------------------------
---- Original Info ----
--------------------------

Ministral 3 14B Instruct 2512 BF16

Key Features

Use Cases

Ministral 3 Family

Benchmark Results

Reasoning

Instruct

Base

License

Model tree for EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF

Dataset used to train EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF

------------------------------------------------ - Model Details and Specifications: -------------------------------------------------

Ministral-3 3B Instruct 2512 (GGUF)

------------------------------------------------------------- - GGUF Conversion and Quantization Details: --------------------------------------------------------------

-------------------------- ---- Original Info ---- --------------------------

Ministral 3 14B Instruct 2512 BF16

Key Features

Use Cases

Ministral 3 Family

Benchmark Results

Reasoning

Instruct

Base

License

Model tree for EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF

Dataset used to train EnlistedGhost/Ministral-3-3B-Instruct-2512-GGUF

------------------------------------------------
- Model Details and Specifications: -
------------------------------------------------

-------------------------------------------------------------
- GGUF Conversion and Quantization Details: -
-------------------------------------------------------------

--------------------------
---- Original Info ----
--------------------------