duckdb-nsql-7b-mlx-4bit
This repository contains an MLX-optimized 4-bit quantized variant of motherduckdb/DuckDB-NSQL-7B-v0.1, intended for maximum efficiency (lowest memory, fastest decoding) on Apple Silicon (M1/M2/M3/M4).
Model description
DuckDB-NSQL-7B is a 7B parameter language model fine-tuned to translate natural language questions into DuckDB SQL. The 4-bit MLX conversion targets minimal memory usage and high throughput, with a larger quality trade-off compared to FP16/8-bit—especially for long schemas and complex queries.
Conversion details
- Base model: motherduckdb/DuckDB-NSQL-7B-v0.1 (fine-tuned from Llama 2 7B)
- Format: MLX
- Precision: 4-bit quantized
- Typical memory footprint: ~4–5 GB (varies by MLX quantization / runtime)
- Recommended for: laptops / demos / constrained RAM; when speed matters more than perfect SQL fidelity
Installation
pip install mlx-lm
Usage
Python
from mlx_lm import load, generate
model, tokenizer = load("Nuxera/duckdb-nsql-7b-mlx-4bit")
schema = """
CREATE TABLE hospitals (
hospital_id BIGINT,
hospital_name VARCHAR,
region VARCHAR,
bed_capacity INTEGER
);
CREATE TABLE encounters (
encounter_id BIGINT,
hospital_id BIGINT,
encounter_datetime TIMESTAMP,
encounter_type VARCHAR
);
"""
question = "For each hospital region, how many encounters happened this month?"
prompt = f"""You are an assistant that writes valid DuckDB SQL queries.
### Schema:
{schema}
### Question:
{question}
### Response (DuckDB SQL only):"""
out = generate(model, tokenizer, prompt=prompt, max_tokens=256, temp=0.0)
print(out)
Run as a local server
mlx_lm.server --model Nuxera/duckdb-nsql-7b-mlx-4bit --port 8080
Prompt format
This model works best when you provide:
- Clear schema (tables + columns)
- One question
- Explicit instruction to output SQL only
Example:
You are an assistant that writes valid DuckDB SQL queries.
### Schema:
CREATE TABLE ...
### Question:
...
### Response (DuckDB SQL only):
Quality notes (4-bit)
4-bit can degrade more than 8-bit/FP16 when prompts include:
- very long schemas with many similarly named columns
- multi-join / nested subqueries
- ambiguous questions requiring stronger reasoning
- strict formatting constraints
If you need maximum reliability, prefer FP16 or 8-bit.
License
This model inherits the Llama 2 license from the base model.
Citation
@misc{nuxera_duckdb_nsql_mlx_4bit,
title={DuckDB-NSQL-7B MLX 4-bit Quantized Conversion},
author={Nuxera AI},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/Nuxera/duckdb-nsql-7b-mlx-4bit}}
}
Base model:
@misc{duckdb_nsql,
title={DuckDB-NSQL-7B: Natural Language to SQL for DuckDB},
author={MotherDuck},
year={2024},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/motherduckdb/DuckDB-NSQL-7B-v0.1}}
}
Acknowledgments
- Original model by MotherDuck
- MLX framework by Apple ML Research
- MLX-LM: mlx-lm
- Nuxera AI
- Downloads last month
- 42
Model tree for Nuxera/duckdb-nsql-7b-mlx-4bit
Base model
meta-llama/Llama-2-7b
Finetuned
motherduckdb/DuckDB-NSQL-7B-v0.1