AI-MO/NuminaMath-CoT
Viewer • Updated • 860k • 61.3k • 578
How to use AITRADER/Amsi-fin-o1-MLX-8bit with MLX:
# Make sure mlx-vlm is installed
# pip install --upgrade mlx-vlm
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
# Load the model
model, processor = load("AITRADER/Amsi-fin-o1-MLX-8bit")
config = load_config("AITRADER/Amsi-fin-o1-MLX-8bit")
# Prepare input
image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
prompt = "Describe this image."
# Apply chat template
formatted_prompt = apply_chat_template(
processor, config, prompt, num_images=1
)
# Generate output
output = generate(model, processor, formatted_prompt, image)
print(output)How to use AITRADER/Amsi-fin-o1-MLX-8bit with Pi:
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "AITRADER/Amsi-fin-o1-MLX-8bit"
# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
"providers": {
"mlx-lm": {
"baseUrl": "http://localhost:8080/v1",
"api": "openai-completions",
"apiKey": "none",
"models": [
{
"id": "AITRADER/Amsi-fin-o1-MLX-8bit"
}
]
}
}
}# Start Pi in your project directory: pi
How to use AITRADER/Amsi-fin-o1-MLX-8bit with Hermes Agent:
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "AITRADER/Amsi-fin-o1-MLX-8bit"
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default AITRADER/Amsi-fin-o1-MLX-8bit
hermes
This is the 8-bit quantized MLX conversion of AITRADER/Amsi-fin-o1, a specialized financial vision-language model. The 8-bit quantization reduces memory usage by ~50% while maintaining excellent performance for financial analysis tasks.
| Parameter | Value |
|---|---|
| Quantization Type | 8-bit Integer |
| Group Size | 64 |
| Memory Reduction | ~50% |
| Quality Retention | ~98%+ |
| Component | Specification |
|---|---|
| Base Architecture | Qwen3-VL (4B parameters) |
| Text Model | 36 layers, 2560 hidden size |
| Vision Encoder | 24 layers, 1024 hidden size |
| Attention Heads | 32 (8 KV heads) |
| Context Length | Up to 131,072 tokens |
| Precision | 8-bit Quantized |
| Model Size | ~4GB |
# Install mlx-vlm
pip install -U mlx-vlm
# Basic image analysis
python -m mlx_vlm.generate \
--model AITRADER/Amsi-fin-o1-MLX-8bit \
--max-tokens 512 \
--temperature 0.7 \
--prompt "Analyze this financial chart and explain the trends." \
--image path/to/financial_chart.png
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
# Load model
model_path = "AITRADER/Amsi-fin-o1-MLX-8bit"
model, processor = load(model_path)
config = load_config(model_path)
# Prepare prompt
prompt = apply_chat_template(
processor,
config,
"Analyze the financial performance shown in this quarterly report.",
num_images=1
)
# Generate response
output = generate(
model,
processor,
prompt,
image="path/to/report.png",
max_tokens=512,
temperature=0.7
)
print(output)
from mlx_vlm import load, stream_generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
model_path = "AITRADER/Amsi-fin-o1-MLX-8bit"
model, processor = load(model_path)
config = load_config(model_path)
prompt = apply_chat_template(
processor,
config,
"What are the key insights from this financial statement?",
num_images=1
)
# Stream the response
for token in stream_generate(
model,
processor,
prompt,
image="financial_statement.png",
max_tokens=512
):
print(token, end="", flush=True)
prompt = """Analyze this stock chart and provide:
1. The overall trend (bullish/bearish/neutral)
2. Key support and resistance levels
3. Volume analysis
4. Trading recommendations"""
prompt = """Review this income statement and:
1. Calculate key financial ratios
2. Compare YoY performance
3. Identify areas of concern
4. Highlight positive indicators"""
prompt = """Extract all numerical data from this financial document
and organize it in a structured format."""
prompt = """Based on this quarterly report:
1. Summarize the company's financial health
2. Calculate growth metrics
3. Provide an investment thesis"""
| Metric | 8-bit (This) | BF16 |
|---|---|---|
| Memory Usage | ~4GB | ~8GB |
| Inference Speed | Faster | Baseline |
| Quality | Very Good | Highest |
| Recommended For | Limited RAM / Speed | Maximum Quality |
| Apple Silicon | Performance |
|---|---|
| M1 (8GB) | Good |
| M1 Pro/Max (16GB+) | Very Good |
| M2/M2 Pro/Max | Excellent |
| M3/M3 Pro/Max | Excellent |
| M4/M4 Pro/Max | Best |
Minimum: 8GB unified memory Recommended: 16GB+ for larger batch sizes
| Variant | Precision | Size | Speed | Quality |
|---|---|---|---|---|
| bf16 | BFloat16 | ~8GB | Baseline | Highest |
| 8bit | 8-bit Quantized | ~4GB | Faster | Very Good |
Choose the 8-bit version if:
Choose the bf16 version if:
This model was fine-tuned on specialized financial datasets:
@misc{amsi-fin-o1-mlx-8bit,
title={Amsi-fin-o1 MLX 8-bit: Quantized Financial Vision-Language Model for Apple Silicon},
author={AITRADER},
year={2025},
url={https://huggingface.co/AITRADER/Amsi-fin-o1-MLX-8bit}
}
This model is released under the Apache 2.0 License.
8-bit
Base model
Qwen/Qwen3-VL-4B-Thinking