Instructions to use WiroAI/WiroAI-Finance-Llama-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use WiroAI/WiroAI-Finance-Llama-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="WiroAI/WiroAI-Finance-Llama-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("WiroAI/WiroAI-Finance-Llama-8B")
model = AutoModelForCausalLM.from_pretrained("WiroAI/WiroAI-Finance-Llama-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use WiroAI/WiroAI-Finance-Llama-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "WiroAI/WiroAI-Finance-Llama-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "WiroAI/WiroAI-Finance-Llama-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/WiroAI/WiroAI-Finance-Llama-8B

SGLang

How to use WiroAI/WiroAI-Finance-Llama-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "WiroAI/WiroAI-Finance-Llama-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "WiroAI/WiroAI-Finance-Llama-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "WiroAI/WiroAI-Finance-Llama-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "WiroAI/WiroAI-Finance-Llama-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use WiroAI/WiroAI-Finance-Llama-8B with Docker Model Runner:
```
docker model run hf.co/WiroAI/WiroAI-Finance-Llama-8B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

WiroAI-Finance-Llama-8B

🚀 Meet with WiroAI/WiroAI-Finance-Llama-8B! A robust language model with more finance knowledge support! 🚀

🌟 Key Features

Fine-tuned with 500,000+ high-quality finance instructions. (Josephgflowers/Finance-Instruct-500k)
LoRA method was used for fine-tuning without quantization.
Adapted to finance expertise.
Built on Meta's cutting-edge LLaMA architecture

📝 Model Details The model is the finance data fine-tuned version of Meta's innovative LLaMA model family. This model has been trained using Supervised Fine-Tuning (SFT) on carefully curated high-quality finance instructions.

Usage

Transformers Pipeline

import transformers
import torch


model_id = "WiroAI/WiroAI-Finance-Llama-8B"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

pipeline.model.eval()

messages = [
    {"role": "system", "content": "You are a finance chatbot developed by Wiro AI"},
    {"role": "user", "content": "How can central banks balance the trade-off between controlling inflation and maintaining economic growth, especially in an environment of high public debt and geopolitical uncertainty?"
  },
]

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    messages,
    max_new_tokens=512,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.9,
)

print(outputs[0]["generated_text"][-1]['content'])

Central banks face a challenging trade-off between controlling inflation and maintaining economic growth, particularly in the face of high public debt and geopolitical uncertainty. To balance these concerns, central banks can consider implementing a combination of monetary and fiscal policies, as well as adjusting the policy mix to address these challenges. For example, central banks can consider:

1. Implementing a gradual tightening of monetary policy to help control inflation without stifling economic growth.
2. Encouraging fiscal authorities to implement fiscal consolidation measures to reduce public debt and improve public finances.
3. Implementing targeted support measures for specific sectors or industries that are most affected by geopolitical uncertainty or other external shocks.
4. Developing and implementing a forward guidance policy to provide clarity and stability to financial markets and economic actors.
5. Collaborating with other central banks and international organizations to address the global nature of these challenges and develop coordinated policy responses.

It is important to note that the appropriate policy response will depend on a variety of factors, including the specific country's economic conditions, monetary and fiscal policy frameworks, and the extent of geopolitical uncertainty. Central banks will need to weigh the benefits and risks of different policy options carefully and adapt their policies accordingly.

🤝 License and Usage

This model is provided under apache 2.0 license. Please review the license terms before use.

📫 Contact and Support

For questions, suggestions, and feedback, please open an issue on HuggingFace or contact us directly from our website.

Citation

@article{WiroAI,
  title={WiroAI/WiroAI-Finance-Llama-8B},
  author={Abdullah Bezir, Furkan Burhan Türkay, Cengiz Asmazoğlu},
  year={2025},
  url={https://huggingface.co/WiroAI/WiroAI-Finance-Llama-8B}
}