File size: 4,279 Bytes
b72aa93
 
 
 
 
f1932cf
b72aa93
f1932cf
b72aa93
f1932cf
 
b72aa93
f1932cf
b72aa93
 
f1932cf
b72aa93
f1932cf
b72aa93
f1932cf
b72aa93
 
f1932cf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b72aa93
 
 
f1932cf
 
 
b72aa93
 
 
 
 
 
 
 
 
 
 
 
f1932cf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b72aa93
 
 
 
f1932cf
 
 
 
 
 
 
b72aa93
 
 
 
 
 
 
 
 
 
 
 
f1932cf
 
 
 
 
 
 
 
 
b72aa93
 
 
f1932cf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b72aa93
f1932cf
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
---
language:
- en
- it
pipeline_tag: text-generation
library_name: transformers
tags:
- llama
- code
- coding-assistant
- gguf
- instruct
- 1b
---

# PINDARO AI CODE

PINDARO AI CODE is the code-specialized release of the Pindaro model family.

## Model At A Glance

- Architecture: `LlamaForCausalLM`
- Model type: `llama`
- Approx. parameters: **~1.1B**
- Precision: `float16`
- Context length: `2048`
- Vocabulary size: `32002`
- Languages: English, Italian
- Primary use: code generation and coding assistance

## Included Artifacts

Hugging Face format:
- `model.safetensors`
- `config.json`
- `generation_config.json`
- `tokenizer.json`
- `tokenizer.model`
- `tokenizer_config.json`
- `special_tokens_map.json`
- `added_tokens.json`

GGUF format:
- `pindaro-f16.gguf`
- `pindaro-q4_k_m.gguf`

Release docs:
- `release/RELEASE_MANIFEST.json`
- `release/RELEASE_NOTES.md`
- `release/SHA256SUMS.txt`

## Prompt Format

Special tokens:
- `<|noesis|>` (id `32000`)
- `<|end|>` (id `32001`)

Configured chat template uses role sections and appends a code-fence prefix in generation prompt:

```jinja
{{ bos_token }}{% for message in messages %}<|noesis|>
{% if message['role'] == 'system' %}### System
{{ message['content'] }}
{% elif message['role'] == 'user' %}### Question
{{ message['content'] }}
{% elif message['role'] == 'assistant' %}### Answer
{{ message['content'] }}
{% endif %}<|end|>
{% endfor %}{% if add_generation_prompt %}<|noesis|>
### Answer
```
{% endif %}
```

Minimal manual prompt example:

```text
<|noesis|>
### Question
Write a Python function add(a, b).
<|end|>
<|noesis|>
### Answer
```
```

## Quickstart (Transformers)

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "RthItalia/PINDARO-AI-CODE"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
)

messages = [
    {"role": "system", "content": "You are a coding assistant."},
    {"role": "user", "content": "Write a Python function add(a, b)."},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
)

attention_mask = torch.ones_like(inputs)
outputs = model.generate(
    inputs,
    attention_mask=attention_mask,
    max_new_tokens=120,
    do_sample=False,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=False))
```

## Quickstart (GGUF / llama.cpp)

```bash
./llama-cli -m pindaro-q4_k_m.gguf -p "<|noesis|>
### Question
Write a Python function add(a, b).
<|end|>
<|noesis|>
### Answer
```" -n 120
```

## Validation Snapshot

Last internal validation snapshot: **2026-03-02**

- HF smoke tests: PASS
- HF mini-eval coding quality: **1.00**
- GGUF F16 quality gate: PASS
- GGUF Q4_K_M quality gate: PASS
- Release verdict: **publishable: true**

Notes:
- Results are from internal sanity checks, not a full public benchmark suite.

## Known Limitations

- Generated code can be syntactically correct but logically wrong.
- May emit verbose outputs or repeated scaffolding.
- Always run tests and static checks on generated code.

## Safety

- Do not execute generated code in privileged environments without review.
- Use sandboxing for untrusted snippets.
- Add dependency and secret scanning in deployment workflows.

## Artifact Checksums (SHA256)

- `model.safetensors`: `f77c27b8babf9fcab83a7dc68ba58934e8c8c031c9f10b4b73e802d4fbfe0cec`
- `config.json`: `b37c45060f3e2f5f9b91903c9ccb32f3c21076e809954fda6c01d987cd8f25cc`
- `generation_config.json`: `6ff47e725c0ec6d0f1895670de7ee68e61a4f99703f6c8e89aea6ab14ea02dc3`
- `tokenizer.json`: `51433f06369ac3e597dfa23a811215e3511b8f86588a830ded72344b76a193ee`
- `tokenizer.model`: `9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347`
- `tokenizer_config.json`: `a0567c49a117af9af332874cfd333ddd622a09c5e9765131ceee6344cb22a3de`
- `special_tokens_map.json`: `d7805e093432afcde852968cdeba3de08a6fe66e77609f4701decb87fc492f33`
- `added_tokens.json`: `ece349d292e246eac9a9072c1730f023e61567984a828fb0d25dccb14e3b7592`
- `pindaro-f16.gguf`: `bdaaeb6fb712e9a4d952082cf415b05c7d076b33786d39063bbfb3a7e5db2031`
- `pindaro-q4_k_m.gguf`: `5f98cc3454774ed5ed80d71a71adfd0daff760fc9eef0900ddd4f7eda2e20fef`