Strix-XSS-4B-RL (PoC)
⚠️ Proof of Concept: This model is an early research prototype demonstrating RL-trained XSS detection. It is not production-ready and requires a modified version of Strix to function properly.
Model Description
Strix-XSS-Qwen3-4B-RL is a specialized language model fine-tuned using reinforcement learning to detect Cross-Site Scripting (XSS) vulnerabilities. Built on Qwen3-4B-Thinking-2507, this model is designed to work as a sub-agent within Strix, an AI-powered penetration testing framework.
This model demonstrates the viability of using RL to train specialized security testing agents that can autonomously identify web application vulnerabilities.
Key Features
- RL-Trained: Fine-tuned using reinforcement learning on simulated penetration testing environments
- Specialized Focus: Optimized specifically for XSS vulnerability detection
- Agent Architecture: Designed for multi-agent systems where different models handle different vulnerability types
- Lightweight: 4B parameters, suitable for deployment on consumer hardware
Performance
Strix-XSS Evaluation: 0.79 (measured on strix-xss from the Prime Intellect environment hub)
This evaluation measures the model's ability to correctly identify XSS vulnerabilities in simulated web applications using the Strix testing framework.
Training Details
Training Data
- Dataset Size: 135 examples
- Environment: Simulated web application environment with Strix tooling
- Training Method: Reinforcement Learning
- Training Platform: Prime Intellect hosted training beta
Special thanks to Prime Intellect for providing the infrastructure that made this research possible! Their hosted training platform enabled efficient RL training for this security-focused model.
Base Model
- Architecture: Qwen3-4B-Thinking-2507
- Parameters: 4 billion
- Context Length: [inherited from base model]
Intended Use
This model is a proof of concept intended to demonstrate:
- The feasibility of RL-trained agents for vulnerability detection
- Integration patterns for specialized sub-agents in security testing frameworks
- Research directions for AI-powered penetration testing
Integration with Strix
This model is designed to work with a modified version of Strix that supports configurable sub-agent models for specific vulnerability types
Note: The public version of Strix does not yet support custom sub-agent models. This integration is currently experimental.
Limitations
- PoC Status: Not production-ready; requires additional testing and validation
- Specialized Scope: Trained only on XSS detection, not general security testing
- Small Dataset: 135 training examples limits generalization
- Simulated Training: Performance on real-world targets may vary
- Framework Dependency: Designed specifically for Strix; may not work well in isolation
Model Versions
- Full Precision: This repository
- GGUF Quantized: Available at kusonooyasumi/strix-xss-qwen3-4b-rl-gguf
- Q4_K_M (recommended for most users)
- Q5_K_M (better quality)
- Q8_0 (high quality)
- FP16 (full quality)
License
This model is released under the MIT License. See LICENSE file for details.
Citation
If you use this model in your research, please cite:
@misc{strix-xss-rl,
author = {oyasumi},
title = {Strix-XSS-4B-RL: An RL-Trained Model for XSS Detection},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/kusonooyasumi/strix-xss-qwen3-4b-rl}}
}
Acknowledgments
- Prime Intellect for providing hosted training infrastructure
- Qwen Team for the excellent base model
- Strix Project for the penetration testing framework
Related Projects
- Strix - AI-powered penetration testing framework
- Downloads last month
- 16