Instructions to use psp-dada/Qwen2-VL-2B-Instruct-SENTINEL with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use psp-dada/Qwen2-VL-2B-Instruct-SENTINEL with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="psp-dada/Qwen2-VL-2B-Instruct-SENTINEL") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("psp-dada/Qwen2-VL-2B-Instruct-SENTINEL", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use psp-dada/Qwen2-VL-2B-Instruct-SENTINEL with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "psp-dada/Qwen2-VL-2B-Instruct-SENTINEL" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "psp-dada/Qwen2-VL-2B-Instruct-SENTINEL", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/psp-dada/Qwen2-VL-2B-Instruct-SENTINEL
- SGLang
How to use psp-dada/Qwen2-VL-2B-Instruct-SENTINEL with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "psp-dada/Qwen2-VL-2B-Instruct-SENTINEL" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "psp-dada/Qwen2-VL-2B-Instruct-SENTINEL", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "psp-dada/Qwen2-VL-2B-Instruct-SENTINEL" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "psp-dada/Qwen2-VL-2B-Instruct-SENTINEL", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use psp-dada/Qwen2-VL-2B-Instruct-SENTINEL with Docker Model Runner:
docker model run hf.co/psp-dada/Qwen2-VL-2B-Instruct-SENTINEL
Model Card for psp-dada/Qwen2-VL-2B-Instruct-SENTINEL | ICCV 2025 | SENTINEL:
Mitigating Object Hallucinations via Sentence-Level Early Intervention
🎊 News
- [2025.07.30] 🔍 Our work has been featured and explained by 52CV, check it out here.
- [2025.07.21] 📖 All code, data, and models are released!
- [2025.06.26] 🎉 Our SENTINEL is accepted by ICCV 2025!
🚀 Overview
SENTINEL introduces an automatic, sentence‑level early intervention strategy to prevent and mitigate object hallucinations in multimodal large language models (MLLMs). Key advantages:
Annotation‑free: No human labeling required.
Model-agnostic: Compatible with any MLLM architecture.
Efficient: Lightweight LoRA fine‑tuning.
🔑 Key Features
🧠 Early intervention halts hallucination propagation. We find that hallucinations of MLLMs predominantly arise in early sentences and propagate through the rest of the output. SENTINEL interrupts this chain early to maximize mitigation.
🔍 In-domain contextual preference learning without human labels. SENTINEL constructs hallucinated/factual samples via detector cross-validation and builds context-aware preference data without relying on proprietary LLMs or manual annotations.
💡 Context matters: rich coherence drives robustness. By prioritizing context-coherent positive samples over hallucinated ones, SENTINEL significantly boosts generalization.
♻️ Iterative contextual bootstrapping for diverse hallucination-free contexts. Our pipeline dynamically grows non-hallucinated contexts and expands coverage across varied scenes, improving robustness across generations.
📊 State-of-the-art results across benchmarks. SENTINEL achieves up to 92% reduction in hallucinations and outperforms prior SOTA methods across Object HalBench, AMBER, and HallusionBench, while maintaining or improving general task performance.
How to use
This model is a PEFT (LoRA) adapter. You first need to load the base model (Qwen/Qwen2-VL-2B-Instruct) and then load this adapter on top of it.
For the details of this model, please refer to the documentation of the GitHub repo.
📝 Citation
If you find our model/code/data/paper helpful, please consider citing our papers 📝 and starring us ⭐️!
@inproceedings{peng2025mitigating,
title={Mitigating object hallucinations via sentence-level early intervention},
author={Peng, Shangpin and Yang, Senqiao and Jiang, Li and Tian, Zhuotao},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={635--646},
year={2025}
}
📧 Contact us
If you have any questions, comments, or suggestions, please do not hesitate to submit an issue or PR to help advance research in this area.