Instructions to use abdoelsayed/dear-8b-reranker-ranknet-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use abdoelsayed/dear-8b-reranker-ranknet-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="abdoelsayed/dear-8b-reranker-ranknet-v1")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("abdoelsayed/dear-8b-reranker-ranknet-v1") model = AutoModelForSequenceClassification.from_pretrained("abdoelsayed/dear-8b-reranker-ranknet-v1") - Notebooks
- Google Colab
- Kaggle
DeAR-8B-Reranker-RankNet-v1
Model Description
DeAR-8B-Reranker-RankNet-v1 is an 8B parameter neural reranker trained with RankNet loss and knowledge distillation. This model is part of the DeAR framework family and achieves strong performance on standard IR benchmarks while being significantly faster than larger teacher models.
Model Details
- Model Type: Pointwise Reranker (Sequence Classification)
- Base Model: LLaMA-3.1-8B
- Parameters: 8 billion
- Training Method: Knowledge Distillation + RankNet Loss
- Teacher Model: LLaMA2-13B-RankLLaMA
- Training Data: MS MARCO
- Precision: BFloat16
Key Features
✅ High Performance: Competitive with 13B teacher on BEIR benchmarks
✅ Fast Inference: 2.2s average latency on standard GPU
✅ Memory Efficient: Fits on single 24GB GPU
✅ Knowledge Distillation: Enhanced with Chain-of-Thought reasoning
Performance
| Benchmark | NDCG@10 |
|---|---|
| TREC DL19 | 74.5 |
| TREC DL20 | 72.8 |
| BEIR (Avg) | 45.2 |
| MS MARCO Dev | 68.9 |
Usage
Quick Start
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load model
model_path = "abdoelsayed/dear-8b-reranker-ranknet-v1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(
model_path,
torch_dtype=torch.bfloat16
)
model.eval().cuda()
# Score a query-document pair
query = "What is machine learning?"
document = "Machine learning is a subset of artificial intelligence..."
inputs = tokenizer(
f"query: {query}",
f"document: {document}",
return_tensors="pt",
truncation=True,
max_length=228, # q_max_len(32) + p_max_len(196)
padding="max_length"
)
inputs = {k: v.cuda() for k, v in inputs.items()}
with torch.no_grad():
score = model(**inputs).logits.squeeze().item()
print(f"Relevance score: {score}")
Batch Reranking
def rerank_documents(query, documents, model, tokenizer, batch_size=64):
"""
Rerank a list of documents for a query.
Args:
query: Search query string
documents: List of (title, text) tuples
model: Loaded reranker model
tokenizer: Loaded tokenizer
batch_size: Batch size for inference
Returns:
List of (index, score) tuples sorted by relevance
"""
scores = []
for i in range(0, len(documents), batch_size):
batch_docs = documents[i:i + batch_size]
# Prepare inputs
queries = [f"query: {query}"] * len(batch_docs)
docs = [f"document: {title} {text}" for title, text in batch_docs]
inputs = tokenizer(
queries,
docs,
return_tensors="pt",
truncation=True,
max_length=228,
padding=True
)
inputs = {k: v.to(model.device) for k, v in inputs.items()}
# Get scores
with torch.no_grad():
logits = model(**inputs).logits.squeeze(-1)
scores.extend(logits.cpu().tolist())
# Sort by score (descending)
ranked = sorted(enumerate(scores), key=lambda x: x[1], reverse=True)
return ranked
# Example usage
query = "When was the Eiffel Tower built?"
documents = [
("Eiffel Tower", "The Eiffel Tower was built in 1889 for the World's Fair."),
("Paris", "Paris is the capital of France."),
("Architecture", "Modern architecture has evolved significantly."),
]
ranking = rerank_documents(query, documents, model, tokenizer)
print(ranking)
# Output: [(0, 8.23), (1, 2.45), (2, -1.87)]
Training Details
Training Data
- Primary Dataset: MS MARCO Passage Ranking
Hardware
- GPUs: 4x NVIDIA A100 (40GB)
- Training Time: ~36 hours
- DeepSpeed: ZeRO Stage 2
Loss Function
RankNet Loss with Knowledge Distillation:
L_total = (1 - α) * L_RankNet + α * L_KD
where:
- L_RankNet: Pairwise ranking loss
- L_KD: KL divergence with teacher (temperature=2)
- α: 0.1 (distillation weight)
Evaluation Results
TREC Deep Learning
| Dataset | NDCG@10 | NDCG@20 | MAP |
|---|---|---|---|
| DL19 | 74.50 | 70.23 | 45.67 |
| DL20 | 72.80 | 69.15 | 43.21 |
BEIR Benchmark
| Dataset | NDCG@10 |
|---|---|
| MS MARCO | 68.9 |
| NQ | 52.3 |
| HotpotQA | 61.8 |
| FiQA | 47.2 |
| ArguAna | 59.4 |
| SciFact | 73.6 |
| TREC-COVID | 85.2 |
| NFCorpus | 39.8 |
Efficiency
| Metric | Value |
|---|---|
| Inference Time (100 docs) | 2.2s |
| GPU Memory (inference) | 18GB |
| Throughput | ~45 docs/sec |
Comparison with Other Models
| Model | Size | TREC DL19 | BEIR Avg | Inference (s) |
|---|---|---|---|---|
| MonoT5-3B | 3B | 71.8 | 43.5 | 3.5 |
| DeAR-P-8B-RL | 8B | 74.5 | 45.2 | 2.2 |
| Teacher (13B) | 13B | 73.8 | 44.8 | 5.8 |
Model Architecture
Input: "query: [Q] [SEP] document: [D]"
↓
LLaMA-3.1-8B Encoder
↓
[CLS] Token Representation
↓
Linear Classification Head
↓
Relevance Score (scalar)
Limitations
- Domain Adaptation: Trained primarily on MS MARCO; may require fine-tuning for specialized domains
- Query Length: Optimized for queries up to 32 tokens
- Document Length: Truncated to 196 tokens; longer documents lose information
- Language: English only
- Numerical Reasoning: Limited capability for queries requiring calculations
Bias and Fairness
This model inherits biases present in:
- Base LLaMA-3.1-8B model
- MS MARCO training data
- Teacher model annotations
Users should evaluate fairness for their specific use cases.
Ethical Considerations
- Search Ranking: Can influence information access and visibility
- Training Data: May contain biased or sensitive content
- Misuse Potential: Should not be used for surveillance or discriminatory ranking
Related Models
DeAR Family:
- DeAR-8B-CE - Binary Cross-Entropy variant
- DeAR-8B-Listwise - Listwise reranking
- DeAR-8B-RankNet-LoRA - LoRA adapter
Teacher:
Dataset:
Citation
@article{abdallah2025dear,
title={DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation},
author={Abdallah, Abdelrahman and Mozafari, Jamshid and Piryani, Bhawna and Jatowt, Adam},
journal={arXiv preprint arXiv:2508.16998},
year={2025}
}
License
MIT License
Contact
- GitHub: DataScienceUIBK/DeAR-Reranking
- Paper: arXiv:2508.16998
- Collection: DeAR Models
- Downloads last month
- 2
Model tree for abdoelsayed/dear-8b-reranker-ranknet-v1
Base model
meta-llama/Llama-3.1-8B