| --- |
| license: mit |
| --- |
| # Cross-Encoder for MS Marco |
|
|
| This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task. |
|
|
| The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See our paper [R2ANKER](https://arxiv.org/pdf/2206.08063.pdf) for more details. |
|
|
| ## Usage with Transformers |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| import torch |
| tokenizer = AutoTokenizer.from_pretrained("YCZhou/R2ANKER") |
| model = AutoModelForSequenceClassification.from_pretrained("YCZhou/R2ANKER") |
| features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="pt") |
| model.eval() |
| with torch.no_grad(): |
| scores = model(**features).logits |
| print(scores) |
| ``` |
|
|
| ## Citation |
| ``` |
| @inproceedings{DBLP:conf/acl/Zhou0GTXLJJ23, |
| author = {Yucheng Zhou and |
| Tao Shen and |
| Xiubo Geng and |
| Chongyang Tao and |
| Can Xu and |
| Guodong Long and |
| Binxing Jiao and |
| Daxin Jiang}, |
| title = {Towards Robust Ranker for Text Retrieval}, |
| booktitle = {Findings of the Association for Computational Linguistics: {ACL} 2023, |
| Toronto, Canada, July 9-14, 2023}, |
| pages = {5387--5401}, |
| publisher = {Association for Computational Linguistics}, |
| year = {2023}, |
| url = {https://doi.org/10.18653/v1/2023.findings-acl.332}, |
| doi = {10.18653/V1/2023.FINDINGS-ACL.332}, |
| timestamp = {Sat, 30 Sep 2023 09:33:34 +0200}, |
| biburl = {https://dblp.org/rec/conf/acl/Zhou0GTXLJJ23.bib}, |
| bibsource = {dblp computer science bibliography, https://dblp.org} |
| } |
| ``` |