Yongbin Choi's picture

Yongbin Choi

whybe-choi

·

AI & ML interests

LLM, RAG, Information Retrieval

Recent Activity

liked a model 1 day ago

LGAI-EXAONE/K-EXAONE-236B-A23B

updated a dataset 2 days ago

whybe-choi/kovidore-v2-financial-beir-test

updated a dataset 2 days ago

whybe-choi/kovidore-v2-energy-beir-test

View all activity

Organizations

upvoted a collection 15 days ago

ViDoRe Benchmark V3

ViDoRe V3 is our latest benchmark, engineered to set a new industry gold standard for multi-modal, enterprise document retrieval evaluation. • 8 items • Updated Nov 5, 2025 • 16

upvoted a paper 15 days ago

Pre-training Small Base LMs with Fewer Tokens

Paper • 2404.08634 • Published Apr 12, 2024 • 36

upvoted an article 20 days ago

Article

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

Nov 5, 2025

•

57

upvoted an article 23 days ago

Article

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

24 days ago

•

46

upvoted an article 24 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

29 days ago

•

552

upvoted a paper 29 days ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 111

upvoted a collection about 2 months ago

BiCA

6 items • Updated 10 days ago • 3

upvoted a paper 2 months ago

OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment

Paper • 2510.07743 • Published Oct 9, 2025 • 8

upvoted an article 2 months ago

Article

Introducing MTEB v2: Evaluation of embedding and retrieval systems for more than just text

Oct 20, 2025

•

34

upvoted 2 articles 3 months ago

Article

Vocabulary is the most important element of Sparse Retrieval

Oct 4, 2025

•

9

Article

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

+4

Jun 3, 2025

•

96

upvoted an article 4 months ago

Article

Welcome EmbeddingGemma, Google's new efficient embedding model

+4

Sep 4, 2025

•

267

upvoted a paper 4 months ago

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30, 2025 • 70

upvoted an article 4 months ago

Article

Reinforcement Learning for Large Language Models: Beyond the Agent Paradigm

Mar 19, 2025

•

8

upvoted 2 collections 4 months ago

ViDoRe Benchmark v2

8 items • Updated May 23, 2025 • 6

RaDeR training datasets

These are some of the retrieval training datasets used for training RaDeR models, sonsisting of different types of query combinations. • 3 items • Updated Jun 12, 2025 • 1

upvoted 2 papers 5 months ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14, 2025 • 97

Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21, 2025 • 67

upvoted a collection 5 months ago

JinaVDR (Visual Document Retrieval)

max. ~1000 images and OCR text included • 42 items • Updated Jul 20, 2025 • 8

upvoted a paper 6 months ago

Solving math word problems with process- and outcome-based feedback

Paper • 2211.14275 • Published Nov 25, 2022 • 10