Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
McGill-NLP 's Collections
The Markovian Thinker
SSA-COMET
INJONGO
Unequal unlearning
AgentRewardBench
Malicious-IR
SafeArena
CHASE
LLM2Vec
WebLINX
AURORA
WebLINX Models
Statcan Dialogue Dataset & Models
FaithDial
MLQuestions

The Markovian Thinker

updated Oct 9, 2025

Reformulating the RL of reasoning LLMs through Markovian Thinking paradigm.

Upvote
11

  • McGill-NLP/delethink-24k-1.5b

    2B • Updated Oct 9, 2025 • 195 • 5

  • McGill-NLP/longcot-24k-1.5b

    2B • Updated Oct 9, 2025 • 18 • 2

  • McGill-NLP/longcot-8k-1.5b

    2B • Updated Oct 9, 2025 • 10 • 1

  • McGill-NLP/delethink-96k-base-1.5b

    2B • Updated Oct 3, 2025 • 10 • 1

  • McGill-NLP/openmath-filtered

    Viewer • Updated Oct 3, 2025 • 200k • 38 • 1

  • McGill-NLP/delethink-96k-1.5b

    2B • Updated Oct 9, 2025 • 10 • 3

  • The Markovian Thinker

    Paper • 2510.06557 • Published Oct 8, 2025 • 30
Upvote
11
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs