Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
CKeibel 's Collections
SLMs
PII
Code-Embeddings
Speech2Text (ASR)
Seq2Seq
Reward Models
diffusion models
Text-Classification
Data
PEFT (Papers)
LLMs (Papers)
Causal LMs, seq2seq models
Embedding models
Vision stuff
datasets
NER
BERT based tasks (models)
Multimodal

Data

updated Aug 18, 2025
Upvote
-

  • HuggingFaceFW/fineweb-2

    Viewer • Updated Oct 27, 2025 • 4.48B • 65.5k • 710

  • allenai/c4

    Viewer • Updated Jan 9, 2024 • 10.4B • 590k • 504

  • ServiceNow-AI/R1-Distill-SFT

    Viewer • Updated Feb 8, 2025 • 1.85M • 1.95k • 313

  • PrimeIntellect/INTELLECT-2-RL-Dataset

    Viewer • Updated May 13, 2025 • 285k • 113 • 65

  • togethercomputer/RedPajama-Data-V2

    Updated Nov 21, 2024 • 3.61k • 390

  • wikimedia/wikipedia

    Viewer • Updated Jan 9, 2024 • 61.6M • 72k • 1.12k

  • avemio/German-RAG-EMBEDDING-TRIPLES-HESSIAN-AI

    Viewer • Updated Oct 16, 2024 • 294k • 18 • 1

  • urchade/synthetic-pii-ner-mistral-v1

    Updated Apr 20, 2024 • 206 • 15
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs