🪔 Doha Generation Model v3

Encoder-Decoder Transformer trained to generate Hindi Dohas conditioned on theme and context.

What's New in v3 (Stage 2 v5)

  • Gate regularization — balanced encoder/meaning attention
  • Meaning decoder pre-training phase — 10 epochs before joint training
  • Enriched meaning with theme prefix विषय: {theme} | {meaning}
  • Encoder unfreeze bug fixed — relative to START_EPOCH
  • Early stopping — prevents overfitting past peak performance
  • Matra best-of-N generation — picks best syllable structure

Architecture

  • Shared Encoder pretrained on 58k Kavitas via T5-style span corruption
  • Meaning Decoder generates semantic meaning first
  • Doha Decoder uses dual cross attention (encoder + meaning) with learnable gate

Usage

generate_doha(model, sp, theme='शृंगार', context='मोरपंखी बाल')
# Or with matra scoring:
generate_doha_best_of_n(model, sp, theme='शृंगार', context='मोरपंखी बाल', n=5)
Downloads last month
126
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support