openai/whisper-large-v3 Automatic Speech Recognition β’ 2B β’ Updated Aug 12, 2024 β’ 6.67M β’ β’ 5.25k
google/pix2struct-widget-captioning-large Visual Question Answering β’ 1B β’ Updated Apr 10, 2024 β’ 52 β’ 20
Salesforce/blip-image-captioning-large Image-to-Text β’ 0.5B β’ Updated Feb 3, 2025 β’ 1.42M β’ 1.44k
mistralai/Mistral-7B-Instruct-v0.1 Text Generation β’ 7B β’ Updated Jul 24, 2025 β’ 427k β’ 1.82k
Runtime error Featured 272 Edit Video By Editing Text β 272 Audio-based video editing using AI-generated transcription
Runtime error Featured 307 AudioLDM2 Text2Audio Text2Music Generation π 307 Generate audio and waveform video from text