Automatic Speech Recognition
Transformers
PyTorch
JAX
Safetensors
whisper
audio
hf-asr-leaderboard
Eval Results
Instructions to use openai/whisper-large-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-large-v3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-large-v3")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-large-v3") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v3") - Inference
- Notebooks
- Google Colab
- Kaggle
whisper large v3 finetuning using our own dataset
#189
by rifasca - opened
I encountered issues while fine-tuning the Whisper-large-v3 model on a 100-hour Arabic dataset using the LoRA-PEFT approach. The resulting transcriptions were highly inaccurate, with excessive hallucinations and frequent duplication of characters.
Hello, I think you're using LoRA and only fine-tuning q_linear and v_linear. You could try fine-tuning all linear layers instead. Also, I believe the Whisper-large-v3 tokenizer performs poorly for low-resource languages.