Automatic Speech Recognition
Transformers
PyTorch
JAX
Safetensors
whisper
audio
hf-asr-leaderboard
Eval Results
Instructions to use openai/whisper-large-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-large-v3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-large-v3")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-large-v3") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v3") - Inference
- Notebooks
- Google Colab
- Kaggle
Fix example usage
#202
by bezzam HF Staff - opened
For the librispeech_long example, as it is longer than 30 s, there is the following error:
ValueError: You have passed more than 3000 mel input features (> 30 seconds) which automatically enables long-form generation which requires the model to predict timestamp tokens. Please either pass `return_timestamps=True` or make sure to pass no more than 3000 mel input features.