Audio-Text-to-Text
Transformers
Safetensors
English
audioflamingo3
text2text-generation
audio
reasoning
audio understanding
ASR
Instructions to use nvidia/audio-flamingo-3-hf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nvidia/audio-flamingo-3-hf with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForSeq2SeqLM processor = AutoProcessor.from_pretrained("nvidia/audio-flamingo-3-hf") model = AutoModelForSeq2SeqLM.from_pretrained("nvidia/audio-flamingo-3-hf") - Notebooks
- Google Colab
- Kaggle
| { | |
| "audio_token": "<sound>", | |
| "feature_extractor": { | |
| "chunk_length": 30, | |
| "dither": 0.0, | |
| "feature_extractor_type": "WhisperFeatureExtractor", | |
| "feature_size": 128, | |
| "hop_length": 160, | |
| "n_fft": 400, | |
| "n_samples": 480000, | |
| "nb_max_frames": 3000, | |
| "padding_side": "right", | |
| "padding_value": 0.0, | |
| "processor_class": "AudioFlamingo3Processor", | |
| "return_attention_mask": true, | |
| "sampling_rate": 16000 | |
| }, | |
| "processor_class": "AudioFlamingo3Processor" | |
| } | |