Automatic Speech Recognition
Transformers
PyTorch
TensorBoard
Ukrainian
wav2vec2
Generated from Trainer
hf-asr-leaderboard
mozilla-foundation/common_voice_8_0
robust-speech-event
Eval Results (legacy)
Instructions to use arampacha/wav2vec2-xls-r-1b-uk with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use arampacha/wav2vec2-xls-r-1b-uk with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="arampacha/wav2vec2-xls-r-1b-uk")# Load model directly from transformers import AutoProcessor, AutoModelForCTC processor = AutoProcessor.from_pretrained("arampacha/wav2vec2-xls-r-1b-uk") model = AutoModelForCTC.from_pretrained("arampacha/wav2vec2-xls-r-1b-uk") - Notebooks
- Google Colab
- Kaggle
| python run_speech_recognition_ctc.py \ | |
| --dataset_name /workspace/data/uk/composed_dataset/ \ | |
| --train_split_name train \ | |
| --model_name_or_path="facebook/wav2vec2-xls-r-1b" \ | |
| --output_dir ./ \ | |
| --overwrite_output_dir \ | |
| --max_steps 12000 \ | |
| --per_device_train_batch_size="16" \ | |
| --per_device_eval_batch_size="64" \ | |
| --gradient_accumulation_steps="8" \ | |
| --dataloader_num_workers 8 \ | |
| --learning_rate="5e-5" \ | |
| --adam_beta2 0.98 \ | |
| --lr_scheduler_type cosine \ | |
| --warmup_ratio 0.1 \ | |
| --evaluation_strategy="steps" \ | |
| --text_column_name="sentence" \ | |
| --chars_to_ignore \, \? \. \! \- \; \: \" \“ \% \‘ \” \� \' « » \( \) ՝ ՛ ՚ \– \— \… ý \ | |
| --save_steps="500" \ | |
| --eval_steps="500" \ | |
| --logging_steps="100" \ | |
| --save_total_limit 10 \ | |
| --freeze_feature_encoder \ | |
| --layerdrop="0.1" \ | |
| --activation_dropout="0.1" \ | |
| --feat_proj_dropout="0.0" \ | |
| --mask_time_prob="0.7" \ | |
| --mask_time_length="10" \ | |
| --mask_feature_prob="0.25" \ | |
| --mask_feature_length="64" \ | |
| --gradient_checkpointing \ | |
| --use_auth_token \ | |
| --fp16 \ | |
| --group_by_length \ | |
| --do_train --do_eval \ | |
| --load_best_model_at_end \ | |
| --report_to all \ | |
| --run_name="xlsr-uk-1b-1" \ | |
| --wandb_project="xlsr-uk" \ | |
| --bnb --tristage_sched | |