google/speech_commands
Updated • 3.29k • 59
This model is a fine-tuned version of facebook/wav2vec2-base-960h on the speech_commands dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 1.745 | 1.0 | 824 | 1.9237 | 0.7648 |
| 0.5664 | 2.0 | 1648 | 1.1424 | 0.7878 |
| 0.4337 | 3.0 | 2472 | 1.1234 | 0.8013 |
| 0.3346 | 4.0 | 3296 | 1.1040 | 0.8035 |
| 0.2683 | 5.0 | 4120 | 1.3128 | 0.7905 |
| 0.3498 | 6.0 | 4944 | 1.2172 | 0.7972 |
| 0.2556 | 7.0 | 5768 | 1.1906 | 0.7986 |
| 0.226 | 8.0 | 6592 | 1.1081 | 0.8044 |
| 0.2317 | 9.0 | 7416 | 1.1068 | 0.8049 |
| 0.1144 | 10.0 | 8240 | 1.1612 | 0.8067 |
| 0.2143 | 11.0 | 9064 | 1.1577 | 0.8031 |
| 0.1668 | 12.0 | 9888 | 1.1343 | 0.8058 |
| 0.2504 | 13.0 | 10712 | 1.0583 | 0.8067 |
| 0.218 | 14.0 | 11536 | 1.0677 | 0.8026 |
| 0.1025 | 15.0 | 12360 | 1.0690 | 0.8053 |
Base model
facebook/wav2vec2-base-960h