Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

microsoft
/
VibeVoice-Realtime-0.5B

Text-to-Speech
Transformers
Safetensors
English
vibevoice_streaming
Realtime TTS
Streaming text input
Long-form speech generation
Model card Files Files and versions
xet
Community
22
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

Working on DGX Spark (ARM64 + CUDA 13) - Setup Notes

#23 opened 23 minutes ago by
logos-flux

great model

❤️ 2
#22 opened 13 days ago by
omnixeno

For those who need a simplified execution on NVIDIA GPU

🔥 1
#21 opened 14 days ago by
ghostplant

How can we access the acoustic encoder and semantics encoder?

1
#20 opened 18 days ago by
hebangwen

the stream input works great

❤️ 3
#18 opened 23 days ago by
mzbac

Local implementation (tested on macos m4)

1
#17 opened 23 days ago by
Expressin

Terrible Quality!

1
#16 opened 25 days ago by
qpqpqpqpqpqp

How can we get the position of text in the generated audio?

1
#12 opened 27 days ago by
maifeeulasad

finetune guide

➕ 2
3
#10 opened 28 days ago by
devops724

Music played at the start of the 0.5model

3
#9 opened 29 days ago by
acoloss

Tried to use this to generate chinese, sounds very foreign...

2
#8 opened 29 days ago by
id0o0bi

Safety or the joke there in

🔥 2
4
#7 opened 30 days ago by
Tom-Neverwinter

no example code ?

3
#6 opened about 1 month ago by
LeroyDyer

Update README.md

#5 opened about 1 month ago by
gghfez

English only?

2
#2 opened about 1 month ago by
PeacePeacepPeace
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs