Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

microsoft
/
VibeVoice-Realtime-0.5B

Text-to-Speech
Transformers
Safetensors
English
vibevoice_streaming
Realtime TTS
Streaming text input
Long-form speech generation
Model card Files Files and versions
xet
Community
21
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

great model

❤️ 1
#22 opened 3 days ago by
omnixeno

For those who need a simplified execution on NVIDIA GPU

🔥 1
#21 opened 4 days ago by
ghostplant

How can we access the acoustic encoder and semantics encoder?

1
#20 opened 8 days ago by
hebangwen

the stream input works great

❤️ 3
#18 opened 13 days ago by
mzbac

Local implementation (tested on macos m4)

1
#17 opened 13 days ago by
Expressin

Terrible Quality!

1
#16 opened 15 days ago by
qpqpqpqpqpqp

How can we get the position of text in the generated audio?

1
#12 opened 17 days ago by
maifeeulasad

finetune guide

➕ 2
3
#10 opened 18 days ago by
devops724

Music played at the start of the 0.5model

2
#9 opened 19 days ago by
acoloss

Tried to use this to generate chinese, sounds very foreign...

2
#8 opened 19 days ago by
id0o0bi

Safety or the joke there in

🔥 2
3
#7 opened 20 days ago by
Tom-Neverwinter

no example code ?

3
#6 opened 20 days ago by
LeroyDyer

Update README.md

#5 opened 21 days ago by
gghfez

English only?

2
#2 opened 21 days ago by
PeacePeacepPeace
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs