Text-to-Audio
updated
Large-Scale Automatic Audiobook Creation
Paper
• 2309.03926
• Published
• 56
FoleyGen: Visually-Guided Audio Generation
Paper
• 2309.10537
• Published
• 8
MusicAgent: An AI Agent for Music Understanding and Generation with
Large Language Models
Paper
• 2310.11954
• Published
• 25
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper
• 2310.00704
• Published
• 21
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper
• 2311.00945
• Published
• 16
In-Context Prompt Editing For Conditional Audio Generation
Paper
• 2311.00895
• Published
• 11
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Paper
• 2312.03491
• Published
• 34
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of
Audio Events in Text-to-audio Generation
Paper
• 2407.02869
• Published
• 21
FunAudioLLM: Voice Understanding and Generation Foundation Models for
Natural Interaction Between Humans and LLMs
Paper
• 2407.04051
• Published
• 40
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music
Generation
Paper
• 2407.15060
• Published
• 9
Improving Text-To-Audio Models with Synthetic Captions
Paper
• 2406.15487
• Published