State of Text-to-Speech

7 papers · avg viability 6.9

View topic page

Top papers

Fish Audio S2 Technical Report(9.0)
DeepASMR: LLM-Based Zero-Shot ASMR Speech Generation for Anyone of Any Voice(8.0)
Causal Prosody Mediation for Text-to-Speech:Counterfactual Training of Duration, Pitch, and Energy in FastSpeech2(7.0)
Bolbosh: Script-Aware Flow Matching for Kashmiri Text-to-Speech(7.0)
Learning-free L2-Accented Speech Generation using Phonological Rules(7.0)
ZeSTA: Zero-Shot TTS Augmentation with Domain-Conditioned Training for Data-Efficient Personalized Speech Synthesis(6.0)
When Fine-Tuning Fails and when it Generalises: Role of Data Diversity and Mixed Training in LLM-based TTS(4.0)