3 papers across 2 sessions
We introduce Streaming Flow Matching, a novel streaming generative model for real-time audio generation from discrete tokens.
We introduce CoreaSpeech, a publicly released 700 h Korean TTS corpus built via a Korean-specific text normalizer and a novel jamo-based coreset selection pipeline.