Speech and Singing Voice Generation
Speech synthesis, singing voice synthesis, voice conversion, singing technique control, and zero-shot vocal modeling.
Research on expressive voice, music intelligence, neural audio, and human-centered multimodal generation.
My work studies controllable generation systems that preserve expression, style, and human intent across sound and movement.
I study generative AI for voice, music, and human motion, with an emphasis on controllable systems that support expressive creation.
I am a Ph.D. student at the School of Computing, National University of Singapore, advised by Prof. Ye Wang in the Sound and Music Computing Lab.
My research covers speech and singing voice synthesis, neural audio codecs, music generation, talking head generation, co-speech gesture generation, and affective multimodal learning.
I am especially interested in models and interfaces that make generated media easier to guide, evaluate, and use in creative or communicative settings.
Before and alongside my research, I have spent many years studying piano and singing. This musical background shapes the questions I care about: how generated sound can remain controllable, expressive, and useful for real creative workflows.
For selected performances and music projects, check my music page.
Three threads connect most of my work: expressive voice generation, music and audio intelligence, and multimodal human signal modeling.
Speech synthesis, singing voice synthesis, voice conversion, singing technique control, and zero-shot vocal modeling.
Neural audio codecs, controllable audio generation, piano transcription/rendering, music style transfer, and creative ML systems.
Talking head generation, co-speech gesture generation, multimodal emotion analysis, and human-centered generation.
A compact timeline for papers, internships, teaching, and milestones.
Filter papers by research thread. Project thumbnails come from your existing assets folder.
Teaching, academic service, honours, and contact links collected in one place.
For research discussions, collaboration, code, CV, and professional profiles.