Ph.D. Student / NUS Sound and Music Computing Lab

Junchuan Zhao

Research on expressive voice, music intelligence, neural audio, and human-centered multimodal generation.

My work studies controllable generation systems that preserve expression, style, and human intent across sound and movement.

Ph.D. Student NUS School of Computing Sound and Music Computing Lab

About

I study generative AI for voice, music, and human motion, with an emphasis on controllable systems that support expressive creation.

I am a Ph.D. student at the School of Computing, National University of Singapore, advised by Prof. Ye Wang in the Sound and Music Computing Lab.

My research covers speech and singing voice synthesis, neural audio codecs, music generation, talking head generation, co-speech gesture generation, and affective multimodal learning.

I am especially interested in models and interfaces that make generated media easier to guide, evaluate, and use in creative or communicative settings.

Before and alongside my research, I have spent many years studying piano and singing. This musical background shapes the questions I care about: how generated sound can remain controllable, expressive, and useful for real creative workflows.

For selected performances and music projects, check my music page.

Research Focus

Three threads connect most of my work: expressive voice generation, music and audio intelligence, and multimodal human signal modeling.

Speech and Singing Voice Generation

Speech synthesis, singing voice synthesis, voice conversion, singing technique control, and zero-shot vocal modeling.

Music and Audio Intelligence

Neural audio codecs, controllable audio generation, piano transcription/rendering, music style transfer, and creative ML systems.

Audio-Driven Human Generation

Talking head generation, co-speech gesture generation, multimodal emotion analysis, and human-centered generation.

Recent News

A compact timeline for papers, internships, teaching, and milestones.

  • One paper has been accepted by Interspeech 2026.
  • I am joining Tencent as a Research Intern in Speech Synthesis in Singapore.
  • One paper has been accepted by ACL 2026 Main Conference.
  • One paper has been accepted by IEEE/ACM TASLP 2026.
  • One paper has been accepted by ICLR 2026.
  • Two papers have been accepted by ICASSP 2026.
  • One paper has been accepted by Interspeech 2025.
  • I was selected for the SoC Teaching Fellowship Scheme.
  • I started my Ph.D. journey at NUS in the SMC Lab.

Publications

Filter papers by research thread. Project thumbnails come from your existing assets folder.

Academic Record

Teaching, academic service, honours, and contact links collected in one place.

Teaching

  • Teaching Assistant, CS3244 Machine Learning, Sem2 AY2025/2026.
  • Teaching Assistant, CS4347/CS5647 Sound and Music Computing, Sem1 AY2025/2026.
  • Teaching Assistant, CS4347/CS5647 Sound and Music Computing, Sem1 AY2024/2025.

Service

  • Reviewer for ToMM, TASLP, INTERSPEECH, ICASSP, ISMIR, and ACM MM.
  • Event chair and performer at NUS Sound and Music Computing Concert 2025.
  • Featured performer at NUS SMC Concert and NUS Jazz Band events.

Honours

  • SoC Teaching Fellowship Scheme, 2025-2026.
  • NUS Research Scholarship, 2023-2027.
  • Outstanding College Student in Beijing, 2022.
  • Queen Mary University of London Undergraduate College Prize, 2022.

Contact

For research discussions, collaboration, code, CV, and professional profiles.

Copied BibTeX