CS SEMINAR

Talk 1: Shaping the Future of Music Innovation: From Pop Music Transformer to Efficient and Controllable AI Models

Talk 2: Human-AI Interaction in Music: Ensemble Performance, LLM-Powered Music Annotation and Retrieval, and Beyond

Speaker

1: Dr. Yi-Hsuan Yang, Professor, College of Electrical Engineering and Computer Science, National Taiwan University
2: Dr. Juhan Nam, Professor, Korea Advanced Institute of Science and Technology (KAIST)

Chaired by

Dr WANG Ye, Associate Professor, School of Computing

wangye@comp.nus.edu.sg

24 Oct 2025 Friday, 03:30 PM to 05:30 PM

Talk 1: By Dr. Yi-Hsuan Yang - 3:30 PM to 4:30 PM
Abstract:
This talk traces the journey of AI-driven music generation, spotlighting key innovations that have transformed music creation. We’ll revisit foundational works like Pop Music Transformer, which pioneered expressive pop piano generation, Theme Transformer, enabling theme-driven compositions, and MuseMorphose, advancing piano music style transfer. The discussion will also highlight recent advancements on text-to-music generation like SiMBA-LDM, a model that drastically reduces training costs while delivering high-quality music, and MuseControlLite, which enhances control over melody and rhythm through lightweight fine-tuning. Additionally, we’ll explore audio effect modeling for guitar tones, showcasing practical applications. Geared toward tech enthusiasts, this talk will illustrate how these models make music creation more accessible, precise, and creative.

Bio:
Dr. Yi-Hsuan Yang received the Ph.D. degree in Communication Engineering from National Taiwan University. Since February 2023, he has been with the College of Electrical Engineering and Computer Science, National Taiwan University, where he is a Full Professor. Prior to that, he was the Chief Music Scientist in an industrial lab called Taiwan AI Labs from 2019 to 2023, and an Associate/Assistant Research Fellow of the Research Center for IT Innovation, Academia Sinica, from 2011 to 2023. His research interests include automatic music generation, music information retrieval, artificial intelligence, and machine learning. His team developed music generation models such as MidiNet, MuseGAN, Pop Music Transformer, and KaraSinger. He was an Associate Editor for the IEEE Transactions on Multimedia and IEEE Transactions on Affective Computing, both from 2016 to 2019. Dr. Yang is a senior member of the IEEE.

Talk 2: By Dr. Juhan Nam - 4:30 PM to 5:30 PM
Abstract:
This talk presents recent research from the Music and Audio Computing Lab at KAIST, focusing on topics that emphasize human-AI interaction in music. The first topic is human-AI ensemble performance, which requires multimodal, real-time, and robust processing for effective musical communication. This umbrella theme encompasses a wide range of music information retrieval (MIR) tasks, such as piano transcription and score following, as well as visual processing tasks like musical cue detection and performance visualization. We will showcase the outcomes through various demos and stage performances. The second topic is LLM-powered music annotation and retrieval, which enables the use of rich textual descriptions and multi-turn conversations for fine-grained music research. This part explores how music-text understanding has evolved from traditional music classification models to multimodal musical LLMs. Finally, if time permits, I will briefly introduce other ongoing research topics, including Foley sound synthesis, audio enhancement, traditional Korean music analysis, and piano music arrangement.

Bio:
Juhan Nam is a Professor at the Korea Advanced Institute of Science and Technology (KAIST), South Korea. He leads the Music and Audio Computing Lab at the Graduate School of Culture Technology, where his research focuses on music information retrieval, audio signal processing, and AI-based music applications. He also serves as the Director of the Sumi Jo Performing Arts Research Center, fostering collaborations with artists to develop innovative technologies for music performance and education. He received his Ph.D. in Music from Stanford University, where he studied at the Center for Computer Research in Music and Acoustics (CCRMA). Before his academic career, he worked at Young Chang (Kurzweil), developing synthesizers and digital pianos.

Talk 1: Shaping the Future of Music Innovation: From Pop Music Transformer to Efficient and Controllable AI Models Talk 2: Human-AI Interaction in Music: Ensemble Performance, LLM-Powered Music Annotation and Retrieval, and Beyond

COM3 Basement

Talk 1: Shaping the Future of Music Innovation: From Pop Music Transformer to Efficient and Controllable AI Models

Talk 2: Human-AI Interaction in Music: Ensemble Performance, LLM-Powered Music Annotation and Retrieval, and Beyond