On Mon March 30, 2026

Speaker

Kazuhiro Nakadai


Title

Robot Audition in the Wild: Toward an Inclusive Society


Abstract

Robot Audition is a concept originally proposed by Nakadai and colleagues to enable robots to perceive and understand complex acoustic scenes in real-world environments where noise, reverberation, and multiple sound sources coexist. In this invited talk, I revisit the development of robot audition from the perspective of “in the wild” sensing, highlighting how auditory and multimodal perception must evolve when robots operate beyond controlled laboratory settings. The talk begins by introducing the core technologies of robot audition, including sound source localization and separation, and by discussing the fundamental challenges that arise when these techniques are deployed in real environments. Building on this foundation, I present a series of research efforts that extend robot audition toward real-world applications, such as locating humans using sound in search-and-rescue scenarios, inferring environmental and surface properties from acoustic signals, and analyzing bird songs for ecological monitoring in outdoor environments. I also discuss how the same technical principles can be extended toward human-centered interaction, particularly sign-language-based human–robot interaction, where communication relies on non-verbal and multimodal signals rather than speech alone. Although these research topics address different application domains, they are unified by a common technical direction: expanding the signals, agents, and environments that intelligent systems are designed to perceive and reason about. By framing robot audition as a foundation for multimodal perception and interaction in the wild, this talk presents a technical pathway through which such expansions can lead toward the realization of an inclusive society, enabling intelligent systems to engage not only with diverse humans, but also with challenging environments and even non-human entities.


Bio

Kazuhiro Nakadai received a B.E. in electrical engineering in 1993, an M.E. in information engineering in 1995, and a Ph.D. in electrical engineering in 2003 from the University of Tokyo. He worked at Nippon Telegraph and Telephone as a system engineer from 1995 to 1999, at the Kitano Symbiotic Systems Project, ERATO, JST as a researcher from 1999 to 2003, and at Honda Research Institute Japan, Co., Ltd. as a principal scientist from 2003 to 2022. Currently, he is a professor at the Department of Systems and Control Engineering, School of Engineering, Institute of Science Tokyo (formerly Tokyo Institute of Technology). He concurrently served as a visiting associate professor at Tokyo Institute of Technology from 2006 to 2010, a visiting professor from 2011 to 2017, and a specially appointed professor from 2017 to 2022. He also held a concurrent position as a guest professor at Waseda University from 2011 to 2018. His research interests include artificial intelligence, robotics, signal processing, computational auditory scene analysis, multimodal integration, and robot audition. He has served as an executive board member for the Japanese Society for Artificial Intelligence (JSAI) from 2015 to 2016 and from 2024 to 2025, and for the Robotics Society of Japan (RSJ) from 2017 to 2018. He is recognized as a Fellow of both the IEEE and RSJ.


Language

English (Offline)