Skip to main content - access key m.
Skip to main navigation - access key n.

Wednesday, Feb. 23, 2011
11a.m., TI Auditorium
(ECSS 2.102)

 

 

 

 

 

 

 

 

 

 

 EE seminar series

“Analysis, Recognition and Synthesis of Human Behaviors:
A Multimodal Approach”

Dr. Carlos Busso, UT Dallas

Sponsored by the Dallas Chapter of the IEEE Signal Processing Society

Abstract
During interpersonal human interaction, speech and gestures are intricately coordinated to express and emphasize ideas, as well as provide suitable feedback to the listener. The tone and intensity of speech, spoken language patterns, facial expressions, head motion and hand movements are all weaved together in a nontrivial manner in order to convey intent and desires for natural human communication. A joint analysis of these modalities is necessary to fully decode human communication. Among other things, this is critically needed in designing next generation information technology that attempts to mimic and emulate how humans process and produce communication signals. This talk will summarize our ongoing research in recognizing and synthesizing paralinguistic information conveyed through multiple communication channels during human interaction, with emphasis on social emotional behaviors.

Bio
An assistant professor of electrical engineering, Carlos Busso received his doctorate in electrical engineering from the University of Southern California in 2008. He was a postdoctoral research associate in USC’s Signal Analysis and Interpretation Laboratory from 2008 to 2009. His research interests are in digital signal processing, speech and video processing, and multimodal interfaces. His current research includes modeling and understanding human communication and interaction, with applications to automated recognition and synthesis to enhance human-machine interfaces. He has worked on audiovisual emotion recognition, analysis of emotional modulation in gestures and speech, designing realistic human-like virtual characters, speech source detection using microphone arrays, speaker localization and identification in an intelligent environment, and sensing human interaction in meetings.