Petros Maragos received the Diploma in E.E. from the National Technical University of Athens (NTUA) in 1980 and the M.Sc. and Ph.D. degrees from Georgia Tech, Atlanta, in 1982 and 1985. In 1985, he joined the faculty of the Division of Applied Sciences at Harvard University, where he worked for eight years as professor of electrical engineering affiliated with the Harvard Robotics Lab. In 1993, he joined the faculty of the School of ECE at Georgia Tech, affiliated with its Center for Signal and Image Processing. During periods of 1996-98 he had a joint appointment as director of research at the Institute of Language and Speech Processing in Athens. Since 1999, he has been working as professor at the NTUA School of ECE, where he is currently the director of the Intelligent Robotics and Automation Lab.
He is also the coordinator of a robotics perception & interaction research unit at the Athena Research and Innovation Center. He has held visiting positions at MIT in 2012 and at UPenn in 2016. His research and teaching interests include signal processing, systems theory, machine learning, image processing and computer vision, audio-speech & language processing, and robotics. He has served as: member of IEEE SPS technical committees; associate editor for the IEEE Trans. on ASSP and IEEE Trans. on PAMI, editorial board member and guest editor for several journals on signal processing, image analysis and vision; co-organizer of several conferences and workshops on image processing, computer vision, multimedia and robotics (including recently EUSIPCO 2017 as general chair).
He has also served on the Greek National Council for Research and Technology. His is the recipient or co-recipient of several awards for his academic work, including a 1987-1992 National Science Foundation Presidential Young Investigator Award, a 1988 IEEE SPS Young Author Best Paper Award, a 1994 IEEE SPS Senior Best Paper Award, the 1995 IEEE W.R.G. Baker Prize Award for the most outstanding original paper, the 1996 Pattern Recognition Society’s Honorable Mention Award, the EURASIP 2007 Technical Achievement Award for contributions to nonlinear signal, image and speech processing, and the Best Paper Award of the IEEE CVPR-2011 Gesture Recognition Workshop. He was elected a Fellow of IEEE in 1995 and a Fellow of EURASIP in 2010 for his research contributions. He has been elected IEEE SPS Distinguished Lecturer for 2017-2018.
Synergy between Computer Vision and Language Processing for Multimodal Perception and Understanding
Abstract:
In this talk we will present an overview of ideas, methods and research results in multimodal spatio-temporal sensory processing with emphasis on audio-visual processing, fusion and learning as applied to problems of attention, speech & language processing coupled with video understanding, and human-robot interaction. We shall begin with a brief synopsis of related findings from audio-visual perception. Afterwards, emphasis will be given to problems of attention where we will present improved computational saliency estimation algorithms for multimodal salient event detection, followed by efficient video summarization based on audio, visual, and text modalities. Then, we will present methods for extracting improved weak labels from text for visual recognition. Finally, we will outline the application of some of the above ideas and methods to intelligent assistive and social robotics by focusing on one main goal: to provide multimodal sensory processing capabilities for detecting, analyzing and recognizing the human user actions. A major challenge where we show advancements is the distant recognition of gestural and verbal commands in the considered human-robot interaction context, as well as their cross-modal integration for improved performance.
More information and related papers can be found in http://cvsp.cs.ntua.gr, and http://robotics.ntua.gr .