This talk is divided into two parts. In the first part, we tackle a fundamental problem in computer vision - detecting humans and understanding their pose in natural images. Unlike previous work which manually decomposes the human body into a set of anatomical parts (i.e. head, torso, arms, legs), we propose a novel technique that models human with a flexible mixture of small non-oriented parts. We show the intuition behind this model and its superior performance on human pose estimation, improving past work on several benchmark datasets while being orders of magnitude faster.
In the second part, we introduce the deep learning techniques, or specifically the deep convolutional neural networks, as a further way of handling computer vision problems, overcoming the limitations of feature engineering and manually designed model representations. We show its exciting improvement on various computer vision problems including image recognition, object detection, human pose estimation and image segmentation, along with a demonstration of its web/mobile applications from Baidu China.
Yi Yang joined the Baidu Institute of Deep Learning as a researcher in October 2013. He graduated with a PhD in computer science from the University of California Irvine in 2013, advised by Prof. Deva Ramanan. He completed a bachelor degree at Tsinghua University in 2006 and a master degree at Hong Kong University of Science and Technology in 2008. He had summer internships at Microsoft Research in 2011 and at Google in 2012. His research interests include object detection, image recognition, human pose estimation, image segmentation, action recognition, multiple human interaction recognition and deep learning.