Faculty Candidate Seminar

Large-Scale Visual Recognition Powered by Big Data

Jia Deng

Post Doc
Thursday, April 04, 2013
4:00pm - 5:00pm
1670 Beyster

Add to Google Calendar

About the Event

Having machines recognize everything in our visual world is one of the grand challenges of computer vision. This entails building a system capable of distinguishing tens of thousands, if not millions, of fine-grained visual classes across a wide range of domains (for example, distinguishing different breeds of terriers or different Toyota models). The past several decades of computer vision research has mostly focused on recognizing a few dozen basic-level object categories (such as distinguishing dogs from cars). However, the problem of visual recognition is necessarily large-scale and algorithms must tackle an entirely new level of scale and complexity in both visual and semantic space. The key challenges include harvesting data, incorporating domain knowledge that enables fine-grained distinction, and handling the large, richly structured output space.

In this talk I will present my research that takes a big data approach to scaling up recognition. I will start with an overview of the ImageNet project, which harvests big visual data -- tens of millions of images for tens of thousands of visual classes -- through large-scale crowdsourcing. Next, I will demonstrate how to recognize fine-grained, sub-ordinate categories via a human-machine collaboration framework that couples a new image representation with a computer game that collects novel forms of data from the crowd. Third, I will explore ways to tackle the large label space, one with tens of thousands of visual categories organized in a large taxonomy. Here I will present a provably optimal approach to optimizing the trade-offs between accuracy and specificity, which leads to a reliable recognition engine operating on 10K+ object categories. Finally, I will discuss future directions that hold promise for unleashing the full power of big data toward large-scale, real-world computer vision.


Jia Deng is a postdoctoral scholar in the Vision Lab at Stanford University. He received his PhD in Computer Science from Princeton University in 2012. His research centers around harvesting, understanding, and harnessing big visual data. He has built datasets and tools used by more than 1,000 researchers worldwide and his work has appeared in popular press such as the New York Times. He has been the lead student organizer of the ImageNet Large Scale Visual Recognition Challenges since 2010. He is also the lead organizer of the first BigVision workshop at NIPS 2012.

Additional Information

Contact: Cindy Estell

Phone: 7347646744

Email: cestell@umich.edu

Sponsor(s): CSE

Open to: Public