Supercomputing speeds advances in deep neural networks

12/18/2014 Mike Koon, Engineering Communications Office

Professor Thomas Huang's team is developing algorithms for image recognition for those unlabeled images with research spilling over to multimedia.

Written by Mike Koon, Engineering Communications Office

When it comes to facial and object recognition, companies like Google and Facebook have a leg up. Those organizations have a massive library of images that are labeled by users around the world. However, most of the images and videos in the world aren’t automatically identified, but the need for such a method is growing rapidly in the ever-changing world of Big Data.

Professor Huang's research group is teaming with associate professor Robert Brunner in a project to identify images of galaxies.
Professor Huang's research group is teaming with associate professor Robert Brunner in a project to identify images of galaxies.
That’s where Professor Emeritus and Research Professor Thomas S Huang's comes in.

His team is developing algorithms for image recognition for those unlabeled images with research spilling over to multimedia. The focus is on biometrics (face recognition, emotional recognition), human computer interface (avatars), and multimedia retrieval (search, ranking). The advances are made possible because of the recent advances in supercomputing. In short, Huang’s research lies at the intersection of deep learning, Big Data, and high-performance computing.

Huang has gained a reputation for being ahead of his time when it comes to using convolutional or feed-forward neural networks (FFNN), a system of programs that closely resembles the operation of the human brain, for image recognition. His influence in the space has helped place students at giants like Baidu and Google and drawn the attention of a number of researchers.

“Applied learning and deep learning technology require high-performance computing,” Huang said. “If we don’t have the labels for this huge amount of data, we can’t train the classifier and use it to apply our algorithms. Crowdsourcing or high performance computing is the key to getting the label here.”

“What we are finding is that these problems really benefit from having that data and computer power,” said Tom Paine, a PhD candidate assisting Huang in this effort. “Not so long ago, our lab really didn’t go after Big Data projects, however we have found that if you can use a few GPUs (graphics processing units) and speed things up by a factor of 10 and multiple GPUs for another factor of 10, you can get accurate results on all of these tasks.” 

For instance, Huang’s group is teaming with Robert Brunner, an associate professor in astronomy, to identify images of galaxies. Astronomers have captured images of about a quarter of the more than 170 billion galaxies in the universe. The ultimate goal is to use deep learning technology, taking a set of architectures to form a set of algorithms, to classify images from the many wavelengths of light captured by telescopes.

Toward that end, the team took 60,000 labeled images, using the first half to train the algorithm and the second half to test it. Their resulted algorithm proved 92.5 percent accurate. Because supercomputers such as Blue Waters can process that many images in about five minutes, the team believes it can quickly make the algorithm even more accurate.

ECE graduate student Kuangxiao Gu collaborates with a program called I-KID, in which psychologists use photos of babies’ eyes to detect how well they are learning. To help collect data, they are taking photographs of babies looking left, right, or straight ahead. The psychologists can classify the data of each subject and match them to images of eyes. The research group is working on developing the algorithm that can help those psychologists make an intelligent conclusion of the subject just by looking at their eyes.

“We are using a machine learning technique to classify those images then using a deep neural network to map which way they are looking,” Gu said. “We need a lot of training data to cover the possibilities.”

Another member of the team, ECE graduate student Pooya Khorrami, collaborates with MIT Lincoln Laboratory on a project on emotions. Using video, audio and text, they are developing an algorithm to recognize emotional cues.

Thomas S Huang
Thomas S Huang

The application accurately detects whether a person is surprised, positive, negative, or neutral. They plan to broaden the list to detect happiness, sadness, fear, anger, and disgust. It could also potentially inform an instructor if students are engaged or bored, through video in a classroom.

Of course, one issue in labeling is noise, programming the neural network to recognize an image through object or facial recognition despite the presence of other graphical elements.

ECE graduate student Xianming Liu is helping train a neural network to identify an object quickly and accurately despite the presence of “noise.”

“We can train these neural networks to be good at a task, but it is more difficult to have control over what the neurons are learning,” Liu said. “We are using feedback to train the neural network to get better results.”

The results of the projects are reciprocal. Huang’s group is helping outside researchers by developing deep learning techniques using Big Data, but the sum of the projects is in turn helping refine the techniques.

“We are interested in developing general techniques, but we need the real applications to test the evidence,” Huang said.

He emphasized again that the advances have only been possible recently because of the ability to use supercomputing to organize the data.

“If we want to have good algorithms, which are person independent, then the more of the training data, the better,” Huang said. “In developing neural networks, you have so many parameters to adjust that you need quite a bit of data. In short, developing these networks without Big Data and supercomputers is not doable.”

So what does the future look like? For starters applying deep learning to biometric tasks such as age and face recognition. Businesses are also interested in quicker ways of getting feedback to make more effective products and services. Google and Facebook recently bought companies that specialize in deep learning and they have hired deep-learning experts to lead those efforts.

“These things have ushered in a real revolution in terms of object recognition as well as object detection, knowing not only the image has the object in it, but pinpointing where the object is,” Paine said. “To date, they have been using simple machine-learning models like linear classifiers. Deep learning is a way to make much more sophisticated models that use this data much better. If you give a linear model millions of images, it does a little bit better. If you apply to same images to a deep learning model, it gets you a significant performance boost.”

“We think that the next big thing in this area is distributed computing using systems like Blue Waters and unsupervised learning,” Khorrami said. “No one has really pushed that yet. We’re vying to make that work well.” 


Share this story

This story was published December 18, 2014.