The Eyes Have It… But What Is It?

What does it mean to "see"? The answer might help us train artificial systems to analyze the staggering volume of videos online.

 |  Transcript [PDF]

“We’re trying to understand, from a fundamental principles point of view, what it means to see.”

Computational vision scientist Richard Wildes, professor of electrical engineering and computer science at York University, studies computer systems as well as biological systems in formulating his answer.

With a biological system, the eyes collect visual information that is sent to the brain for processing. In an artificial system, a camera might record video that is sent to a computer for processing. But both inhabit the same physical spaces, and in many cases both have the same endgame.

“In either case, you’re trying to recognize faces, you’re trying to recognize actions, you’re trying to recognize the scene in which the actions are occurring. It makes sense that the processing in between is the same at a certain level of abstraction,” adds Wildes.

From historic moments to everyday activities, a plethora of personal videos are now shared online. An incredible 300 hours of video are uploaded to YouTube every minute, and that’s just one of many video repositories. These collections contain a wealth of information, but it would be nearly impossible for any one person to sift through them all.

Through a combination of image processing and artificial intelligence, Wildes is trying to train computers to recognize and categorize video content automatically.

“For example, if we’re taking a video of somebody being interviewed, how can we actually analyze that video to be able to say, ‘Here’s a man, he’s answering questions, he’s telling a story about his research,’” explains Wildes.

Wildes believes we are very close to deploying tools that can automatically analyze videos for the actions and environments they capture. This true video understanding will make it easier to extract information from the vast collections of shared videos from around the world.

‹ Previous post
Next post ›

Richard Wildes received his PhD from the Massachusetts Institute of Technology in 1989. Subsequently, he joined Sarnoff Corporation (now SRI) as a Member of the Technical Staff in the Vision Technologies Lab. In 2001, he joined the Department of Electrical Engineering and Computer Science (EECS) at York University, Toronto, as an Associate Professor. Currently he is Associate Director of York’s Vision: Science to Applications (VISTA) program, funded for $33.3 million by the Canada First Research Excellence Fund. He also previously served as Chair of EECS and Associate Director of the York Centre for Vision Research (CVR).

Wildes’s research interests are in computational vision, especially video understanding and machine vision applications, as well as artificial intelligence. While in industry, he led teams developing vision technologies for various real-world applications, including iris recognition, where he is widely considered a pioneer. Awards and honours include the IEEE D.G. Fink Prize Paper Award, Sarnoff Corporation Technical Achievement Award, twice giving invited presentations at the USA National Academy of Sciences and currently holding a York Research Chair, Tier I.

Research2Reality is a groundbreaking initiative that shines a spotlight on world-class scientists engaged in innovative and leading edge research in Canada. Our video series is continually updated to celebrate the success of researchers who are establishing the new frontiers of science and to share the impact of their discoveries with the public.