A team led by HHMI investigator Terrence Sejnowski reported the feat in the March 1999 issue of the journal Psychophysiology.
The automated system, which has been improved since the article appeared, could be a boon for behavioral studies. Scientists have already found ways, for example, to distinguish false facial expressions of emotion from genuine ones. In depressed
individuals, they've also discovered differences between the facial signals of suicidal and nonsuicidal patients. Such research relies on a coding system developed in the 1970s by Paul Ekman of the University of California, San Francisco, a coauthor of the Psychophysiology paper. Ekman's Facial Action Coding System (FACS) breaks down facial expressions into 46 individual motions, or action units.
Sejnowski's team designed the computer program to use the same coding system. Their challenge was to enable the program to recognize the minute facial movements upon which the coding system is based. Other researchers had come up with different computerized approaches for analyzing facial motion, but all had limitations, says Sejnowski, who is director
of the Computational Neurobiology Laboratory at The Salk Institute for Biological Studies in La Jolla, California, and a professor of biology at theUniversity of California, San Diego (UCSD). A technique called feature-based analysis, for example, measures variables such as the degree of skin wrinkling at various points on the face. "The trouble," Sejnowski explains, "is that some people don't wrinkle at all and some wrinkle a lot. It depends on age and a lot of other factors, so it's not always reliable."
His teamwhich included Ekman, Marian Stewart Bartlett of UCSD and Joseph Hager of Network Information Research Corp. in Salt Lake Citytook the best parts of three existing facial-motion-analysis systems and combined them.
"We discovered that although each of the methods was imperfect, when we combined them the hybrid method performed about as well as the human expert, which is at an accuracy of around 91 percent," Sejnowski says. The computer program did much better than human nonexperts, who performed with only 73.7 percent accuracy after receiving less than an hour of practice in recognizing and coding action units. The coding process involves identifying and marking sequences of frames in which an individual facial expression begins, peaks and ends. A minute of video can contain several hundred action units to recognize and code.
In the work reported in Psychophysiology, the researchers taught the computer program to recognize 6 of the 46 action units. Since then, the program has mastered six more and, by incorporating new image-analysis methods developed in Sejnowski's lab, the system's performance has risen to 95 percent accuracy. The additional work was published in the October 1999 issue of IEEE Transactions on Pattern Analysis and Machine Intelligence.
Now the team is engaged in a friendly "cooperative competition" with researchers from Carnegie Mellon University and the University of Pittsburgh who have developed a similar system. The two systems will be tested on the same images to allow direct comparisons of performance on individual images as well as overall accuracy. The teams will then collaborate on a new
system that incorporates the best features of each.
A computer that accurately reads facial expressions could result in a better lie detector, which is why the CIA is funding the joint project. But Sejnowski sees other possible commercial applications as well.
"This software could very well end up being part of everybody's computer," he says. "One of the goals of computer science is to have computers interact with us in the same way we interact with other human beings. We're beginning to see programs that can recognize speech." But humans use more than speech recognition when they communicate with each other, he explains. In face-to-face conversation, "you watch how a person reacts to know whether they've understood what you've said and how they feel about it." Your desktop computer can't do that, so it doesn't know when it has correctly interpreted your words or when it has bungled the meaning. With this software and a video camera mounted on your monitor, Sejnowski thinks your computer might someday read you as well as your best friend does.
And speaking of best friends, the software could conceivably give robotic pets a leg up on the furry kind. In a project at UCSD, the researchers are integrating their system into the popular robotic dog AIBO, developed by Sony Corp., with the goal of training the robo-pet to recognize individual people and respond to their emotions. For example, says Sejnowski, "AIBO might comfort you if you are upset or play with you if you are restless." The eventual product could be more than just a high-tech toy. With recent research showing health benefits from interacting with pets, an empathic AIBO might be good medicine for people too ill or frail to care for living animals.
Exciting as the commercial prospects may be, Sejnowski says he's most interested in using the system to explore information processing in the human brain. "In the recent work, we find that the best performance comes from a method based on the way that single neurons filter visual images in the very first stage of processing in the visual cortex," he says. "The next step is to see whether or not some of the subsequent stages of processing in the visual system line up with the methods that we've developed."
The new methods are also proving useful in analyzing brain images obtained through functional magnetic resonance imaging. Sejnowski likes to think of the brain's changing activity patterns as "brain expressions," similar in many ways to facial expressions. The goal of his new work is to understand, by analyzing sequences of brain images, how different patterns of neural activity relate to particular tasks the brain is tackling or thoughts that are flickering through it.
"The face expresses what's going on in the brain, but it's only a pale reflection," Sejnowski says. "Now we have the capability, with brain imaging, to actually look inside and see what's going on in the person's mind while they're experiencing an emotion and making the facial expression."
Photo: Mark Harmel
this story in Acrobat PDF format.
Reprinted from the HHMI Bulletin,
May 2001, pages 12-17.
©2001 Howard Hughes Medical Institute