My lab is developing high-throughput tools to help biologists characterize the behavior of genetic model organisms. These tools use techniques from machine vision and learning to automate the majority of the behavior analysis. They are critical to obtaining the throughput necessary for the large-scale neuroethology projects at JFRC, such as the Fly Olympiad, in which the behaviors of thousands of transgenic lines must be quantitatively measured. They will also collect more detailed, accurate measurements of larger numbers of animals behaving in more natural environments, resulting in the discovery of more subtle behavioral effects, even in smaller scale experiments. Similar impact can be expected on both large- and small-scale behavioral genetics studies. Animal behavior analysis is also of interest to the machine vision and learning communities, as it is an ideal testbed for our algorithms. It will contribute to our understanding of what good solutions to the tracking and behavior analysis problems are, and to our understanding of what "behavior" means.
Our research into animal behavior analysis can be divided into three areas: tracking, behavior detection, and behavior mining, summarized in the figure.
Animal-tracking software inputs video of animals behaving and returns their positions in each frame. We are developing novel tracking algorithms tailored to the constraints of biological science. Our tracking algorithms must find a balance between the following:
- usability by computer vision novices
- the robustness and reliability necessary for high-throughput, long-duration experiments
- the accuracy and predictability required for scientific results
- adaptability to different experimental setups
- the flexibility to handle difficult, complex experimental setups, in particular, large numbers of interacting animals
Our goal is to create open-source, freely available software that can "solve" the tracking problem for model organisms in controlled lab environments. To this end, we have developed a freely available, open-source software package called Ctrax (http://ctrax.sourceforge.net/) for multiple fly tracking. It is currently being used by biologists in a number of labs. We will continue to develop this and new software, an endeavor that requires both research in academic computer vision as well as practical software development. Besides adapting existing tracking algorithms to the animal-tracking problem, we are developing new algorithms for multitarget tracking and high-resolution pose estimation, and experimenting with semisupervised methods for making tracking software more adaptable and easy to use by those without knowledge of the underlying tracking algorithm.
Statistics of an animal's position over timeas well as easily computed per-frame properties such as speed, change in velocity direction, and distance to another animalprovide a quantitative description of the animal's behavior. We could end our research with these per-frame statistics of behavior and certainly observe many interesting behavioral effects. However, we may be able to find more concise or subtle behavioral effects by incorporating biologists' taxonomies of behavior classes observed in the animal. For a given behavior identified by a biologist (e.g., walking, wing extension, lunging), we can create higher level descriptions of behavior by segmenting trajectories into sequences in which the animal is and is not performing this behavior. For example, one can quantify the frequency with which a fly lunges, the mean speed of a fly while walking, or the mean total change in orientation during saccades.
We are developing a "vocabulary" of Drosophila behaviors, and tools for automatically detecting these behaviors. Using machine learning, we easily create detectors for many behaviors. To define a behavior detector, a biologist manually segments sample trajectories into sequences in which the animal is and is not performing the behavior of interest. Our software then learns a detector that replicates these labels. We are making this learning software usable by computer science novices so that the vocabulary of behaviors can be expanded and adapted to different experimental setups by biologists at JFRC and around the world.
The algorithms described above automate tasks that human experts would be able to do easily, given unlimited time and patience. However, the data sets computed by the tracking and behavior detection software contain more information than can easily be visualized by a human expert, and a human may not be able to easily extract all the scientifically "interesting" behavioral effects. The third focus of my lab is using data-mining techniques to find such structure in the automatically computed behavioral data sets. One potential quantitative definition of "interesting" involves looking for statistical differences between representations of the trajectories of types of animals whose behaviors are hypothesized to be different, e.g., different transgenic lines, different genotypes, different individuals, at different times of day, in different locations in the arena, with different amounts of time of food deprivation, in response to different visual or chemical stimuli. The analysis can be performed to find behavioral statistics differentiating a pair of discrete types, e.g., wild-type flies versus a genetic mutation, or a continuous type, e.g., the time of day or distance to a landmark in the environment.
Using both biologists' knowledge of behaviors to train detectors as well as data-mining techniques, we endeavor to discover a concise yet rich quantitative language for describing the behavior of genetic model organisms. Linking this with manipulations to the organisms' environment, neural circuitry, and genetics will give us insight into the brain and biology of the organism.
As of March 17, 2010