Research Summary
Viren Jain develops computational techniques for automating image analysis and applies these techniques to studying neural circuits.
Electron microscopy (EM) is still the best imaging method for producing data from which one can unambiguously determine the complete synaptic connectivity of neuronal assemblies. Although recent innovations have made imaging more efficient, the next step of image analysis is a persistent bottleneck. Many years of human effort solved the image analysis problem for Caenorhabditis elegans, but the manual approach is too costly for routine and rapid production of wiring diagrams—especially for circuits with tens or hundreds of thousands of neurons. Automation of image analysis is therefore a critical problem for neurobiology and the one that forms the core of my research program.
Machine Learning of Image Analysis
To infer a connectivity matrix from images of dense neuropil, two image analysis tasks must be performed: (1) identifying synapses, and (2) tracing neurites to their parent cell bodies. The first task can be posed as a visual object recognition problem, in which the goal is to determine whether an image contains an example of an object category. The second task can be posed as an image segmentation problem, in which the goal is to group pixels into distinct partitions that correspond to physical objects. To produce useful wiring diagrams, image segmentation must be performed with extraordinary accuracy. A mistake in synapse identification will cause at most a single error in a connectivity matrix, but a single error in neurite tracing could misassign thousands of synapses.
In previous work, we introduced new image segmentation algorithms based on a machine learning approach to image analysis. By developing novel classifiers and learning algorithms specialized for the problem of segmentation, our approach has yielded new levels of performance in automated EM reconstruction. However, these methods have largely focused on a relatively local analysis of the image volume; further improvements in accuracy are likely to require automated methods that can reason about much larger amounts of image context. We are pursuing such improvements through research in several directions:
Segmentation is an example of a structured prediction problem. Probabilistic graphical models and non-probabilistic energy-based models are powerful representations for dealing with structured prediction problems but generally define very difficult learning and inference problems. For EM reconstruction, we will eventually encounter models with millions of objects and a practically infinite combinatorial space of configurations of those objects. What learning and inference algorithms are effective for dealing with such problems? There is a large and growing literature on efficient (approximate) methods for inference and learning in such models, but such research has been done on different domains and problems. EM reconstruction may require a fresh approach to these issues.
Neurites are very specialized three-dimensional structures and therefore there should exist effective low-dimensional features that describe them. What is the best approach for devising such descriptors (e.g., unsupervised learning, human specification, or end-to-end discriminative learning), and how can they be best used to improve EM reconstruction accuracy?
Semiautomated Reconstruction by Integrating Machine Learning, Computer Vision, and Human–Computer Interaction
In addition to advancing the basic computer vision and machine learning research involved in automated reconstruction, we are solving the science and engineering challenges involved in capitalizing on existing automated techniques in reconstructing wiring diagrams. In particular, using state-of-the-art but imperfect automated techniques to produce accurate descriptions of connectivity requires sophisticated software that enables humans to proofread computer reconstructions by identifying and correcting errors. Our approach integrates machine learning algorithms directly into the proofreading process, which will enable progress on several important problems:
The true metric of interest in a reconstruction project is the amount of human effort required to proofread and correct errors in an automated segmentation (the nuisance metric). However, all machine learning approaches to image segmentation are designed to optimize some other metric (such as pixel error). Can a machine learning strategy be devised that directly optimizes the nuisance metric?
Machine learning requires ground truth, which in EM reconstruction is typically provided by humans and is thus expensive to acquire. How can human input be selectively acquired so that it is most informative for a particular learning algorithm? The general form of this problem has been well-studied as active learning but has yet to be rigorously pursued within the context of image segmentation. A successful solution to this problem will increase the performance of automated reconstruction by maximizing the information content provided by human labeling.
Reconstructing Sensory Circuits
In collaboration with labs at Janelia and elsewhere, we are applying our reconstruction algorithms and tools to answer fundamental biological questions related to how local circuitry in sensory systems processes information. We are excited about combining measurements of neuronal activity and connectivity to try and understand computational principles that govern the nervous system.
As of March 10, 2011




