Computational Biology, Molecular Biology
University of California, Santa Cruz
Dr. Haussler is also a professor of biomolecular engineering and director of the Center for Biomolecular Science and Engineering and of the Training Program in Stem Cell Biology at the University of California, Santa Cruz; scientific codirector of the California Institute for Quantitative Biosciences (QB3); and a consulting professor at both Stanford Medical School and UC San Francisco Biopharmaceutical Sciences Department.
David Haussler is developing new statistical and algorithmic methods to explore the molecular evolution of the human and other vertebrate genomes, integrating cross-species comparative and high-throughput genomics data to study gene structure, function, and regulation. He applies genome-scale evolutionary analysis to the study of cancer and other diseases.
The human genome, with its 3 billion chemical bases that together spell out the recipe for life, is still largely unfamiliar territory. While most scientists have focused their attention on the stretches of DNA responsible for building proteins, which make up only a minute fraction of the human genome, David Haussler studies the vast DNA regions that do not code for proteins, often called "junk" DNA. His research strongly suggests that this "junk" has biological relevance, including the regulation of genes that build proteins. Errors in this type of DNA also have been linked to genetic diseases. An expert in using computers to understand the enormous amount of data in the genome—a field called bioinformatics—Haussler wants to learn how noncoding regions control protein-coding genes. Additionally, his studies comparing the genomes of animal species have revealed new insights into the evolutionary forces at work over millions of years.
Haussler studied art and then psychotherapy before settling on mathematics as a college major. During summers, he worked for his brother, Mark, a biochemist at the University of Arizona studying vitamin D metabolism. David helped carry out experiments on chicks deprived of vitamin D and then analyzed the results. The experience helped him realize he was far more interested in mathematics, his passion since childhood, than laboratory work. Haussler became interested in the mathematical analysis of DNA while pursuing a doctorate at the University of Colorado. At this time, in the early 1980s, the field of bioinformatics was virtually unknown, but Haussler wanted to be a part of it.
Early in his career, Haussler introduced the use of powerful statistical models to find genes that code for proteins in long stretches of DNA sequences. Based on this work, his laboratory became involved in the Human Genome Project. Haussler's group was recruited to this project to develop computer algorithms to locate the protein-coding genes in the human genome. A graduate student in the laboratory, James Kent, developed a computer program in just a few weeks that enabled the team to amass into a coherent sequence the 400,000 fragments of DNA that had been decoded by sequencing laboratories around the world. On July 7, 2000, Haussler and his colleagues posted the working draft of the human sequence on the Internet. "That moment, on July 7, 2000, when the flood of As, Cs, Ts, and Gs of the human genome sequence came across my computer screen on its way to the Internet to reach thousands of scientists all over the world, was the most exciting moment of my career," he recalls.
Haussler and his team then developed the UCSC Genome Browser, an interactive Web-based microscope that allows scientists to view annotated genome sequences of humans and other organisms at any level, from a complete chromosome to a single nucleotide. In recent years, they created algorithms and software to analyze and display the genetic differences among organisms during the course of evolution. Haussler's group reconstructed with an estimated 98 percent accuracy part of the genome of the common ancestor of most placental mammals—a small shrew-like creature that lived about 100 million years ago. They also have begun assembling a database that can trace the changes in any given nucleotide from the common placental ancestor to humans or other living mammals. They are using this tool to study various parts of the human genome, including the noncoding regions of DNA, to understand how they function to regulate the genes involved in building proteins.
Despite current uncertainties about the genome's full meaning, Haussler is committed to the exploration. "We didn't understand the human genome sequence the day the draft was posted on the Internet, and we still don't understand it today," he says. "What drives me to keep exploring the genome is the same thing that drives most scientists: curiosity and the excitement of the unknown. I want to know what is actually in our genome, how it works, and how it evolved to be the way it is."