 |
 |
 |
 |
 |
 |
|
|
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
|
|
|
|
Haussler's epiphany occurred when his team posted the Human Genome Project's first draft on the Internet. "I turned on my computer and there it was, humanity's first nearly complete look at its own recipe, an almost boundless waterfall of As, Cs, Gs, and Ts cascading across my screen, chosen and ordered by the individual life struggles of our ancestors through billions of years of evolution. It was an indescribable moment."
Only weeks before, however, there seemed little possibility that the vast amounts of the project's data could soon be stitched together for ready display. Haussler, a computer scientist who directs the Center for Biomolecular Science and Engineering at the University of California, Santa Cruz (UCSC), had worried about this "genome assembly" problem since first being asked to join the effort by Eric S. Lander, director of the Whitehead Institute/MIT Center for Genome Research, one of the major laboratories contributing to the genome-draft data. Lander wanted Haussler to apply his computer algorithms to find the genes in the genome sequence, but given the fragmented nature of the data, this task proved exceedingly difficult. "Bits and pieces of information were coming to us from 20 different sequencing centers around the world," Haussler says. "It was a jumble." Enter James Kent, then a UCSC graduate student, whom Haussler and growing legions of other admiring researchers consider a computer-programming genius. In a tour de force lasting only a matter of weeks, Kent crafted a large and sophisticated program, known as GigAssembler, which put together the first draft of the genome. "Jim saved the day and got the genome into the public domain, where it was available for free," Haussler says. And the public responded, downloading the draft in record numbers. Indeed, in the first 24 hours alone, Haussler's computers at UCSC churned out half a trillion bytes of data.
GOLDEN PATH
But that awesome day was only the beginning. "Jim and I had already started thinking about the next step," Haussler says. "We had the genome, but now we needed a way for scientists to interactively explore it." This was a challenge that Haussler's group was particularly well equipped to address, particularly on a campus where mathematicians and computer scientists were fast converging around two fundamental tools of the information age: Web-browsing technology and bioinformatics. Harnessing these new electronic technologies would ultimately produce the UCSC genome browserdubbed the Golden Pathas well as its cousins at the European Bioinformatics Institute and the National Center for Biotechnology Information. Over the past few years, researchers at laboratories around the world have become habitual users of the Haussler team's technology as they search for new clues about disease and potential therapies. Famed breast cancer researcher Mary-Claire King, at the University of Washington, is one of them. "The Golden Path is extraordinarily easyand funto use," she says. "Our undergraduate students learn how to work with it in less than an hour and are hooked on genomics as a consequence." Moreover, it has already proven its worth as a premier investigative tool. "Using the UCSC browser over the past two years," says King, "we have identified a novel gene critical to breast cancer development. One of my postdocs identified a gene responsible for age-related hearing loss. And systemic lupus erythematosus, an extremely complex disease we have struggled with for 20 years, is now approachable because we can consider multiple genomic events simultaneously."
|
|
|
|
The popularity of the UCSC genome browser is no surprise to Haussler, simply because it helps scientists trek their way through areas that would otherwise be unnavigable. Regions of interest are not limited to the estimated 30,00035,000 genes, which constitute only about 1.5 percent of the human genome. The other 98.5 or so percent, the non-protein-coding DNA that used to be dismissed as "junk," is known to possess gene-regulation and other cellular-control zones. Although areas devoted to these functions were once thought to constitute only a tiny fraction of the genome compared to protein-coding DNA, the opposite appears to be true. "We estimate that an additional 34 percent of the human genome, beyond the protein-coding DNA, may have an equally vital functional role," says Haussler.
His conclusion stems from analyses of the molecular evolution of the human genome that included comparisons to the genome sequences of the laboratory mouse and other mammals. By noting similarities, researchers ultimately determined the DNA regions that evolved from a common ancestor (called "orthologous" regions) and then measured the degree to which the DNA in those regions is conserved among the different mammalian species. Haussler refers to highly conserved areas of the human genome as "the living record of the difficult struggle of our ancestors to preserve the most critical parts of the message of life." Such regions constitute approximately 5 percent of the bases in the genome, according to his team. This result suggests that there are many non-protein-coding regions that perform vital functions, says Haussler, and it "defines a clear task for genomics: Find out what this selected 5 percent of the human genome does." MathematicsHaussler's first loveremains at the core of his genomic enterprise, which began 20 years ago at the University of Colorado. There, under the tutelage of his mentor Andrzej Ehrenfeucht, Haussler became interested in the mathematical analysis of DNA sequences. This was a time when there was considerable excitement about recombinant DNA methods and when the complete DNA sequences for selected bacteriophages were first being deciphered, along with parts of the Escherichia coli genome. The goal to find sites that most strongly drive protein expression motivated Haussler and two fellow graduate students, Gary D. Stormo and Eugene W. Myers (whom he counts among the "founding pioneers" of bioinformatics), to ponder computational methods for finding patterns in the sequences of nucleotides then being discovered. Few could have imagined the blossoming of technology since then that nowadays makes those early efforts seem clumsy and unsophisticated.
Nor could many people have envisioned the increasing numbers of students now happily poking their fingers into the very essence of life itself. "After all," Haussler asks, "isn't life basically just a complicated, self-perpetuating DNA program?" Such questions, tossed into the air with rhetorical flourish, have held a career-long fascination for this father of two, whose Hawaiian shirt and easy demeanor belie an almost flammable curiosity. "I think anyone [who trains to be a mathematician] strives for maximum simplicity. I'm a Platonist, looking for proof of the necessary structure inside the beauty and form of life." Yet Haussler is the first to admit that simplicity is not always attainable. And he should know, having spent a decade trying to create a mathematical model of the ways by which synaptic junctions in a network of neurons are altered by experience. "I wanted to know how the brain adapts and learns at the most fundamental mathematical level." What he learned instead was that some complex natural systems do not break down cleanly into subparts. "When we subjected simulated neural networks to simple learning challenges, they naturally formed exceedingly complex, overlapping sets of interactions that successfully addressed these challenges," Haussler says. Their solutions were unanticipated and difficult to deconstruct. "I'm afraid that molecular evolution has addressed the challenge of creating self-perpetuating life in a similar manner, and that it will be difficult to fully unravel the solution it has found."
MATHEMATICIAN IN PARADISE
But on a more workaday level, Haussler believes that numerous discoveries of biomolecular processes with great medical importance are on the way. These discoveries will result from collaborations of math and computer-science experts with researchers such as microbiologists, cell biologists, and chemistsall of whose worlds once seemed parallel at best. "When I started in mathematics, I had no clue that I would be interacting with biology in this way," says Haussler, staring up at the towering redwoods outside his office window. "Who would have thought? I'm just a mathematician living in paradise."
Download this story in Acrobat PDF format.
(requires Acrobat Reader)
Photo: Timothy Archibald
Reprinted from the HHMI Bulletin,
September 2003, pages 28-31.
©2003 Howard Hughes Medical Institute
|
|
|
|
 |
|
 |
 |
|