image by VSA Partners

Right Before Your Eyes

Coupling protein sequence to function, thousands of variants at a time.

Even with a load of new data, presenting information to lab-mates at a group meeting can be an uncomfortable experience. Worse still is reporting a total lack of progress. But for postdoctoral fellow Doug Fowler, publicly detailing the latest dead ends in his project turned out to be pivotal.

His ambitious project had been dogged by technical problems since its inception a year and a half earlier. "That experience of laying out all these things that weren't working made me realize there was another way," says Fowler, who works in the lab of HHMI investigator Stanley Fields at the University of Washington (UW) in Seattle.

Fowler and Fields wanted to comprehensively survey—on a larger scale than had ever been done—how genetic changes influence protein function. Typical approaches produce one mutant version of a protein at a time—for example, by converting some amino acids in a sequence to alanine. Instead, they wanted to track thousands of versions of a protein that systematically varied in its sequence. The result would be a high-resolution map of how each component of a protein contributes to its function, bringing insight to basic biology and drug design alike.

"As you think of a complex, three-dimensional surface of a protein, every position there has its own unique profile," says Fields, whose research into protein function has spawned new technologies. "But you could never predict it with what we know."

Fowler and Fields can now obtain these unique profiles experimentally, but not with their initial strategy. They had been trying to develop a protein microarray on which hundreds of thousands of proteins would be generated from corresponding spots of DNA. As he mulled over his discouraging lab group meeting, it dawned on Fowler that his main problem with the microarray—linking DNA sequence to protein function—had been solved more than 20 years before with a procedure called phage display.

With phage display, DNA encoding a protein of interest is introduced into phage viruses, which display the protein on their outer shells. With each virus containing a different protein, a large collection of variant proteins can be studied. Screening the viruses for the ability to bind to other molecules retains the phage that bind and washes away those that don't. Repeating this step enriches the phage population for those carrying the few proteins that are the best binders, and DNA recovered from these phage "winners" identifies them.

But Fowler and Fields were interested in some of the losers, too, because that would allow them to track the fate of far more variants. So they tried phage display with a new twist: doing it under conditions that increased the chances of binding, so that more phage would make it through each round of selection. They then used high-throughput sequencing to identify and quantify the DNA of the surviving phage, with the DNA counts acting as a proxy for protein function: those sequences increasing in abundance encoded better binders, and those that were depleted encoded faulty ones.

This innovative combination of sequencing and permissive phage display produced a detailed map of how specific amino acids influence the binding ability of a small and well-characterized region of a protein called the human Yes-associated protein 65 WW domain. As reported in the September 2010 issue of Nature Methods, the researchers' approach winnowed more than 600,000 variants of this protein down to 94,606 survivors—a group large enough to reveal how each amino acid position in the protein tolerated mutation. This process pinpointed key positions critical for protein function and showed that both the exact position and the precise amino acid in that position mattered.

"The map you get is a much denser kind of landscape than you would ever get by standard mutagenesis, because with standard mutagenesis you're not able to look at so many variants," Fields says.

A Rugged RNA-Scape

Two miles from the Fields lab, a similar experiment was under way at the Fred Hutchinson Cancer Research Center.

To check how their proxy measure of protein function related to protein structure, they teamed up with David Baker, a UW colleague and HHMI investigator. Using a computational tool developed by Baker called Rosetta, they simulated the stability of their surviving protein variants and found that the relative abundances correlated with the calculated stabilities, with the depleted variants likely to be less stable than the original wild-type protein.

Fields says that because the method is fairly generic, it can be tailored to many biological questions. For example, it could catalog the effects of mutations in disease genes, distinguishing deleterious variants from harmless ones. In drug development, the method could assay the range of protein variants vulnerable to a drug—something important to know for diseases such as cancer, in which a runaway mutation can allow proteins to develop drug resistance.

It can even provide a reality check for proteins designed by computers and humans, says Baker. "A brand new protein is born out of nowhere, you know nothing about it. This technique offers the potential to immediately know what the role of every amino acid is."

And the time it took to get from a frustrating group meeting to publication of a new method? One year. "It was one of these magical experiences in science that doesn't happen very often," Fowler says.

Scientist Profile

University of Washington
Genetics, Molecular Biology
University of Washington
Biochemistry, Computational Biology