When a cell needs a certain protein, the gene that encodes it is turned into RNA and then fed into a molecular machine called a ribosome. This machine translates the genetic code into a chain of amino acids, and as the chain emerges from it, something magical happens: it spontaneously folds up into a complex, ball-of-spaghetti-like shape—a functional protein.
The protein folds because the atoms in its amino acids either attract or repel each other and the water molecules around them. In a smallish protein—about 300 amino acids long—these trillions of interactions create a literally limitless number of possible shapes. But the protein always folds to the same one. Since David Baker came to the University of Washington nearly 15 years ago, he's been trying to understand how that process works.
His interest in the problem began long ago, as a senior at Harvard University. "I took a biochemistry course and the textbook talked briefly about the protein-folding problem—it said that amino acid sequences encoded three-dimensional structures," remembers Baker. "We had to write a term paper on something to do with biochemistry, and I asked the TA and the professors if I could look at protein folding. They discouraged me, saying 'no one really knows how it works.'"
Baker left the problem alone, working on different questions as a graduate student in the lab of HHMI investigator Randy Schekman. But the question nagged him. "When I was thinking about what to do for a postdoctoral research project, I thought 'well, maybe I'll look at protein folding.'"
A protein's final shape is mostly determined by physics. In the same way that a ball rolled along a bumpy landscape will eventually end up at the lowest—and hence most stable—point, the atomic interactions in chains of amino acids lead to some shapes that are more stable than others. The more stable shapes are lower in energy, and "almost always, the final structure of a protein is the lowest energy state for its unique amino acid sequence," says Baker.
But while figuring out a ball's most stable location is easy, predicting the single correct structure of a protein based solely on its amino acid sequence is not; it requires accurately calculating and balancing the strength of all the countless atomic interactions.
The calculations are far too time-consuming to do by hand, so Baker uses computers. Over a decade ago, he and his group created the first version of a structure prediction program called Rosetta. Rosetta constantly evolves as Baker's group compares its predictions to real-life structures obtained using techniques like x-ray crystallography.
Baker's team originally ran Rosetta on computers at the University of Washington. But several years ago, he realized he could never accrue enough capacity to answer the questions that intrigued him. New technology, though, meant he didn't have to. Baker adapted software developed for SETI, the Search for Extraterrestrial Intelligence. SETI harnesses unused time on ordinary people's home computers to analyze radio data from sky telescopes for patterns that might reflect sentient alien life. With help from the creators of SETI@home, Baker and his team created Rosetta@home. By early 2008, his extended protein-folding network had nearly 200,000 members.
The continued efforts to improve Rosetta and the computing power of Rosetta@home have led to advances in protein-structure prediction. Rosetta structure prediction has been shown not only to be capable of predicting the structures of some small proteins very accurately from their amino acid sequences, but also to dramatically speed up traditional experimental methods of determining protein structures.
Baker attributes his success to his insistence that hypotheses always are validated and improved by results from real-world experiments. He says this insistence stems from his experience at Harvard, where he first studied philosophy, but became frustrated by arguments that seemed primarily about language and lacked substantive content.
Although computing the structures of naturally occurring proteins and RNAs and their interactions remains a major focus of Baker's group, he now works just as hard on creating proteins that do not yet exit. "We've been doing the structure prediction work—trying to get from amino acid sequence to the structure of naturally occurring proteins," he says. "From there, it's not such a leap to start thinking about making new amino acid sequences that encode new structures."
The first result of this effort was Top7, an artificial protein that Baker's group created in 2003. Engineering Top7 won Baker the AAAS Newcomb Cleveland Prize in 2005, the organization's most prestigious award for research reported in its journal Science, where Baker published the results.
After showing with Top7 that new protein structures could be designed with very high accuracy, Baker moved on to the next challenge: creating proteins that do new and useful things that no naturally occurring protein can do. Baker's group is working now to create new proteins to help repair mutations that cause disease, new protein inhibitors to block pathogens, and vaccines for HIV.
In the past year, Baker's group made a breakthrough in creating new enzymes. Enzymes are the class of proteins that catalyze, or speed up, all the chemical reactions that happen inside living things. Biological organisms have evolved enzymes for speeding up the chemical reactions important for life. However, says Baker, "There is a very wide range of reactions of interest for which there are no naturally occurring enzymes—for example, breaking down toxic compounds in the environment, creating new fuel molecules, and creating new therapeutics."
In early 2008, Baker's group reported successfully creating two brand-new enzymes from scratch. They're not quite as fast as natural enzymes, but as Baker points out, nature has had much longer to get things just right. As his group's techniques improve and other scientists join the effort, Baker envisions a world of potential applications.
Though Rosetta@home began as a way to automate the process of structure prediction, recently Baker added an interactive component.
The Rosetta algorithm uses what computer scientists call the "Monte Carlo Method" to run through possible structures—basically, it flails the amino acids around randomly until it finds something that works. The Rosetta@home interface has an integral screensaver that displays the protein-folding action. Baker remembers that as his volunteers watched their proteins flail, they were writing in, saying "Hey! The computer is doing silly things! It would be great if we could help guide it."
So Baker teamed up with a group of University of Washington computer scientists to create "Foldit," a multiplayer online protein-folding game, in the hope that a little friendly competition will reap scientific dividends. Eventually, the group plans to add a design component that will enable players to build new proteins of their own.
All a player needs is an internet hookup and a computer. According to Baker, "With rosetta@home, we channeled the computer power of people all around the world into solving biomedical research problems. Now, with Foldit, we are attempting to channel their brain power as well. The dream is that people working together all around the world can make a significant contribution to science and global health. For example, I imagine a 12-year-old in Indonesia who can visualize proteins in his head and can build a cure for HIV."
Besides the game players and the 200,000 rosetta@home participants who donate their computer time to folding proteins, Baker works with a large team, both in his lab and around the globe. For him, making such a massive investment in solving a problem is the only way to go. "My philosophy is, if you're going to tackle a problem, you should really go all out," he says. "Most interesting problems, you aren't going to be able to solve if you go at them half heartedly."