Our laboratory is interested in developing biological technologies, especially for analyzing protein function. Even in cases when the function of a protein is known, it is formidable to assess the consequences of amino acid changes on this function, but this is urgently needed information for physicians and patients in interpreting the rapidly growing data on human genetic variation.
Often we use the unicellular eukaryote Saccharomyces cerevisiae (baker's yeast) as the host organism for carrying out protein assays. Yeast—the first eukaryote to be sequenced—has a relatively small number of genes and is highly tractable for experimentation. In addition, yeast is a convenient host to express proteins from other organisms, a property we take advantage of in analyzing heterologous proteins, including human ones.
Over the past few years, we have focused on a method, termed deep mutational scanning, that couples protein display technology to high-throughput DNA sequencing. Protein display methods physically link proteins and the DNA sequences that encode them. When protein variants in such a method are put under a selection for function, those with beneficial features enrich in the population and those with deleterious features deplete. These changes in frequency can be determined by sequencing of the encoding DNAs. By comparing the frequencies of a given variant in the input and selected populations, we obtain a ratio that is an estimate of the variant's function.
Deep mutational scanning provides a quantitative measure of the function of hundreds of thousands of variants of a protein in a single experiment. The key ingredients of this approach—protein display, low-intensity selection, and high-throughput sequencing—are simple and widely available. Data from this approach can be used to construct protein sequence–function maps and to reveal fundamental protein properties. We used this approach with such protein domains as a WW domain, viral and yeast RNA-binding domains, E3 ubiquitin ligases, a degradation signal known as a degron, and a G-protein-coupled receptor.
Organisms use a relatively small repertoire of RNA-binding domains, with specificity achieved by the spatial organization of these domains within a protein and by the small sequence variations among structurally related domains. In one application of deep mutational scanning, we studied the effect of mutations on the function of a common RNA-binding domain called the RNA recognition motif (RRM), which is present in the poly(A)-binding protein of yeast. Data on the ability of variants of this protein to function in yeast allowed us to identify critical residues. We combined the mutational data with the set of natural sequences corresponding to evolutionarily divergent variants of this domain to identify a site of protein-protein interaction at single amino acid resolution.
In another application, we scored the activity of variants of the human tumor suppressor protein BRCA1 for their ability to bind to a partner protein, BARD1, and to act as an E3 ubiquitin ligase. We used these scores to generate a model for predicting tumor suppressor activity of BRCA1 variants, and found that this model performs better than current computational approaches. Thus, massively parallel functional assays can facilitate the prospective interpretation of variants observed in clinical sequencing.
Although synonymous codons encode identical amino acids, variation in these codons can lead to subtle alterations in protein production. In collaboration with the laboratory of Elizabeth Grayhack (University of Rochester), we analyzed an insert of three random amino acids in green fluorescent protein to identify a set of adjacent codon pairs in yeast that mediate profound reductions in translation efficiency. We showed that the inhibitory codon pairs affect translation elongation and efficiency in a manner distinct from the effects of the individual codons. Our results suggest that the mechanisms of codon-mediated regulation depend extensively on wobble decoding and synergistic interactions between adjacent sites in the ribosome.
Transfer RNAs (tRNAs), critical for translating genetic information into cellular function, must adopt a specific three-dimensional structure to interact with the ribosome, with elongation factors, and with the appropriate aminoacyl tRNA synthetases. To define the set of functional tRNAs and the extent to which a particular tRNA can tolerate mutation, we collaborated with the laboratories of Eric Phizicky and David Mathews (University of Rochester) to adapt deep mutational scanning to the study of tRNA function. The large set of tRNA mutants that we analyzed is also allowing us to model the determinants by which a tRNA becomes temperature-sensitive for function.
With the laboratory of Maitreya Dunham (University of Washington), we analyzed the activity of a yeast promoter using large-scale mutagenesis and selection of yeast variants in a chemostat. We identified promoter mutations that could increase fitness, in some cases by creating potential binding sites for transcription factors, but we showed that even combinations of beneficial point mutations could not induce a fitness increase comparable to that due to gene amplification.
Uncovering the genetic underpinnings of complex traits has proven difficult. In collaboration with the laboratory of Christine Queitsch (University of Washington), we are using the mating pathway of S. cerevisiae to develop a model for testing hypotheses about complex trait genetics. We use deep mutational scanning to identify many small-effect mutations in individual genes. These can then be combined to show how additive genetic variation and epistasis, along with genetic modifiers like the chaperone Hsp90, contribute to a complex trait.
Biosensors are able to bind to a small molecule and change property to signal that this binding has occurred. In collaboration with the laboratories of George Church (Harvard) and David Baker (HHMI, University of Washington), we developed a method to make biosensors based on the principle of ligand-dependent protein stability. We fuse a ligand-binding domain to a fluorescent protein or a transcription factor, and then destabilize the ligand-binding domain by mutation. We identify mutant versions that are degraded inside cells unless the target small molecule is bound. These biosensors can be used to optimize the production of a desired compound or to regulate cellular behavior.
In other work, we have begun efforts in genome engineering in yeast. We generated a set of small DNA elements that lead to variable levels of transcription, and showed that these could regulate a set of genes for lycopene production and be used to identify colonies with a high level of this compound. We have initiated projects focused on using bacteriophages to generate large amounts of sequence diversity, with the goal of evolving novel protein activities. Bacteriophages also provide sufficient natural diversity to analyze the sequence constraints on the function of their proteins. In collaboration with the laboratory of Christine Queitsch (University of Washington), we are carrying out genome-wide analyses in plants, including the identification of enhancers and the construction of a library of mutations that target the complete set of Arabidopsis thaliana genes.
Some of this research was supported by grants from the National Institutes of Health.
As of February 25, 2016