|
Genomic Analysis of Protein Function

Summary: Stanley Fields analyzes the function of proteins from the yeast Saccharomyces cerevisiae on a genome-wide basis and uses this yeast to develop assays that can be applied to proteins from any organism.
The last decade has seen a profusion of whole-genome sequences, with total DNA sequence accumulation in GenBank now more than 100 billion bases. Genome sequences have led to the prediction of large complements of proteins, ranging from a few thousand in bacterial species to more than 20,000 for humans and other mammalian species. However, the determination of protein function remains a difficult task, given the tremendous range of biochemical activities that proteins display, the diverse modifications that a protein can undergo during its lifetime, the multiplicity of proteins potentially encoded by a single gene, and the use of proteins for more than a single function.
Our laboratory is interested in developing technologies, especially those to analyze protein function. For many of our efforts, we use the unicellular eukaryote Saccharomyces cerevisiae (baker's yeast) as the host organism for carrying out protein assays. Yeast—the first eukaryote to be sequenced—has a relatively small number of genes, is highly tractable for experimentation, and has been used to derive numerous sets of reagents and high-throughput data. As a consequence, the set of yeast proteins is particularly advantageous for testing new technologies. In addition, yeast is a convenient host to express proteins from many other organisms, and we have taken advantage of this property to analyze several sets of such heterologous proteins.
We have also used S. cerevisiae for the analysis of proteins relevant to human disease. Past studies have focused on a human polyglutamine-containing protein implicated in neurodegenerative disease, the human Toll-like receptors that mediate innate immunity, the proteins of the malaria parasite Plasmodium falciparum, and yeast proteins that play a role in aging.
Protein Interactions We developed an array format in which nearly all of the predicted open reading frames of S. cerevisiae were generated as fusions with the activation domain of the yeast Gal4 transcription factor. An automated procedure is used to screen the array via the two-hybrid method. More than 1,000 proteins have been screened so far, with the resultant identification of thousands of putative interactions.
In collaboration with Prolexys Pharmaceuticals Inc. (Salt Lake City), we carried out a high-throughput two-hybrid analysis of P. falciparum, the parasite responsible for the most virulent form of malaria. More than 30,000 random DNA-binding domain hybrids were assayed to yield a core dataset of ~2,800 interactions. With Min-Hao Kuo (Michigan State University), we modified the two-hybrid method to detect interactions that depend on a post-translational modification, identifying acetylation-dependent and phosphorylation-dependent interactions.
With Jing Huang (University of California, Los Angeles) and Invitrogen Inc. (Branford, Connecticut), we developed a pooling strategy that can dramatically decrease the effort required to generate large-scale datasets. This method identified protein interactions as well as drug-resistant yeast mutants. With Invitrogen, we used protein microarray technology to generate a protein interaction map for 12 of the 13 WW domains of S. cerevisiae. We observed nearly 600 interactions between these 12 domains and ~200 proteins.
Protein Display Technology The increased availability of gene sequences requires technologies to enable screens of the encoded polypeptides for diverse binding and catalytic activities. We are exploring technologies to couple DNA fragments, the mRNA transcribed in vitro from this DNA, and the protein translated in vitro from this RNA to provide the basis for activity screens and binding selections. This technology could be applied to complex mixtures of DNA, such as those present in the gut microbiota.
Substrate-Enzyme Relationships in Ubiquitination Ubiquitin is a highly conserved protein whose attachment to a target protein can alter its fate. The E1 ubiquitin-activating enzyme, E2 ubiquitin-conjugating enzymes, and E3 ubiquitin ligases act in concert to covalently attach ubiquitin to proteins. There are 35 E3 ligases in S. cerevisiae and nearly 1,000 putative ones in humans, many of which, such as the breast cancerspecific tumor-suppressor BRCA1 and early-onset Parkinson's disease protein Parkin, are important in disease. We are using several approaches to identify on a genome-wide basis the specific protein substrates of individual E3 enzymes.
Intron Detection and Degradation One hallmark of eukaryotic gene structure is the presence of introns, which are spliced out of pre-mRNAs prior to translation. Introns are released in the form of lariats, in which the 5' end of the intron RNA links to the 2' hydroxyl of an internal adenosine. The lariat must be debranched prior to intron turnover, and in the absence of the debranching enzyme, lariat RNAs accumulate. We compared yeast RNA derived from a wild-type strain and one deficient in debranching enzyme to identify more than half of the known yeast introns, as well as novel ones. We are adapting this approach for genome-wide identification of introns in Drosophila and human cells.
We identified a yeast protein conserved in other organisms that could potentially link the spliceosome to the degradation of lariat introns. This protein, YGR093w, interacts with the debranching enzyme Dbr1 and with Syf1, a component of the spliceosome. We are performing biochemical experiments with model RNA substrates to establish a catalytic role.
Metabolites in Yeast Metabolism encompasses all the processes by which a cell generates energy and other essential molecules from nutrients. These pathways rely on hundreds of genes and involve thousands of small molecules. Using fluorescent reagents and capillary electrophoresis, we are profiling metabolites in S. cerevisiae—which shares all major metabolic pathways with humans—to determine the effects of genetic and environmental stimulation on metabolism. We hope to apply these data to understanding human mutations and polymorphisms affecting metabolic pathways, yielding insights into the biology of metabolic processes in human health and disease.
Chromosome Structure Evidence from work in several organisms indicates that specific interchromosomal interactions occur and can affect gene expression and other processes. We are analyzing yeast chromosomes to identify specific contacts between genomic loci. We are also analyzing DNase I hypersensitivity in yeast in order to understand chromatin structure and to determine how nucleosome positioning may change under different environmental conditions.
Recombination Genes We are interested in exploiting the process of recombination, ultimately to promote gene therapy and genome engineering. Toward this end, we are using an assay in which targeted recombination results in a genetically selectable effect to identify proteins that promote recombination in yeast. Using this targeted recombination assay, we are also seeking to identify genes that influence recombination by assaying strains of the yeast deletion set and by testing proteins that have been implicated in recombination.
Yeast Aging S. cerevisiae is a useful organism for studying factors that determine cellular longevity. The aging of mitotically active cells in higher eukaryotes can be modeled by the replicative life span of yeast mother cells, whereas the aging of postmitotic cells resembles the chronological survival of quiescent yeast during stationary phase. We used high-throughput technologies to identify and characterize genes that modify both aspects of the cellular life span. With Brian Kennedy (University of Washington), we determined replicative life span for a significant fraction of the strains in the S. cerevisiae deletion collection and chronological life span for all of the strains. Among the genes identified in both the replicative and chronological assays, several are known components of the TOR signaling pathway. The TOR pathway may be a primary conduit through which excess nutrient availability promotes aging in eukaryotic cells.
Some of this research was supported by grants from the National Institutes of Health.
Last updated: August 24, 2007
|