We study small RNAs that regulate gene expression. Our main focus is on microRNAs (miRNAs), which are ~22-nt RNAs that specify gene repression by base-pairing to messages of protein-coding genes. Our lab is uncovering both the widespread influence of miRNAs on metazoan gene expression and the roles that miRNAs play during growth and development of plants and animals. For example, our work indicates that more than half of human protein-coding genes are conserved regulatory targets of miRNAs and that the miRNA regulation of one of these genes is important for preventing human cancers.
Genomics of MicroRNAs
Our lab was among those that discovered the abundance of miRNAs. These tiny endogenous RNAs are encoded by unusual genes whose primary transcripts form distinctive hairpin structures from which the miRNAs are processed. Each hairpin is processed to generate the mature ~22-nt miRNA, which is then incorporated into a silencing complex, where it can specify post-transcriptional gene repression by base-pairing to messages of protein-coding genes.
We previously developed cloning and computational strategies for miRNA gene discovery in animals and plants and found, for example, that humans have hundreds of miRNA genes. We have also shown that many miRNAs are present at levels of more than 1,000 molecules per cell, with some exceeding 50,000 molecules per cell, indicating that the miRNAs, together with their associated proteins, are among the more abundant ribonucleoprotein complexes found in animal cells. Furthermore, many of the miRNAs from nematodes, insects, and mammals are related to each other, suggesting important functions since the common ancestor of these diverse animal lineages.
To explore more thoroughly the genomics and evolution of miRNAs and other endogenous silencing RNAs, we are using high-throughput sequencing to obtain millions of sequencing reads representing small RNAs from diverse plants and animals. For example, we find both miRNAs and another class of small regulatory RNAs known as piRNAs (described below) in the most deeply branching animals, implying that these two classes of regulatory miRNAs have been available to shape gene expression since the beginning of multicellular animal life. Recent analyses of more nematode, fly, and mammalian sequences have revealed additional miRNAs, including some that had been missed earlier because their initial processing involves splicing rather than the specialized machinery that generates canonical miRNAs.
Other Small Regulatory RNAs
In addition to new miRNAs, our sequencing of small RNAs has also revealed previously unknown types of RNAs, including heterochromatic siRNAs (small interfering RNAs) and other types of endogenous siRNAs. Particularly intriguing have been the 21U-RNAs found in nematodes. Precisely 21 nucleotides long, they begin with a uridine but are diverse in their remaining 20 nucleotides. 21U-RNAs originate from >15,000 genomic loci dispersed in two broad regions of one chromosome—primarily between protein-coding genes or within their introns. These loci share a large upstream motif that is conserved in other nematodes, presumably because of its importance for producing this class of small RNAs. In a collaboration with Craig Mello (HHMI, University of Massachusetts Medical School), we recently found that 21U-RNAs are expressed specifically in the germline and are associated with a Piwi-related protein, which is needed for normal germ-line development and fertility. In flies and mammals, Piwi-interacting RNAs (piRNAs) are also found in the germline, where they direct transposon silencing and can have other roles in gamete production. Although the 21U-RNAs differ from the piRNAs found in other animals with respect to their size and biogenesis, the 21U-RNAs appear to function as the piRNAs of nematodes.
Regulatory Targets of MicroRNAs
The discovery of hundreds of miRNA genes raised the question of what all these tiny RNAs are doing. To address this question, we have developed methods of predicting miRNA targets without bringing in too many false-positive predictions. In plants, the miRNAs have extensive pairing to their targets, and the evolutionarily conserved targets are mostly genes that play important roles during development. In animals, the miRNAs usually recognize shorter sites (typically 7 or 8 nucleotides in length) that match a short region of the miRNA containing the "seed" sequence. Animal miRNAs have a great abundance and diversity of targets. In collaboration with Christopher Burge (Massachusetts Institute of Technology), we recently showed that more than half of the human protein-coding genes have been under selective pressure to maintain pairing to miRNAs. When nonconserved targeting is considered, the fraction of human genes regulated by miRNAs grows even higher.
Although a 7mer site matching a miRNA often mediates some repression, it is not always sufficient, indicating that other characteristics help specify targeting. Using both computational and experimental approaches, we uncovered five general features of site context that boost site efficacy. Combining these features, we constructed a model of target recognition that successfully predicts site performance, thereby providing an important resource for choosing which of the many miRNA-target relationships are most promising for experimental follow-up. Because our approach distinguishes effective from ineffective sites without recourse to evolutionary conservation, it also identifies effective nonconserved sites and siRNA off-targets. Our target predictions for mammals, flies and nematodes can be viewed at www.TargetScan.org, with ranking of mammalian predictions available according to their predicted efficacy or their preferential conservation.
To complement our ongoing studies of the influence of metazoan miRNAs on mRNA abundance and evolution, we have been collaborating with Steven Gygi's lab (Harvard Medial School) to use quantitative mass spectrometry to examine the impact of introducing or deleting individual miRNAs on the output of thousands of proteins. The identities of the responsive proteins and the extent of their response correspond well with our previous predictions. For most targets, mRNA destabilization explains more of the repression than does translational repression. Hundreds of genes are directly repressed by individual microRNAs, albeit each to a modest degree, indicating that for most interactions, microRNAs act as rheostats to make fine-scale adjustments to protein output.
Biological Functions of MicroRNAs
By disrupting the regulation of particular targets, several groups, including ours, have demonstrated the importance of miRNA-directed regulation during each stage of plant development. Our efforts, often in collaboration with Bonnie Bartel's lab (Rice University), have focused on the repression of the messages of NAC-domain transcription factors, auxin-response transcription factors, HD-Zip transcription factors, and ARGONAUTE1, a protein crucial for plant miRNA function. With regard to miRNA function in mammals, we showed that miR-196 directs the cleavageof HoxB8 mRNA during mouse embryonic development and also appears to repress paralogous Hox genes. More recently, we worked with Michael Hemann (Massachusetts Institute of Technology) to show that disrupting the miRNA regulation of the Hmga2 oncogene enhances oncogenic transformation. Because many human tumors possess defective HMGA2 genes that lack the miRNA complementary sites, our work indicates that losing miRNA regulation of this oncogene contributes to human cancers.
RNA-Catalyzed RNA Polymerization
The RNA world hypothesis states that early life forms lacked protein enzymes and depended instead on enzymes composed of RNA. This hypothesis relies on the premise that some RNA sequences can catalyze RNA replication. In support of this idea, we have created an RNA molecule that catalyzes the type of polymerization reaction needed for RNA replication. The ribozyme uses nucleoside triphosphates and the coding information of an RNA template to extend an RNA primer by the successive addition of up to 14 nucleotides—more than a complete turn of an RNA helix. Its polymerization is quite accurate, and most importantly, it is general in terms of the sequence and length of the primer and template RNAs, as would be needed for self-replication and evolution. We are examining the catalytic and structural features of this ribozyme and its predecessors.