In my lab, we are studying how genes can be selected for silencing and then turned off. Genes contain information encoded in deoxyribonucleic acid (DNA), which is a relatively stable molecule excellent for storing the information. However, for the information to be made useful for conducting the work of the cell, it needs to be converted, or transcribed, into a slightly different chemical form: ribonucleic acid (RNA). The enzymes that read the DNA code and use it to transcribe corresponding RNA polymers are known as DNA-dependent RNA polymerases (RNAPs, for short).
In bacteria and archaea, a single form of RNAP accomplishes all cellular transcription. But in eukaryotes, there are three essential nuclear RNAPs required for cell survival. The first of these enzymes, RNA polymerase I (Pol I), has a very specific job, the transcription of hundreds of copies of nearly identical ribosomal RNA genes (rRNA genes). Each rRNA gene can generate a transcript that is subsequently cleaved into three smaller RNAs that assemble with ~80 proteins and one other RNA to form ribosomes, the cellular machines that make proteins.
RNA polymerase II (Pol II) is incredibly versatile in comparison to Pol I, transcribing thousands of protein-coding genes, long noncoding RNAs (RNAs that do not encode proteins), and small RNAs that modify or process messenger RNAs or ribosomal RNAs. Eukaryotic Pol III is also versatile, although not as gregarious as Pol II, transcribing several classes of smaller RNAs (typically shorter than 500 nucleotides) that include 5S ribosomal RNA, transfer RNAs, small regulatory RNAs, and short interspersed nuclear elements.
In 2000, while investigating the newly sequenced genome of the model plant Arabidopsis thaliana, I stumbled upon the fact that plants are unique among eukaryotes in having catalytic subunits for two more multisubunit RNA polymerases, in addition to the ubiquitous polymerases I, II, and III present in all eukaryotes. These novel polymerases, which we named Pol IV and Pol V, are not essential for viability under laboratory conditions. However, Pol IV and Pol V are important enzymes that play distinct roles in gene-silencing pathways that help keep transposable elements, viruses, and a subset of genes under control.
The functions of Pol IV and Pol V are best understood with respect to the RNA-directed DNA methylation pathway (Figure 1). In this process, production of 24-nucleotide small interfering RNAs (siRNAs) brings about the methylation of cytosines within matching DNA sequences, accompanied by chemical modifications of the histone proteins that interact with the methylated DNA. These changes in chromatin (the term used to describe the combination of DNA and tightly associated proteins) can block the association of proteins required for gene activation by helping make the DNA inaccessible to Pols I, II, or III. Our studies have shown that Pol IV acts at the beginning of the RNA-directed DNA methylation pathway, acting in physical association with an RNA-dependent RNA polymerase, RDR2. Together, Pol IV and RDR2 synthesize short double-stranded RNAs that are cut by DCL3 into 24 nt siRNA. Pol V acts at a different stage of the pathway, synthesizing transcripts to which the siRNAs bind, in association with an Argonaute protein (primarily AGO4) required to recruit the silencing machinery.
We were the first to determine the complete subunit compositions (12 subunits each) of Pols II, IV, and V in Arabidopsis and maize, showing definitively that Pol IV and Pol V evolved as specialized forms of Pol II. This fits with the fact that non-plant eukaryotes also carry out RNA-mediated transcriptional silencing using small RNAs that bind to long noncoding RNAs; but in these organisms, the RNAs are made by Pol II. During plant evolution, it appears that duplication of Pol II subunit genes allowed for the diversification of enzyme functions, with silencing functions taken over by Pol IV and Pol V and essential functions, such as synthesis of messenger RNAs, retained by Pol II.
We, and others, have shown that Pol IV and Pol V transcription does not appear to involve conventional promoters to specify transcription start sites. Instead, Pols IV and V are apparently recruited to chromatin that is marked by heritable DNA methylation patterns perpetuated in crosstalk with chemical modifications of the associated histones. But we do not understand how Pol IV and Pol V initiate transcription once they are recruited. Toward this end, we have developed the first in vitro systems in which to study details of Pol IV and Pol V transcription, the coupling of Pol IV and RDR2 activities, and the roles of helper proteins. These studies are a current focus that will keep us busy for some time.
The second major focus of my laboratory is nuclear dominance, an epigenetic phenomenon that occurs in genetic hybrids and describes the preferential transcription of rRNA genes inherited from only one of the two parents. For many years, it was thought that the basis for nucleolar dominance was the preferential activation of one set of rRNA genes due to their (hypothetical) higher binding affinity for one or more limiting transcription factors. However, we showed instead that the molecular basis for nuclear dominance is the preferential silencing of one set of rRNA genes via repressive chromatin modifications that include cytosine methylation and a series of repressive histone modifications. Using genetic approaches, we identified many of the players required for preferential rRNA gene silencing, including specific DNA methyltransferases, methylcytosine-binding domain proteins, histone deacetylases, and histone methyltransferases.
The rationale for nucleolar dominance is dosage control. Eukaryotes have hundreds, and sometimes thousands, of rRNA genes, not all of which are needed in every cell at every time in development. As a result, excess rRNA genes are shut off, and this occurs in non-hybrids as well as in hybrids that display nuclear dominance. We have shown that similar chromatin modifications occur in hybrids and non-hybrids, indicating that nuclear dominance is just one manifestation of the rRNA gene dosage control system that operates in all eukaryotes.
A question unanswered for several decades is: how are specific rRNA genes chosen for silencing when all rRNA genes are nearly identical in sequence? We recently achieved a breakthrough, showing that selective silencing is not achieved one rRNA gene at a time based on their sequences. Instead, selective silencing occurs on a much larger scale via the inactivation of entire nucleolus organizer region (NORs), the chromosomal loci where rRNA genes are clustered in hundreds of copies. In the Arabidopsis thaliana strain Col-0, the NOR on chromosome 2 is selectively silenced during development whereas the rRNA genes on chromosome 4 stay active (Figure 2). Identifying the basis for chromosome-specific NOR inactivation is now our challenge.
Our work is supported, in part, by funding from the National Institutes of Health, the Gordon and Betty Moore Foundation, and the Carlos O. Miller endowment at Indiana University.
As of March 9, 2016