In my lab, we are interested in the ways genes are turned on and off. Genes contain information encoded in deoxyribonucleic acid (DNA), which is a relatively stable molecule that is good for storing the information. However, for the information to be made useful to conduct the work of the cell, it needs to be copied, or transcribed, into a slightly different chemical form—ribonucleic acid (RNA). The enzymes that “read” the DNA and use it to synthesize corresponding RNA polymers are known as DNA-dependent RNA polymerases (RNAPs).
In bacteria and archaea, a single form of RNAP accomplishes all cellular transcription. But in eukaryotes, there are three essential nuclear RNA polymerases that are required for cell survival. The first of these, RNA polymerase I (Pol I), has a specific job, focusing its efforts on the transcription of hundreds of copies of nearly identical ribosomal RNA genes (rRNA genes). Each rRNA gene can generate a primary transcript (initial RNA) that is subsequently cleaved into three smaller RNAs that assemble with numerous proteins to form the catalytic core of ribosomes, the cellular machines that are responsible for protein synthesis.
RNA polymerase II (Pol II) is incredibly versatile in comparison to Pol I, transcribing thousands of protein-coding genes, long noncoding RNAs (RNAs that do not encode proteins), and small RNAs that modify or process messenger RNAs or ribosomal RNAs. Eukaryotic Pol III is also versatile, although not as gregarious as Pol II, transcribing several classes of smaller RNAs (typically under 500 nucleotides) that include 5S ribosomal RNA, transfer RNAs, small regulatory RNAs, and short interspersed nuclear elements.
In 2000, while investigating the newly sequenced genome of the model plant Arabidopsis thaliana, I stumbled upon the fact that plants are unique among eukaryotes in having catalytic subunits for two multisubunit RNA polymerases in addition to the ubiquitous eukaryotic polymerases I, II, and III. These novel polymerases, which we have since named Pol IV and Pol V, are not essential for viability under laboratory conditions, unlike Pols I, II, and III. However, Pol IV and Pol V turn out to be important enzymes that play distinct roles in gene-silencing pathways that help keep transposable elements, viruses, and other genomic repeats under control.
The functions of Pol IV and Pol V are best understood with respect to the RNA-directed DNA methylation pathway, in which production of 24-nucleotide small interfering RNAs (siRNAs) brings about the methylation of cytosines within complementary DNA sequences, as well as chemical changes within the histone proteins that wrap and organize the DNA. These changes in chromatin, the term used to describe the combination of DNA and its tightly associated proteins, can block the association of proteins required for gene activation and/or they can recruit additional proteins that make the DNA inaccessible to RNA polymerases. Available evidence indicates that Pol IV acts at the beginning of the RNA-directed DNA methylation pathway, generating the initial transcripts that serve as precursors for siRNA biogenesis. By contrast, Pol V acts late in the pathway, with our work suggesting that Pol V’s role is to generate transcripts to which siRNAs bind, in association with an Argonaute protein (primarily AGO4), thereby targeting the silencing machinery to these loci (Figure 1).
We recently determined the complete subunit compositions (12 subunits each) of Pols II, IV, and V in Arabidopsis, showing definitively that Pol IV and Pol V evolved as specialized forms of Pol II. This finding fits with observations in fission yeast that show that RNA-mediated transcriptional silencing involves small RNAs that bind to long noncoding RNAs at silenced loci, with all of the requisite RNAs made by Pol II. During plant evolution, it appears that duplication of Pol II subunit genes allowed for the diversification of Pol II functions, such that silencing functions were eventually taken over by Pol IV and Pol V but essential functions, such as synthesis of RNAs encoding proteins, were retained by Pol II.
Identifying the subunits and sequence changes responsible for the unique functions of Pols II, IV, and V is one of several long-term goals for my lab. We do not understand how Pol IV and Pol V find their target sites in the genome, and we do not know if their transcription is programmed by conventional promoters and transcription factors or possibly by distinctive chromatin structures. We need to develop in vitro systems in which to define the templates and products of Pol IV and Pol V transcription. We also need to understand how the activity of Pol IV and Pol V is physically coupled with other proteins that are required for RNA-directed DNA methylation. These goals will occupy the attention of my laboratory in the years to come.
The second major focus of my laboratory is nuclear dominance, an epigenetic phenomenon that occurs in genetic hybrids and describes the preferential transcription of rRNA genes inherited from only one of the two parents. For many years, it was thought that the basis for nucleolar dominance was the preferential activation of one set of rRNA genes due to their (hypothetical) higher binding affinity for one or more limiting transcription factors. However, we showed instead that the molecular basis for nuclear dominance is the preferential silencing of one set of rRNA genes via a mechanism that includes repressive chromatin modifications, including cytosine methylation and a series of repressive histone modifications. Using genetic approaches, we have identified many of the players required for the preferential silencing of rRNA genes in nucleolar dominance, including specific DNA methyltransferases, methylcytosine-binding domain proteins, histone deacetylases, and histone methyltransferases (Figure 2).
The rationale for nucleolar dominance appears to be dosage control. Eukaryotes have hundreds to thousands of rRNA genes, not all of which are needed in every cell at every time in development. As a result, excess rRNA genes are shut off, and this occurs in nonhybrids as well as hybrids that display nuclear dominance. We find that similar chromatin modifications occur in hybrids and nonhybrids alike, indicating that nuclear dominance is just one manifestation of the dosage control system that operates in all eukaryotes.
A major unanswered question is how the two sets of rRNA genes of a hybrid are discriminated from one another within the nucleus, and how one set is chosen for inactivation. It is still not clear whether the mechanisms responsible for rRNA gene silencing act at the level of individual rRNA genes or on a larger scale, perhaps on the scale of tens or hundreds of rRNA genes that are clustered together at specific chromosomal loci known as nucleolus organizer regions. We have published evidence that rRNA gene silencing is mediated by siRNAs that program RNA-directed DNA methylation. Knocking out the activity of several proteins required for RNA-directed DNA methylation is sufficient to abolish nuclear dominance in hybrids. Moreover, precise RNA base-pairing interactions could provide the specificity required to discriminate between rRNA genes that are extremely similar in sequence but not quite identical. For these reasons, we are eager to understand whether we can manipulate the levels of regulatory RNAs, as well as understand their origins and developmental regulation, thereby elucidating the choice mechanism responsible for differential rRNA gene silencing.
Grants from the National Institutes of Health provided partial support for these projects.
As of May 30, 2012