Gene expression is a fundamental process in which DNA is transcribed to RNA, the direct template for protein synthesis. The regulation of the amount and types of protein made is critical to the cell's viability and its identity. During development of multicellular organisms, the scheme of events that determine exactly what type of cell will form becomes status quo in the adult so that adult cell division gives rise to the identical cell type, preserving tissue identity. Since all cells of a given organism contain the same genetic material, what controls the expression of the different proteins that specify the attributes of the different cell types? This complex and fascinating process is functionally dependent on the proteins that structure the body of DNA. My laboratory's long-term goal is to determine how and when a gene is transcribed, and what controls this process. To do this, we set out to determine the criteria that enable or disable transcription as a function of increasingly complex DNA structure.
We focused initially on naked DNA and the requirements for the formation of a transcription complex that can synthesize RNA. We first established a fully reconstituted transcription system in vitro. With all the components defined, we discovered how different families of protein factors regulate the transcription process so that the large RNA polymerase II (RNAPII) molecule can perform the actual work of migrating rapidly along the gene's DNA to produce RNA.
The milieu of cellular genes is considerably more elaborate. Two meters of DNA are packed in each human cell nucleus, in an ordered hierarchy. First, the DNA itself is wrapped around proteins called histones, forming spools called nucleosomes. The resulting fiber, which has an 11-nanometer diameter, resembles beads on a string. This is then coiled further into a 30-nm fiber, and this, in turn, is structured into higher-order formations, up to the scale of the chromosomes. This remarkable consolidation poses a problem: how are the genes to be transcribed to RNA untangled? To learn how these structures are made amenable to transcription, we reconstituted the 11- and 30-nm fibers in vitro.
We found that nucleosomes composing the 11-nm fiber present a barrier to RNAPII, and we discovered two novel activities that allow the polymerase to overcome this barrier. RSF (remodeling and spacing factor), which acts at the stage of transcription initiation, interacts with transcriptional activators and renders the promoter sequence accessible to the transcription complex. It does this by mobilizing nucleosomes in an ATP-dependent manner and then depositing them back via its chaperone activity. The other activity, FACT (facilitates chromatin transcription), allows the polymerase to read the DNA while it is still spooled on the nucleosomes. FACT removes half of the histones ahead of the polymerase, allowing the polymerase molecule to burrow its way through the spool, transcribing the DNA as it goes.
We recently established in vitro conditions to reconstitute the more complicated and higher ordered 30-nm fiber. Although this structure is also refractory to transcription, the model inducible promoter that we inserted was accessible for binding by some specific transcription factors. It was not until transcription was induced, however, that the full dynamics elicited by this process were revealed. The physical positions of the nucleosomes were altered: the full cohort of transcription factors could now access the promoter. The repositioning of the nucleosomes actually occurred in a patterned and cyclical manner: the presence of the transcription factors also oscillated. We observed these same dynamics at the endogenous promoter in vivo.
As we try to understand how transcription occurs on increasingly complex DNA structures by recapitulating this process in vitro, we also investigate the proteins that dynamically alter the composition of the nucleosomes. It is the amalgam of distinct nucleosomal features that controls DNA accessibility. Over the past years, we and our colleagues in the transcription field have found that the histone tails—unstructured, hook-like projections that extend outward from each nucleosome—are the key to the extent of DNA folding. The post-translational modifications that these tails receive determine whether they lock together tightly or loosely and thus whether the genes spooled around them are silent or active, respectively. One such modification is methylation of lysine residues in the histone tails.
Depending on which lysine of which histone protein is methylated, the resulting modified nucleosomes become participants in the transcriptional status of the underlying genes. For example, our studies revealed two distinct activities, L3MBTL1 and Ezh1, each of which binds to and actively compacts chromatin when the nucleosomes carry certain methylated histones. Although the means by which these proteins compact chromatin are different, the end result is an altered structure and gene repression. The proteins responsible for histone lysine methylation (methylases) and their regulation are pivotal to this process.
Transmitting cellular identity during cell division may require that methylations on specifically positioned nucleosomes be propagated along with the DNA so that the patterning of gene expression is duplicated. A specific methylase would somehow recognize already methylated parental histone residues, and then target and methylate the similarly positioned histone lysine residues on newly synthesized daughter chromatin.
A big step toward our goal of understanding how dividing cells can retain their proper identity was our recent isolation of the human protein PR-Set7, which methylates the histone H4 polypeptide at a specific residue, lysine 20 (H4K20). We discovered that such nucleosomes correlate with repressive chromatin in vivo and that the expression of PR-Set7 is regulated during cell division, occurring only during early mitosis. PR-Set7 binds to mitotic chromosomes during prometaphase, prior to their separation. The enzyme is critical for development in the mouse, as we found that a homozygous knockout of PR-Set7 resulted in embryonic lethality.
We postulate that binding of PR-Set7 to mitotic chromosomes establishes the basis for propagation of this mark through cell divisions. However, PR-Set7 does not by itself recognize methylated H4K20; instead it interacts with L3MBTL1 that does bind methylated H4K20. In this scenario, PR-Set7 recognizes methylated H4K20 through its interaction with L3MBTL1 in the mother chromosomes and then transmits this mark to the daughter chromosomes before separation. This is an exciting possibility, because the modification can be passed down to daughter cells, telling them which genes are silent and which are active—in other words, determining the cell's identity.
A further advance on this front involves Ezh2, a member of the Polycomb group of proteins that maintain Hox gene repression during development. Ezh1 and Ezh2 are homologs, and we found that Ezh1 compacts chromatin directly. However, Ezh2 methylates histones and participates in a dynamic interplay between histone methylation and acetylation. The methylation activity of Ezh2 is modulated during its association with other proteins with which it forms the Polycomb repressive complex 2 (PRC2). Ezh2 can methylate H3K27 and H1K26, both associated with silenced chromatin.
Our most recent findings show that another PRC2 component, Eed, possesses a unique aromatic cage that binds to several methylated histone residues. Only upon Eed binding to methylated H3K27, however, is the Ezh2 component of PRC2 stimulated in its methylation activity toward H3K27. This pinpoints how the PRC2 complex recognizes already methylated H3K27 and how another subunit can modify Ezh2 activity to methylate new H3K27 candidates. Studies from other labs established that the expression of Ezh2 and some of its partner proteins is regulated by the cell cycle and coordinated with DNA replication. Thus, although histone methylation by PR-Set7 may establish heritable cellular identity during mitosis, histone methylation by Ezh2 may do so during DNA replication.
Since the nucleosomes may embody some of the cues that mandate specific gene expression, we are now trying to understand exactly how the component histones are deposited on newly synthesized DNA such that the parental cues are inherited faithfully and the cells that arise after division exhibit the same identity. This is a fundamental issue in our studies of how gene expression is regulated and its integrity preserved among subsequent generations of cells.