The ability of B lymphocytes to generate diverse B cell antigen receptor (BCR) repertoires and effector antibodies relies on two different programmed genomic rearrangements of immunoglobulin (Ig) genes that fuse widely separated DNA double-strand breaks (DSBs). During primary B cell development, exons that encode antigen-binding Ig variable (V) regions are assembled from Ig V, D (diversity), and J (joining) gene segments via V(D)J recombination. Antigen-dependent activation of mature B cells leads to IgH class-switch recombination (CSR), a process that fuses DSBs in two different switch (S) regions to replace the initially expressed IgH constant-region exons (CHs) with a different CH and thereby change antibody class and pathogen elimination functions.
Chromosomal translocations are aberrant genomic rearrangements that fuse DSBs at different chromosomal locations to produce interchromosomal or intrachromosomal translocations that manifest as deletions or inversions. V(D)J recombination and CSR share mechanistic features with translocations, including requiring physical juxtaposition (synapsis) of DSBs within two widely separated sequences followed by their fusion via end joining. In this regard, V(D)J recombination or CSR DSBs occasionally are joined to DSBs near cellular oncogenes to generate translocations that contribute to lymphoid cancers.
Early on, we discovered that Ig and T cell receptor (TCR) gene V(D)J recombination is carried out by a common V(D)J recombinase. Subsequent studies of others showed that the recombination-activating gene (RAG) endonuclease introduces the specific DSBs at Vs, Ds, or Js; our studies demonstrated that these broken gene segments are fused by the classical nonhomologous DNA end-joining DSB repair pathway (C-NHEJ). We also discovered that developmentally ordered, lineage-specific, and feedback-regulated antigen receptor locus V(D)J recombination is mediated epigenetically by modulating substrate gene segment accessibility to V(D)J recombinase. With others, we documented accessibility correlates, including noncoding transcription and chromatin modifications. However, key cis regulatory elements, beyond known Ig and TCR gene enhancers and promoters, remained elusive.
The mouse IgH locus spans 2.7 Mb, with the Vs embedded in a several-megabase distal portion separated from Ds and Js by a 100 kb intergenic region. We identified a key V(D)J recombination regulatory locus, intergenic control region-1 (IGCR1), within this 100 kb interval. Functionally, IGCR1 employs CTCF looping/insulator factor-binding elements and, correspondingly, mediates IgH loops containing distant enhancers. IGCR1 diversifies antibody repertoires by suppressing transcription and rearrangement of proximal IgH Vs, thereby promoting rearrangement of distal Vs. IGCR1 also maintains various known regulatory aspects of IgH V(D)J recombination by suppressing joining of IgH Vs to Ds not already joined to Js and by suppressing IgH V-to-DJ joins in thymocytes. Beyond providing insights into IgH V(D)J recombination control, elucidation of IGCR-1 also implicated a new role for the generally expressed CTCF protein.
We also implicated additional factors in V(D)J recombination joining. We found that deficiency for XLF, a putative C-NHEJ factor, identified based on its mutation in immunodeficient human patients, did not affect V(D)J recombination in mice. Our search for potential compensatory factors implicated multiple ataxia-telangiectasia-mutated protein (ATM) DSB response (DSBR) factors, including ATM and downstream factors histone H2AX and 53BP1. Thus, while deficiency for any of these DSBR factors alone has modest impact on V(D)J recombination, combined deficiency with XLF abrogates V(D)J joining. These findings indicate that XLF shares fundamental roles in V(D)J recombination with the broader ATM-dependent DSBR, but that these roles were masked by functional redundancy. Such roles, which potentially include end tethering, are still under investigation.
V(D)J exons are assembled upstream of a CH (Cμ) that encodes IgM antibodies. CSR replaces Cμ with a CH lying 100–200 kb downstream that encodes another IgH class (e.g., IgG, IgE, IgA) via a DSB generation and end-joining process. CSR is initiated by activation-induced cytidine deaminase (AID), a single-strand DNA (ssDNA)-specific cytidine deaminase that generates S-region lesions that are processed into DSBs. We had shown that transcription targets AID, in association with modifications/cofactors, to the nontemplate strand of duplex DNA substrates. However, given that AID acts on both DNA strands, the mechanism of template-strand targeting, which could be shielded by nascent RNA transcripts, was enigmatic. Through biochemical and other approaches, we found, in collaboration with former trainee Uttiya Basu (Columbia University), that the RNA exosome, a cellular RNA degradation/processing machine, provides template-strand AID access, revealing a role for noncoding RNA surveillance machinery in antibody diversity.
Beyond Ig genes, AID acts on a limited set of off-target genes, generating substrates for oncogenic translocations and mutations that contribute to B cell lymphoma. Our genomic translocation cloning work, as well as work of others, identified a small set of AID off-targets in CSR-activated B cells. How AID is recruited to such off-targets was another mystery. Based on deep genome-wide nuclear run-on sequencing of peripheral B cells activated for CSR or somatic hypermutation (SHM), we discovered that most robust AID off-target translocations occurred within highly focal regions of genes in which sense and antisense transcription converge. We found that this convergent transcription arises via antisense transcription emanating from "super enhancers" within sense-transcribed gene bodies. This work, done collaboratively with Shirley Liu and James Bradner (Dana-Farber Cancer Institute), explains why AID off-targeting is directed to a small subset of mostly lineage-specific genes in activated B cells, a gene subset highly enriched in B cell lymphoma oncogenes.
We found that DSBs induced by the yeast I-SceI endonuclease at I-SceI target sites in place of an acceptor IgH S region allowed I-SceI-dependent CSR in B cells at nearly physiological levels via a translocation-like process in which I-SceI DSBs were joined to AID-initiated Sμ DSBs. Based on this, we developed high-throughput genome-wide translocation sequencing (HTGTS) to map, at the nucleotide level, translocation junctions between I-SceI or custom endonuclease "bait" DSBs generated in a specific genomic location and other endogenous "prey" DSBs genome-wide. HTGTS revealed that major endogenous translocation hot spots in CSR-activated B cells and progenitor B cell lines were recurrent DSBs generated, respectively, by AID and RAG.
We employed HTGTS to test our hypothesis that the frequency at which the ends of two separate DSBs join to each other in the three-dimensional (3D) genome is a function of the frequency at which DSBs are present at each site and the frequency with which the two sites are synapsed in a cell population. Consistent with this notion, our collaborative studies with Job Dekker (University of Massachusetts) revealed that two high-frequency DSBs can dominate genome-wide translocation landscapes regardless of chromosomal location because of cellular heterogeneity in 3D genome organization, allowing most loci to be synapsed in a subset of cells. Likewise, in the absence of dominant DSBs, synapsis frequency of two broken sites contributes more strongly to translocation frequency. Thus, treating cells with ionizing radiation to generate large numbers of random DSBs leads to the length of any given chromosome being a translocation hot-spot region for DSBs within it because of the greater probability of two DSBs being proximal on the same cis chromosome versus on different chromosomes.
Within a cis chromosome, we found that translocation frequency is further enhanced between sequences lying within megabase or sub-megabase distances, which may be related to recently characterized topological domains that contribute to increased interaction frequencies between sequences within them. Our findings suggest how physiological levels of CSR joining are achieved between S-region DSBs linearly separated by hundreds of kilobases. First, location of S regions within megabase topological chromatin domains leads to their frequent synapsis in activated B cells, perhaps enhanced by specialized IgH-looping features. Second, S regions, as specialized AID targets, generate sufficiently frequent DSBs that the probability of DSBs being present in two synapsed S regions is sufficient to drive their physiological joining. We are testing this model and related hypotheses that such general chromatin topological factors contribute to synapsis of V, D, and J segments for RAG cleavage and promote recurrent interstitial chromosomal deletions found in certain cancers.
Grants from the National Institutes of Health and the Leukemia and Lymphoma Society of America provided support for some of these studies.
As of March 22, 2016