Skip to main content
Molecular and Cellular Biology logoLink to Molecular and Cellular Biology
. 2009 Mar 23;29(11):3124–3133. doi: 10.1128/MCB.00139-09

G Clustering Is Important for the Initiation of Transcription-Induced R-Loops In Vitro, whereas High G Density without Clustering Is Sufficient Thereafter

Deepankar Roy 1, Michael R Lieber 1,*
PMCID: PMC2682002  PMID: 19307304

Abstract

R-loops form cotranscriptionally in vitro and in vivo at transcribed duplex DNA regions when the nascent RNA is G-rich, particularly with G clusters. This is the case for phage polymerases, as used here (T7 RNA polymerase), as well as RNA polymerases in bacteria, Saccharomyces cerevisiae, avians, mice, and humans. The nontemplate strand is left in a single-stranded configuration within the R-loop region. These structures are known to form at mammalian immunoglobulin class switch regions, thus exposing regions of single-stranded DNA for the action of AID, a single-strand-specific cytidine deaminase. R-loops form by thread-back of the RNA onto the template DNA strand, and here we report that G clusters are extremely important for the initiation phase of R-loop formation. Even very short regions with one GGGG sequence can initiate R-loops much more efficiently than random sequences. The high efficiencies observed with G clusters cannot be achieved by having a very high G density alone. Annealing of the transcript, which is otherwise disadvantaged relative to the nontemplate DNA strand because of unfavorable proximity while exiting the RNA polymerase, can offer greater stability if it occurs at the G clusters, thereby initiating an R-loop. R-loop elongation beyond the initiation zone occurs in a manner that is not as reliant on G clusters as it is on a high G density. These results lead to a model in which G clusters are important to nucleate the thread-back of RNA for R-loop initiation and, once initiated, the elongation of R-loops is primarily determined by the density of G on the nontemplate DNA strand. Without both a favorable R-loop initiation zone and elongation zone, R-loop formation is inefficient.


Immunoglobulin (Ig) class switch recombination (CSR) is the process in which IgM is changed to IgG, IgA, or IgE by rearranging the Ig heavy chain from IgHμ to IgHγ, IgHα, or IgHɛ (7, 11, 40). This DNA recombination process occurs at class switch sequences located upstream of the corresponding constant domain exons. The class switch sequences are long (1 to 12 kb); repetitive, with unit repeat lengths of 25 to 80 bp; transcribed by a promoter immediately upstream of each switch region; and strikingly G-rich on the nontemplate strand, with G densities reaching 40 to 50% (48). Despite conservation of these features, the actual primary switch repeat sequences themselves are not conserved across species or even among the switch sequences of the different isotypes (e.g., Igμ, Igγ, Igα, and Igɛ) (15). Even the unit repeats within any one switch region of a given species vary from one repeat to the next, such that not one of the individual repeats precisely matches the average sequence of that switch region.

Activation-induced deaminase (AID) is a cytidine deaminase that is expressed in activated B cells and is essential for Ig CSR and Ig somatic hypermutation (SHM) (31). AID only acts on cytosines located in single-stranded DNA (ssDNA) (6). This raises the question of how the DNA becomes single-stranded so that AID can act on these genomic regions (48). The promoters upstream of the switch region are critical for CSR, indicating that transcription is critical (5, 16, 21, 46, 49). Transcription is also critically important for SHM, suggesting that some level of ssDNA is somehow exposed to AID during transcription (28). Indeed, transcription by both mammalian RNA polymerase II and prokaryotic or phage polymerases can generate ssDNA upstream of the polymerase to some extent in a manner that is not very well characterized (29). The eukaryotic ssDNA binding protein, RPA, appears to contribute to this exposure of ssDNA in vivo, perhaps by stabilizing the single-stranded state transiently induced by transcription (7, 8, 32, 33). Other proteins that bind either the nascent RNA or the nontemplate DNA strand may modify the efficiency of R-loop formation in vivo (14, 20, 22, 26).

Ig switch regions evolved several hundred million years after SHM already existed (4, 19, 40, 42, 50). Although both SHM and CSR require AID, the processes of CSR and SHM are quite different. SHM is a point mutagenesis process, whereas CSR is a double-strand break recombination process. CSR regions in Xenopus are rich in palindromic forms of preferred sites of AID action (WRC, where W = A or T and R = A or G). Interestingly, upon hyperimmunization, amphibians do not switch nearly as efficiently as mammals (13). At least part of the basis of this may be due to the high asymmetric G density in mammals. Mammalian CSR regions achieve G densities of nearly 50% on the nontemplate strand in some repeats, in contrast to amphibian switch regions, which are 21% G, like random vertebrate DNA.

An RNA-DNA hybrid forms at a 140-bp subregion of the Igα switch region upon in vitro transcription with T7 RNA polymerase (34, 35). We showed that RNA-DNA hybrids form at all of the tested murine switch regions anywhere within the length of their repetitive regions (9). The RNA-DNA hybrids are stable for days and stable to phenol-chloroform extraction. The RNA-DNA hybrids are also stable to shorter term exposure to temperatures of 65 to 75°C. We showed that the structure of these RNA-DNA hybrids is an R-loop, with the G-rich DNA strand displaced by the G-rich nascent RNA of the same sequence (47). We and others have shown that the number of hydrogen bonds between the RNA and template DNA strand is important for the stability of the R-loop, based on failure of R-loops to form when inosine is substituted for guanine (I:C base pairs share only two hydrogen bonds rather than the three of G·C) (12; T. E. Wilson and M. R. Lieber, unpublished data). Upon RNase H treatment, some misalignment of top strand and bottom strand DNA repeats occurs in vitro, resulting in displaced loops of ssDNA on both strands, and we have proposed that this may occur in vivo as a way to expose single-strandedness on the template DNA strand for AID action (47, 48).

Our in vitro studies do not show any evidence of secondary structure on the nontemplate DNA strand of the R-loops (e.g., G-quartets), and in vitro experiments in which no Na+ or K+ are present (only lithium ion or cesium ion) show unaltered levels of R-loop formation, indicating that R-loops are stable under conditions where G-quartets do not form (37). Therefore, if G-quartets form in vitro on the G-rich nontemplate DNA strand, they are not essential for R-loop formation. Moreover, at chromosomal R-loops, the regions of single-strandedness are continuous (47, 48). One would expect G-quartets to trap intervening Cs, thereby making them resistant to bisulfite, and we do not see this (17). For these in vitro and in vivo reasons, we do not favor the view that G-quartets exist at switch region R-loops; however, one cannot rule out the possibility that G-quartets exist but that physical methods to detect them are limited (12).

We and others have demonstrated that R-loops form at Ig CSR regions in the mouse chromosome in activated B cells at Igμ, Igγ3, Igγ2b, and Igγ1, but not in resting B cells (36, 47). In accord with an R-loop model, inversion of switch regions reduces their efficiency (39). In vivo R-loops can have heterogeneous initiation sites, are continuous until their termination, typically terminate within or shortly downstream of the switch region sequences, and can be removed by RNase H.

R-loops have also been demonstrated in vivo at some other genomic and mitochondrial locations in vertebrate cells and in yeast cells (18, 24-27, 45). R-loops are found at G-rich regions in the chicken cell line DT40, as well as in HeLa cells, when the cells were depleted of ASF/SF2 proteins that might aid in ribonucleoparticle formation (26). Saccharomyces cerevisiae mutants for a component of the THO complex (which is involved in ribonucleoparticle biogenesis and has functional roles in maintaining genomic stability during transcription and integrates transcriptional elongation with mRNA export) also exhibit apparent R-loop formation in GC-rich sequences upstream of the RNA polymerase elongation complex. Importantly, this location is a hot spot for mitotic recombination in wild-type yeast, and the hot spot is suppressed with overexpression of RNase H (1, 14, 18).

More recently, we showed that in vitro R-loops form by a thread-back mechanism and form less efficiently as the number of switch repeats decreases and as the G cluster size of the repeats decreases (37). More specifically, we showed that a sequence of 50% G where the Gs are alternating with A, C, or T does not form R-loops nearly as well as a sequence of 50% G in which the Gs are clustered. Even a longer R-loop zone with a high G density but no clustering cannot match the R-loop formation efficiency of shorter G-clustered sequences of equivalent G density.

Here, we find that the region of initiation of R-loops, termed the R-loop initiation zone (RIZ), relies on one or two clusters of a few Gs in the RIZ and, without these, R-loop initiation and R-loop formation are inefficient. We believe that this is because clusters of Gs have higher thermodynamic stability in an RNA-DNA hybrid. This offsets the disadvantageous steric reasons that contribute to the inefficiency of the RNA strand to compete with the nontemplate DNA strand for the opportunity to anneal with the template DNA strand. Downstream of the RIZ, the R-loop elongation zone (REZ) can support extension of the R-loop with merely a high G density and does not require G clustering. The distinction between sequence and stability requirements for the R-loop initiation and elongation zones is essential to permit predictions of where, within transcription units, R-loops may initiate or extend in the genome. The in vitro findings here provide a better understanding of the mechanistic aspects of R-loop formation and fit very well with the in vivo observations of where R-loops initiate and extend.

MATERIALS AND METHODS

Oligonucleotides and plasmid substrates.

All of the substrates were constructed by cloning a small double-stranded insert downstream of a T7 promoter sequence and immediately upstream of the wild-type or modified Ig switch region. For constructing substrates shown in Fig. 1, we digested the parent substrate (pDR18, pDR22, pDR26, or pDR54) with SacI and blunted the 3′ overhangs with T4 DNA polymerase before ligating the short double-stranded inserts containing sequences for motif A, C, or B. Motif A was made by annealing oligonucleotides DR122 (5′-GGTGCTGGGGTAGG-3′) and DR123 (5′-CCTACCCCAGCACC-3′). Motif C was made by annealing DR126 (5′-GGTGCTCGACTACA-3′) and DR127 (5′-TGTAGTCGAGCACC-3′), while motif B was made by annealing DR124 (5′-TGCACTCGATCTAT-3′) and DR125 (5′-ATAGATCGAGTGCA-3′). pDR70 was made by cloning the annealed double-stranded product of oligonucleotides DR118 (5′-TCGAGCGTGCGAGCGCGAGAGCGTGAGTGCGTGAGCGAGCGCGTGAGCGC-3′) and DR119 (5′-TCGAGCGCTCACGCGCTCGCTCACGCACTCACGCTCTCGCGCTCGCACGC-3′) in the XhoI site immediately upstream of the first of the three Sγ3 repeats in pDR51. The clones containing DR122, DR126, DR124, or DR118 sequences in their respective nontemplate DNA strands were selected by sequencing and used for in vitro transcription experiments. DNA purification on CsCl gradients and subsequent procedures were followed as described previously (37).

FIG. 1.

FIG. 1.

Locations of G in the substrates used for studying the effects of G clusters in the RIZ. As reflected by the substrate names on the left, the substrates are organized in groups of three (A, C, and B). The positions of G on the nontemplate strand are displayed as solid circles. The first set of three substrates (pDR18 set) has clusters of mostly GGGGs in the REZ. The top molecule (pDR18A) has two additional GGGG clusters in the RIZ, which is upstream of the REZ. The second substrate (pDR18C) has one additional GGGG-cluster motif in the RIZ, and the last molecule in the set (pDR18B) has a random sequence in the RIZ of the same length as the other substrates. The pDR22 set, the pDR26 set, and the pDR54 set are represented similarly. The REZ of the pDR22 set contains G clusters, but none with a size more than GGG. The pDR26 substrate only contains GG clusters in the REZ. In the pDR54 set, the REZ contains 49.7% Gs on the nontemplate strand (same as the G density in the REZ of the pDR18 set) but no G clusters. The Gs are distributed over the length of the REZ, with Gs alternating with A, C, or T. The RIZs are of the same length in the different sets of substrates, and the same applies to the REZs. Each G is represented as a solid circle, and other nucleotides are indicated as open circles. The nucleotide positions have been noted below the pDR54B representation.

In vitro transcription.

In vitro transcription, sodium bisulfite treatment, and frequency determination experiments using a colony lift hybridization assay were performed as previously described (37). Briefly, SalI-linearized substrates were mock transcribed or transcribed with T7 RNA polymerase in the presence of [α-32P]UTP for 1 h at 37°C. Transcribed samples were treated with RNase A or RNase A and RNase H1 for one additional hour at 37°C, organically extracted, and electrophoresed. Ethidium bromide staining was done afterward to locate restriction fragments containing the switch regions and the shifted bands. The gels were exposed to phosphorimager screens after pressing and drying as described previously.

Colony lift hybridization for the determination of frequency of R-loop formation.

To calculate the frequencies of R-loop formation, samples of T7 transcribed and RNase A-treated substrates were incubated overnight with sodium bisulfite at 37°C without any denaturation step. This would convert Cs on the single-stranded regions on nontemplate and template DNA in the R-loop conformation. PCR amplification was done on bisulfite-modified DNA with native primers DR050 and DR051 that would anneal outside of the region of interest (37). The PCR fragments were cloned with TOPO-TA cloning kit, and the white recombinant bacterial colonies were restreaked and lifted onto nylon membranes. The membranes were then transferred to a denaturing solution (0.5 M NaOH, 1.5 M NaCl) for 15 min and then transferred to 1 M Tris (pH 7.5) for 15 min. This was followed by transfer to 1 M Tris (pH 7.5)-1.5 M NaCl for 15 min and a rinse with 2× SSC (1× SSC is 0.15 M NaCl plus 0.015 M sodium citrate). The DNA was then fixed on the membrane by UV cross-linking. The membranes were then rinsed with 2× SSC and incubated for hybridization with end-labeled oligonucleotide probes. The oligonucleotide probes were designed to anneal to regions of C-to-T conversions but not to unconverted regions in the nontemplate strand derivative clones within the switch regions. DR075 (5′-CAAAACTATCCAACCTGATTCCCATACTC-3′) was used as a probe for detecting nontemplate strand C-to-T converted regions in TA clones of pDR18A, pDR18C, pDR18B, pDR22A, pDR22C, and pDR22B, while DR077 (5′-CCAAAACTATCCAACCTGATTACCATACT-3′) was used for probing pDR26A, pDR26C, and pDR26B. The clones corresponding to positive signals were confirmed by DNA sequencing of the whole PCR insert, and molecules with ≥25-nucleotide (nt) stretches with at least three consecutive C-to-T conversions were considered in the present study to be informative R-loop derivatives. The total number of nontemplate derivatives was determined by sequencing 16 white clones picked randomly. These were then scored for C-to-T conversions on the nontemplate strand. To calculate the frequency, the number of confirmed R-loop clones (picked up by probing the membranes) was divided by the total number of nontemplate derivative clones.

RESULTS

G clustering is important for R-loop initiation.

In our previous work, we determined that the density of Gs is not the only parameter that is important for R-loop formation because clustering of Gs yielded more efficient R-looping (37). For example, we noted that for the same G density, clustered Gs supported more R-loop formation than dispersed G-rich regions (7% of molecules with R-loops versus 0.3% [see Fig. 6A in reference 37]). Given this, we wondered whether the addition of one or two additional GGGG sequences at the very beginning of an RNA transcript might increase the percentage of DNA templates that form R-loops. Therefore, we varied the number and location of Gs in the sequence in the manner diagrammed in Fig. 1. Nearly all of the sequence shown in the top line (substrate pDR18A) consists of four ∼49-bp repeats of the Sγ3 switch region, but with the addition of a 16-bp sequence before these four repeats. We use three variations of this 16-bp sequence, called A, C, and B. The A variant of the 16-bp sequence contains two GGGG stretches on the nontemplate strand. The C variant has only one cluster of GGGG, and the B variant has none. We examined how these variants influence R-loop formation upon transcription from the T7 promoter with extension through the 4 repeats immediately downstream.

FIG. 6.

FIG. 6.

Role of G clusters versus high G density in the RIZ in R-loop formation efficiency. Linearized pDR51 (three Sγ3 G-clustered repeats), pDR54 (four-repeat long and dispersed high G density region without any G clustering), pDR70 (one repeat long dispersed and high G density region followed by three Sγ3 repeats) and pDR18 (four Sγ3 G-clustered repeats) were either mock transcribed (lanes 1, 6, 11, and 16 for pDR51, pDR54, pDR70, and pDR18, respectively) or transcribed with T7 RNA polymerase in the presence of [α-32P]UTP and treated with RNase A afterward (lanes 2 to 4, lanes 7 to 9, lanes 12 to 14, and lanes 17 to 19 in triplicates for pDR51, pDR54, pDR70, and pDR18, respectively). The fifth lane in each set is transcribed sample treated with RNase A and RNase H1 (lanes 5, 10, 15, and 20 for pDR51, pDR54, pDR70, and pDR18, respectively). The first repeat of these substrates (representing the RIZ) has approximately similar G density with dispersed (50% Gs in pDR54 and pDR70) or clustered (45.8% Gs in pDR51 and pDR18) distribution on the nontemplate strand. (A) The top panel is the ethidium-stained gel profile. The switch region containing linear fragment of pDR51 contains three repeats and therefore has a faster gel mobility than the switch region/modified switch region containing fragments of pDR54, pDR70, and pDR18, which have four repeat long regions. The positions of the linearized restriction fragments are marked “L.” The positions of R-loop-induced shifts are marked as “Shift.” (B) [α-32P]UTP radiolabel profile of the same gel shown in panel A after phosphorimager exposure. The shifted positions and the linearized restriction fragments have been marked as “Shift” and “L,” respectively. A concise description of the substrate switch regions is shown below panel B.

For the experiments, we incubated T7 RNA polymerase with purified, linearized plasmid DNA for 1 h at 37°C (see Fig. 1 in reference 37). RNase A is added to degrade free RNA after transcription. The reactions are then organically extracted and analyzed on agarose gels. The RNA generated remains associated with the template DNA in the form of an R-loop in a subset of molecules when the RNA strand is sufficiently G-rich. For these linear templates, the fraction of molecules in an R-loop conformation can be seen as a shifted species due to its slower gel mobility (9, 10, 30, 34, 47, 48).

Based on the ethidium-stained gels (Fig. 2A), we found that upon T7 transcription using pDR18A, pDR18C, and pDR18B, the pDR18A variant exhibits the greatest amount of shift (i.e., R-loop formation), whereas the shift using pDR18C (with one GGGG cluster, abbreviated 1x4G) is comparatively weaker. The amount of shift drops even further for the pDR18B variant, which contains no G clusters. The transcription is done with [α-32P]UTP present. Therefore, after RNase A treatment, the shifted species can be visualized by phosphorimaging (Fig. 2B), whereas the linearized fragment containing the transcribed sequence, but not associated with RNA, remains largely unlabeled. The densitometric analysis of the radiolabeled shifted species also reveals a similar trend in that the radiolabeled shift in the substrate with the A motif (2x4G) is severalfold greater than the shift in the substrate with the B motif (random sequence; no G clusters, or 0x4G). The substrate with the C motif (1x4G) also shows an increase over the B variant. RNase H1 digests the RNA in RNA-DNA hybrids, and treatment of a fraction of transcribed sample with E. coli RNase H1 confirmed the RNA-DNA hybrid nature of the shifted species because these were not observed in the samples so treated (lanes 5, 10, and 15 in Fig. 2A and 2B).

FIG. 2.

FIG. 2.

Effect of G clusters in the RIZ on R-loop formation. Analysis and map of R-loop molecules in pDR18A, pDR18C, and pDR18B with RIZ motif A (2 × GGGG), C (1 × GGGG), or B (no G clusters) are shown. These plasmids have identical REZ regions. (A) Linearized pDR18A, pDR18C, and pDR18B substrates were either mock transcribed (lanes 1, 6, and 11 for pDR18A, pDR18C, and pDR18B, respectively) or transcribed with T7 RNA polymerase in the presence of [α-32P]UTP and treated with RNase A afterward (lanes 2 to 4, lanes 6 to 8, and lanes 12 to 14 in triplicate for pDR18A, pDR18C, and pDR18B, respectively). The fifth lane in each set is a transcribed sample treated with RNase A and RNase H1 (lanes 5, 10, and 15 for pDR18A, pDR18C, and pDR18B, respectively). The top panel is the ethidium-stained gel profile. The position of the linear fragment containing the switch region is designated “L.” R-loop molecules run slower than “L” and are seen as a shifted band designated as “Shift.” The shifted band is not present in the RNase H-treated lane, confirming the RNA-DNA hybrid nature of the shifted species. (B) [α-32P]UTP radiolabel profile of the same gel shown in panel A after phosphorimager exposure. Most of the radiolabel localizes with the “Shift” bands, but not with the “L” fragments, and is not seen in the mock-transcribed lanes or in the RNase H-treated samples at either position. (C) Representation of single-stranded regions in the DNA nontemplate strand. Transcribed substrates were treated with sodium bisulfite to convert Cs in the single-stranded regions to Us. PCR amplification, cloning, and colony lift hybridization were done to calculate R-loop frequency (also see Table 1) and detect regions of single-strandedness (read as stretches of C-to-T conversions with sequencing). The top line is a diagram of the linearized substrate, showing the T7 promoter, followed by the RIZ sequence A, C, or B upstream (shown as an inverted triangle) of the REZ switch repeats represented as thick arrows. In each set, the first line shows all Cs on the nontemplate strand as vertical lines. Each of the following lines represents an independent nontemplate strand derivative molecule, with vertical lines representing observed as C-to-T conversions. Some molecules with R-loop-induced single-stranded stretches of conversion were incomplete for the conversion information on the nontemplate strand, and only the length to which the molecule was informative for the nontemplate strand has been shown. The asterisks mark the position of the internal C in the CCmetA/TGG sequence that gets methylated by bacterial dcm (DNA cytosine methylase) enzyme and therefore remains unconverted upon sodium bisulfite treatment.

To calculate the frequency of R-looped molecules and map the sequence location of the R-loops generated in these substrates, we treated the transcribed samples with sodium bisulfite (as described in Materials and Methods) followed by a colony lift hybridization assay (see Materials and Methods) to screen and sequence many molecules for R-loops. We observed the same trend in R-loop frequencies: 12% for the A variant (2x4G), 6% for the C variant (1x4G), and 0.8% for the B variant (0x4G), as measured by colony lift hybridization. We find that nearly all of the R-loops of the A variants begin within a narrow region 10 to 26 nt downstream of the transcription initiation site within the G clusters of the RIZ (Fig. 2C).

The sequence downstream of the RIZ is exactly the same in the three variants (A, C, and B); therefore, the difference in R-loop frequencies is a direct consequence of the composition of the RIZ. Having two G clusters in the RIZ imparts better efficiency than having one, which in turn is better than none. The thermodynamic stability of an RNA-DNA hybrid that initiates at the start of the RIZ and extends to the end of the four-switch region repeats is affected by <1.05-fold in the A or C variants compared to the B variant (see Tables S2 to S4, column 6, in the supplemental material), and yet we see R-looping variation over a 15-fold range between these three substrates. Therefore, the difference in R-loop formation efficiency is due to the differences only in the RIZ rather than the sequences downstream.

G clustering in the initiation zone is important for a wide range of substrates.

We wanted to extend analysis of the RIZ effect on R-loop formation to a range of substrates where the sequence downstream, termed the R-loop elongation zone or REZ, has been mutated. In these mutated forms, all clusters of 4 Gs are reduced to either clusters of 3 Gs (pDR22A, pDR22C, and pDR22B), clusters of 2 Gs (pDR26A, pDR26C, and pDR26B), or only dispersed Gs with no G clusters but relatively high G density (pDR54A, pDR54C, and pDR54B). The RIZ region is one of the 3 sequences described above (A, C, and B). The substrates thus constructed were linearized, transcribed with T7 RNA polymerase and treated with RNase A, or treated with RNase A and RNase H1.

In the substrates with only 3G clusters in the REZ (pDRA, pDRC, and pDRB), we observed that that the A variant exhibited a marked improvement in R-loop formation, as seen on the ethidium-stained gel, on the radiolabeled shift, and in the colony lift assay. In the ethidium-stained gel, the shifted species was ca. 5.3% of the total transcribed substrate (Fig. 3A, lanes 2 to 4). Although a distinct shifted species could not be detected for the C or the B variants in the ethidium-stained gel, in a more sensitive assay using the [α-32P]UTP label incorporation in transcribed RNA, the shifted species in the pDR22A variant showed a strong radiolabeled band much greater than the radiolabel level in the shifted region for the C variant, which in turn was greater that that of the B variant, which was nearly at background levels (Fig. 3B). R-loop frequencies calculated by colony lift hybridization assay also showed that the pDR22A variant was the most efficient, with 5.1% of molecules in an R-loop conformation. In pDR22C, 1.2% of molecules had R-loops, which was an improvement over the B variant, in which only 0.6% of molecules were R-looped. The substantial difference (5.1% versus 0.6%) between pDR22A and pDR22B clearly indicates the G clusters in the RIZ are important for R-loop formation efficiency.

FIG. 3.

FIG. 3.

Effect of reducing the REZ G clusters from GGGG to GGG. Analysis and maps of R-loop molecules in pDR22A, pDR22C, and pDR22B with RIZ motif A (2 × GGGG), C (1 × GGGG), or B (no G clusters) and an identical REZ with maximum G-cluster size of GGG are shown. (A) Representation is similar to Fig. 2A. Linearized pDR22A, pDR22C, and pDR22B substrates were either mock transcribed (lanes 1, 6, and 11 for pDR22A, pDR22C, and pDR22B, respectively) or transcribed with T7 RNA polymerase in the presence of [α-32P]UTP and treated with RNase A afterward (lanes 2 to 4, lanes 6 to 8, and lanes 12 to 14 in triplicate for pDR22A, pDR22C, and pDR22B, respectively). The fifth lane in each set is transcribed sample treated with RNase A and RNase H1 (lanes 5, 10, and 15 for pDR22A, pDR22C, and pDR22B, respectively). The top panel is the ethidium-stained gel profile. The position of switch region containing linear fragment is designated “L.” The R-loop-induced shift is designated as “Shift.” (B) [α-32P]UTP radiolabel profile of the same gel shown in panel A after phosphorimager exposure. The positions of shifted species and the linearized restriction fragment have been marked as “Shift” and “L,” respectively. (C) Representation of single-stranded regions in the DNA nontemplate strand detected by colony lift hybridization and sequencing after sodium bisulfite treatment. Similar to the description in Fig. 2C, the top line represents the linearized substrate, showing the T7 promoter, followed by the RIZ sequence A, C, or B upstream (shown as an inverted triangle) of the REZ that contains the modified switch repeats (GGG clusters) represented as thick arrows. The first line in each set shows all Cs on the nontemplate strand. Each of the following lines is an independent nontemplate strand derivative molecule, with vertical lines representing observed C-to-T conversions. Some molecules with R-loop-induced single-stranded stretches of conversions were incomplete for the conversion information on the nontemplate strand, and only the length to which the molecule was informative for the nontemplate strand has been shown. The asterisks mark the position of the methylated C in bacterial dcm methylation sites (CCmetA/TGG) that remain unconverted.

Even in the substrate with only 2G clusters in the REZ (pDR26A, pDR26C, and pDR26B), there was a small but discernible indication that G clustering in the RIZ improves R-loop formation. We could not detect any significant shifted species in the A, C, or B variants by ethidium staining or by radiolabel incorporation analysis, indicating that the REZ is not sufficiently capable of extending any R-loops initiated in the RIZ, presumably because of decreased G density of the transcript in the REZ (Fig. 4A and B). However, by colony lift hybridization, we detected one R-loop molecule out of 432 nontemplate strand derivative molecules (Table 1 and Fig. 4C). Also, the pDR54 set of substrates (Fig. 1) further support a role for G clusters in the RIZ in R-loop efficiency (Fig. 5). These substrates have a high G density in the REZ but no clustering. Compared to the substrate with motif B (0x4G) in the RIZ, motif A (2x4G), or motif C (1x4G) made R-loop formation more efficient (Fig. 5). Therefore, several sets of substrates (pDR22, pDR26, and pDR54 sets) suggest a role for G clustering in the RIZ.

FIG. 4.

FIG. 4.

Effect of reducing the REZ G clusters from GGGG to GG. Experiments to detect the presence of transcription-induced R-loops in pDR26A, pDR26C, and pDR26B with RIZ motif A (2 × GGGG), C (1 × GGGG), or B (no G clusters) and an identical REZ with maximum G-cluster size of GG are shown. (A) Representation is similar to Fig. 2A. Linearized pDR26A, pDR26C, and pDR26B substrates were either mock transcribed (lanes 1, 6, and 11 for pDR26A, pDR26C, and pDR26B, respectively) or transcribed with T7 RNA polymerase in the presence of [α-32P]UTP and treated with RNase A afterward (lanes 2 to 4, lanes 6 to 8, and lanes 12 to 14 in triplicate for pDR26A, pDR26C, and pDR26B, respectively). The fifth lane in each set is transcribed sample treated with RNase A and RNase H1 (lanes 5, 10, and 15 for pDR26A, pDR26C, and pDR26B, respectively). The top panel is the ethidium-stained gel profile. The position of the switch region containing linear fragment is designated “L.” No discernible R-loop-induced shifted species could be located above the linear fragment “L” for pDR26A, pDR26C, or pDR26B. (B) [α-32P]UTP radiolabel profile of the same gel shown in panel A after phosphorimager exposure. No shifted species was observed for the transcribed samples. Similar to panel A, the expected position of the linearized restriction fragment has been marked “L.” respectively. (C) Representation of a molecule with a long stretch of single-strandedness in the DNA nontemplate strand detected by colony lift hybridization and sequencing after sodium bisulfite treatment. Similar to the description in Fig. 2C, the top line represents the linearized substrate, showing the T7 promoter, followed by the RIZ sequence A (shown as an inverted triangle) and the REZ with modified switch repeats (GG clusters) represented as thick arrows. The next line in the set shows all Cs on the nontemplate strand. The following line shows the only nontemplate strand-derived molecule detected in an R-loop conformation (of 432 molecules screened; see also Table 1), with vertical lines representing observed C-to-T conversions. The asterisks mark the position of the methylated C in bacterial dcm methylation sites (CCmetA/TGG) that remain unconverted.

TABLE 1.

R-loop formation frequencies in various substrates calculated by colony lift hybridization assay

Substrate R-loop
No. of molecules with:
Frequency of R-loop formation (%)a
Initiation zone (RIZ) Elongation zone (REZ) Nontemplate strand information Long stretches of single strandedness on nontemplate strand
pDR18A 2×4G clusters Four Sγ3 repeats (mostly GGGG clusters) 154 19 12.3
pDR18C 1×4G cluster Four Sγ3 repeats (mostly GGGG clusters) 220 13 5.9
pDR18B No clusters Four Sγ3 repeats (mostly GGGG clusters) 371 3 0.8
pDR22A 2×4G clusters Modified four Sγ3 repeats (maximum cluster size is GGG) 334 17 5.1
pDR22C 1×4G cluster Modified four Sγ3 repeats (maximum cluster size is GGG) 411 5 1.2
pDR22B No clusters Modified four Sγ3 repeats (maximum cluster size is GGG) 1,051 6 0.6
pDR26A 2×4G clusters Modified four Sγ3 repeats (maximum cluster size is GG) 432 1 0.2
pDR26C 1×4G cluster Modified four Sγ3 repeats (maximum cluster size is GG) 754 NDb <0.1
pDR26B No clusters Modified four Sγ3 repeats (maximum cluster size is GG) 1,066 ND <0.1
a

That is, the percentage of molecules in the R-loop conformation.

b

ND, none detected.

FIG. 5.

FIG. 5.

Effect of reducing the REZ G clusters from GGGG to G while maintaining a high overall REZ G density. Experiments are shown analyzing transcription-induced R-loops in pDR54A, pDR54C, and pDR54B with RIZ motif A (2 × GGGG), C (1 × GGGG), or B (no G clusters) and an identical REZ with a high nontemplate strand G density (49.7% Gs, organized as GNGNGN…) with no G clusters. (A) Linearized pDR54A, pDR54C, and pDR54B substrates were either mock transcribed (lanes 1, 6, and 11 for pDR54A, pDR54C, and pDR54B, respectively), transcribed with T7 RNA polymerase in the presence of [α-32P]UTP and treated with RNase A afterward (lanes 2 to 4, lanes 6 to 8, and lanes 12 to 14 in triplicates for pDR54A, pDR54C, and pDR54B, respectively). The fifth lane in each set is transcribed sample treated with RNase A and RNase H1 (lanes 5, 10, and 15 for pDR54A, pDR54C, and pDR54B, respectively). The top panel is the ethidium-stained gel profile. The position of the linear fragment containing the switch region is designated “L.” The position of the R-loop induced shift is marked with a bracket and designated as “Shift.” (B) [α-32P]UTP radiolabel profile of the same gel shown in panel A after phosphorimager exposure. The shifted position and the linearized restriction fragment have been marked as “Shift” and “L,” respectively.

Comparison of roles of G clusters versus G density in the R-loop initiation zone.

In the studies described above, we observed that the addition of G clusters to the RIZ improves R-loop formation. We wondered whether initiation is strictly a function of the number of G clusters in the region immediately downstream of the promoter but upstream of the REZ and whether a sufficiently high G density in the RIZ can substitute for G clusters. The short motifs used in the previous section were too small to effectively disrupt the G clusters while maintaining the total G content. Therefore, we made a new substrate called pDR70 by inserting one repeat length of 50% dispersed Gs with no G clusters on the nontemplate strand, between the T7 promoter and three Sγ3 repeats in pDR51 (37). For this experiment, we call this the RIZ simply because it replaces the usual RIZ zone. The overall length of the switch substrate was maintained to be equivalent to four repeats of Sγ3, but with the first repeat region containing no G clusters, and instead having a high G density due to alternating GNGNGN. Therefore, pDR70 was constructed to contain 24 Gs in the 48 nt representing the one repeat length of dispersed G region. For comparison, in pDR18 or pDR51, the nontemplate strand G density in the first Sγ3 repeat is 45.8% (22 Gs in the 48-nt repeat), most of which are in G clusters.

In the ethidium-stained gel, transcription induced R-loop formation was observed as a mobility-shifted species in substrates pDR51 and pDR18 (Fig. 6A and B, lanes 2, 3, and 4 and lanes 17, 18, and 19, respectively), which have G clusters in Sγ3 repeats immediately downstream of the promoter but not in pDR70, which has the alternating G region followed by three Sγ3 repeats (Fig. 6, lanes 13, 14, and 15). Comparison of the radiolabel densities at the shifted species shows that the pDR70 has ∼17-fold less label intensity than that of the pDR18 (4 Sγ3 repeats) and about 10-fold less compared to pDR51 (3 Sγ3 repeats). This illustrates the effect of G clustering, even though there is a smaller effect of distance from the promoter. The values at the shifted position for pDR70 are comparable to pDR54, which has a four-repeat long G-rich region without G clusters (Fig. 6, lanes 7, 8, and 9). Both pDR70 and pDR18 have very similar and high G densities in the first repeat region and differ only in the distribution of the Gs within this region where R-loops would initiate.

These results show that G clustering is extremely important for efficient R-loop initiation regions. That is, replacing G clusters with unclustered, but equivalent G density sequences is inadequate and drastically reduces the efficiency of R-loop formation even in the presence of downstream G cluster-containing switch regions. Hence, we can conclude that high G density without clustering cannot replace the stronger effect of G clusters in R-loop initiation.

A high G density can compensate for an REZ with no G clustering.

Although our studies above focused primarily on the RIZ, our substrates do have significant implication for clustering versus density in the REZ. Comparison of pDR54A with pDR26A is informative in this regard. Both substrates have the same RIZ. The REZ in pDR54A has only isolated Gs (no clusters of even 2 Gs), but it has a relatively high G density. The REZ in pDR26A has many GG clusters (Fig. 1).

We found efficient R-loop formation in pDR54A (4.2%) based on percentage shift values from the respective ethidium stained gels. In contrast, pDR26A has no detectable R-loop formation (compare pDR54A in Fig. 5A and B, lanes 2 to 4, with pDR26A in Fig. 4A and B, lanes 2 to 4). In fact, pDR54A is nearly as efficient in R-loop formation (4.2%) as pDR22A (5.3%), which has many GGG clusters throughout the REZ (Fig. 3A and B, lanes 2 to 4 for pDR22A and Fig. 5A and B, lanes 2 to 4 for pDR54A, respectively).

This clearly shows that a high G density in the REZ can support R-loop formation without depending upon G clusters. Therefore, a high G density in the REZ compensates for an REZ with no G clustering.

DISCUSSION

R-loops have been studied in a wide range of in vitro and in vivo prokaryotic and eukaryotic (including mitochondrial) systems. However, mechanistic understanding of where and how they form has been lacking. In the previous in vitro or in vivo work on R-loops from our lab and others, no distinction was made between the zone where R-loops initiate (RIZ) and the remaining zone where the R-loop is maintained (REZ) (37). Although we previously learned a great deal about R-loops, including the threading back of the RNA for R-loop formation, we feel that the distinction between the RIZ and the REZ is critical for understanding where R-loops form and the length over which they extend. The sequence basis for the RIZ and REZ can be reduced to a thermodynamic level, which was not possible previously. In addition, definition of parameters for R-loop initiation and elongation now permits much more specific genome-wide searches for potential R-loop forming regions. Assaying for R-loop forming regions requires estimates not only of where they might initiate but also of how far downstream they might extend.

Mechanism of R-loop initiation.

The studies here markedly improve our understanding of the mechanism of R-loop formation (Fig. 7). During transcription of random sequence by all RNA polymerases, the nontemplate DNA strand is separated from the template DNA strand. The nontemplate DNA strand appears to track along the outside of the RNA polymerase for a length because it is solvent exposed, is susceptible to nucleases, and can be recognized by single-stranded binding proteins (2, 3, 43). The two DNA strands appear to reanneal outside of the upstream side of the RNA polymerase or on the surface of the polymerase in a region where both DNA strands are solvent exposed (23, 44). It is clear that the nascent RNA that exits the RNA polymerase is single stranded, based on its susceptibility to RNase A and T1 (37). R-loop formation upon transcription by phage and mammalian polymerases indicates that the nascent RNA strand can compete with the nontemplate DNA strand for annealing to the template DNA strand as it emerges from the RNA polymerase.

FIG. 7.

FIG. 7.

Model of R-loop initiation by nucleation at G clusters. (A) This diagram depicts events prior to or without R-loop formation. The two DNA strands separated by the RNA polymerase are reannealing to form a duplex. The drawing is not to scale, and the reannealing of the two DNA strands may occur anywhere between (or on) the surface of the RNA polymerase and some uncertain number of base pairs upstream of the polymerase. The black downward arrow (DNA on) represents the duplex formation propensity of the two DNA strands. Thermodynamically, all RNA-DNA duplexes are more stable than the DNA-DNA duplexes, but the DNA-DNA duplex ultimately prevails because of more favorable proximity of the two DNA strands. The dashed black arrow pointing upwards (DNA off) is the propensity of the DNA duplex to separate into template and nontemplate strands (breathing). The red arrows are the propensities of the transcript to associate (RNA on; red upward arrow) or dissociate (RNA off; red downward arrow) with the template strand, and the red arrows are thinner than the black arrows because the RNA transcript exits the RNA polymerase away from the DNA and consequently is relatively disadvantaged sterically for association with the template DNA strand. (B) Model of initiation of R-loop formation when G clusters are present in the transcript. The association of the RNA with the DNA template strand is strengthened at the RNA-DNA hybrid regions containing G clusters (thereby weakening the RNA-DNA dissociation propensity; dashed red arrow). This happens because of a considerable increase in the local thermodynamic stability of the RNA-DNA hybrid (see Tables S1 to S4 and additional discussion in the supplemental material). This initial hybridization or stable nucleation event provides an increased opportunity for the rest of the transcript to hybridize with the DNA template (depending on the downstream G-density in the REZ). The presence of G clusters on the nontemplate strand also increases the breathing of the DNA duplex (now depicted as a solid upward black arrow). Therefore, the RNA-DNA nucleation event may occur after the two DNA strands anneal (on the surface of the upstream side of the RNA polymerase) but then breathe open, thereby allowing the RNA transcript to “invade” and anneal to the template DNA strand. The increased RNA-DNA hybrid length (extension of the R-loop downstream [i.e., REZ]) and the presence of G-richness or other G clusters in the transcript impart greater stability to the R-loop structure because of increased difference between the RNA-DNA thermodynamic stability over the DNA-DNA duplex stability in favor of the RNA-DNA hybrid. Once formed, the R-loop terminates downstream when the difference between the RNA-DNA and DNA-DNA stability is smaller, such that the proximity advantage of the DNA-DNA annealing prevails.

The competition between the RNA and the nontemplate DNA strand for annealing to the template DNA does not have to occur at the instant that all three strands exit the RNA polymerase. Rather, the two DNA strands might anneal to one another at or close to the surface of the RNA polymerase and then breathe open, thereby providing an opportunity for a G cluster of the RNA strand to invade the DNA. Such breathing of the newly reformed DNA duplex is particularly likely because as the two DNA strands anneal to one another on the upstream side of the RNA polymerase, the newly reannealed DNA duplex at this position is effectively a DNA end, and DNA ends are known to undergo substantial breathing (41). Interestingly, breathing of G-rich sequences is maximal when one strand is composed of consecutive Gs and the other is composed of consecutive Cs, just as is the case for the G clusters in the RIZ (41). This may be because such GGG/CCC (or longer) regions adopt a DNA conformation that is intermediate between the B form and the A form, called B/A-intermediate DNA, and the B/A-intermediate conformation is favorable for DNA breathing (41).

All RNA-DNA is stronger than DNA-DNA of the corresponding sequence. However, the resumption of DNA-DNA in random DNA sequences is likely because the nontemplate strand is closer to the template strand than the RNA strand. Even for linear Ig switch region sequences with 50% G density in clusters, only a small minority of substrates assume an R-loop conformation. Therefore, despite the thermodynamic advantage of RNA-DNA over DNA-DNA, the proximity advantage of the nontemplate DNA strand is a dominant factor in favoring DNA-DNA over RNA-DNA. Therefore, the initial nucleation site of the thread-back RNA must have maximum stability via as many Gs as possible in a short length (because there is not sufficient length of DNA strand separation to permit a long segment of RNA to bind). It is interesting in this regard that most of the R-loop substrates with two G clusters initiate within that two G cluster RIZ, whereas almost all but one R-loop molecule in substrates with zero RIZ G clusters started downstream of the RIZ, mostly at or inside the switch regions (REZ). Although the influence of the one G cluster motif in the RIZ results in increased R-loop formation compared to the zero G-cluster substrates, the R-loop start sites on these substrates are more varied, pointing to the intermediate stability of R-loops initiated at a one G-cluster RIZ.

When we compare the sequence between pDR18A and pDR18B, the two substrates differ by two extra G clusters in RIZ of pDR18A not present in pDR18B. This is a comparatively small change in the overall sequence. However, the R-loop formation efficiency of pDR18A is ∼15-fold greater than pDR18B. This disproportionate increase in pDR18A is observed because the 2x4G clusters in pDR18A are closer to the 5′ end of the transcript, whereas the position of the first G cluster in pDR18B is internal in the transcript. The transcript length is the same in the two substrates. Due to the higher mobility of the nucleotide positions near the 5′ terminus compared to the internal positions, the 5′ ends of the transcripts have a higher collision frequency with the template DNA strand. Thus, G clusters present toward the free 5′ end of the RNA transcript have a higher probability of nucleating an R-loop initiation event. Once initiated, the overall stability of the R-loop is also higher in the A variant because of a higher local G-content around this region (more G clusters in the pDR18A compared to pDR18B; see Fig. 1), thereby further reducing the RNA dissociation propensity (Fig. 7). Thus, whereas the stability factors (total G-content) are additive in nature, the R-loop initiation factors (increased molecular mobility driving higher collision frequency and RNA-DNA nucleation upstream of the polymerase) are much more than additive.

Thermodynamic considerations.

It is useful to consider the thermodynamic aspects to fully appreciate the mechanism of RNA-DNA hybrid formation in R-loops. We observe a many fold improvement in R-looping when the RIZ contains one or two clusters of GGGG. This improved clustering correlates much better with the stability of annealing in the RIZ than in the entire R-loop region (RIZ + REZ). The efficiency of R-looping increases as the length and number of G clusters increase. As mentioned earlier, all RNA-DNA duplexes are stronger than DNA-DNA duplexes (38; http://ozone3.chem.wayne.edu). The addition of a motif containing two GGGG clusters or one GGGG cluster can be calculated to improve the local strength of annealing (ΔG) of the RIZ substantially (∼50% or 1.5-fold with the 2x4G motif and ∼25% or 1.25-fold with the 1x4G motif; see Table S1, column 4, in the supplemental material). The RIZ is so small relative to the REZ that the strength of annealing for the entire RNA-DNA hybrid is increased relatively little (<5% or <1.05-fold; see Tables S2 to S4, column 6, in the supplemental material) compared to a random sequence in the RIZ. Therefore, the stability of the RNA-DNA in the RIZ is the key factor for R-loop initiation and a key factor for overall R-loop formation efficiency (Fig. 7).

In the R-loop initiation, as well as elongation zones, the thermodynamic stability of DNA-DNA that has clusters of Gs is weaker than for dispersed Gs (see discussion and tables in the supplemental material). This is a reflection of the B/A-intermediate DNA structure mentioned above for regions of GGG/CCC (41). As mentioned above for R-loop initiation, this would favor DNA-DNA strand separation in the G-clustered regions, thereby making it somewhat easier for the proximity-disadvantaged nascent RNA to anneal to the template DNA strand and thereby initiate a nucleation event.

Mechanism of R-loop elongation.

Once an initial RNA-DNA nucleation site forms (typically in the RIZ in the present study), local RNA-DNA is more stable than DNA-DNA, as usual. Then, elongation is purely a reflection of the stability of RNA-DNA being substantially stronger than the stability of DNA-DNA. The R-loop terminates as the difference in stability of RNA-DNA and DNA-DNA gets smaller. At this point, the proximity advantage for the annealing of the nontemplate DNA to the template DNA eventually prevails, and the R-loop elongation ends.

The thermodynamic stability of clustered G duplexes in a DNA duplex is weaker than that of dispersed sequences such as GNGNGN. The clustered G duplexes may have better intrastrand base stacking at the expense of interstrand interaction. A greater tendency of these G-clustered sequences to breathe (as mentioned above) may also make the DNA-DNA component weaker. The majority of the difference between the RNA-DNA and DNA-DNA components is contributed by this DNA-DNA interaction effect rather than the RNA-DNA interaction within zones that contain G clusters.

Relevance of R-loop initiation and elongation zones in vivo.

In light of our inferences here, one might wonder why mammalian switch regions evolved to have G clustering throughout, rather than only at the beginning. R-loop initiation is a stochastic process and is not 100% efficient at the first G cluster. In fact, on linear substrates with four Sγ3 repeats, less than 10% of the molecules are in an R-loop conformation at any one time. Hence, additional G clusters further downstream improve the R-loop formation efficiency overall. Therefore, it is not surprising that the Ig switch regions contain G-clustered repeats throughout their repetitive zone. Mapping of R-loop positions in vivo shows that the initiation point varies considerably (47), a finding consistent with what we see in vitro for partial switch regions (37).

The findings here provide a basis for understanding the R-loop initiation and extension seen upstream of the Igμ switch region in vivo (17). A strong RIZ followed downstream by a very weak REZ may be insufficient to remain stable as an R-loop. An in vivo example of this may be the 50-bp G-clustered (50% G-dense on the nontemplate strand) peak upstream of the Sμ repetitive region. This might be insufficient to initiate stable R-looping in the wild-type allele or an allele that deletes the core Sμ repeats (called ΔSμTR in reference 17) if not for the region downstream of it (REZ), which is G-dense but relatively unclustered. Therefore, the observations here for R-loop initiation and elongation are likely to have predictive value in assessing transcription units for their propensity for R-loop initiation and elongation.

Supplementary Material

[Supplemental material]

Acknowledgments

We thank Chih-Lin Hsieh and members of the Lieber laboratory for discussion. We thank Albert Tsai of our lab for the use of the bisulfite plotter for the R-loop maps.

This study was supported by the National Institutes of Health (M.R.L.). D.R. is a recipient of a USC Hiedelberger Award and a CBM Training Grant Award.

Footnotes

Published ahead of print on 23 March 2009.

Supplemental material for this article may be found at http://mcb.asm.org/.

REFERENCES

  • 1.Aguilera, A., and B. Gomez-Gonzalez. 2008. Genome instability: a mechanistic view of its causes and consequences. Nat. Rev. Genet. 9204-217. [DOI] [PubMed] [Google Scholar]
  • 2.Artsimovitch, I., and R. Landick. 2002. The transcriptional regulator RfaH stimulates RNA chain synthesis after recruitment to elongation complexes by the exposed nontemplate DNA strand. Cell 109193-203. [DOI] [PubMed] [Google Scholar]
  • 3.Bandwar, R. P., N. Ma, S. A. Emanuel, M. Anikin, D. G. Vassylyev, S. S. Patel, and W. T. McAllister. 2007. The transition to an elongation complex by T7 RNA polymerase is a multistep process. J. Biol. Chem. 28222879-22886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Barreto, V. M., Q. Pan-Hammarstrom, Y. Zhao, L. Hammarstrom, Z. Misulovin, and M. C. Nussenzweig. 2005. AID from bony fish catalyzes class switch recombination. J. Exp. Med. 202733-738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bottaro, A., R. Lansford, L. Xu, J. Zhang, P. Rothman, and F. W. Alt. 1994. S region transcription per se promotes basal IgE class switch recombination but additional factors regulate the efficiency of the process. EMBO J. 13665-674. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bransteitter, R., P. Pham, M. D. Scharff, and M. F. Goodman. 2003. Activation-induced cytidine deaminase deaminates deoxycytidine on single-stranded DNA but requires the action of RNase. Proc. Natl. Acad. Sci. USA 1004102-4107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Chaudhuri, J., U. Basu, A. Zarrin, C. Yan, S. Franco, T. Perlot, B. Vuong, J. Wang, R. T. Phan, A. Datta, J. Manis, and F. W. Alt. 2007. Evolution of the immunoglobulin heavy chain class switch recombination mechanism. Adv. Immunol. 94157-214. [DOI] [PubMed] [Google Scholar]
  • 8.Chaudhuri, J., M. Tian, C. Khuong, K. Chua, E. Pinaus, and F. W. Alt. 2003. Transcription-targeted DNA deamination by the AID antibody diversification enzyme. Nature 422726-730. [DOI] [PubMed] [Google Scholar]
  • 9.Daniels, G. A., and M. R. Lieber. 1995. RNA:DNA complex formation upon transcription of immunoglobulin switch regions: implications for the mechanism and regulation of class switch recombination. Nucleic Acids Res. 235006-5011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Drolet, M., S. Broccoli, F. Rallu, C. Hraiky, C. Fortin, E. Masse, and I. Baaklini. 2003. The problem of hypernegative supercoiling and R-loop formation in transcription. Front. Biosci. 8d210-221. [DOI] [PubMed] [Google Scholar]
  • 11.Dunnick, W. A., G. Z. Hertz, L. Scappino, and C. Gritzmacher. 1993. DNA sequence at immunoglobulin switch region recombination sites. Nucleic Acids Res. 21365-372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Duquette, M. L., P. Handa, J. A. Vincent, A. F. Taylor, and N. Maizels. 2004. Intracellular transcription of G-rich DNAs induces formation of G-loops, novel structures containing G4 DNA. Genes Dev. 181618-1629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Flajnik, M. F., K. Miller, and L. D. Pasquier. 2003. Evolution of the immune system, p. 519-570. In W. E. Paul (ed.), Fundamental immunology. Lippincott, Philadelphia, PA.
  • 14.Gonzalez-Aguilera, C., C. Tous, B. Gomez-Gonzalez, P. Huertas, R. Luna, and A. Aguilera. 2008. The THP1-SAC3-SUS1-CDC31 complex works in transcription elongation-mRNA export preventing RNA-mediated genome instability. Mol. Biol. Cell 194310-4318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Gritzmacher, C. A. 1989. Molecular aspects of heavy-chain class switching. Crit. Rev. Immunol. 9173-200. [PubMed] [Google Scholar]
  • 16.Harriman, G. R., A. Bradley, S. Das, P. Rogers-Fani, and A. C. Davis. 1996. IgA class switch in Iα exon-deficient mice. J. Clin. Investig. 97477-485. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Huang, F.-T., K. Yu, B. B. Balter, E. Selsing, Z. Oruc, A. A. Khamlichi, C.-L. Hsieh, and M. R. Lieber. 2007. Sequence dependence of chromosomal R-loops at the immunoglobulin heavy-chain Sμ class switch region. Mol. Cell. Biol. 275921-5932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Huertas, P., and A. Aguilera. 2003. Cotranscriptionally formed DNA:RNA hybrids mediate transcription elongation impairment and transcription-associated recombination. Mol. Cell 12711-721. [DOI] [PubMed] [Google Scholar]
  • 19.Ichikawa, H. T., M. P. Sowden, A. T. Torelli, J. Bachl, P. Huang, G. S. Dance, S. H. Marr, J. Robert, J. E. Wedekind, H. C. Smith, and A. Bottaro. 2006. Structural phylogenetic analysis of activation-induced deaminase function. J. Immunol. 177355-361. [DOI] [PubMed] [Google Scholar]
  • 20.Jimeno, S., A. G. Rondon, R. Luna, and A. Aguilera. 2002. The yeast THO complex and mRNA export factors link RNA metabolism with transcription and genome instability. EMBO J. 213526-3535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Jung, S., K. Rajewsky, and A. Radbruch. 1993. Shutdown of class switch recombination by deletion of a switch region control element. Science 259984-987. [DOI] [PubMed] [Google Scholar]
  • 22.Kaneko, S., C. Chu, A. J. Shatkin, and J. L. Manley. 2007. Human capping enzyme promotes formation of transcriptional R loops in vitro. Proc. Natl. Acad. Sci. USA 10417620-17625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Korzheva, N., A. Mustaev, M. Kozlov, A. Malhotra, V. Nikiforov, A. Goldfarb, and S. A. Darst. 2000. A structural model of transcription elongation. Science 289619-625. [DOI] [PubMed] [Google Scholar]
  • 24.Lee, D. Y., and D. A. Clayton. 1998. Initiation of mitochondrial DNA replication by transcription and R-loop processing. J. Biol. Chem. 27330614-30621. [DOI] [PubMed] [Google Scholar]
  • 25.Li, X., and J. L. Manley. 2006. Cotranscriptional processes and their influence on genome stability. Genes Dev. 201838-1847. [DOI] [PubMed] [Google Scholar]
  • 26.Li, X., and J. L. Manley. 2005. Inactivation of the SR protein splicing factor ASF/SF2 results in genomic instability. Cell 122365-378. [DOI] [PubMed] [Google Scholar]
  • 27.Li, X., and J. L. Manley. 2005. New talents for an old acquaintance: the SR protein splicing factor ASF/SF2 functions in the maintenance of genome stability. Cell Cycle 41706-1708. [DOI] [PubMed] [Google Scholar]
  • 28.Liu, M., J. L. Duke, D. J. Richter, C. G. Vinuesa, C. C. Goodnow, S. H. Kleinstein, and D. G. Schatz. 2008. Two levels of protection for the B-cell genome during somatic hypermutation. Nature 451841-845. [DOI] [PubMed] [Google Scholar]
  • 29.Longerrich, S., U. Basu, F. Alt, and U. Storb. 2006. AID in somatic hypermutation and class switch recombination. Curr. Opin. Immunol. 18164-174. [DOI] [PubMed] [Google Scholar]
  • 30.Masse, E., P. Phoenix, and M. Drolet. 1997. DNA topoisomerases regulate R-loop formation during transcription of the rrnB operon in Escherichia coli. J. Biol. Chem. 27212816-12823. [DOI] [PubMed] [Google Scholar]
  • 31.Muramatsu, M., H. Nagaoka, R. Shinkura, N. A. Begum, and T. Honjo. 2007. Discovery of activation-induced cytidine deaminase, the engraver of antibody memory. Adv. Immunol. 941-36. [DOI] [PubMed] [Google Scholar]
  • 32.Pham, P., R. Bransteitter, J. Petruska, and M. F. Goodman. 2003. Processive AID-catalyzed cytosine deamination on single-stranded DNA stimulates somatic hypermutation. Nature 424103-107. [DOI] [PubMed] [Google Scholar]
  • 33.Ramiro, A. R., P. Stavropoulos, M. Jankovic, and M. C. Nussenzweig. 2003. Transcription enhances AID-mediated cytidine deamination by exposing single-stranded DNA on the nontemplate strand. Nat. Immunol. 4452-456. [DOI] [PubMed] [Google Scholar]
  • 34.Reaban, M. E., and J. A. Griffin. 1990. Induction of RNA-stabilized DNA conformers by transcription of an immunoglobulin switch region. Nature 348342-344. [DOI] [PubMed] [Google Scholar]
  • 35.Reaban, M. E., J. Lebowitz, and J. A. Griffin. 1994. Transcription induces the formation of a stable RNA.DNA hybrid in the immunoglobulin alpha switch region. J. Biol. Chem. 26921850-21857. [PubMed] [Google Scholar]
  • 36.Ronai, D., M. D. Iglesias-Ussel, M. Fan, Z. Li, A. Martin, and M. D. Scharff. 2007. Detection of chromatin-associated single-stranded DNA in regions targeted for somatic hypermutation. J. Exp. Med. 204181-190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Roy, D., K. Yu, and M. R. Lieber. 2008. Mechanism of R-loop formation at immunoglobulin class switch sequences. Mol. Cell. Biol. 2850-60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Santa Lucia, J. 1998. A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proc. Natl. Acad. Sci. USA 951460-1465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Shinkura, R., M. Tian, C. Khuong, K. Chua, E. Pinaud, and F. W. Alt. 2003. The influence of transcriptional orientation on endogenous switch region function. Nat. Immunol. 4435-441. [DOI] [PubMed] [Google Scholar]
  • 40.Stavnezer, J., and C. T. Amemiya. 2004. Evolution of isotype switching. Semin. Immunol. 16257-275. [DOI] [PubMed] [Google Scholar]
  • 41.Tsai, A. G., A. E. Engelhart, M. M. Hatmal, S. I. Houston, N. V. Hud, I. S. Haworth, and M. R. Lieber. 2009. Conformational variants of duplex DNA correlated with cytosine-rich chromosomal fragile sites. J. Biol. Chem. 2847157-7164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Wakae, K., B. G. Magor, H. Saunders, H. Nagaoka, A. Kawamura, K. Kinoshita, T. Honjo, and M. Muramatsu. 2006. Evolution of class switch recombination function in fish activation-induced cytidine deaminase, AID. Int. Immunol. 1841-47. [DOI] [PubMed] [Google Scholar]
  • 43.Wang, D., and R. Landick. 1997. Nuclease cleavage of the upstream half of the nontemplate strand DNA in an Escherichia coli transcription elongation complex causes upstream translocation and transcriptional arrest. J. Biol. Chem. 2725989-5994. [DOI] [PubMed] [Google Scholar]
  • 44.Westover, K. D., D. A. Bushnell, and R. D. Kornberg. 2004. Structural basis of transcription: separation of RNA from DNA by RNA polymerase II. Science 3031014-1016. [DOI] [PubMed] [Google Scholar]
  • 45.Xu, B., and D. A. Clayton. 1995. A persistent RNA-DNA hybrid is formed during transcription at a phylogenetically conserved mitochondrial DNA sequence. Mol. Cell. Biol. 15580-589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Xu, L., B. Gorham, S. C. Li, A. Bottaro, F. W. Alt, and P. Rothman. 1993. Replacement of germ-line ɛ promoter by gene targeting alters control of immunoglobulin heavy chain class switching. Proc. Natl. Acad. Sci. USA 903705-3709. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Yu, K., F. Chedin, C.-L. Hsieh, T. E. Wilson, and M. R. Lieber. 2003. R-loops at immunoglobulin class switch regions in the chromosomes of stimulated B cells. Nat. Immunol. 4442-451. [DOI] [PubMed] [Google Scholar]
  • 48.Yu, K., and M. R. Lieber. 2003. Nucleic acid structures and enzymes in the immunoglobulin class switch recombination mechanism. DNA Repair 21163-1174. [DOI] [PubMed] [Google Scholar]
  • 49.Zhang, J., A. Bottaro, S. Li, V. Stewart, and F. W. Alt. 1993. A selective defect in IgG2b switching as a result of targeted mutation of the I gamma 2b promoter and exon. EMBO J. 123529-3537. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Zhao, Y., Q. Pan-Hammarstrom, Z. Zhao, and L. Hammarstrom. 2005. Identification of the activation-induced cytidine deaminase gene from zebrafish: an evolutionary analysis. Dev. Comp. Immunol. 2961-71. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

[Supplemental material]
supp_29_11_3124__1.pdf (99.1KB, pdf)

Articles from Molecular and Cellular Biology are provided here courtesy of Taylor & Francis

RESOURCES