Non-standard events during the elongation phase of transcription can either enrich gene expression or contribute to erroneous and wasteful expression. An example of the former is selection for reverse transcriptase-mediated multiple alternative base substitutions to lead to pathogen surface variability to evade host defenses (1). A different type of productive non-standard polymerase action involves realignment of the template:product hybrid at a slippage-prone sequence to yield product with extra or fewer base(s) than present in the corresponding template sequence (2). This has been studied with DNA-dependent DNA polymerases, DNA-dependent RNA polymerases, RNA-dependent RNA polymerases and RNA-dependent DNA polymerases (reverse transcriptases, RTs).
Evolutionary selected transcription slippage is utilized in the expression of viruses such as the Paramyxoviruses, Sendai virus and Parainfluenza virus (3,4), the Filovirus, Ebola virus (5–7), the large Potyviridae family (8–10), chromosomal genes such as Thermus thermophilus dnaX (11), numerous genes in an endosymbiont (12), a variety of bacterial Insertion Sequences (13–16), several medically important plasmid genes of Shigella flexneri (17–19), and counterpart chromosomal toxin secretion genes in Citrobacter rodentium and Yersinia pseudotuberculosis (20). Further, an extensive bioinformatic analysis of bacterial genomes has revealed many candidates that have yet to be experimentally explored (13,15,16). Transcriptional indel errors are relevant to certain disease states (21–25) and maybe significant for aging (26,27).
One common bacterial type of transcriptional slippage-prone sequence involves 9 or more A’s or T’s (28); other repeats have also been analyzed (29). Dissociation of the nascent RNA from its template hybrid complement allows realigned pairing in either direction. A well known Paramyxovirus heteropolymeric slippage motif is composed of A’s followed by G’s with the identity of the mis-paired base in the new re-aligned hybrid being important in determining slippage directionality (30,31). Nearly all work has focused on slippage involving a linear (unstructured) template. However, there is evidence that a protein roadblock or template structure ahead of a DNA-dependent RNA polymerase transcribing a slippage motif can stimulate realignment (2,32,33). Also there is one report of roadblock-mediated RT slippage where a polymerase bypasses an RNA-structure forming sequence prior to resumption of synthesis (34). The present work does not explore RT generation of product lacking sequence complementary to template sequence present in RNA structure.
Despite these studies of transcription slippage, and several studies of reverse transcriptase fidelity including (34–37), significant issues concerning RT mediated indel formation merit investigation. The use of RT as a lab reagent is one of the reasons why their slippage propensity is of interest (38). However, RTs do not contain 3′ exonuclease proofreading activity and their templates are prone to form structures at the ambient temperature at which these enzymes act. Better reagent polymerases have been developed from thermophilic DNA-dependent DNA polymerases by adapting their catalytic activity to function with RNA templates (39–41), or derived from existing RT polymerase by genetic engineering (42). Though the derived enzymes have the beneficial quality of lower base mis-incorporation due to higher accuracy for substrate selection (39), their indel fidelity remains to be explored.
The natural functional utilization of reverse transcriptase activities also enhances interest in deeper understanding of their propensity for indel formation. Reverse transcriptase activities are naturally essential for retroviruses and, retrotransposons, CRISPR spacer acquisition from RNA as a defense mechanism (43), and maintenance of chromosome ends (44). Further, the retron reverse transcriptase that yields msDNA (45) is significant for bacteria pathogenicity and colonization (46).
Here, we analyze the slippage propensity of different retroviral RTs as well as a retrotransposon counterpart. This study involves utilization of identical test sequences encompassing relevant stimulatory features and specific slippage-prone motifs. In addition, specific slippage candidate cassettes for natural RT slippage were also tested with their relevant RT enzymes.
The starting point for the present work was an unexpected result from a control for experiments in which reverse transcriptase slippage would confound the issue being addressed. A 6bp-stem 4nt-loop nascent transcript structure (here named ‘model' stem–loop) stimulates E. coli DNA-dependent RNA polymerase transcriptional realignment at a 3′-A5G5–5′ motif which on its own is an inefficient slippage site (47). Analysis of the product RNA generated in that study involved reverse transcription by SuperScriptTM III (derived from Moloney Murine Leukemia Virus RT). For the experiments included in that publication, the controls to distinguish whether indels in its product DNA derived from the initial DNA-dependent RNA polymerase step, or subsequently from product cDNA reverse transcriptase slippage, revealed no reverse transcriptase slippage. Follow-up work tested potential nascent RNA stem–loop structure stimulation of slippage at runs of A shorter than 9, the minimal needed for efficient slippage at such motifs. In this unpublished work, a significant proportion of the reverse transcriptase product of one 75 nt chemically synthesized RNA template with an inverted repeat with potential to form the ‘model’ stem–loop 5′ adjacent to 8 A’s, had an extra T. This control experiment prompted the present investigation of RNA template stem–loop structure-mediated reverse transcriptase realignment.
RNA template constructs
Preparation of RNA templates (quadruplex cassettes) with T7 RNA polymerase is described in Supplementary Methods. Chemically synthesized RNA templates and DNA oligonucleotides were from IDT-DNA (Supplementary Table S1).
Retroviral reverse transcriptase enzymes were purchased as follows: SuperScript™ III (Invitrogen), AMV (Biolabs), M-MulV (Biolabs), HIV-1 RT and HIV-2 RT (Abcam) and the retrotransposon TGIRT enzyme (InGex). In general when not indicated in the main text, RT reactions for SuperScript™ III, HIV-1, HIV-2 RT enzymes were with SuperScript™ III buffer 1X (50 mM Tris–HCl, pH 8.3 at 25°C, 75 mM KCl, 3 mM MgCl2, 5 mM DTT). For AMV (50 mM Tris–HCl pH 8.3 at 25°C, 75 mM KOAc, 8 mM Mg(OAc)2, 10 mM DTT) and M-MulV (50 mM Tris–HCl pH 8.3 at 25 °C, 75 mM KCl, 3 mM MgCl2, 10 mM DTT). For the TGIRT reactions two alternative buffers were used. The buffer for the template switching reaction contained 450 mM NaCl, 5 mM MgCl2, 20 mM Tris–HCl pH 7.5 (48). The buffer for testing retrotransposon slippage candidate cassettes contained 75 mM KCl, 10 mM MgCl2, 20 mM Tris–HCl pH 7.5 (49).
RT reactions with retroviral reverse transcriptases involved a pre-annealing step of the RNA template (100 ng): DNA Primer (2 pmol) (Supplementary Table S1), in the presence of the dNTP substrate (with the specific concentrations of each indicated in the main text), with the presence or absence of antisense where indicated (2, 20 or 200 pmol), in 10 μl reaction volumes. With wtSL or MUTsl RNA templates, incubation was at 65°C for 5 min before chilling on ice. For G-rich RNA templates with potential to form structure formation larger than that of the model stem–loop wtSL, the annealing mix had in addition 10 mM KCl and the annealing step was at 95°C for 30 s with a 1°C temperature decrease (from 95°C to 16°C) every 30 s. On completion one of several different 10 μl reaction mixes was added and incubated for 50 min at the temperature indicated. One reaction mix contained 100 units SuperScript™ III, 1X SuperScript™ III buffer and 20 mM DTT—this one was incubated at 52 °C. The AMV reaction mix contained 10 units of enzyme and 1× AMV buffer-incubation was at 37°C. The MuLV reaction contained 10 units of enzyme and 1× MuLV buffer-incubation 37 °C. The HIV-1 reaction mix contained 4 units enzyme (1.7 pmol), 1× SuperScript™ III buffer and 20 mM DTT-incubation 37°C. The HIV-2 reaction mix contained 0.2 units enzyme (1.7 pmol), 1× SuperScript™ III buffer and 20 mM DTT-incubation 37°C. On completion a further incubation, which was at 85°C, followed for 5 min.
TGIRT RT reactions involving template switching are described in Supplementary Methods. Analysis of the candidate retrotransposon slippage cassettes was performed using a specific DNA primer complementary to the 3′ end segment of the test RNA. A mix of 100 ng RNA with 4 μl 10 μM specific primer and 10 μl 2× TGIRT ‘low salt’ buffer in a total volume of 18 μl, was incubated at 65°C for 5 min and chilled on ice. Then 1 μl 10 μM TGIRT enzyme was added. The premix was pre-incubated at room temperature for 30 min. Reaction was initiated by adding substrate dNTPs as indicated in the text and incubated for 10 min at room temperature. In the final 20 μl reaction mix, the final concentration of primer was 2 μM and of TGIRT enzyme was 500 nM. Then, 1 μl 5 M NaOH was added and incubated at 95°C for 3 min. It was neutralized with 1 μl 5 M HCl. cDNA was then purified with a silica-based column following the procedure described in Supplementary Methods. Elution was with 20 μl RNase free water (Supplementary Table S1).
Polymerase chain reaction
Each specific cDNA was amplified using the corresponding set of forward and reverse primers (Supplementary Table S1). Standard PCR reactions were 50 μl volume and contained: 1× Thermo buffer (Biolabs), 2 μl cDNA or 4 nM DNA oligo, 200 μM each dNTP (Biolabs), 500 nM each specific primer, and 0.8 unit Taq DNA polymerase (Biolabs). The PCR cycle was: denaturation at 94°C for 5 min, then 25 cycles of denaturation at 94°C for 30 s, annealing at 52°C for 30 s and elongation at 72°C for 30 s. This was followed by a final elongation at 72°C for 1 min.
Limited primer extension
IRD700 fluorescent 5′-labeled oligonucleotides were from IDT DNA. The standard limited primer extension reaction was in 12.5 μl volume with 1× Thermo buffer (Biolabs), 12 nM of a specific IRD700-labeled fluorescent primer (IDT-DNA, Supplementary Table S1), a mix of 1 μM of three dNTPs with the missing dNTP replaced by the corresponding terminator chain reaction acydNTP (Biolabs) at 50 μM, and 0.6 unit of Vent exo-polymerase (Biolabs). The quantity of (RT)-PCR template was about three times lower than that of the fluorescent primer. On average each primer molecule is utilized on 20 occasions for chain extension during the 60-cycle PCR reactions. The PCR cycle was: denaturation at 94°C for 2 min, then 60 cycles of denaturation at 94°C for 30 s, annealing at 55°C for 30 s and elongation at 72°C for 30 s. The final elongation was at 72°C for 2 min. In all cases each RT reaction and its subsequent analysis was repeated at least twice. Reaction products were analyzed on 15% sequencing gels. Image capture was performed with a LiCor Sequencer.
RNA template stem–loop is a key factor for reverse transcriptase slippage directionality on 7A’s and 6A’s
Initial experiments investigating a possible role for stem–loops in stimulating indel formation utilized SuperScript™ III, the widely used genetically engineered RT and two chemically synthetized 75 nt RNA constructs containing 7A’s. These specified the WT, or a variant, of the ‘model’ RNA stem–loop structure 5′ adjacent to 7A’s. The first construct ‘wtSL-A7’ has the WT sequence specifying the ‘model’ stem–loop 5′-GCGGGCgcaaGCCCGC-3′, with the potential of base pairing indicated in upper case. The second ‘MUTsl-A7’ has the 5′ side sequence of the stem substituted by complementary nt bases, i.e. from 5′-GCGGGC-3′ to 5′-CGCCCG-3′ to prevent potential formation of the model stem–loop structure (Figure 1A and B). RT reactions were performed with all dNTP equimolar at 500 μM. The cDNAs were then amplified by PCR with Taq polymerase to yield the ‘RT-PCR products’. The controls for Taq polymerase slippage used two chemically synthetized 75 nt DNAs, whose sequence corresponds to that of the test RNA sequence, used as template for PCR amplification. This yields the ‘PCR products’ referred to below. Next, the two RT-PCR and the two PCR products were used as templates for Limited Primer Extension (LPE) analysis for detecting the addition or omission of a base(s) in the T/A-tract derived sequence. LPE reactions were performed with one primer whose sequence is complementary to the template sequence adjacent to the T-tract present in one of the two strands of the RT-PCR and PCR products [the other DNA strand has the corresponding A-tract]. The conditions of the LPE reaction enable the primer to be extended to the first template base position at which termination was arranged to occur by incorporation of an acyclic dGTP (acyGTP) base. This leads to efficient termination at the first base C of the template encountered by the polymerase during extension of the primer as the corresponding dGTP standard substrate is absent from the reaction (see Materials and Methods). The C at which LPE termination occurs is 5′ adjacent to the T-tract (other sites and acyclic dNTPs are used as controls in Supplementary Data). The length of the LPE product also depends on the occurrence of any indel in the T-tract motif. In absence of slippage of the DNA polymerases used for amplification of the chemically synthesized DNA (Taq polymerase control) and subsequently for generation of the LPE product (Vent exo− polymerase), a homogeneous length LPE product is expected. This is used as a length marker. Comparison of the pattern of the LPE product(s) generated from RT-PCR with the marker reveal specific RT polymerase slippage-mediated base indels (Figure 1C).
With the wtSL-A7 construct, reverse transcription using SuperScript™ III enzyme and all dNTPs present at 500 μM, showed strong realignment-mediated addition of an extra A (Figure 1D, lane 9) but no addition with the MUTsl-A7 construct, where the potential for base-pair formation is greatly diminished (Figure 1D, lane 18). The corresponding control LPE marker showed no slippage addition for both the WT (Figure 1C and D, lane 19) and mutant constructs (Figure 1C and D, lane 20). These LPE markers indicate that the DNA polymerase reagents (i.e. Taq and Vent exo-polymerases) are not responsible for the base addition. However, the wtSL-A7 and MUTsl-A7 constructs do show some omission of an A base (Figure 1D, lanes 9 and 18) with a similar signal detection level as the corresponding LPE markers (Figure 1D, lanes 19 and 20).
DNA-dependent RNA polymerase realignment is sensitive to the relative concentration of the substrate specified by the slippage site and by the DNA template base 5′ adjacent to it (47). To assess whether this also pertains with Reverse Transcriptase realignment, different dNTP concentration ratios were assayed. Nine dNTP ratio combinations with 5, 50 or 500 μM for the dTTP (specified by the A-tract slippage motif) and 5, 50 or 500 μM for the dGTP (specified 5′ adjacent to the template motif) were tested; the dATP and dCTP substrates were each present at 500 μM. Presence of the ‘model’ WT stem–loop stimulated addition of T (Figure 1D, lanes 4–9); this stimulation was increased with higher dTTP concentrations and higher ratios of [dTTP]:[dGTP] (Figure 1D, compare lane sets 1–3, 4–6 and 7–9). In the absence of the ‘model’ stem–loop, base omission of T was predominant. The most stimulatory dNTP condition was the lower ratio, 1:100, of [dTTP]:[dGTP], (Figure 1D, lane 12). At the highest dTTP concentration tested, a modest level of base addition is also observed but only at the highest ratio of [dTTP]:[dGTP] (lane 16). These results indicate that the RNA stem–loop is a strong stimulator for RT SuperScript™ III realignment directionality, promoting addition of an extra T in the cDNA, but not the omission of a T. Interestingly, in absence of the RNA stem–loop structure, realignment directionality is the inverse. This directionality difference indicates that the RNA stem–loop is also a strong inhibitor for omission of a base complementary to a template base. In summary, realignment efficiency and directionality is influenced by the relative dNTP concentrations and by RNA template structure.
As an alternative to the MUTsl-A7 construct whose potential for ‘model’ stem–loop structure formation is abolished by base substitution, we employed an RNA antisense strategy to decrease the potential for model stem–loop structure formation in the wtSL-A7 template. The result showed that presence of a 10 nt antisense RNA (anti-5′stem), complementary to the RNA sequence 9 nt 5′ to the 7A’s, modestly decreases one base addition and enhances base omission. These experiments also showed that the efficiency and/or the directionality of the realignment are affected depending on the dNTP ratios (Supplementary Data and Figure S1).
To assess potential intermolecular stem–loop structure stimulatory action, we used an RNA antisense (anti-3′stem) complementary to the 10 nt sequence 5′ adjacent to the A7 motif in the MUTsl-A7’ RNA construct (Figure 2A). The results, Figure 2B, show a strong effect of the antisense RNA on slippage directionality and efficiency. This antisense result with ‘MUTsl-A7’ is similar to the RT realignment without antisense with the ‘wtSL-A7’ construct (Figure 1). Increasing relative concentration of the antisense ‘anti-3′stem’ correlates: (i) at equimolar dNTP, with increasing base addition (Figure 2, lanes 9–12); (ii) at the lowest dNTP ratio (i.e. [dTTP]5μM:[dGTP]500μM), with a dramatically decreasing base omission of a base (lanes 5–8); (iii) at the highest dNTP ratio (i.e. [dTTP]500μM:[dGTP]5μM) with a slightly increasing base addition (lanes 1–4).
In conclusion, formation of an antisense RNA: template RNA hybrid 5′ adjacent to the motif, mimics the presence of an intramolecular RNA stem–loop structure with a similar effect on realignment directionality and efficiency.
RT catalytic center positioning
Formation of the RNA model stem–loop 5′ to the slippage motif should act as a physical roadblock for the transcribing RT polymerase on the A-tract. We first identified the minimal number of nucleotides 5′ of an A7 motif at which formation of the model stem–loop could stimulate slippage. Base C 5′ adjacent to the motif was maintained in all sequences. Derivatives of the ‘wtSL-A7’ construct were made with 1, 2 or 3 nt insertions between the stem–loop and the A-tract motif (Supplementary Figure S2, panel A). The LPE results showed that by increasing the distance between the model stem–loop and the A7 tract by just one nt, the stimulatory effect of the RNA stem–loop structure on base addition is abolished. The results also show that the inhibitory effect of the stem–loop on base omission is abolished as well. Base omission is now more sensitive to dNTP concentration ratio variation. This is most evident with a higher concentration of the dGTP substrate (specified by the template base 5′ adjacent to the A-tract motif), than that of the dTTP substrate (specified by the slippage motif) (Supplementary Figure S2A and C). The results with E. coli RNA polymerase generated RNA, showed that the model stem–loop 5′ adjacent to an A5 motif does not, at equimolar dNTP, stimulate SuperScript™ III-mediated base addition (data not shown). Interestingly, similar experiments using derivative constructs specifying the model stem–loop 0, 1, 2, or 3 nt 5′ to A5 motif, showed that though the RNA stem–loop does not stimulate base addition on A5, its inhibitory effect on base omission is present when the model stem–loop is 5′ adjacent to the A5 (Supplementary Figure S2B and C). The distance between the ‘road-blocking’ structure formation and the A7 slippage motif was also explored by antisense RNA experiments (Supplementary Results and Supplementary Figure S3).
To summarize, intra- or intermolecular ‘stem’ structures need to be 5′ adjacent to the re-alignment motif for optimal stimulation of base addition. They also need to be 5′ adjacent for maximal inhibition of base omission. Taken together, the results show that at the time of productive realignment, the catalytic center of RT is mostly located at the template position 3′ adjacent to the ‘stem’ structure.
G-rich sequences are also strong stimulators for slippage
To explore potentially relevant properties of G-rich sequences, four dsDNA constructs were made (Supplementary Methods). RNA generated from these with T7 RNA polymerase had the sequence GGCGGCGGCGG 5′ adjacent to the A7 motif or separated from it by 1, 2 or 3 nt (C, UC or UUC) (Figure 3A). In the 5′ leader (UTR) of eukaryotic initiation factor-4A (eIF4A) mRNA this sequence forms an RNA quadruplex (50). However, the structure potentially formed in the transcripts utilized here could be different due the potential for pairing involving the U and C, where present, in the spacer, and was not explored.
LPE analysis was performed with primer, R_821, complementary to the sequence immediately adjacent to the A-tract in one DNA strand of the (RT)-PCR product. An acyC terminator mediates LPE termination at the site specified by the template base position underlined in the sequence 5′-G-spacer-A’s-3′ (Figure 3B, left). The varied spacer lengths (0,1,2,3) determine the staggered LPE product sizes, markers, seen on the gel. The shift is related to the 1 nt length difference of the spacers involved (Figure 3B, PCR). With SuperScript™ III RT, at equimolar dNTP the RT-PCR derived LPE products from all four constructs contain detectable base addition (Figure 3B, lanes 1–4). With dNTP ratio conditions that favor base addition, both the efficiency of addition and number of bases added, increase with spacer length extensions (Figure 3B, lanes 5–8). In contrast, with dNTP ratio conditions that favor base omission, both the efficiency of base absence and number of bases missing, decreases with spacer length extensions (Figure 3B, lanes 9–12).
RT experiments have also been performed with RNA template variants of the eIF4A-derived G-rich sequence, two other G-rich sequences and their derivatives. With a subset, stimulatory effects are evident at specific relative dNTP concentration conditions (Supplementary Results and Supplementary Figure S4).
In conclusion, G- rich sequences can have a major impact on slippage and its directionality.
WT retroviral reverse transcriptases exhibit similar realignment
The reverse transcriptase from WT Moloney Murine Leukemia virus (MuLV), the parent of SuperScript™ III, plus the RTs from Avian Myeloblastosis virus (AMV) and from HIV-1 and HIV-2 were similarly tested with WTsl-A7, and mutSL-A7 under the nine dNTPs concentration conditions. In addition these RTs were tested with the WT, or mutated, ‘model’ stem–loop 5′ adjacent to an A6 motif (wtSL-A6 and MUTsl-A6) under equimolar dNTP concentration condition. The results show a similar LPE product pattern indicating that at identical RNA template and dNTP concentration conditions, the different RT polymerases tested share a clear similar response to slippage directionality. However, for specific reaction conditions, they can show marked differences in their slippage propensity (Supplementary Results and Supplementary Figure S5).
A retrotransposon RT mediates efficient slippage
Thermostable RTs encoded by group II introns from thermophilic bacteria are proving very useful for next generation RNA sequencing (49) and one of them, TGIRT, is commercially available and becoming widely used because of its thermostability (60°C) and advantageous template switching. TGIRT was first tested using the constructs specifying the WT or mutated ‘model’ stem–loop 5′ to the A7 slippage motif. As described more fully in Methods, the experimental conditions involved attachment of a preformed 41 bp DNA:RNA hybrid that is utilized as primer for reverse transcription of the test template by the TGIRT enzyme. The hybrid contained a one base overhang at the 3′ end of the DNA. It is complementary to the base at the 3′ end of the RNA test construct. The overhang base is utilized by the RT enzyme to switch from the RNA of the hybrid to the RNA test template. Such template switching (48) is utilized in preparation of samples for deep sequencing. The buffer conditions used for the preparation of the cDNA for deep sequencing were the same as used here for the study of TGIRT reagent slippage.
The first set of experiments was with WTsl-A7 and mutSL-A7constructs. Reactions were performed using three dNTP concentration conditions. RT reactions were performed with 3 dNTP ratio conditions for the substrates: (i) all 4 dNTPs at 1.25 mM, (ii) dTTP 12.5 μM, other 3 at 1.25 mM, (iii) dATP at 12.5 μM other 3 at 1.25 mM. The results show that TGIRT also responds to RNA template structure and specific dNTP concentration ratio. However, for its slippage-mediated base addition the range of the number of extra nucleotides was much greater and was from 1 to 50 nt (Supplementary Results and Supplementary Figure S6, panels A–C).
To determine the potential importance of the identity of the RNA template base, C, 5′ adjacent to the A7 motif in wtSL-A7 ( = wtSL/C-A7) and MUTsl-A7 ( = MUTsl/C-A7), the C was substituted by G to give the constructs ‘wtSL/G-A7’ and ‘MUTsl/G-A7’. Also in the WT construct a compensatory base substitution was made in the sequence specifying the 5′ base of the 5′ side of the stem to maintain base pairing (Figure 4A). In the MUT construct a corresponding substitution to preclude base pairing was not necessary as its potential partner is already G (Figure 4B). The second set of constructs ‘wtSL/U-A7’ and ‘MUTsl/U-A7’ is as the first set except for the base adjacent to the motif being C with corresponding compensatory base substitutions (A and U respectively) to maintain (wt), or to abolish (MUT), stem–loop structure formation (Figure 4C and D). RT reactions were performed using three specific dNTP concentration ratios (1:1, 100:1 and 1:100) for the dNTP substrate specified by the slippage motif and the RNA base adjacent to the motif. LPE analysis showed a similar result as obtained with the WT and mut ‘stem–loop’ model structure where the last base of the sequence specifying the 3′ side of its stem has the base C 5′ adjacent to the A7 motif. The RT slippage followed the dNTP ratio ‘rules’ where higher substrate concentration specified by the slippage motif stimulates base addition (Figure 4E, lanes 2, 5, 8 and 11), and where higher substrate concentration specified by the template base 5′ adjacent to the motif, stimulates base omission (Figure 4E, lanes 3, 6, 9 and 12). The RT slippage also followed the rules of slippage directionality involving potential formation of the RNA structure 5′ of the motif. With the wtSL constructs, base addition is stimulated, and with the MUTsl constructs it is inhibited (Figure 4E, compare lane sets 1–2 with 7–8, and 4–5 with 10–11). In contrast, slippage omission of at least one base is favored in the absence of potential for RNA template stem–loop formation (Figure 4E, compare lane 9 with 3, and lane 12 with 6). In conclusion, the above result shows that the realignment for the TGIRT enzyme is independent of identity of the base located 5′ adjacent to the motif.
Next, we analyzed the stimulatory effect of the RNA road-blocking ‘model’ structure 5′ to A6 and to the U6 motifs (Supplementary Figure S6A and B). RT reactions were also performed using three specific dNTP ratios (1:1, 100:1 and 1:100) for the dNTP substrate specified by the slippage motif and the RNA base adjacent to the motif. LPE analysis showed a similar slippage pattern for A6 as shown for the A7 motif and it also followed the slippage rules involving dNTP ratio and potential RNA template structure formation (Supplementary Figure S6, panels D and E with A6 motif). Interestingly, slippage occurs with the U6 motif and follows the ‘slippage rules’ (Supplementary Figure S6, panels D and E with U6 motif).
In conclusion, these results show that the non-retroviral RT enzyme behaves similarly to the retroviral RT enzyme in terms of template structure and dNTP influences, although the number of bases inserted by TGIRT enzyme slippage is dramatically higher, ranging up to more than 50 bases instead of just 1.
Retrotransposon gag-pol slippage candidates
A bioinformatic analysis of LTR retrotransposons revealed several that may utilize recoding in synthesis of their GagPol, with some being candidates for utilization of transcription slippage (51). We selected three of these candidates for in vitro testing of TGIRT enzyme slippage during reverse transcription of cassettes. In the two Drosophila melanogaster candidates tested pol was in the –1 frame with respect to gag, whereas in the third candidate, which was from maize (Zea mays), its pol was in the +1 frame with respect to gag. Drosophila candidate Dme1_ChrX_2630566 has the motif 5′-AU6-3′ and was tested with a chemically synthetized RNA containing 22 nt 5′ and 26 nt 3′ to the motif (Figure 5,A). Candidate Dme1_Chr3_26087113 has the motif 5′-GA4U4-3′ and the chemically synthetized RNA to test it contained 18 nt 5′ and 32 nt 3′ of the motif (Supplementary Figure S7A and C). To more closely resemble physiological conditions, in these reactions the TGIRT-mediated reverse transcription was performed at room temperature and in low salt buffer (this differed from the 60°C and higher salt conditions utilized in the switching template experiment above). The RT reaction was performed with a sequence specific primer for each test candidate cassette. Each candidate was tested with 3 specific dNTP ratios (1:1, 100:1 and 1:100) for the dNTP substrate specified by the slippage motif and the RNA base 5′ adjacent to the motif. LPE analysis showed slippage for the candidate Dme1_ChrX_2630566 having the U6 tract in the RNA template: efficiency and distribution follow the dNTP rule for slippage (Figure 5B). Candidate Dme1_Chr3_26087113 showed no slippage (Supplementary Figure S7C).
The Maize candidate (gi_7262818_71383_R) has the motif 5′-UA4C3-3′. This candidate contains a conserved RNA template forming structure specified by 25 nt 5′ to the A4C3 motif (51), that is a candidate cis-acting RNA road-blocking element for stimulation of RT slippage. LPE analysis showed no relevant slippage for the (sub)-motif AC3 (Supplementary Figure S7B and D, with acyT LPE reaction) but showed marginal slippage-mediated addition of one base under all dNTP condition indicating that the A4 motif in the sequence UA4C3 is a poor but ‘active’ slippage-prone motif (Supplementary Figure S7B and D with acyA LPE reaction).
Potential utilization of structure stimulated slippage
The results here show that a cassette from Drosophila retrotransposon Dme1_chrX_2630566 containing an AU6 motif, exhibits strong slippage with sensitivity to relative dNTP concentration conditions. In addition, a cassette with a Maize retrotransposon sequence that has conserved potential for template stem–loop structure formation 5′ to a motif A4C3, showed marginal slippage-mediated addition of T. Given the widespread occurrence and importance of retrotransposons, these results highlight the need for systematic studies to reveal the extent of their functional utilization of RT slippage.
Replication of the single-stranded, positive sense, RNA genome of SARS Coronavirus involves a viral-encoded RNA-dependent RNA polymerase. Polymerase expression involves -1 ribosomal frameshifting at a U-UUA-AAC sequence (55–57). Together with 5′ bases, it is part of a GU5A3C sequence. Interestingly, a potential 10 bp-stem 4 nt-loop structure forms 2 nt 5′ to the GU5A3C sequence and causes reduced frameshift-derived product (58). During replication of the (+) strand such a stem–loop would be ahead (5′) of the U5A3 motif. This raises the possibility of it leading to road-block-induced slippage at the U5A3 motif, and so being a counterpart of the situation shown for the HIV frameshift site. The potential for HIV functional utilization of RT and its implications are considered in the accompanying ms (59).
The finding of RNA G-rich sequence stimulated RT slippage is of interest and its possible extension to DNA-dependent RNA polymerase slippage merits investigation. The widespread distribution of G-rich sequence in RNA has implications for the common use of reverse transcriptase in generating cDNA for deep-sequencing.
Interest in the potential of synthetic compensatory frameshifting near the sites of frameshift mutations to ameliorate a subset of genetic disease, prompted the testing of complementary oligonucleotides for frameshift stimulatory effects (60–63). Whether sequences that can bind to DNA, such as CRISPR-cas nickase mutants (64,65), would create a counterpart partial ‘roadblock’ structure for slippage stimulation, merits future work.
The results highlight the need for caution before assuming that RT products faithfully reflect template sequence. This caution extends to TGIRT. Though it is known to cause a very low rate of base substitution errors, nevertheless in the present work exhibits the highest level of slippage errors.
Extrapolating from the polymerase properties identified here to other polymerases, the recent increase in the modest number of known occurrences of productive utilization of transcription slippage for enriching gene expression, seems set to further increase. More generally, it extends awareness of the potential for template structure to stimulate slippage by diverse types of polymerase, and permits further parallels between context features that promote ribosomal frameshifting and transcription slippage.