G-quadruplex structures bind to EZ-Tn5 transposase
Introduction
Next generation DNA sequencing has contributed to rapid growth in genetic information [1]. Combined with spectacular improvements in sequencing devices and data processing, innovation in sample preparation has been a key element of this success. A major improvement in library construction was achieved with the introduction of the Nextera XT kit (Illumina) and this has gained popularity owing to speed, simplicity and low input DNA requirements. The Nextera XT one step library “tagmentation” protocol utilizes a hyperactive Tn5 transposase [2] to simultaneously perform DNA fragmentation and the addition of barcodes, thereby eliminating the need for additional repair, dA tailing and ligation [3,4]. This tagmentation protocol has found application in DNA and RNA sequencing methods [3,5], bisulfite sequencing [6], chromatin profiling [7], and haplotype resolution [8].
Nextera “tagmentation” depends upon the function of a DNA transposition enzyme (transposase) derived from the bacterial Tn5 transposon [2]. The wild type Tn5 transposase inserts a 19 bp DNA fragment into target DNA by a cut and paste mechanism [4,9]. The transposase used in Nextera procedures is a hyperactive Tn5 transposase derivative supplied by Illumina (and previously by Epicentre, USA) and catalyzes integration of synthetic oligonucleotides into target DNA at high efficiency [2,9].
We recently evaluated the accuracy of available next generation sequencing tools for accurately genotyping the CYP2D6 gene [10], which encodes an important cytochrome P450 liver enzyme involved in drug metabolism. During this analysis, we observed a marked variability in sequence coverage across the 6.6 Kb CYP2D6 polymerase chain reaction (PCR) amplicon tagmented using the Nextera XT kit and sequenced on the MiSeq (Illumina) platform, in contrast to data generated for the same gene with Ion Torrent sequencing, which does not use Nextera tagmentation. The striking pattern of regions with low sequencing coverage in the MiSeq analysis led us to hypothesize that these GC rich, low-coverage regions reflected preferential integration sites for the Nextera transpose. In order to test this hypothesis, we examined sequence and structural motifs in these regions, and tested their ability to bind to EZ Tn5 transposase.
Section snippets
Computational analyses
The genomic co-ordinates (hg19) of regions that were observed to form quadruplexes in the human genome [11] were obtained from NCBI under accession GSE63874 and consisted of G4-seq experiments carried out in K+ buffer (Na_K_plus_hits_intersect.bed.gz, Na_K_minus_hits_intersect.bed.gz), and in the presence of G4 stabilizing molecule PDS (pyridostatin) (Na_PDS_plus_hits_intersect.bed.gz, Na_PDS_minus_hits_intersect.bed.gz). PCR amplification of CYP2D6, library construction, and DNA sequencing on
Results
In a previous study, alignment of MiSeq data for CYP2D6 amplicons generated by Nextera XT libraries revealed specific points of low coverage regions within the long PCR products [10]. These “dips” in average read depth across the CYP2D6 amplicon were observed only on the MiSeq platform, and were not apparent in the Ion Torrent data (Supplementary Fig. 1). The low coverage regions (Table 2, Fig. 1) corresponded to previously predicted G-quadruplex (G4) forming sites in the GC rich regions of
Discussion
Non-B DNA structures such as hairpin DNA and G-quadruplexes have been implicated in cellular processes such as gene regulation, DNA replication and genomic stability [21,22], including transposition [23,24]. G4 secondary DNA structures were recently found in greater numbers in the human genome than computationally predicted using the high-resolution sequencing-based (G4-seq) method [11]. Coverage of human genome libraries derived from Nextera Tn5 transposase based fragmentation has been shown
Conclusion
G4 secondary structures were detected in the CYP2D6 pharmacogene at sites of low DNA sequence coverage from data obtained using the Nextera XT library protocol on the MiSeq platform. Further investigation of these CYP2D6 G4s revealed a high affinity interaction with the Ez-Tn5 transposase. Since the Nextera transposase and EZ-Tn5 are derivatives of Tn5 transposase, binding of the Nextera tranposase to the G4 sites within the CYP2D6 amplicons during the library preparation step possibly resulted
Author contributions
SLC carried out the experiments and data analysis, JC and EWC assisted with parts of experiments and data analysis, RCJD assisted with data analyses. SLC drafted the manuscript with support from EWC, JC, RD and MAK. SLC and MAK conceived the experiments. All authors have read and approved the final article.
Declaration of competing interest
The authors declare no conflict of interest.
Acknowledgements
We are grateful to the Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand for access to CD spectroscopy, BLItz system and SPR. This work was supported by the Marsden Fund Council from New Zealand Government funding, administered by the Royal Society of New Zealand.
References (46)
- et al.
The next-generation sequencing revolution and its impact on genomics
Cell
(2013) - et al.
Tn5 in vitro transposition
J. Biol. Chem.
(1998) - et al.
In vitro reconstitution of a single-stranded transposition mechanism of IS608
Mol. Cell.
(2008) - et al.
DNA gyrase is a host factor required for transposition of Tn5
Cell
(1982) - et al.
Making and breaking nucleic acids: two-Mg2+-ion catalysis and substrate specificity
Mol. Cell.
(2006) - et al.
Exploring the binding of d(GGGT)4 to the HIV-1 integrase: an approach to investigate G-quadruplex aptamer/target protein interactions
Biochimie
(2016) - et al.
Lentivector integration sites in ependymal cells from a model of metachromatic leukodystrophy: non-B DNA as a new factor influencing integration
Mol. Ther. Nucleic Acids
(2014) - et al.
Evaluation of a transposase protocol for rapid generation of shotgun high-throughput sequencing libraries from nanogram quantities of DNA
Appl. Environ. Microbiol.
(2011) - et al.
Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition
Genome Biol.
(2010) - et al.
Transposase mediated construction of RNA-seq libraries
Genome Res.
(2012)
Ultra-low-input, tagmentation-based whole-genome bisulfite sequencing
Genome Res.
Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position
Nat. Methods
Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing
Nat. Genet.
Tn5 as a model for understanding DNA transposition
Mol. Microbiol.
Cross-comparison of exome analysis, next-generation sequencing of amplicons, and the iPLEX((R)) ADME PGx panel for pharmacogenomic profiling
Front. Pharmacol.
High-throughput sequencing of DNA G-quadruplex structures in the human genome
Nat. Biotechnol.
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration
Briefings Bioinf.
A toolbox for predicting g-quadruplex formation and stability
J. Nucleic Acids
QGRS Mapper: a web-based server for predicting G-quadruplexes in nucleotide sequences
Nucleic Acids Res.
DNA G-quadruplexes show strong interaction with DNA methyltransferases in vitro
FEBS Lett.
Label-Free, Real-Time Interaction and Adsorption Analysis 1: Surface Plasmon Resonance
Relevance of G-quadruplex structures to pharmacogenetics
Front. Pharmacol.
CD study of the G-quadruplex conformation
Methods Mol. Biol.
Cited by (2)
Noncanonical DNA structures are drivers of genome evolution
2023, Trends in GeneticsChromosome organization affects genome evolution in Sulfolobus archaea
2022, Nature Microbiology
- 1
Current address: Faculty of Pharmacy, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
- 2
Current address: The institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, United Kingdom