Elsevier

Biochimie

Volume 177, October 2020, Pages 190-197
Biochimie

G-quadruplex structures bind to EZ-Tn5 transposase

https://doi.org/10.1016/j.biochi.2020.07.022Get rights and content

Highlights

  • Regions of low read depth were observed in sequencing data generated from Nextera libraries.

  • We showed DNA in these regions could form G-quadruplexes.

  • Sequences from these regions bind to the EZ-Tn5 transposase with high affinity.

  • G-quadruplexes may represent target sites for Tn5 and other transposases.

Abstract

Next generation DNA sequencing and analysis of amplicons spanning the pharmacogene CYP2D6 suggested that the Nextera transposase used for fragmenting and providing sequencing priming sites displayed a targeting bias. This manifested as dramatically lower sequencing coverage at sites in the amplicon that appeared likely to form G-quadruplex structures. Since secondary DNA structures such as G-quadruplexes are abundant in the human genome, and are known to interact with many other proteins, we further investigated these sites of low coverage. Our investigation revealed that G-quadruplex structures are formed in vitro within the CYP2D6 pharmacogene at these sites, and G-quadruplexes can interact with the hyperactive Tn5 transposase (EZ-Tn5) with high affinity. These findings indicate that secondary DNA structures such as G-quadruplexes may represent preferential transposon integration sites and provide additional evidence for the role of G-quadruplex structures in transposition or viral integration processes.

Introduction

Next generation DNA sequencing has contributed to rapid growth in genetic information [1]. Combined with spectacular improvements in sequencing devices and data processing, innovation in sample preparation has been a key element of this success. A major improvement in library construction was achieved with the introduction of the Nextera XT kit (Illumina) and this has gained popularity owing to speed, simplicity and low input DNA requirements. The Nextera XT one step library “tagmentation” protocol utilizes a hyperactive Tn5 transposase [2] to simultaneously perform DNA fragmentation and the addition of barcodes, thereby eliminating the need for additional repair, dA tailing and ligation [3,4]. This tagmentation protocol has found application in DNA and RNA sequencing methods [3,5], bisulfite sequencing [6], chromatin profiling [7], and haplotype resolution [8].

Nextera “tagmentation” depends upon the function of a DNA transposition enzyme (transposase) derived from the bacterial Tn5 transposon [2]. The wild type Tn5 transposase inserts a 19 bp DNA fragment into target DNA by a cut and paste mechanism [4,9]. The transposase used in Nextera procedures is a hyperactive Tn5 transposase derivative supplied by Illumina (and previously by Epicentre, USA) and catalyzes integration of synthetic oligonucleotides into target DNA at high efficiency [2,9].

We recently evaluated the accuracy of available next generation sequencing tools for accurately genotyping the CYP2D6 gene [10], which encodes an important cytochrome P450 liver enzyme involved in drug metabolism. During this analysis, we observed a marked variability in sequence coverage across the 6.6 Kb CYP2D6 polymerase chain reaction (PCR) amplicon tagmented using the Nextera XT kit and sequenced on the MiSeq (Illumina) platform, in contrast to data generated for the same gene with Ion Torrent sequencing, which does not use Nextera tagmentation. The striking pattern of regions with low sequencing coverage in the MiSeq analysis led us to hypothesize that these GC rich, low-coverage regions reflected preferential integration sites for the Nextera transpose. In order to test this hypothesis, we examined sequence and structural motifs in these regions, and tested their ability to bind to EZ Tn5 transposase.

Section snippets

Computational analyses

The genomic co-ordinates (hg19) of regions that were observed to form quadruplexes in the human genome [11] were obtained from NCBI under accession GSE63874 and consisted of G4-seq experiments carried out in K+ buffer (Na_K_plus_hits_intersect.bed.gz, Na_K_minus_hits_intersect.bed.gz), and in the presence of G4 stabilizing molecule PDS (pyridostatin) (Na_PDS_plus_hits_intersect.bed.gz, Na_PDS_minus_hits_intersect.bed.gz). PCR amplification of CYP2D6, library construction, and DNA sequencing on

Results

In a previous study, alignment of MiSeq data for CYP2D6 amplicons generated by Nextera XT libraries revealed specific points of low coverage regions within the long PCR products [10]. These “dips” in average read depth across the CYP2D6 amplicon were observed only on the MiSeq platform, and were not apparent in the Ion Torrent data (Supplementary Fig. 1). The low coverage regions (Table 2, Fig. 1) corresponded to previously predicted G-quadruplex (G4) forming sites in the GC rich regions of

Discussion

Non-B DNA structures such as hairpin DNA and G-quadruplexes have been implicated in cellular processes such as gene regulation, DNA replication and genomic stability [21,22], including transposition [23,24]. G4 secondary DNA structures were recently found in greater numbers in the human genome than computationally predicted using the high-resolution sequencing-based (G4-seq) method [11]. Coverage of human genome libraries derived from Nextera Tn5 transposase based fragmentation has been shown

Conclusion

G4 secondary structures were detected in the CYP2D6 pharmacogene at sites of low DNA sequence coverage from data obtained using the Nextera XT library protocol on the MiSeq platform. Further investigation of these CYP2D6 G4s revealed a high affinity interaction with the Ez-Tn5 transposase. Since the Nextera transposase and EZ-Tn5 are derivatives of Tn5 transposase, binding of the Nextera tranposase to the G4 sites within the CYP2D6 amplicons during the library preparation step possibly resulted

Author contributions

SLC carried out the experiments and data analysis, JC and EWC assisted with parts of experiments and data analysis, RCJD assisted with data analyses. SLC drafted the manuscript with support from EWC, JC, RD and MAK. SLC and MAK conceived the experiments. All authors have read and approved the final article.

Declaration of competing interest

The authors declare no conflict of interest.

Acknowledgements

We are grateful to the Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand for access to CD spectroscopy, BLItz system and SPR. This work was supported by the Marsden Fund Council from New Zealand Government funding, administered by the Royal Society of New Zealand.

References (46)

  • A. Adey et al.

    Ultra-low-input, tagmentation-based whole-genome bisulfite sequencing

    Genome Res.

    (2012)
  • J.D. Buenrostro et al.

    Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position

    Nat. Methods

    (2013)
  • S. Amini et al.

    Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing

    Nat. Genet.

    (2014)
  • W.S. Reznikoff

    Tn5 as a model for understanding DNA transposition

    Mol. Microbiol.

    (2003)
  • E.W. Chua et al.

    Cross-comparison of exome analysis, next-generation sequencing of amplicons, and the iPLEX((R)) ADME PGx panel for pharmacogenomic profiling

    Front. Pharmacol.

    (2016)
  • V.S. Chambers et al.

    High-throughput sequencing of DNA G-quadruplex structures in the human genome

    Nat. Biotechnol.

    (2015)
  • H. Thorvaldsdottir et al.

    Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration

    Briefings Bioinf.

    (2013)
  • H.M. Wong et al.

    A toolbox for predicting g-quadruplex formation and stability

    J. Nucleic Acids

    (2010)
  • O. Kikin et al.

    QGRS Mapper: a web-based server for predicting G-quadruplexes in nucleotide sequences

    Nucleic Acids Res.

    (2006)
  • S.L. Cree et al.

    DNA G-quadruplexes show strong interaction with DNA methyltransferases in vitro

    FEBS Lett.

    (2016)
  • C.J. Fee

    Label-Free, Real-Time Interaction and Adsorption Analysis 1: Surface Plasmon Resonance

  • S.L. Cree et al.

    Relevance of G-quadruplex structures to pharmacogenetics

    Front. Pharmacol.

    (2014)
  • I. Kejnovska et al.

    CD study of the G-quadruplex conformation

    Methods Mol. Biol.

    (2019)
  • Cited by (2)

    1

    Current address: Faculty of Pharmacy, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia

    2

    Current address: The institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, United Kingdom

    View full text