Introduction

Glycosyltransferases (GTs) (E.C. 2.4.x.x) constitute a large class of enzymes involved in the synthesis of abundant complex glycosidic structures expressed in cells1. These glycoconjugates are of upmost importance for the interaction between cells and for infection by pathogenic species2. Amongst these GTs, galactofuranosyltransferases (GalfTs) are poorly described enzymes that catalyse the transfer of a galactofuranosyl moiety (Galf) from UDP-α-D-Galf to specific acceptor molecules (Fig. 1)3,4. Noteworthy is that Galf is a five-membered ring sugar that is mainly found on the surface of many pathogenic species such as Mycobacterium tuberculosis, Aspergillus and Leishmania species5,6. The peculiarity of this sugar arises from its high immunogenicity7,8 and the fact that it is totally absent from mammals5. The detection of Galf can therefore be of interest for diagnosis of infection9,10. Even more interestingly, as it can be classified as a virulence factor, the inhibition of the biosynthesis of Galf-containing conjugates can lead to the development of promising inhibitors from which may arise powerful drugs against Multi-Drug-Resistant tuberculosis, for example11,12. The unique structure of Galf in the glycosciences has therefore drawn the attention of many researchers due to its potential tremendous therapeutic applications.

Figure 1
figure 1

Enzymatic incorporation of a Galf unit from UDP-α-D-Galf onto an acceptor catalysed by a GalfT.

Despite this interest, very few GalfTs have been studied to date. The most investigated microbial organism expressing GalfT is M. tuberculosis, whose GalfTs involved in cell wall arabinogalactan biosynthesis have been cloned and characterized13,14,15. In the galactan chain, Galf is alternatively bound by β-(1 → 5) and β-(1 → 6) glycosidic bonds. The two GalfTs involved in galactan chain initiation and elongation are bifunctional, and are able to catalyse the formation of two different glycosidic bonds. Besides those two, much less information is known on WbbI from Klebsiella pneumoniae16, which is another bifunctional GalfT, and on GfsA from Aspergillus17. In Trypanosoma rangeli, genes were identified as coding for GalfT, however, the enzymatic activity of the corresponding protein could not be demonstrated18.

Leishmaniasis belongs to the group of Neglected Tropical Diseases, as defined by World Health Organization (WHO), which includes diseases that are endemic to Third World countries. More than 20 species of parasites are responsible for the yearly infection of two million people, threatening about 350 million people worldwide. It is estimated that more than 12 million people are infected by the parasite in nearly 100 countries19. This disease can exhibit several clinical forms, from the common cutaneous form, which is generally a self-healing disease, to the visceral form (Kala-azar) that is the most severe, usually fatal manifestation of Leishmania infection. Moreover, leishmaniasis has emerged as one of the most important opportunistic infection associated with HIV. In southern Europe, 70% of visceral leishmaniasis are associated with HIV infection20. For these reasons, leishmaniasis is considered as one of the most fatal Neglected Tropical Disease, and has drawn WHO’s attention concerning its diagnosis and treatment21. Efficient prophylactic measures, including safe vaccines, are not available, and effective and affordable chemotherapy is lacking. Current treatments that rely on toxic antimony-containing compounds or diamidines, require strict medical supervision and are threatened by the spread of drug resistance22. Other medications such as amphotericin B or miltefosine offer an alternative for treatment but are also toxic and expensive. In addition, the emergence of resistant strains is expected for miltefosine because of its long half-life23. There is thus an urgent need to identify new chemotherapeutic agents for the treatment of this disease.

The discovery of new antiparasitic compounds exhibiting low toxicity and high specificity for Leishmania relies on the identification of new therapeutic targets. The first pyranose–furanose mutase from an eukaryotic organism has been recently characterized in Leishmania24. Targeted gene deletion of this enzyme in Leishmania major led to attenuated virulence, establishing that Galf contributes significantly to L. major pathogenesis7. Investigations on GalfT’s have started with expression and isolation of the M. tuberculosis enzymes25. The nature of the reactions catalyzed (processing transferases) and the structure of the Galf-containing glycans in M. tuberculosis are very different from the Leishmania system. However, studies of M. tuberculosis GalfTs have provided data relevant for the design of antituberculosis agents25,26. In Leishmania, galactofuranose-containing glycoconjugates such as lipophosphoglycans (LPGs), glycosylinositolphospholipids (GIPLs) and glycoproteins were reported as playing important role in parasite infection process (Fig. 2)27,28,29,30. Moreover all these structures have important functions in the parasite lifecycle31. They play a key role in growth and communication between the parasite and mammalian cells32. In addition, they are essential for the binding and detachment of the parasite from the midgut of the insect vector and therefore for the transmission of the parasite to the mammalian host33,34,35. They also confer resistance to complement-mediated lysis and inhibit phagolysosomal fusion36. Therefore, to permit a better understanding of the implication of each Galf-containing glycoconjugates in cell wall pathogenesis, it is important to identify and characterize the GalfTs that are involved in their assembly37.

Figure 2
figure 2

(A) Putative biological roles of Leishmania major GalfTs. (B) Phylogeny of GalfTs between main pathological Leishmania species (L. major, L. infantum, L. donovani, L. mexicana, and L. braziliensis). The bottom bar scales the genetic change (ie ratio of substitutions per site) along horizontal branches. Starting from a MUSCLE alignment of LPG1, LPG1G, LPG1L, and LPG1R sequences from TriTrypDB database in L. major (resp. LmjF.25.0010, LmjF.32.3990, LmjF.26.0550, LmjF.33.0300), L. infantum (LinJ.25.0010, LinJ.32.4140, LinJ.26.0520, LinJ.33.0330), L. donovani (LdBPK_250010.1, LdBPK_324140.1, LdBPK_260520.1, LdBPK_330330.1), L. mexicana (LmxM.25.0010, LmxM.31.3990, LmxM.26.0550, LmxM.32.0300), and L. braziliensis (LbrM.25.0010, LbrM.32.4230, LbrM.26.0650), the tree was built using PhyML software using Neighbourg Joining algorithm in the Phylogeny.fr web server37.

The genome of Leishmania major was previously screened by Zhang and co-workers, and an analysis led to the identification of four putative genes (lpg1, lpg1L, lpg1R, and lpg1G) that could encode for GalfTs, but that share less than 20% similarity of sequence with known GalfTs from other species30,38. Knock-out studies enabled the authors to identify the enzyme LPG1 as the GalfT involved in Galf attachment during LPG biosynthesis39. The generation of knock-out mutants enabled subsequent studies to identify LPG as necessary for insect infection, as well as critical for parasite virulence and survival in the early stages of the human infection, although some controversy still remains depending on the species31,39,40. Deletion of lpg1, lpg1L, or lpg1R (single or multiple deletions) helped the authors to attribute respective functional roles of these three enzymes (Fig. 2A)30, although no information on LPG1G nor study of any of the four proteins at a molecular level was available. Moreover, homologous enzymes are also present in other Leishmania species (infantum, donovani, braziliensis, mexicana) (Fig. 2B). LPG1R is only present as a truncated protein in L. braziliensis, thus the protein was not used for alignment and tree generation. All four proteins share 52–92% homology (27–30% identity) with L. major orthologs. Studying these GalfTs is therefore crucial to understand the chemistry of the GalfT enzymatic reaction, and their biological role in L. major. Moreover, galactofuranose residues were also detected in glycoconjugates of Trypanosma species5; however, only genes coding for putative galactofuranosyltransferases were described in T. rangeli and in T. cruzi18,41. Biochemical tools and knowledge of these enzymes developed in L. major could thus serve as a template for other pathogenic systems, especially trypanosomatids for which no satisfactory treatments are available to date. Here we report the first cloning, overexpression, purification and biochemical characterization of these four proteins from L. major as well as the identification of their enzymatic function.

Material and Methods

Chemical and biological reagents

Chemical reagents including buffer, salts, sugars, NAD+, pyruvate kinase and lactic dehydrogenase enzymes from rabbit muscle were purchased from Sigma-Aldrich. UDP-pyranoses and pNP-sugars were purchased from Carbosynth (Compton, UK), pET-vectors from Novagen and pMal vectors from New England Biolabs. UDP-α-D-Galf was enzymatically prepared, purified and characterized following the procedure previously developed by Prof. Field’s team42.

Cloning and expression of the four putatives genes

The four genes lpg1, lpg1L, lpg1R and lpg1G were amplified by PCR using L. major genomic DNA, which was kindly provided by Dr Françoise Routier (Hannover Medical School), as template. Specific primers described in Table S1 were designed to amplify genes that encode for proteins without the transmembrane domain (according to TMHMM prediction server)43. lpg1, lpg1L and lpg1R loci (TritrypDB Accession numbers LmjF.25.0010, LmjF.26.0550, LmjF.33.0300), and the 3 identical lpg1G genes copies located in 3 distinctive loci (TritrypDB Accession LmjF.32.3990, LmjF.05.1230, LmjF.19.1650) were used as template for primer design. The amplified region for lpg1, lpg1L, lpg1R and lpg1G excluded the transmembrane domains (resp. nucleotides 1–117, 1–120, 1–120, and 1–139). Amplicons were cloned into pMAL-c2X vector to generate lpg1-pMAL, lpg1G-pMAL, lpg1L-pMal, and lpg1R-pMal plasmids (Figs S1S4). The sequencing of the DNA performed by Eurofins Genomics validated the cloned constructs. The corresponding MBP-fused protein thus contained respectively residues 40–396, 41–421, 41–592, and 47–599.

Plasmids were transformed into E. coli Rosetta (DE3) strain. Clones were cultivated on LB Broth medium with the appropriate antibiotics (Chloramphenicol 30 μg/mL and Ampicillin 50 μg/mL) at 37 °C until optical density at 600 nm reached 0.6. Overexpression was then induced with 100 µM of IPTG and the cultures were incubated overnight at 30 °C. Two liters of culture were harvested and resuspended in Tris 50 mM pH 8.0 Buffer containing NaCl 25 mM and 1 mg/mL of lysozyme. Resuspension was then incubated for 30 min with stirring at 4 °C before freeze–thaw lysis, followed by sonication. After centrifugation (40000 g, 20 min, 4 °C), the supernatant was filtered and loaded on Maltose Binding Protein (MBP) affinity column (MBP Trap HP-1mL, GE Healthcare), and MBP-tagged proteins were then eluted according to manufacturer instructions. LPG1x recombinant proteins were finally purified by size exclusion chromatography (SuperdexTM 200 10/300 GL, GE Healthcare). The proteins of interest were considered as pure enough (>95%) according to SDS-PAGE to perform enzymatic assays. The concentration of contaminants was too low to enable their identification by mass fingerprinting after gel excision. The molecular weight, purity, and concentration were assayed by respectively MS, SDS-PAGE, and the Bradford assay44.

Coupled spectrophotometric assay

Enzymatic assays were performed in 96-well microtiter plate following protocol previously described45. A few parameters were modified: the final volume of the reaction was 200 µL and the media contained 0.2–10 µg of enzyme. Commercially available acceptors were tested at a final concentration of 1 mM. Twenty-nine commercially available carbohydrates were tested as acceptors including hexoses (D-Glc, D-Man, D-GlcNH2, D-GlcNAc), monosaccharides (Me-α-D-Glc, Me-α-D-Man and Oct-α-D-Man), pNP-furanoses (pNP-α-L-Araf, pNP-β-D-Galf and pNP-β-D-Ribf), pNP-hexoses (pNP-α- and β-D-Glc, pNP-α- and β-D-Gal, pNP-α- and β-D-Man, pNP-α- and β-D-GlcNac, pNP-α- and β-D-Xyl, pNP-α- and β-L-Ara, pNP-α- and β-L-Fuc, pNP-β-D-Fuc and pNP-α-L-Rha) and disaccharides (D-Maltose, D-Lactose and D-Melibiose). Controls i.e. reaction’s mixture missing the donor, the acceptor or the enzyme, were performed in parallel. The reactions were monitored at 340 nm using a Multiskan™ GO (Thermo Scientific) microplate reader for up to 20 min with 10 s intervals. UDP formation rates were assumed to be equal to NADH consumption rates, and kinetic parameters were calculated by fitting saturation curves (obtained from the average of triplicate measurements) with standard the Michaelis–Menten equation (Eq. 1)46, using Prism 6 (GraphPad) (see Fig. S5).

$${v}_{0}=\frac{{V}_{{\rm{\max }}}\times [S]}{{K}_{{\rm{M}}}+[S]}$$
(1)

Michaelis–Menten equation. [S] is the substrate concentration, v0 and Vmax are, respectively, the initial and the apparent maximum velocity rate, and KM is the apparent Michaelis constant.

Glycosylation reaction assay and High-Resolution Mass Spectrometry (HRMS) analysis

A magnetically stirred 3 mL solution containing methyl α-D-mannopyranoside (25 μmol, 5 mg), the UDP-sugar (5 μmol, 2.8 mg), 50 mM Tris pH 8, 20 mM MgCl2 and 0.5 mg of the GalfT to be tested was prepared. The reaction was incubated at 37 °C for 24 h. Analytical thin layer chromatography of the reaction mixture was performed on silica gel aluminium supported plates. The eluent was composed of ethyl acetate, methanol and water with the ratio of 7/2/1. Sugars were detected with orcinol solution (95% ethanol 100%, 5% H2SO4 and orcinol 200 mg) after heating at 100 °C (see Fig. S6). After reaction solvent evaporation, 1 mL of acetic anhydride and 1 mL of pyridine were added and the reaction was left at room temperature for 48 h. Then, the residue was concentrated through co-evaporation with toluene. The reaction mixture was then resuspended and the peracetylated sugar was isolated by extraction into CH2Cl2. Finally, high-resolution accurate mass measurements were performed in positive mode with an ESI source on a Q-TOF mass spectrometer (Bruker MaXis) with an accuracy tolerance of 2 ppm by the “Fédération de Recherche” ICOA/CBM (FR2708) analytical platform (see Fig. S7).

Results and Discussion

Expression of 4 lpg1X family genes as recombinant soluble proteins

The genes lpg1, lpg1L, lpg1R and lpg1G were amplified by PCR from Leishmania major genomic DNA, using primers designed to remove the N-terminal transmembrane domain. Interestingly, unlike other lpg1X genes, lpg1G has a particular genetic context. In L. major genome, three identical copies of lpg1G gene are found in three distinct loci, all located near the telomeric end of the corresponding chromosomes. An increase of the copy number of genes near the telomeric end of chromosomes in Leishmania has been related to drug resistance mechanisms47,48. However, in the case of lpg1G gene, the copy numbers are located in separate chromosomes, making this gene a unique example for which the biological function and significance of amplification has still to be understood. For all 4 genes, the amplified fragments were initially cloned into different pET expression vectors (containing N- or both N- and C-terminal His-tag) such as pET-24b(+), pET-28a(+) and pET-32a(+) and transformed in various strains of E. coli. Each His-tagged recombinant protein was over-expressed and purified but the obtained proteins were very difficult to purify, as they formed strong complexes with the GroEL chaperonin (as identified by HRMS) and very low yields were obtained. Different literature protocols were tested to remove the contaminant but none was successful49. Thus, in our hands, pET vectors were found not suitable for expressing the LPG1x proteins. To overcome this issue, the lpg1X genes were cloned in the pMAL-c2X expression vector to obtain MBP-tagged proteins. Constructs were transformed into E. coli Rosetta (DE3) expression strain and proteins were over-expressed at 30 °C. Chemical and physical lysing technics using lysozyme, heat shock and sonication were performed, followed by affinity and size exclusion chromatography. Finally, the pure desired recombinant proteins (Fig. 3) were obtained with a high yield of 5 mg/L for LPG1 and LPG1R and 10 mg/L of culture for LPG1G and LPG1L.

Figure 3
figure 3

Evaluation of the expression and the purity of L. major GalfTs after superdex elution step in 1-D 8% SDS-PAGE with standard mixture marker proteins. Full gels are displayed in SI in Figs S25c.

Enzymatic assays

We used a coupled spectrophotometic assay to assess GalfT activity45. This assay correlates the formation of UDP with NADH consumption by coupling the activity of GalfT to two enzymes, pyruvate kinase (PK) and lactate dehydrogenase (LDH). On the twenty-nine commercially available carbohydrates that were tested as acceptors, including hexoses, monosaccharides, pNP-furanoses, pNP-hexoses and disaccharides, only methyl α-D-mannopyrannoside (Me-Manp) efficiently reacted as an acceptor when used at 1 mM. It was therefore used as a simple acceptor, instead of synthesizing the complex natural acceptor. The acceptor ability of Me-Manp was anticipated, as a Galf–Manp linkage is present in both LPG and GIPL (see Fig. 2).

LPG1, LPG1L, LPG1R and LPG1G are galactofuranosyltransferases

Using Me-Manp as the acceptor, kinetics data was obtained (three replicate experiments) with the respective enzymes. Michaelis–Menten analysis of the data is shown in Table 1. All four enzymes recognized UDP-Galf with apparent KM values ranging from 0.02 to 0.55 mM. Based on that parameter, the proteins can be clustered in two groups: LPG1, LPG1G, and LPG1L, which strongly bind to UDP-Galf with KM’s in the low μM level, and LPG1R, which binds UDP-Galf more weakly. The UDP-Galf KM value for LPG1R is comparable to the one reported for mycobacterial GlfT2 (0.38 mM). Moreover, the kcat values of LPG1, LPG1G and LPG1L range from 5,000 to 30,000 min−1, which are superior to the catalytic rate for LPG1R, which is at least 10-fold time lower with a value of 636 min−1 (and comparable to the catalytic rate reported for GlfT2). The low enzymatic activity observed for LPG1R, in comparison with the three other GalfTs might be explained by uncorrected folding of the enzyme, because of the removal of N-terminal transmembrane domain. However, this hypothesis could be assessed, as for all other LPG1x expressed proteins, no loss of activity could be observed. Moreover, as all four proteins were soluble and no preliminary structural data (eg. Circular Dichroism) on full-length protein is available, validation of proper folding could not be confirmed. These marked differences in the Michaelis constant and turnover rate are found also when estimating the catalytic efficiency (kcat/KM) of LPG1, LPG1G and LPG1L yielding values higher than 300,000 min−1 mM−1 as opposed as 1,145 min−1 mM−1 and 1,131 min−1 mM−1 for LPG1R and GlfT2 respectively. In summary LPG1, LPG1G and LPG1L demonstrated strong in vitro GalfTs properties, at least 300 times higher than LPG1R and the previously reported mycobacterial GlfT2 from M. tuberculosis. However, in M. tuberculosis GlfT2 is a polymerizing enzyme that adds around 30 Galf units, linked by alternating by β-(1 → 5) and β-(1 → 6) glycosidic bonds50. GlfT2 belongs to the CAZY glycosyltransferase (GT) family 251,52, which contains mainly polymerizing enzymes such as the cellulose or the chitin synthase. On the contrary, the LPG1x enzymes belong to the CAZY GT family 40, which only contains putative GalfTs from trypanosomatids. In Leishmania species, GalfTs introduce only one Galf residue into the 3-OH position of the mannosyl acceptor with β-selectivity for example in LPG and GIPL (see Fig. 2).

Table 1 Kinetic parameters of leishmanial GalfTs LPG1, LPG1G, LPG1L, LPG1R compared with mycobacterial GlfT2 for UDP-α-D-Galf.

In addition, after subsequent peracetylation of the mixture following concentration, HRMS analysis demonstrated the presence of the corresponding disaccharide, indicated by the presence of a peak at m/z = 673.1950 corresponding to the exact mass of the sodium adducts of the peracetylated and glycosylated methyl α-D-mannopyranoside product (see Fig. S7). Unfortunately 1H, HMBC or HMQC NMR experiments lead to weak signals, probably due i) to the low amount of dissacharide or most probably, ii) to the presence of a mixture of (1–2, 1–3, 1–4 and/or 1–6) regioisomers. It is noteworthy that Galf-manp containing structures can also be found in other pathogenic microorganisms such as Cryphonectria parasitica (1–2 linkage), Aspergillus (1–3 or 1–6 linkages) or Paraccidioides brasiliensis (1–6 linkage)5. Still, this HMRS data unambiguously confirms that the LPG1x family can catalyse the transfer of a Galf residue to Me-Manp. These GalfT activities are unique both (i) in term of their high catalytic efficiency toward the UDP-Galf and (ii) because they are the first, and to date only, kinetically characterized enzymes from the CAZY GT family 40. Specially given their high turnover values, comparable to those of sucrose or glycogen phosphorylases, and although it will require more studies to discover efficient acceptors for these enzymes, they constitute original biocatalytic tools that will be useful for the chemoenzymatic synthesis of galactofuranosyl-containing conjugates. Such compounds are expected to be useful biological probes for studying cytosolic mutases or eukaryotic transporters53,54,55 present in the Golgi membranes of Leishmania.

LPG1x family can also use UDP-pyranoses as sugar donors

Five NDP-pyranoses were also tested with each LPG1x GalfT (Table 2) so to probe the substrate specificity of these enzymes with artificial donors. Unexpectedly, all four of the enzymes were able to use UDP-pyranoses as donor substrates. Preparative reactions were incubated at 37 °C for 24 h and the reactions were followed by TLC (see Fig. S6). The product of these reactions exhibited an Rf = 0.16, similar to the maltose used as a reference. HRMS analysis of the peracetylated sugar products enabled the identification of the corresponding disaccharide (see Fig. S7). None of the four-recombinant proteins was able to use UDP α-D-glucuronic acid as a substrate. LPG1, LPG1G and LPG1R were the less promiscuous as they were only able to recognize only one UDP-pyranose. LPG1 and LPG1G recognize UDP-α-D-Galp with respective kcat/KM values of 1,400 min−1 mM−1 and 27,978 min−1 mM−1. This is far lower than for the UDP-Galf and this is mainly due to a much lower kcat. Indeed, the apparent KM is still in the sub-mM range, even as low as 5 μM for LPG1G. LPG1R only recognized UDP-α-D-Glcp a similar kinetic properties to the two previous GalfTs for the UDP-Galp, i.e. a low kcat/KM (7,994 min−1 mM−1). Still this is close to a seven-fold increase as compared to UDP-Galf and therefore LPG1R exhibits a better glucopyranosyltransferase than galactofuranosyltransferase activity at least in vitro. LPG1L was the most promiscuous enzyme in this respect as it was able to catalyse the reaction with not only UDP-α-D-Galp and UDP-α-D-Glcp but also very surprisingly with GDP-α-D-Manp and GDP-α-D-Glcp even with lower specificity. Once again, with LPG1L GalfT the apparent KM values were in the tens of millimolar range and the kcat values were as low as 24 min−1. NDP-pyranoses were all recognized in a similar manner but UDP nucleotide sugars led to faster reactions than their GDP counterparts by 50 to 100-fold. Among the few characterized GalfTs, only M. tuberculosis GlfT2 has been reported to be able to use and incorporate analogues of galactofuranose (deoxy and fluoro derivatives)26. However, LPG1x exhibit higher substrate promiscuity, as they can utilize UDP-pyranose donors.

Table 2 Kinetic values of LPG1, LPG1G, LPG1L and LPG1R for UDP-D-pyranoses (n.d: no enzymatic activity detected).

The ability of LPG1L to recognize and utilize a diversity of nucleotide-sugar donors places this enzyme among the most promiscuous natural and characterized glycosyltransferase reported to date in term of the donor. This finding also underscores this protein as a promising tool for glycorandomization, at least for transferring carbohydrate residues to α-Manp-containing acceptors56,57. The hypothesis that this activity results from the residual activity of a contaminant mutase from E. coli followed by classical GalfT activity was ruled out for three reasons: (i) UDP-α-D-Glcp, GDP-α-D-Manp and GDP-α-D-Glcp are not reported substrates of the mutase58, (ii) the presence of a residual band at 42 kDa on SDS-PAGE corresponding to the mass of the mutase was not observed (see Fig. 3)59, and (iii) the obtained kinetic parameters are incompatible with those observed for the mutase as a contaminant60.

Conclusion

Despite the natural occurrence of the galactofuranose in many pathogenic microorganisms, the pathways involved in its biosynthesis remain poorly understood. This is due, in part, to a lack of knowledge of the corresponding enzymes involved in its incorporation into glycoconjugates (mutases, transferases, transporters, hydrolases). The tedious synthesis of the required donor substrate, UDP-α-D-Galf, is another barrier4,61,62. This work provides the first enzymatic characterization of eukaryotic GalfTs and substantially increases our knowledge of these rare enzymes from the CAZY GT family 40. In addition to UDP-α-D-Galf, these enzymes proved in vitro to be able to use some NDP-pyranoses as substrates, thus indicated that they are among the most promiscuous natural glycosyltransferases to date. These unique biocatalysts also proved to be stable and robust for days and can now serve for the chemo-enzymatic incorporation of Galf moiety into complex glycoconjugates.