Introduction

Klebsiella pneumoniae is one of the major causes of nosocomial infections worldwide (Podschun and Ullmann 1998). K. pneumoniae colonizes the humans gastrointestinal tract and it is able to cause infection if gains entry to the blood and other tissues (Piperaki et al. 2017). It can cause meningitis and sepsis in neonates and infants in pediatric wards as well as serious infections in immunocompromised children and adults. K. pneumoniae is also one of the causative agents for community acquired pneumonia (Piperaki et al. 2017). The hospitalization and mortality rate for patients infected by K. pneumoniae is varying among different countries. As an instance, the overall hospitalization and mortality rate for patients with pneumonia caused by K. pneumoniae in the United States is 3.6 cases/100,000 population and 14.3%, respectively, where the hospitalization rate has increased since 2002 to 2011 (Wuerth et al. 2016).The worldwide pooled mortality rate of patients infected with carbapenem- resistant and carbapenem-susceptible K. pneumoniae was reported to be 42.14% and 21.16%, respectively (Xu et al. 2017).

A significant increase of more than 40% in antibiotic resistance has been reported for K. pneumoniae which is mainly observed for the following antibiotics: cefazolin, cefuroxime, amoxicillin-clavulanic acid, ceftriaxone, cefepime, and ceftazidime (Kahraman and Çiftcia 2017). The globally dissemination of hyper-virulent strains of K. pneumoniae and the emergence of antibiotics-resistant isolates of this pathogen narrows down the treatment options and has renewed interest in its vaccines. A number of vaccine candidates including inactivated whole cell vaccines, capsular polysaccharides, outer membrane vesicles, and highly conserved immunogenic proteins have been proposed for prevention of K. pneumoniae infections. A clinical trial aiming K. pneumoniae prevention evaluated a multicomponent vaccine called Immunovac in patients with chronic pulmonary diseases (Egorova et al. 2010). This vaccine contained antigenic complexes derived from K. pneumoniae, P. vulgaris, S. aureus, and E. coli and was observed to be immunogenic in elderly patients and preschool children (Egorova et al. 2010). However, in spite of numerous attempts, vaccine candidates for K. pneumoniae prevention have not been fully protective, safe and globally available yet, demanding new strategies to find more efficient antigens to develop effective vaccine (Pletz et al. 2016; Choi et al. 2019).

Vaccine studies in vivo and in vitro are time-consuming and costly processes. Using bioinformatics approaches such as in silico epitope prediction can make these processes easier by introducing novel or characterizing previous found vaccine candidates (He et al. 2010).

Fimbriae proteins of K. pneumoniae have been shown to be effective protein carriers and immunogens (Li et al. 2010; Staniszewska et al. 2000; Lavender et al. 2005; Witkowska et al. 2005; Babu et al. 2017). However, vaccines candidates based on these proteins are at the primarily stages of development and require more studies (Choi et al. 2019). Type 1 fimbriae is composed of 1000 units of FimA (the major subunit) and single copies of FimF (the adaptor subunit), FimG (the tip subunit), and FimH (the adhesion) proteins (Volkan et al. 2014).

It is accepted that humoral (antibody) responses play a critical role in controlling of K. pneumoniae infections. The anti-K. pneumoniae antibodies evoked in experimental animals have been shown to cause a protective response (Libon et al. 2002; Lee et al. 2015; Hegerle et al. 2018). B cell-specific epitopes are linear or discontinuous amino acid sequences that induce specific, desirable, and broad-spectrum humoral immunity (Chyau Liang 1998). Defining these epitopes will assist to find and characterize the novel vaccine candidates for this gram-negative pathogen.

In our previous work, we predicted T cell epitopes of four type 1 fimbriae antigens namely FimA, FimF, FimG, and FimH using immunoinformatic, docking and molecular dynamics (Rostamian et al. 2020). Due to the importance of humoral immune responses against K. pneumoniae, here we performed an in silico study on these four Fim antigens as vaccine candidates and tried to find and characterize B-cell epitopes of these antigens. Furthermore, the selected B cell epitopes were modeled and their physiochemical properties were estimated to be used as potential vaccine candidates.

Methods

Sequences Preparation

The sequences of the single proteins of FimA (accession number: CDO12578.1), FimF (accession number: CDO12574.1), FimG (accession number: CDO12573.1) and FimH (accession number: CDO12572.1) were retrieved from NCBI protein database (https://www.ncbi.nlm.nih.gov/protein). Note that in K. pneumoniae type 1 fimbriae a bunch of about 1000 identical FimA subunits are present, but here we only select one single subunit of FimA.

Linear B-Cell Epitope Prediction

For prediction of linear epitopes, the Fim protein sequences were submitted to the IEDB (https://tools.iedb.org/main/bcell/), (Fleri et al. 2017), and ABCpred (https://crdd.osdd.net/raghava/abcpred/) (Saha and Raghava 2006) servers. The IEDB server contains five tools to predict linear B-cell epitopes that are based on the following scales: hydrophilicity (Parker et al. 1986), beta turn (Chou and Fasman 2006), surface accessibility (Emini et al. 1985), flexibility (Karplus and Schulz 1985), and antigenicity (Kolaskar and Tongaonkar 1990). At the first step, the linear B-cell epitopes of Fim proteins were predicted by these five tools. For each tool, as recommended by the tool, the window size was set on seven amino acid residues and the threshold was set on the defaults provided by the tool. Each tool provided a chart in which regions with scores higher than threshold are more probable to be B-cell epitope. All charts provided by the tools were compared and the regions that predicted to be epitopes in at least three (out of five) tools were selected for further analysis.

Also, two other linear B-cell epitope prediction programs namely BepiPred 1.0 (Larsen et al. 2006) and BepiPred 2.0 (Jespersen et al. 2017) can be found in the IEDB server. BepiPred 1.0 use algorithms that combined hidden Markov model and a propensity scale method, while BepiPred 2.0 use a Random Forest algorithm trained on epitopes and non-epitope amino acid residues found from crystal structures (Larsen et al. 2006; Jespersen et al. 2017). Similar to above mentioned B-cell epitope prediction tools, these tools also report their analyses by charts in which the regions with scores higher than threshold are defined as epitopes.

To find more probable linear B-cell epitope, ABCpred server was also applied. This server predicts epitopes using artificial neural network (Saha and Raghava 2006). The proteins sequences were submitted and other parameters were remained as default. The server presents the more probable linear epitopes ranked according to their score when higher score means the higher probability to be epitope.

Finally, to find similar predicted epitopes, all the epitopes obtained from IEDB (seven tools) and ABCpred were compared using Epitope Cluster Analysis tool (https://tools.iedb.org/cluster/) and the common epitopes were selected as final linear B-cell epitopes.

Modelling of Fim proteins

For prediction of conformational B-cell epitopes, the predictors need tertiary structures of our proteins. Since the 3D structures of our proteins were not available in the PDB bank, the structures of Fim proteins were obtained by modeling using SWISS-MODEL server (https://swissmodel.expasy.org/, (Waterhouse et al. 2018)).

The amino acid sequence of Fim proteins were submitted to the server. Following target-template alignment, the server found and used the available 3D structure of proteins similar to Fim (usually Fim proteins of other bacteria) as templates. The model accuracy and modeling errors were estimated by the qualitative model energy analysisFootnote 1 (QMEAN) scoring function supplied by the server (Waterhouse et al. 2018). Also, the quality of models was assessed by generation of Ramachandran plot using RAMPAGE server (https://mordred.bioc.cam.ac.uk/~rapper/rampage.php, (Lovell et al. 2003)).This server use Ramachandran plots to predict the possibility of amino acids to form a secondary structure and show the qualities of models based on the amino acid percentage in the allowed, favoured and outlier regions (Lovell et al. 2003).

Conformational B-Cell Epitope Prediction

The 3D models of Fim proteins were submitted to DiscoTope 2.0 (Kringelum et al. 2012) and ElliPro (Ponomarenko et al. 2008) to predict conformational epitopes. The parameters were remained as the tools defaults. Each tool predicts one or more regions as epitopes. The regions common in predicted epitopes of two tools were considered as final conformational epitopes.

Toxicity, Human Similarity, and Experimental Records

The linear epitopes and the segment sequences of conformational epitopes were checked for potential toxicity using ToxinPred server (Gupta et al. 2013). The epitopes sequences were entered as FASTA format, the support-vector machineFootnote 2 (SVM) algorithm was selected and other parameters were left as the server defaults.

By searching in BLASTP (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins), selected linear and conformational epitopes were screened for similarity with human proteome (taxid 9606). The epitopes with ≥ 90% similarity with human peptides (at both coverage and identity values) were excluded from the study.

The shortlisted epitopes were investigated for possible experimental records using the home page of IEDB (https://www.iedb.org/) by selecting human as the host and applying each epitope as query.

Antigenicity and Allergenicity

Antigenic regions of Fim proteins were predicted using the antigenic program of the European Molecular Biology Open Software SuiteFootnote 3 (EMBOSS) (https://www.bioinformatics.nl/cgi-bin/emboss/antigenic) that predicts antigenicity using the method of Kolaskar and Tongaonkar (1990). The amino acid sequences of Fim proteins were entered to the program separately and other parameters were remained as defaults. The output represents amino acid sequences with different lengths and different scores that are potentially antigenic regions. These predicted regions were clustered with previously predicted linear and conformational epitopes using Epitope Cluster Analysis tool (https://tools.iedb.org/cluster/). The Fim regions that have been predicted in both linear/conformational epitope prediction and antigenicity prediction were selected as the best B-cell epitopes.

Presence of antigenic sequence in these selected epitopes was also confirmed by Antigenic Peptide PredictionFootnote 4 (APP) software (https://imed.med.ucm.es/Tools/antigenic.pl). APP predicts those segments within a protein or a peptide sequence that are likely to be antigenic by eliciting an antibody response by Kolaskar and Tongaonkar method (1990).

The allergenicity of the final peptides was checked by Structural Database of Allergenic ProteinsFootnote 5 (SDAP) (Ivanciuc 2003) and AllerCatPro (Maurer-Stroh et al. 2019) servers.

3D Structure and Other Physicochemical Properties of Final Epitopes

The PEP-Fold3 server (https://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD3/2018) was used to predict the tertiary structure of selected epitopes (Lamiable et al. 2016). To achieve the other physicochemical properties of epitopes, ProtParam from Expasy (https://web.expasy.org/protparam/) was used (Adhikari et al. 2018). These properties included molecular weight, half-life, instability index, net charge, hydrophobicity, grand average of hydropathicity indexFootnote 6 (GRAVY), and isoelectric pHFootnote 7 (pI).

Moreover, helical wheel projection was carried out to predict the position of amino acids in peptides by Heliquest server (https://heliquest.ipmc.cnrs.fr/cgi-bin/ComputParams.py) (Gautier et al. 2008).

Results

Linear B-Cell Epitope Prediction

A B-cell linear epitope search was performed for all Fim proteins using the IEDB (seven tools) and ABCpred. Each tool in the IEDB provided a chart in which regions with scores higher than threshold were more probable to be B-cell epitope. These charts are presented in Figs. S1, S2, S3, and S4. After comparison of all charts provided by the tools the regions that predicted to be epitopes in at least five (out of seven) tools were selected. Then, these selected epitopes were compared to the results of ABCpred tool and clustered using cluster analysis tool with cut-off set on ≥ 70% similarity. Only one epitope of each cluster was selected as final linear B-cell epitope which was the one that had the highest prediction scores or had been predicted by more prediction tools. The number of epitopes predicted by each tool and the selected linear B-cell epitopes of each antigen is presented in Table 1.

Table 1 The number of predicted and the selected linear B-cell epitopes

Modelling of Fim Proteins

The 3D structures of Fim proteins were modeled by SWISS-MODEL server based on target-template alignment. Regarding quality estimations provided by the server, the models were created with high quality and accuracy (Fig. 1a–c).

Fig. 1
figure 1

Evaluation of the models quality. After homology modeling of Fim proteins, the model quality was evaluated by Local Quality Estimate (LQE Local Quality Estimate) (LQE) plots, Comparison with Non-redundant Set of PDB Structures, and Global Quality Estimate (GQE Global Quality Estimate) (GQE) prepared by SWISS-MODEL server, as well as by Ramachandran plotting by RAMPAGE server. In LQE plots (a), residues with a score above 0.6 are expected to be of acceptable quality. In comparison plots (b), whatever the model (indicated by a red star) is near to the dark grey region, the model has a higher quality. In GQE plots (c), for each term the Z-score number around zero shows better agreement between the models and experimental structures of similar size. The results of Ramachandran plots (d) are summarized in a table that show the percentage of amino acid residues in favored, allowed, and outlier regions

Furthermore, the quality of models was also confirmed by Ramachandran plotting that revealed large portions of amino acid in the favored regions (Fig. 1d).

Conformational B-Cell Epitope Prediction

The created models of Fim proteins were submitted to DiscoTope 2.0 and ElliPro to predict conformational B-cell epitope of each antigen. The tools predicted epitopes with different scores. The outputs were compared and the regions predicted to be epitope in both tools or had gotten the highest score by the tools were considered as final conformational epitopes. The number of predicted epitopes by each tool and the selected conformational epitopes has been represented in Table 2. The locations of selected conformational epitope residues for each Fim protein have also been illustrated in Fig. 2.

Table 2 The number of predicted and the selected conformational B-cell epitopes
Fig. 2
figure 2

The locations of selected conformational epitope residues in Fim proteins. The location of selected epitopes are depicted by yellow color. The Fim A, FimF, and FimG epitope location have been extracted from ElliPro server, while the FimH epitope was extracted from DiscoTope 2.0 server

Toxicity, Human Similarity, and Experimental Records

The toxicity, human similarity and experimental records investigations were performed for the selected linear epitopes and the segment sequences of conformational epitopes. For predicting toxicity the ToxinPred server compared our peptides with 1805 toxic peptides to find any possible homology of our peptides and these toxic peptides. The results of toxicity prediction showed that none of our selected epitopes was toxic for humans. Comparison of the epitopes with human proteome showed that only two epitopes (TAGSTSS of FimA and TADSTGY of FimG) were completely similar (100% identity and 100% coverage) with human peptides, hence were excluded. No experimental record was found for any of our selected epitopes.

Antigenicity Prediction

The antigenicity prediction resulted in the sequences with different lengths and different scores that were potentially antigenic regions (Table 3). These regions were clustered with our previously predicted linear and conformational epitopes and the common regions between them were selected as the best B-cell epitopes. In case more than one common region existed, only one region was selected which had been predicted by more tools or had higher prediction scores (Table 4). Furthermore, for these final peptides APP software predicted that there are antigenic determinants in all of them confirming their high immunogenicity (Table 4).

Table 3 The antigenicity prediction of Fim proteins
Table 4 Final B-cell epitopes of Fim protein

The allergenicity prediction results showed that none of our final epitopes are allergen.

Structural and Physicochemical Properties Prediction

To simplify the explanations of the results for physicochemical and structural prediction, the symbols of the epitopes (as Table 4) were used. The physicochemical properties of final epitopes including molecular weight, half-life, instability index, net charge, hydrophobicity, GRAVY, and pI have been presented in Table 5. These results showed that BEF-G and BEF-H were naturally basic, stable, and hydrophilic while BEF-A sequence was acidic, stable and less hydrophilic. BEF-F was also acidic, hydrophilic but unstable. Hydrophobic and hydrophilic faces of each peptide were drawn up by helical wheel projection (Fig. 3).

Table 5 Predicted physicochemical properties of epitopes
Fig. 3
figure 3

The helical wheel projection of the final epitopes. The helical wheel projection of BEF-A (a), BEF-F (b), BEF-G (c) and BEF-H (d) are shown. In all epitopes, polar and non-polar amino acids are distributed irregularly and disrupting the uniformity of hydrophobic and hydrophilic faces. The amino acid residues are shown by standard single letters

The prediction results of the 3D structure of the peptides revealed that BEF-A and BEF-F are mixture of helix and coil while BEF-G and BEF-H are a combination of β-sheet, turn and coil (Fig. 4).

Fig. 4
figure 4

Structure prediction. 3D structure of epitopes was predicted by PEP-Fold3 server. The 3D structure of BEF-A, BEF-F, BEF-G, and BEF-H are shown in parts A, B, C, and D, respectively

Discussion

In the present study using immunoinformatics approaches we found B cell epitopes of four Fim proteins as vaccine candidates against K. pneumoniae. B cell epitopes may be linear or conformational. Linear epitopes are short sequences that elicit humoral responses even if the antigen administered in the denatured form. Conversely, conformational epitopes consist of atoms from distant amino acid residues joined on the surface of the antigens. These epitopes may be bound to either an immunoglobulin or a B cell receptor and triggers humoral responses (Chyau Liang 1998).

B-cell epitope prediction resulted in the final 5, 3, 5, and 4 linear epitopes with 7-amino acid sequences. Since we defined the window size of prediction tools on 7, the predicted epitopes are all 7-amino acid sequences. It does not mean that all B-cell epitopes should be restrictively 7-amino acid length, but it means that these seven amino acids are the core of B-cell epitopes and almost always present in antibody-epitope interactions.

In contrast, conformational epitope predictions lead to different numbers of epitopes for each Fim antigen with various sequence lengths. For FimA, a conformational epitope was found (amino acid number 29 to number 35) that had been also found by linear epitope prediction emphasizing the importance of this region as an epitope. For other Fim proteins, a conformational epitope was found for each antigen that composed of discontinues parts. There was no relationship between discontinues parts and the final predicted linear epitopes. However, it does not make a problem since conformational epitopes may not be necessarily associated with linear ones.

We applied many other assays to choose the best epitopes including prediction antigenicity, allergenicity, toxicity, human similarity and investigation on experimental records. These assays resulted in final four epitopes (each for one Fim protein) that were immunogen, antigenic, not similar to human peptides, not allergen and not toxic.

To find conformational epitops, prediction tools need 3D structure of the antigens.

If the tertiary structure of the antigens is known, the predicted tools can find the conformational epitopes more efficiently. Otherwise, in silico approaches provide a way to model the structure of the antigens and map their conformational epitopes. Accordingly, in the present study we modeled the Fim antigens using in silico approaches. Generally the quality estimations showed that our models are acceptable, although some quality scores were low for some models.

Although no study was found on our Fim antigens, there are some bioinformatics studies on other antigens of K. pneumoniae regarding B cell epitope mapping. In a study done by Dar et al. 222 genomes of K. pneumoniae were explored and the core proteome was found and applied for a deep reverse vaccinology. They identified four antigens and predicted their B cell epitopes using ABCPred (Dar et al. 2019). By applying several bioinformatics approaches and based on conformational B cell and linear CD4+ T cell epitopes, Farhadi et al. engineered a polytopic vaccine using five Omps from K. pneumoniae (Farhadi et al. 2015). These studies and our study emphasize on the potential ability of prediction tools on B cell epitope-based vaccine candidates for K. pneumoniae.

If an epitopic peptide is going to be used as vaccine candidate, it would be essential to elucidate their structures and physicochemical properties. Therefore, in the present study, the physicochemical and structural properties of final four epitopes were investigated. These investigations include molecular weight, half-life, instability index, net charge, hydrophobicity, GRAVY, pI and 3D structure that all of them are valuable and increases the accuracy and precision of the epitope design (Adhikari et al. 2018). Isoelectric point (pI) is an important factor to select a good epitope (Campos-Pinto et al. 2019). The peptides at their own isoelectric point carries no net electrical charge, their immunogenicity is reduced because antibodies and other specific immune receptors such as B cell receptor prefer charged and polar antigen residues (Wang et al. 2018). The pH of the blood and the body tissues is neutral (pH 7.2–7.6), therefore, it is better that pI of the epitopes not be close to the pH of the blood and body tissues. In this study, pI of all epitopes, except BEF-G, is far from the physiological pH of the body.

GRAVY indicates the rate of hydrophilicity or hydrophobicity of a protein. Peptides with positive GRAVY are more hydrophobic while peptides with a negative GRAVY value are more hydrophilic and tend to be more water soluble (Kyte and Doolittle 1982; Adhikari et al. 2018). All of our final epitopes were in the suitable range in this regard. Increasing the hydrophobic ratio in the non-polar face of the peptides to increase the symmetry in hydrophobic and hydrophilic faces, can enhanced their toxicity and hemolytic activity (Boland and Separovic 2006; Madanchi et al. 2019). Helical wheel projection showed that in all epitopes, the hydrophilic face became more discontinuous by polar residues, so the symmetry in hydrophobic and hydrophilic faces was not seen in our epitopes. Therefore, this arrangement reduces the possibility of cytotoxicity and hemolytic activity of these peptides to humans. This finding confirms the results of toxicity prediction done by ToxinPred server and showed that our final four epitopic peptides are non toxic.

Altogether, here we found four B cell epitopes (each for one Fim antigen) that are immunogen, antigenic, not similar to human peptides, not allergen and not toxic. They have also suitable physiochemical properties required for vaccine candidates, although their actual efficacy should be investigated using in vitro and in vivo testing.

It should be noted that numerous challenges can be readily predictable in developing K. pneumoniae vaccines, but not considered in our study. The first one is the developed vaccines against K. pneumoniae should not have cross-reactivity with the normal flora of intestine where this bacterium colonizes (Ahmad et al. 2012). The second, patients vulnerable to K. pneumoniae are often elderly and suffer multiple comorbidities. Consequently, immune responses of these populations to vaccine may be poor and not induce protective immunity (Weinberger et al. 2008).