Abstract
Predicting the corresponding 3D structure from the protein’s sequence is one of the most challenging tasks in computational biology, and a confident interresdiue contact map serves as the main driver towards ab initio protein structure prediction. Benefiting from the ever-increasing sequence databases, residue contact prediction has been revolutionized recently by the introduction of direct coupling analysis and deep learning techniques. However, existing deep learning contact prediction methods often rely on a number of external programs and are therefore computationally expensive. Here, we introduce a novel contact prediction method based on fully convolutional neural networks and extensively extracted evolutionary features from multi-sequence alignment. The results show that our deep learning model based on a highly optimized feature extraction mechanism is very effective in interresidue contact prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anfinsen, C.B.: Principles that govern the folding of protein chains. Science 181(4096), 223–230 (1973)
Marks, D.S., Hopf, T.A., Sander, C.: Protein structure prediction from sequence variation. Nat. Biotechnol. 30(11), 1072 (2012)
Adhikari, B., et al.: CONFOLD: residue-residue contact-guided ab initio protein folding. Proteins Structure Function Bioinf. 83(8), 1436–1449 (2015)
Xu, J.: Distance-based protein folding powered by deep learning. Proc. Natl. Acad. Sci. 116(34), 16856–16865 (2019)
Senior, A.W., et al.: Improved protein structure prediction using potentials from deep learning. Nature 577, 706–710 (2020)
Yang, J., et al.: Improved protein structure prediction using predicted interresidue orientations. In: Proceedings of the National Academy of Sciences, p. 201914677 (2020)
Taylor, W.R., Jones, D.T., Sadowski, M.I.: Protein topology from predicted residue contacts. Protein Sci. 21(2), 299–305 (2012)
Miyazawa, S., Jernigan, R.L.: Residue–residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J. Mol. Biol. 256(3), 623–644 (1996)
Zhu, J., et al.: Protein threading using residue co-variation and deep learning. Bioinf. 34(13), i263–i273 (2018)
Cong, Q., et al.: Protein interaction networks revealed by proteome coevolution. Science 365(6449), 185–189 (2019)
Raval, A., et al.: Assessment of the utility of contact-based restraints in accelerating the prediction of protein structure using molecular dynamics simulations. Protein Sci. 25(1), 19–29 (2016)
Lubecka, E.A., Liwo, A.: Introduction of a bounded penalty function in contact-assisted simulations of protein structures to omit false restraints. J. Comput. Chem. 40(25), 2164–2178 (2019)
Dago, A.E., et al.: Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis. Proc. Natl. Acad. Sci. 109(26), E1733–E1742 (2012)
Pollock, D.D., Taylor, W.R.: Effectiveness of correlation analysis in identifying protein residues undergoing correlated evolution. Protein Eng. Des. Sel. 10(6), 647–657 (1997)
Dunn, S.D., Wahl, L.M., Gloor, G.B.: Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinf. 24(3), 333–340 (2007)
Lee, B.-C., Kim, D.: A new method for revealing correlated mutations under the structural and functional constraints in proteins. Bioinf. 25(19), 2506–2513 (2009)
Rajgaria, R., McAllister, S., Floudas, C.: Towards accurate residue–residue hydrophobic contact prediction for α helical proteins via integer linear optimization. Proteins 74(4), 929–947 (2009)
Rajgaria, R., Wei, Y., Floudas, C.: Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD. Proteins 78(8), 1825–1846 (2010)
Pierre, B., Cheng, J.: Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinf. 8(1), 113–113 (2007)
Tegge, A.N., et al.: NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Res. 37, 515–518 (2009)
Wu, S., Zhang, Y.: A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinf. 24(7), 924–931 (2008)
Wang, Z., Xu, J.: Predicting protein contact map using evolutionary and physical constraints by integer programming. Bioinf. 29(13), i266–i273 (2013)
Zhang, H., et al.: COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming. Proteins Structure Function Bioinf. 84(3), 332–348 (2016)
Weigt, M., et al.: Identification of direct residue contacts in protein–protein interaction by message passing. Proc. Natl. Acad. Sci. 106(1), 67–72 (2009)
Morcos, F., et al.: Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl. Acad. Sci. 108(49), E1293–E1301 (2011)
Baldassi, C., et al.: Fast and accurate multivariate Gaussian modeling of protein families: predicting residue contacts and protein-interaction partners. PLoS ONE 9(3), e92721 (2014)
Jones, D.T., et al.: PSICOV: Precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinf. 28(2), 184–190 (2012)
Ekeberg, M., et al.: Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Phys. Rev. E 87(1), 012707 (2013)
Kamisetty, H., Ovchinnikov, S., Baker, D.: Assessing the utility of coevolution-based residue–residue contact predictions in a sequence-and structure-rich era. Proc. Natl. Acad. Sci. 110(39), 15674–15679 (2013)
Seemayer, S., Gruber, M., Söding, J.: CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations. Bioinf. 30(21), 3128–3130 (2014)
Skwark, M.J., Abdel-Rehim, A., Elofsson, A.: PconsC: combination of direct information methods and alignments improves contact prediction. Bioinf. 29(14), 1815–1816 (2013)
Jones, D.T., et al.: MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinf. 31(7), 999–1006 (2015)
He, B., et al.: NeBcon: Protein contact map prediction using neural network training coupled with naïve Bayes classifiers. Bioinf. 33(15), 2296–2306 (2017)
Wang, S., et al.: Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol. 13(1), e1005324 (2017)
Ding, W., et al.: DeepConPred2: an improved method for the prediction of protein residue contacts. Comput. Struct. Biotechnol. J. 16, 503–510 (2018)
Adhikari, B., Hou, J., Cheng, J.: DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinf. 34(9), 1466–1472 (2017)
Adhikari, B.: DEEPCON: Protein Contact Prediction using Dilated Convolutional Neural Networks with Dropout. Bioinf. 36(2), 470–477 (2019)
Wu, Q., et al.: Protein contact prediction using metagenome sequence data and residual neural networks. Bioinf. 36(1), 41–48 (2020)
Jones, D.T., Kandathil, S.M.: High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinf. 34(19), 3308–3315 (2018)
Nugent, T., Jones, D.T.: Predicting transmembrane helix packing arrangements using residue contacts and a force-directed algorithm. PLoS Comput. Biol. 6(3), e1000714 (2010)
van Giessen, A.E., Straub, J.E.: Monte Carlo simulations of polyalanine using a reduced model and statistics-based interaction potentials. J. Chem. Phys. 122(2), 024904 (2005)
Acknowledgment
This work is partly supported by Strategic Priority CAS Project XDB38000000, National Science Foundation of China under Grant No. U1813203, the Shenzhen Basic Research Fund under Grant No. JCYJ20180507182818013, JCYJ20200109114818703 and JCYJ20170413093358429, Hong Kong Research Grant Council under Grant No. GRF-17208019 and CAS Key Lab under Grant No. 2011DP173015. We would also like to thank the Outstanding Youth Innovation Fund (CAS-SIAT to Huiling Zhang).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, H., Wu, H., Ting, HF., Wei, Y. (2021). Protein Interresidue Contact Prediction Based on Deep Learning and Massive Features from Multi-sequence Alignment. In: Zhang, Y., Xu, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2020. Lecture Notes in Computer Science(), vol 12606. Springer, Cham. https://doi.org/10.1007/978-3-030-69244-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-69244-5_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69243-8
Online ISBN: 978-3-030-69244-5
eBook Packages: Computer ScienceComputer Science (R0)