Skip to main content

Protein Interresidue Contact Prediction Based on Deep Learning and Massive Features from Multi-sequence Alignment

  • Conference paper
  • First Online:
Parallel and Distributed Computing, Applications and Technologies (PDCAT 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12606))

  • 1082 Accesses

Abstract

Predicting the corresponding 3D structure from the protein’s sequence is one of the most challenging tasks in computational biology, and a confident interresdiue contact map serves as the main driver towards ab initio protein structure prediction. Benefiting from the ever-increasing sequence databases, residue contact prediction has been revolutionized recently by the introduction of direct coupling analysis and deep learning techniques. However, existing deep learning contact prediction methods often rely on a number of external programs and are therefore computationally expensive. Here, we introduce a novel contact prediction method based on fully convolutional neural networks and extensively extracted evolutionary features from multi-sequence alignment. The results show that our deep learning model based on a highly optimized feature extraction mechanism is very effective in interresidue contact prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anfinsen, C.B.: Principles that govern the folding of protein chains. Science 181(4096), 223–230 (1973)

    Article  Google Scholar 

  2. Marks, D.S., Hopf, T.A., Sander, C.: Protein structure prediction from sequence variation. Nat. Biotechnol. 30(11), 1072 (2012)

    Article  Google Scholar 

  3. Adhikari, B., et al.: CONFOLD: residue-residue contact-guided ab initio protein folding. Proteins Structure Function Bioinf. 83(8), 1436–1449 (2015)

    Article  Google Scholar 

  4. Xu, J.: Distance-based protein folding powered by deep learning. Proc. Natl. Acad. Sci. 116(34), 16856–16865 (2019)

    Article  Google Scholar 

  5. Senior, A.W., et al.: Improved protein structure prediction using potentials from deep learning. Nature 577, 706–710 (2020)

    Article  Google Scholar 

  6. Yang, J., et al.: Improved protein structure prediction using predicted interresidue orientations. In: Proceedings of the National Academy of Sciences, p. 201914677 (2020)

    Google Scholar 

  7. Taylor, W.R., Jones, D.T., Sadowski, M.I.: Protein topology from predicted residue contacts. Protein Sci. 21(2), 299–305 (2012)

    Article  Google Scholar 

  8. Miyazawa, S., Jernigan, R.L.: Residue–residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J. Mol. Biol. 256(3), 623–644 (1996)

    Article  Google Scholar 

  9. Zhu, J., et al.: Protein threading using residue co-variation and deep learning. Bioinf. 34(13), i263–i273 (2018)

    Article  Google Scholar 

  10. Cong, Q., et al.: Protein interaction networks revealed by proteome coevolution. Science 365(6449), 185–189 (2019)

    Google Scholar 

  11. Raval, A., et al.: Assessment of the utility of contact-based restraints in accelerating the prediction of protein structure using molecular dynamics simulations. Protein Sci. 25(1), 19–29 (2016)

    Article  Google Scholar 

  12. Lubecka, E.A., Liwo, A.: Introduction of a bounded penalty function in contact-assisted simulations of protein structures to omit false restraints. J. Comput. Chem. 40(25), 2164–2178 (2019)

    Google Scholar 

  13. Dago, A.E., et al.: Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis. Proc. Natl. Acad. Sci. 109(26), E1733–E1742 (2012)

    Article  Google Scholar 

  14. Pollock, D.D., Taylor, W.R.: Effectiveness of correlation analysis in identifying protein residues undergoing correlated evolution. Protein Eng. Des. Sel. 10(6), 647–657 (1997)

    Article  Google Scholar 

  15. Dunn, S.D., Wahl, L.M., Gloor, G.B.: Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinf. 24(3), 333–340 (2007)

    Article  Google Scholar 

  16. Lee, B.-C., Kim, D.: A new method for revealing correlated mutations under the structural and functional constraints in proteins. Bioinf. 25(19), 2506–2513 (2009)

    Article  Google Scholar 

  17. Rajgaria, R., McAllister, S., Floudas, C.: Towards accurate residue–residue hydrophobic contact prediction for α helical proteins via integer linear optimization. Proteins 74(4), 929–947 (2009)

    Article  Google Scholar 

  18. Rajgaria, R., Wei, Y., Floudas, C.: Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD. Proteins 78(8), 1825–1846 (2010)

    Article  Google Scholar 

  19. Pierre, B., Cheng, J.: Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinf. 8(1), 113–113 (2007)

    Article  Google Scholar 

  20. Tegge, A.N., et al.: NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Res. 37, 515–518 (2009)

    Article  Google Scholar 

  21. Wu, S., Zhang, Y.: A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinf. 24(7), 924–931 (2008)

    Article  Google Scholar 

  22. Wang, Z., Xu, J.: Predicting protein contact map using evolutionary and physical constraints by integer programming. Bioinf. 29(13), i266–i273 (2013)

    Article  Google Scholar 

  23. Zhang, H., et al.: COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming. Proteins Structure Function Bioinf. 84(3), 332–348 (2016)

    Article  Google Scholar 

  24. Weigt, M., et al.: Identification of direct residue contacts in protein–protein interaction by message passing. Proc. Natl. Acad. Sci. 106(1), 67–72 (2009)

    Article  Google Scholar 

  25. Morcos, F., et al.: Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl. Acad. Sci. 108(49), E1293–E1301 (2011)

    Article  Google Scholar 

  26. Baldassi, C., et al.: Fast and accurate multivariate Gaussian modeling of protein families: predicting residue contacts and protein-interaction partners. PLoS ONE 9(3), e92721 (2014)

    Article  Google Scholar 

  27. Jones, D.T., et al.: PSICOV: Precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinf. 28(2), 184–190 (2012)

    Article  Google Scholar 

  28. Ekeberg, M., et al.: Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Phys. Rev. E 87(1), 012707 (2013)

    Article  Google Scholar 

  29. Kamisetty, H., Ovchinnikov, S., Baker, D.: Assessing the utility of coevolution-based residue–residue contact predictions in a sequence-and structure-rich era. Proc. Natl. Acad. Sci. 110(39), 15674–15679 (2013)

    Article  Google Scholar 

  30. Seemayer, S., Gruber, M., Söding, J.: CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations. Bioinf. 30(21), 3128–3130 (2014)

    Article  Google Scholar 

  31. Skwark, M.J., Abdel-Rehim, A., Elofsson, A.: PconsC: combination of direct information methods and alignments improves contact prediction. Bioinf. 29(14), 1815–1816 (2013)

    Article  Google Scholar 

  32. Jones, D.T., et al.: MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinf. 31(7), 999–1006 (2015)

    Article  Google Scholar 

  33. He, B., et al.: NeBcon: Protein contact map prediction using neural network training coupled with naïve Bayes classifiers. Bioinf. 33(15), 2296–2306 (2017)

    Article  Google Scholar 

  34. Wang, S., et al.: Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol. 13(1), e1005324 (2017)

    Article  Google Scholar 

  35. Ding, W., et al.: DeepConPred2: an improved method for the prediction of protein residue contacts. Comput. Struct. Biotechnol. J. 16, 503–510 (2018)

    Article  Google Scholar 

  36. Adhikari, B., Hou, J., Cheng, J.: DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinf. 34(9), 1466–1472 (2017)

    Article  Google Scholar 

  37. Adhikari, B.: DEEPCON: Protein Contact Prediction using Dilated Convolutional Neural Networks with Dropout. Bioinf. 36(2), 470–477 (2019)

    Article  Google Scholar 

  38. Wu, Q., et al.: Protein contact prediction using metagenome sequence data and residual neural networks. Bioinf. 36(1), 41–48 (2020)

    Article  Google Scholar 

  39. Jones, D.T., Kandathil, S.M.: High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinf. 34(19), 3308–3315 (2018)

    Article  Google Scholar 

  40. Nugent, T., Jones, D.T.: Predicting transmembrane helix packing arrangements using residue contacts and a force-directed algorithm. PLoS Comput. Biol. 6(3), e1000714 (2010)

    Article  Google Scholar 

  41. van Giessen, A.E., Straub, J.E.: Monte Carlo simulations of polyalanine using a reduced model and statistics-based interaction potentials. J. Chem. Phys. 122(2), 024904 (2005)

    Article  Google Scholar 

Download references

Acknowledgment

This work is partly supported by Strategic Priority CAS Project XDB38000000, National Science Foundation of China under Grant No. U1813203, the Shenzhen Basic Research Fund under Grant No. JCYJ20180507182818013, JCYJ20200109114818703 and JCYJ20170413093358429, Hong Kong Research Grant Council under Grant No. GRF-17208019 and CAS Key Lab under Grant No. 2011DP173015. We would also like to thank the Outstanding Youth Innovation Fund (CAS-SIAT to Huiling Zhang).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanjie Wei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, H., Wu, H., Ting, HF., Wei, Y. (2021). Protein Interresidue Contact Prediction Based on Deep Learning and Massive Features from Multi-sequence Alignment. In: Zhang, Y., Xu, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2020. Lecture Notes in Computer Science(), vol 12606. Springer, Cham. https://doi.org/10.1007/978-3-030-69244-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69244-5_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69243-8

  • Online ISBN: 978-3-030-69244-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics