Computational strategies for protein conformational ensemble detection
Introduction
Let's dream the dream of the dreamer who is the protein scientist: Everything to know about the protein function from sequence, including how it responds to intrinsic and extrinsic perturbations, such as mutations and changes in the environment, would be the key to solving many biological and biotechnological problems. The questions would arrive in stages of increased complexity. Our scientist would first wish to know what the protein structure would look like under some condition it is functional. Then would come the question of how the function is facilitated, and if there are alternative conformations to achieve function. How would the protein behave in a given environment? An alternative environment? How would it respond to arising mutations? To molecular crowding? What would the role of kinetics be while, for example, switching between conformations or when the environment is altered? There is a plethora of computational strategies developed to date, tackling with each of these questions.
Making the dream come true needs unification. We begin by collecting all these inquiries into a ‘dynamical conformational search problem of proteins.’ The hypothetical free energy surface of a protein (Figure 1) consists of one or more minima which correspond to the regions defined by the various three-dimensional structures assumed by the protein. There are also other low-lying minima that represent misfolded states and higher energy local minima for on-pathway partially folded states on this surface. Because the folding free energy difference is typically on the order of a few kcal/mol for a globular protein [1], an order of magnitude smaller than that for the simplest of pairwise interactions at the covalent bond, a physics-based solution to the protein folding problem has proven extremely difficult. Small deviations at the level of bonds would completely mislead the global solution for determining the correct folded state(s) of the polypeptide chain. Recent breakthroughs applying artificial intelligence to put the structure in the vicinity of the functional region from sequence information alone allows us to set that part of the problem aside [2]. Here, we will address advances and open problems in strategies to navigate the neighborhood of the functionally relevant regions of the dynamical conformational space (Figure 1b–d). We refer the reader to the toy model of Figure 2a, while we discuss the strategies for efficient modeling of the conformational surface (CS) of proteins.
Section snippets
Atomistic simulations are reliable and predictive, albeit with limited time horizon and system size
Force-fields for atomistic molecular dynamics (MD) simulations have reached that level of accuracy which allow navigating the CS (Figure 2b). For example, the kinetics of protein–protein association at atomistic detail can now be predicted at experimentally consistent precision [3] using MD with the aid of enhanced sampling techniques and Markov state models [4]. The range of application systems is also ever-growing, including soluble, membrane, intrinsically disordered proteins, and those with
Coarse-grained approaches provide physics-based insight on protein equilibrium and dynamics
One obvious extension of full atomistic MD is to carry out simulations on coarse-grained beads. To date, Martini force-field has proven to display good approximations to experimental data, especially for membrane proteins and, with its latest update Martini 3, also for soluble proteins [13]. However, for the latter, the properties reproduced, at this time, are limited to the general protein association problems and salting in/out the behavior of soluble proteins. It has also been demonstrated
Hybrid all-atom–coarse grained methodologies extend horizon for navigating the dynamical landscape
Our understanding of exploring a dynamically changing CS has been shaped by the search for allosteric positions to engineer control into the protein structures [32] and more recently by the search for cryptic sites [33]. The former relies on finding distal positions affecting function at a known active site. The latter indicates dynamically transient and functionally relevant states that remain completely inaccessible in PDB structures and might exist unnoticed in long MD trajectories. Applying
Complete navigation of the dynamical landscape is practicable by integrative computational-experimental models
Historically, experimental validation of predictions has been crucial for the computational biology community. It is now clear that by charting the CS of proteins, computational techniques will increasingly guide experimental approaches, for example, suggesting spin labeling positions [41] and optimal positions for enzyme redesign [42]. A holistic approach towards mapping the CS of a protein has already initiated applications in determining allosteric and cryptic sites by complementing
Conflict of interest statement
Nothing declared.
Acknowledgements
The authors gratefully acknowledge support from the Scientific and Technological Research Council of Turkey (Grant Number 116F229).
References (55)
- et al.
Comparative perturbation-based modeling of the SARS-CoV-2 spike protein binding with host receptor and neutralizing antibodies: structurally adaptable Allosteric communication hotspots define spike sites targeted by global circulating mutations
Biochemistry
(2021) - et al.
In silico mutational studies of Hsp70 disclose sites with distinct functional attributes
Protein-Struct Funct Bioinf
(2015) - et al.
Engineering an allosteric control of protein function
J Phys Chem B
(2021) - et al.
Mechanical unfolding of proteins-A comparative nonequilibrium molecular dynamics study
Biophys J
(2020) - et al.
Structure-based analysis of cryptic-site opening
Structure
(2020) - et al.
Predicting optimal DEER label positions to study protein conformational heterogeneity
J Phys Chem B
(2017) - et al.
Determining protein structures using deep mutagenesis
Nat Genet
(2019) - et al.
Altered expression of a quality control protease in E. coli reshapes the in vivo mutational landscape of a model enzyme
Elife
(2020) - et al.
New computational protein design methods for de novo small molecule binding sites
PLoS Comput Biol
(2020) - et al.
Solution of levinthal's paradox and a physical theory of protein folding times
Biomolecules
(2020)
The breakthrough in protein structure prediction
Biochem J
Complete protein-protein association kinetics in atomic detail revealed by molecular dynamics simulations and Markov modelling
Nat Chem
What Markov state models can and cannot do: correlation versus path-based observables in protein-folding models
J Chem Theor Comput
Additive CHARMM36 force field for nonstandard amino acids
J Chem Theor Comput
TorchMD: a deep learning framework for molecular simulations
J Chem Theor Comput
Unsupervised learning methods for molecular simulation data
Chem Rev
Computational methods for exploring protein conformations
Biochem Soc Trans
Discovering collective variables of molecular transitions via genetic algorithms and neural networks
J Chem Theor Comput
Discovery of a hidden transient state in all bromodomain families
Proc Natl Acad Sci U S A
Delineating the conformational landscape of the adenosine A(2A) receptor during G protein coupling
Cell
Reply to: insufficient evidence for ageing in protein dynamics
Nat Phys
Martini 3: a general purpose force field for coarse-grained molecular dynamics
Nat Methods
Protein-ligand binding with the coarse-grained Martini model
Nat Commun
Coarse-grained modeling of multiple pathways in conformational transitions of multi-domain proteins
J Chem Inf Model
Conformational change of proteins arising from normal mode calculations
Protein Eng
Predicting cryptic ligand binding sites based on normal modes guided conformational sampling
Protein-Struct Funct Bioinf
Anisotropy of fluctuation dynamics of proteins with an elastic network model
Biophys J
Cited by (6)
Targeting protein conformations with small molecules to control protein complexes
2022, Trends in Biochemical SciencesCitation Excerpt :Such alterations allow proteins with increased conformational plasticity to switch from oligomerization-nonpermissive to oligomerization-permissive conformations, collectively known as conformational ensemble. Typically, oligomerization-permissive and -nonpermissive conformations of a protein have different free energies and may exist in a dynamic equilibrium [3]. Conformational switching is not always energetically favorable but it is often induced by other proteins, membrane binding, metabolites, or changes in the pH, light, and temperature.
Landscape Architecture Design and Implementation Based on Intelligent Monitoring Sensing Network
2023, Journal of SensorsDHFR Mutants Modulate Their Synchronized Dynamics with the Substrate by Shifting Hydrogen Bond Occupancies
2022, Journal of Chemical Information and ModelingIntegrating dynamics into enzyme engineering
2022, Protein Engineering, Design and Selection