PHAST: Protein-like heteropolymer analysis by statistical thermodynamics☆
Introduction
Linear heteropolymeric chains of amino acids are known as polypeptides. Proteins, on the other hand, are a large class of biological polymers containing at least one long polypeptide. Together, they constitute a set of macromolecules performing a vast array of functions within organisms, whose specificities strongly depend on their detailed intramolecular interactions that leads to three-dimensional (native) structures by a folding mechanism. The thermodynamic hypothesis [1] states that the native structure of a protein is a unique, stable and kinetically accessible minimum of the free energy solely determined by its (primary) sequence of amino acids. However, eventual misfolding may produce dysfunctional proteins, so inducing the formation of cytotoxic amyloid aggregates, which can culminate on degenerative diseases as type 2 Diabetes Mellitus (DM2) [2] or Alzheimer (AD) [3].
Once those biopolymers are typical representatives of the so-called small-systems, where in general the equivalence of statistical ensembles does not hold (see [4] and references therein), the direct computation of their density of states is a well suited investigation method to result on the system microcanonical thermodynamics [5]. Nevertheless, performing such calculations at a high enough accuracy implies on accumulating large amounts of statistical data through Monte Carlo methods, as Wang–Landau [6] or the multicanonical (MUCA) ensemble [7]. Thus, it consists of a huge computational challenge, which can be reasonably alleviated just by conjugating efficient parallel algorithms and coarse-grained molecular force fields.
To accomplish these demands we designed PHAST, a simple and easy-to-use open source package that enables users to simulate and analyze the microcanonical thermostatistics of general models for semi-flexible linear polymers [8], [9], as coarse-grained proteins [10], [11], [12], [13]. The program brings fully reconfigurable built-in force fields that can be readily modified and extended, this makes it specially apt for prototyping of many-body interacting systems, as polymers embedded in crowded solutions [14]. It is written in standard Fortran and possess MPI and CUDA extensions [15], [16], so being able to efficiently run on most modern parallel computer architectures. The distribution is forbidden for commercial or military purposes, but we hope that PHAST is useful and could be eventually incorporated in other programs, while we kindly request to be informed accordingly. Still, it is naturally supposed to be acknowledged in all resulting publications.
The article is organized as follows, in Section 2 we review some simulation background, this includes coarse-grained modeling and the derivation of microcanonical thermostatistics from parallel multicanonical simulations. Section 3 encompasses an in-depth overview of application modules. In Section 4 two concrete case studies involving heteropolymers bioinspired on amyloidogenic proteins are presented, so the aggregation of fragments and their further cross-seeding with isoforms are investigated in the context of a minimal coarse-grained model. The parallel scaling of such simulations is rigorously addressed, while their physical significance is outlined, when possible, by comparisons to other results from the literature. Section 5 brings our conclusions and perspectives for further developments.
Section snippets
Modeling proteins and polymers
Coarse-graining is a widely employed modelingstrategy to reduce the degrees of freedom of many-body macromolecular systems as heteropolymers [9], [17]. Moreover, the study of protein folding and aggregation, specially in crowded media [18], has also largely benefited from such approach [17]. In this vein, PHAST incorporates a configurable force field able to describe semi-flexible linear heteropolymers1
Software framework
The PHAST package is constituted by the following modules
-
SET_INPUT: prepares input protein sequences from PDB files.
-
PHAST: the main engine to run parallel multicanonical simulations.
-
ANALYST: computes the microcanonical thermostatistics from MUCA simulations.
It is not possible to give here a highly detailed description of all program routines. Instead, we address the main features and data workflows of aforementioned modules in the following subsections, which
Example runs
As case studies we have investigated two proteic systems by using the AB-model limit2 [10] of the Hamiltonian Eq. (3). Such modeling does not allow for prediction of protein structures, but may provide an useful method to learn about some general thermodynamical mechanisms underlying biological structural phase transitions [12].
Conclusions
We have presented PHAST, a package for simulating coarse-grained models of proteins and bioinspired polymers in the multicanonical ensemble. Despite of its simple structure, the software comprises a reconfigurable force field which allows for an excellent parallel speedup. Additionally, its modules enable users to quickly map proteins downloaded from PDB to an inner AB-lexicon — or its generalizations as [29]— as well as to automatically compute most of systems microcanonical thermostatistics.
Acknowledgments
This is a long-term project whose author has benefited from enlightening discussions with Leandro G. Rizzi, Lieverton H. Queiroz, Mathias S. Costa, Mikhael C. Chrum, and Nelson A. Alves. This manuscript has been substantially improved by constructive suggestions from both anonymous reviewers. The Brazilian laboratories CENAPAD-SP and LNCC-RJ are acknowledged by providing machine time and helpful technical support, respectively on their IBM P750 [34] and Santos Dumont [38] supercomputing
References (41)
- et al.
Phys. A
(2016)et al.Eur. Phys. J. B
(2010) - et al.
Phys. Rev. E
(2011) - et al.
Phys. Rev. Lett.
(2006) - et al.
J. Chem. Phys.
(2014) J. Mol. Biol.
(1988)- et al.
Dynamics of Polymeric Liquids
(1987) - et al.
J. Chem. Phys.
(2003) - et al.
ACS Chem. Neurosci.
(2013)et al.PLoS ONE
(2014) - et al.
Phys. Proc.
(2014) Diabetes Care
(2003)
Science
Williams Textbook of Endocrinology
BMJ
Microcanonical Thermodynamics
Phys. Rev. Lett.
Phys. Rev. E
Comput. Phys. Comm.
Phys. Lett. B
J. Stat. Phys.
Fields Inst. Commun.
Comput. Phys. Comm.
Soft Matter
Phys. Rev. E
Phys. Rev. E
J. Chem. Phys.
Phys. Rev. E
Cited by (5)
PathMolD-AB: Spatiotemporal pathways of protein folding using parallel molecular dynamics with a coarse-grained model
2020, Computational Biology and ChemistryCitation Excerpt :Therefore, simulations with the 3D-AB model demand a lower computational cost, compared to atomic models. For instance, in aggregation studies, where it is required a higher computational effort, this model enabled realistic simulations of fibrillar aggregates (Frigori et al., 2013; Frigori, 2014, 2017). Nowadays, the model has been used in many benchmark works for the PSP problem (Lin et al., 2018; Zhou et al., 2018).
Failure analysis integrated with prediction model for LNG transport trailer and thermal hazards induced by an accidental VCE: A case study
2020, Engineering Failure AnalysisCitation Excerpt :Unconfined rapid phase transitions are generally not considered hazardous, however, these can cause structural damage if they are to occur in a confined space. PHAST examines the progress of potential accidents from the initial release to far-field dispersion analysis including the modeling of tank spreading and evaporation, flammable and explosive effects [41–43]. It is considered the world’s most comprehensive and detailed process hazards analysis software system.
Massively parallel multicanonical simulations
2018, Computer Physics CommunicationsMicrocanonical insights into the physicochemical stability of the coformulation of insulin with amylin analogues
2021, Journal of Molecular ModelingBe positive: Optimizing pramlintide from microcanonical analysis of amylin isoforms
2017, Physical Chemistry Chemical Physics
- ☆
This paper and its associated computer program are available via the Computer Physics Communication homepage on ScienceDirect (http://www.sciencedirect.com/science/journal/00104655).