Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis
Highlights
► Python Materials Genomics (pymatgen) is a robust, open-source library for materials analysis. ► Well-tested set of structure and thermodynamic analyses relevant to many applications. ► Open platform for researchers to collaboratively develop sophisticated analyses of materials. ► Convenient tools to obtain useful materials data via the Materials Project’s RESTful API. ► Evaluated phase and electrochemical stability of recently synthesized material Li4SnS4.
Introduction
First principles calculations have the potential to greatly accelerate the design and optimization of new materials. In the past decade, electronic structure calculation codes [1], [2], [3], [4] have reached a level of maturity such that it is now possible to reliably automate and scale first principles calculations across any number of compounds, subject only to the limits of available computing resources. Indeed, there are currently several parallel initiatives that employ high-throughput first principles calculations in materials design. For example, the Materials Project [5] (http://www.materialsproject.org) aims to calculate the properties of all known inorganic materials and make this data publicly available to the materials community to accelerate innovation in materials research. The Materials Project is based on the high-throughput framework developed by Jain et al. [6] and subsequently extended by collaborators at the Lawrence Berkeley Laboratory and National Energy Research Scientific Computing Center (NERSC). This framework has been used to screen over 80,000 inorganic compounds for a variety of applications, including Li-ion and Na-ion batteries [7], [8], [9], [10], [11]. Similarly, Curtarolo et al. [12] have developed the AFLOW (Automatic Flow) software framework for high-throughput calculation of crystal structure properties of alloys, intermetallics and inorganic compounds and applied it to the investigation of the effect of structure on the stability of binary alloys [13] and superconductors [14], and the search for topological insulators [15]. Yet another example of high-throughput materials design can be found in the CatApp developed by Hummelshoj et al. [16] which provides a web application to access activation energies of elementary surface reactions and is part of a larger database of surface reaction data being developed under the Quantum Materials Informatics Project (http://www.qmip.org). On the molecular front, the Clean Energy Project [17] uses high-throughput computational chemistry to look for the best organic molecules for various applications, including organic semiconductors [18] and polymers for the membranes used in fuel cells for electricity generation.
In this paper, we describe the Python Materials Genomics (pymatgen) library, a robust, open-source Python library for materials analysis. A key enabler in high-throughput computational materials science efforts is a robust set of software tools to perform initial setup for calculations (e.g., generation of structures and necessary input files) and post-calculation analysis to derive useful material properties from raw calculated data. The aims of pymatgen are as follows:
- 1.
Define core Python objects for materials data representation.
- 2.
Provide a well-tested set of structure and thermodynamic analysis tools relevant to many applications.
- 3.
Establish an open platform for researchers to collaboratively develop sophisticated analyses of materials data obtained both from first principles calculations and experiments.
The pymatgen library is currently used in the Materials Project for structure generation, manipulation and thermodynamic analysis. As such, it has been robustly tested over the large database of compounds in the Materials Project database. However, it should be noted that while the pymatgen library supports the Materials Project, its is designed to be a standalone library, and most of its analysis tools are flexible enough to be used by any materials researcher with other electronic structure codes and sources of data. The latest stable version of pymatgen (version 2.2.4 as of this paper) can be obtained via the Python Package Index at http://pypi.python.org/pypi/pymatgen, while the “bleeding edge” developmental version can be obtained from the official GitHub repo at http://github.com/materialsproject/pymatgen.
Section snippets
Overview of pymatgen
The pymatgen library is written in the Python programming language, and leverages the large number of available standard and scientific programming libraries, including the widely used numpy and scipy libraries [19]. It is compatible with Python version 2.7.×, but a transition to Python 3 is planned when the necessary libraries become available. It is primarily based on the object-oriented programming paradigm to facilitate code reuse and ensure modularity in design. In terms of development, we
Compound generation and structure transformations
Pymatgen provides a powerful framework for performing compound generation and structure transformations via the transformations package. A transformation is essentially a well-defined algorithm for generating new compounds and structures from existing structures. For example, a common approach to developing new materials from existing materials involve the substitution of existing species in the structure for others. Users can, for instance, use the data-mined substituted rules developed by
Analysis tools
The pymatgen library provides many tools for high-throughput, automated assimilation of data from electronic structure calculations, and for subsequent analysis of the assimilated data.
Integration with the Materials Project RESTful API
One of the key impediments to materials design is the availability of materials information. The Materials Project aims to meet this need by providing open, public access to a large database of calculated data on known materials. Currently, there are several user-friendly “apps” available on the Materials Project that use this data. In order to reach a broader materials community, we have created an application programming interface (API) based on a subset of the principles of REpresentational
Application example – phase stability of a new material
To illustrate the power of the pymatgen library, we will present a practical example of how it can be used to determine the phase stability of a new material. One of the main obstacles to performing phase stability analyses on new materials is that the phase stability of a particular material depends on its energy relative to that of competing phases. Without relying on an external database of pre-computed materials data, such an effort requires the materials researcher to identify all
Conclusion
The Python Materials Genomics (pymatgen) library is a robust, open-source python library for materials data analysis. By defining core Python objects for materials data representation, providing a well-tested set of structure and thermodynamic analysis tools relevant to many materials applications, and establishing an open platform for researchers to collaboratively develop sophisticated analyses of materials data, the pymatgen library is a key enabler of the Materials Project, powering several
Acknowledgments
This work was supported in part by the Department of Energy’s Basic Energy Sciences program under Grant No. DE-FG02-96ER45571. We also thank the National Energy Research Scientific Computing Center (NERSC) for providing invaluable computing resources and IT support for this project. A. Jain acknowledges funding from the Laboratory Directed Research and Development Program of Lawrence Berkeley National Laboratory under US Department of Energy Contract No. DE-AC02-05CH1123.
References (35)
- et al.
Computer Physics Communications
(2009) - et al.
Computational Materials Science
(2011) - et al.
Computational Materials Science
(2012) - et al.
Computer Coupling of Phase Diagrams and Thermochemistry
(2005) - et al.
Electrochemistry Communications
(2010) - et al.
Physical Review B
(1996) - M.J. Frisch, G.W. Trucks, H.B. Schlegel, G.E. Scuseria, M.A. Robb, J.R. Cheeseman, J.A. Montgomery Jr., T. Vreven, K.N....
- et al.
Zeitschrift für Kristallographie
(2005) - S.P. Ong, A. Jain, G. Hautier, M. Kocher, S. Cholia, D. Gunter, D. Bailey, D. Skinner, K. Persson, G. Ceder, The...
- et al.
Chemistry of Materials
(2011)
Journal of Materials Chemistry
MRS Bulletin
Energy & Environmental Science
Chemistry of Materials
Physical Review B
Nature Materials
Angewandte Chemie (International Ed. in English)
Cited by (2848)
Controlled doping of ultralow amounts Ru on Ni cathode for PEMWE: Experimental and theoretical elucidation of enhanced performance
2024, Applied Catalysis B: EnvironmentalHydrogen trapping in mixed carbonitrides
2024, Acta MaterialiaAb initio property predictions of quinary solid solutions using small binary cells
2024, Computational Materials Science