Elsevier

Computational Materials Science

Volume 68, February 2013, Pages 314-319
Computational Materials Science

Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis

https://doi.org/10.1016/j.commatsci.2012.10.028Get rights and content

Abstract

We present the Python Materials Genomics (pymatgen) library, a robust, open-source Python library for materials analysis. A key enabler in high-throughput computational materials science efforts is a robust set of software tools to perform initial setup for the calculations (e.g., generation of structures and necessary input files) and post-calculation analysis to derive useful material properties from raw calculated data. The pymatgen library aims to meet these needs by (1) defining core Python objects for materials data representation, (2) providing a well-tested set of structure and thermodynamic analyses relevant to many applications, and (3) establishing an open platform for researchers to collaboratively develop sophisticated analyses of materials data obtained both from first principles calculations and experiments. The pymatgen library also provides convenient tools to obtain useful materials data via the Materials Project’s REpresentational State Transfer (REST) Application Programming Interface (API). As an example, using pymatgen’s interface to the Materials Project’s RESTful API and phasediagram package, we demonstrate how the phase and electrochemical stability of a recently synthesized material, Li4SnS4, can be analyzed using a minimum of computing resources. We find that Li4SnS4 is a stable phase in the Li–Sn–S phase diagram (consistent with the fact that it can be synthesized), but the narrow range of lithium chemical potentials for which it is predicted to be stable would suggest that it is not intrinsically stable against typical electrodes used in lithium-ion batteries.

Highlights

► Python Materials Genomics (pymatgen) is a robust, open-source library for materials analysis. ► Well-tested set of structure and thermodynamic analyses relevant to many applications. ► Open platform for researchers to collaboratively develop sophisticated analyses of materials. ► Convenient tools to obtain useful materials data via the Materials Project’s RESTful API. ► Evaluated phase and electrochemical stability of recently synthesized material Li4SnS4.

Introduction

First principles calculations have the potential to greatly accelerate the design and optimization of new materials. In the past decade, electronic structure calculation codes [1], [2], [3], [4] have reached a level of maturity such that it is now possible to reliably automate and scale first principles calculations across any number of compounds, subject only to the limits of available computing resources. Indeed, there are currently several parallel initiatives that employ high-throughput first principles calculations in materials design. For example, the Materials Project [5] (http://www.materialsproject.org) aims to calculate the properties of all known inorganic materials and make this data publicly available to the materials community to accelerate innovation in materials research. The Materials Project is based on the high-throughput framework developed by Jain et al. [6] and subsequently extended by collaborators at the Lawrence Berkeley Laboratory and National Energy Research Scientific Computing Center (NERSC). This framework has been used to screen over 80,000 inorganic compounds for a variety of applications, including Li-ion and Na-ion batteries [7], [8], [9], [10], [11]. Similarly, Curtarolo et al. [12] have developed the AFLOW (Automatic Flow) software framework for high-throughput calculation of crystal structure properties of alloys, intermetallics and inorganic compounds and applied it to the investigation of the effect of structure on the stability of binary alloys [13] and superconductors [14], and the search for topological insulators [15]. Yet another example of high-throughput materials design can be found in the CatApp developed by Hummelshoj et al. [16] which provides a web application to access activation energies of elementary surface reactions and is part of a larger database of surface reaction data being developed under the Quantum Materials Informatics Project (http://www.qmip.org). On the molecular front, the Clean Energy Project [17] uses high-throughput computational chemistry to look for the best organic molecules for various applications, including organic semiconductors [18] and polymers for the membranes used in fuel cells for electricity generation.

In this paper, we describe the Python Materials Genomics (pymatgen) library, a robust, open-source Python library for materials analysis. A key enabler in high-throughput computational materials science efforts is a robust set of software tools to perform initial setup for calculations (e.g., generation of structures and necessary input files) and post-calculation analysis to derive useful material properties from raw calculated data. The aims of pymatgen are as follows:

  • 1.

    Define core Python objects for materials data representation.

  • 2.

    Provide a well-tested set of structure and thermodynamic analysis tools relevant to many applications.

  • 3.

    Establish an open platform for researchers to collaboratively develop sophisticated analyses of materials data obtained both from first principles calculations and experiments.

The pymatgen library is currently used in the Materials Project for structure generation, manipulation and thermodynamic analysis. As such, it has been robustly tested over the large database of compounds in the Materials Project database. However, it should be noted that while the pymatgen library supports the Materials Project, its is designed to be a standalone library, and most of its analysis tools are flexible enough to be used by any materials researcher with other electronic structure codes and sources of data. The latest stable version of pymatgen (version 2.2.4 as of this paper) can be obtained via the Python Package Index at http://pypi.python.org/pypi/pymatgen, while the “bleeding edge” developmental version can be obtained from the official GitHub repo at http://github.com/materialsproject/pymatgen.

Section snippets

Overview of pymatgen

The pymatgen library is written in the Python programming language, and leverages the large number of available standard and scientific programming libraries, including the widely used numpy and scipy libraries [19]. It is compatible with Python version 2.7.×, but a transition to Python 3 is planned when the necessary libraries become available. It is primarily based on the object-oriented programming paradigm to facilitate code reuse and ensure modularity in design. In terms of development, we

Compound generation and structure transformations

Pymatgen provides a powerful framework for performing compound generation and structure transformations via the transformations package. A transformation is essentially a well-defined algorithm for generating new compounds and structures from existing structures. For example, a common approach to developing new materials from existing materials involve the substitution of existing species in the structure for others. Users can, for instance, use the data-mined substituted rules developed by

Analysis tools

The pymatgen library provides many tools for high-throughput, automated assimilation of data from electronic structure calculations, and for subsequent analysis of the assimilated data.

Integration with the Materials Project RESTful API

One of the key impediments to materials design is the availability of materials information. The Materials Project aims to meet this need by providing open, public access to a large database of calculated data on known materials. Currently, there are several user-friendly “apps” available on the Materials Project that use this data. In order to reach a broader materials community, we have created an application programming interface (API) based on a subset of the principles of REpresentational

Application example – phase stability of a new material

To illustrate the power of the pymatgen library, we will present a practical example of how it can be used to determine the phase stability of a new material. One of the main obstacles to performing phase stability analyses on new materials is that the phase stability of a particular material depends on its energy relative to that of competing phases. Without relying on an external database of pre-computed materials data, such an effort requires the materials researcher to identify all

Conclusion

The Python Materials Genomics (pymatgen) library is a robust, open-source python library for materials data analysis. By defining core Python objects for materials data representation, providing a well-tested set of structure and thermodynamic analysis tools relevant to many materials applications, and establishing an open platform for researchers to collaboratively develop sophisticated analyses of materials data, the pymatgen library is a key enabler of the Materials Project, powering several

Acknowledgments

This work was supported in part by the Department of Energy’s Basic Energy Sciences program under Grant No. DE-FG02-96ER45571. We also thank the National Energy Research Scientific Computing Center (NERSC) for providing invaluable computing resources and IT support for this project. A. Jain acknowledges funding from the Laboratory Directed Research and Development Program of Lawrence Berkeley National Laboratory under US Department of Energy Contract No. DE-AC02-05CH1123.

References (35)

  • X. Gonze et al.

    Computer Physics Communications

    (2009)
  • A. Jain et al.

    Computational Materials Science

    (2011)
  • S. Curtarolo et al.

    Computational Materials Science

    (2012)
  • S. Curtarolo et al.

    Computer Coupling of Phase Diagrams and Thermochemistry

    (2005)
  • S.P. Ong et al.

    Electrochemistry Communications

    (2010)
  • G. Kresse et al.

    Physical Review B

    (1996)
  • M.J. Frisch, G.W. Trucks, H.B. Schlegel, G.E. Scuseria, M.A. Robb, J.R. Cheeseman, J.A. Montgomery Jr., T. Vreven, K.N....
  • X. Gonze et al.

    Zeitschrift für Kristallographie

    (2005)
  • S.P. Ong, A. Jain, G. Hautier, M. Kocher, S. Cholia, D. Gunter, D. Bailey, D. Skinner, K. Persson, G. Ceder, The...
  • G. Hautier et al.

    Chemistry of Materials

    (2011)
  • G. Hautier et al.

    Journal of Materials Chemistry

    (2011)
  • G. Ceder et al.

    MRS Bulletin

    (2011)
  • S.P. Ong et al.

    Energy & Environmental Science

    (2011)
  • Y. Mo et al.

    Chemistry of Materials

    (2012)
  • A. Kolmogorov et al.

    Physical Review B

    (2008)
  • K. Yang et al.

    Nature Materials

    (2012)
  • J.S. Hummelshøj et al.

    Angewandte Chemie (International Ed. in English)

    (2012)
  • Cited by (2848)

    View all citing articles on Scopus
    View full text