research papers\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoSTRUCTURAL
BIOLOGY
ISSN: 2059-7983

Real-space refinement in PHENIX for cryo-EM and crystallography

CROSSMARK_Color_square_no_text.svg

aMolecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, bDepartment of Physics and International Centre for Quantum and Molecular Structures, Shanghai University, Shanghai 200444, People's Republic of China, cCambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Hills Road, Cambridge CB2 0XY, England, dBioscience Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA, eNew Mexico Consortium, Los Alamos, NM 87545, USA, fFaculté des Sciences et Technologies, Université de Lorraine, BP 239, 54506 Vandoeuvre-les-Nancy, France, gCentre for Integrative Biology, IGBMC, CNRS–INSERM–UdS, 1 Rue Laurent Fries, BP 10142, 67404 Illkirch, France, and hDepartment of Bioengineering, University of California Berkeley, Berkeley, California, USA
*Correspondence e-mail: pafonine@lbl.gov

(Received 10 January 2018; accepted 27 April 2018; online 30 May 2018)

This article describes the implementation of real-space refinement in the phenix.real_space_refine program from the PHENIX suite. The use of a simplified refinement target function enables very fast calculation, which in turn makes it possible to identify optimal data-restraint weights as part of routine refinements with little runtime cost. Refinement of atomic models against low-resolution data benefits from the inclusion of as much additional information as is available. In addition to standard restraints on covalent geometry, phenix.real_space_refine makes use of extra information such as secondary-structure and rotamer-specific restraints, as well as restraints or constraints on internal molecular symmetry. The re-refinement of 385 cryo-EM-derived models available in the Protein Data Bank at resolutions of 6 Å or better shows significant improvement of the models and of the fit of these models to the target maps.

1. Introduction

Improvements in the cryo-electron microscopy (cryo-EM) technique have led to a rapid increase in the number of high-resolution three-dimensional reconstructions that can be interpreted with atomic models (Fig. 1[link]). This has prompted a number of new developments in PHENIX (Adams et al., 2010[Adams, P. D. et al. (2010). Acta Cryst. D66, 213-221.]) to support the method, from model building (Terwilliger, Adams et al., 2018[Terwilliger, T. C., Adams, P. D., Afonine, P. V. & Sobolev, O. V. (2018). bioRxiv, 267138. https://doi.org/10.1101/267138.]), map improvement (Terwilliger, Sobolev et al., 2018[Terwilliger, T. C., Sobolev, O. V., Afonine, P. V. & Adams, P. D. (2018). Acta Cryst. D74, 545-559.]) and refinement (Afonine et al., 2013[Afonine, P. V., Headd, J. J., Terwilliger, T. C. & Adams, P. D. (2013). Comput. Crystallogr. Newsl. 4, 43-44. https://www.phenix-online.org/newsletter/CCN_2013_07.pdf.]) to model validation (Afonine et al., 2018[Afonine, P. V., Klaholz, B. K., Moriarty, N. W., Poon, B. K., Sobolev, O. V., Terwilliger, T. C., Adams, P. D. & Urzhumtsev, A. (2018). bioRxiv. https://doi.org/10.1101/249607.]). In this manuscript, we focus on atomic model refinement using a map (primarily cryo-EM, but the same algorithms and software are also applicable to crystallographic maps).

[Figure 1]
Figure 1
Number of cryo-EM-derived models in the PDB at resolutions of 6 Å or better.

Model refinement is an optimization problem and as such it requires the definition of three entities (for reviews, see Tronrud, 2004[Tronrud, D. E. (2004). Acta Cryst. D60, 2156-2168.]; Watkin, 2008[Watkin, D. (2008). J. Appl. Cryst. 41, 491-522.]; Afonine et al., 2012[Afonine, P. V., Grosse-Kunstleve, R. W., Echols, N., Headd, J. J., Moriarty, N. W., Mustyakimov, M., Terwilliger, T. C., Urzhumtsev, A., Zwart, P. H. & Adams, P. D. (2012). Acta Cryst. D68, 352-367.], 2015[Afonine, P., Urzhumtsev, A. & Adams, P. D. (2015). Arbor, 191, a219. https://doi.org/10.3989/arbor.2015.772n2005.]). Firstly, the model, i.e. a mathematical construct that explains the experimental data, with an associated set of refinable parameters: in this case an atomic model with coordinates whose positions can be varied to improve the fit to the data. Seondly, the target function that links the model parameters to the experimental data: this function scores model-to-data fit and therefore guides refinement. Finally, an optimization method that changes the values of refinable model parameters such that the model agreement with the experimental data is improved. In PHENIX, gradient methods are used through L-BFGS (Liu & Nocedal, 1989[Liu, D. C. & Nocedal, J. (1989). Math. Program. 45, 503-528.]) for this goal. If the target function is expressed through diffraction intensities or structure factors, refinement is usually referred to as reciprocal-space, or Fourier-space, refinement (FSR). Alternatively, a target function may be formulated in terms of a map: a Fourier synthesis in the case of crystallography or a three-dimensional reconstruction from projections in the case of cryo-EM. Such refinement is referred to as real-space refinement (RSR). In both cases the targets are the sums over a large number of similar terms corresponding to either reflections (FSR) or map grid points (RSR). A key methodological difference is that for RSR each term depends on only a few atoms, while for FSR each term depends on all model parameters. Most modern macromolecular refinement programs were developed for crystallographic data and therefore perform refinement in reciprocal space, at least as their main mode of operation (see Table 1 in Afonine et al., 2015[Afonine, P., Urzhumtsev, A. & Adams, P. D. (2015). Arbor, 191, a219. https://doi.org/10.3989/arbor.2015.772n2005.]). This work focuses on the real-space refinement of coordinates of atomic models.

In cryo-EM studies real-space refinement is a natural choice because a three-dimensional map is the output of the single-particle image-reconstruction method (see, for example, Frank, 2006[Frank, J. (2006). Three-Dimensional Electron Microscopy of Macromolecular Assemblies. Oxford University Press.]) and does not change in a fundamental way as the atomic model is improved. This is not the case for crystallo­graphy, where the experimental data are diffraction intensities, and the associated and vital phase information has to be obtained indirectly. In crystallography, obtaining the best phases typically involves their calculation from atomic models, in turn making the resulting maps model-biased (see, for example, Hodel et al., 1992[Hodel, A., Kim, S.-H. & Brünger, A. T. (1992). Acta Cryst. A48, 851-858.]). Although FSR methods are predominant in crystallographic refinement, RSR is attractive in some contexts as it makes it possible to refine parts of the model locally and fast, and model incompleteness does not influence refinement as it does for FSR (Lunin et al., 2002[Lunin, V. Y., Afonine, P. V. & Urzhumtsev, A. G. (2002). Acta Cryst. A58, 270-282.]). For this reason RSR has been particularly popular in the context of interactive model-building software such as FRODO, O (Jones, 1978[Jones, T. A. (1978). J. Appl. Cryst. 11, 268-272.]; Jones et al., 1991[Jones, T. A., Zou, J.-Y., Cowan, S. W. & Kjeldgaard, M. (1991). Acta Cryst. A47, 110-119.]), MAIN (Turk, 2013[Turk, D. (2013). Acta Cryst. D69, 1342-1357.]) and Coot (Emsley & Cowtan, 2004[Emsley, P. & Cowtan, K. (2004). Acta Cryst. D60, 2126-2132.]; Emsley et al., 2010[Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486-501.]).

In the case of cryo-EM an atomic model can also be refined using a reciprocal-space target. This can be achieved by converting the map into Fourier coefficients. These Fourier coefficients can then be used in reciprocal-space refinement using standard refinement protocols that are well established for crystallographic structure refinement (see, for example, Cheng et al., 2011[Cheng, L., Sun, J., Zhang, K., Mou, Z., Huang, X., Ji, G., Sun, F., Zhang, J. & Zhu, P. (2011). Proc. Natl Acad. Sci. USA, 108, 1373-1378.]; Baker et al., 2013[Baker, M. L., Hryc, C. F., Zhang, Q., Wu, W., Jakana, J., Haase-Pettingell, C., Afonine, P. V., Adams, P. D., King, J. A., Jiang, W. & Chiu, W. (2013). Proc. Natl Acad. Sci. USA, 110, 12301-12306.]; Brown et al., 2015[Brown, A., Long, F., Nicholls, R. A., Toots, J., Emsley, P. & Murshudov, G. (2015). Acta Cryst. D71, 136-153.]). We note, however, that unless the map is converted to the full corresponding set of Fourier coefficients (and not a subset containing only a sphere limited to the stated resolution) this conversion may not be lossless.

To address the emerging structure-refinement needs of the rapidly growing field of cryo-EM, the phenix.real_space_refine program (Afonine et al., 2013[Afonine, P. V., Headd, J. J., Terwilliger, T. C. & Adams, P. D. (2013). Comput. Crystallogr. Newsl. 4, 43-44. https://www.phenix-online.org/newsletter/CCN_2013_07.pdf.]), which is capable of the refinement of atomic models against maps, has been introduced into the PHENIX suite. It is not limited to cryo-EM and can also be used in crystallographic refinement (X-ray, electron or neutron). In this paper, we describe the implementation of the phenix.real_space_refine program and demonstrate its performance by applications to simulated data and to cryo-EM models in the PDB (Bernstein et al., 1977[Bernstein, F. C., Koetzle, T. F., Williams, G. J., Meyer, E. F. Jr, Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T. & Tasumi, M. (1977). J. Mol. Biol. 112, 535-542.]; Berman et al., 2000[Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P. E. (2000). Nucleic Acids Res. 28, 235-242.]) and corresponding maps in the EMDB (Henrick et al., 2003[Henrick, K., Newman, R., Tagari, M. & Chagoyen, M. (2003). J. Struct. Biol. 144, 228-237.]). This is a work in progress, and further details and advances will be reported as the program evolves. To date, phenix.real_space_refine has been used in a number of documented structural studies (see, for example, Fischer et al., 2015[Fischer, N., Neumann, P., Konevega, A. L., Bock, L. V., Ficner, R., Rodnina, M. V. & Stark, H. (2015). Nature (London), 520, 567-570.]; Shalev-Benami et al., 2016[Shalev-Benami, M., Zhang, Y., Matzov, D., Halfon, Y., Zackay, A., Rozenberg, H., Zimmerman, E., Bashan, A., Jaffe, C. L., Yonath, A. & Skiniotis, G. (2016). Cell. Rep. 16, 288-294.]; Chua et al., 2016[Chua, E. Y. D., Vogirala, V. K., Inian, O., Wong, A. S. W., Nordenskiöld, L., Plitzko, J. M., Danev, R. & Sandin, S. (2016). Nucleic Acids Res. 44, 8013-8019.]; Ahmed et al., 2016[Ahmed, T., Yin, Z. & Bhushan, S. (2016). Sci Rep. 6, 35793.]; Yang et al., 2016[Yang, H., Wang, J., Liu, M., Chen, X., Huang, M., Tan, D., Dong, M.-Q., Wong, C. C. L., Wang, J., Xu, Y. & Wang, H.-W. (2016). Protein Cell, 7, 878-887.]; Gao et al., 2016[Gao, Y., Cao, E., Julius, D. & Cheng, Y. (2016). Nature (London), 534, 347-351.]; Chen et al., 2016[Chen, Y. et al. (2016). Science, 353, aad8266.]; Bhardwaj et al., 2016[Bhardwaj, A., Sankhala, R. S., Olia, A. S., Brooke, D., Casjens, S. R., Taylor, D. J., Prevelige, P. E. Jr & Cingolani, G. (2016). J. Biol. Chem. 291, 215-226.]; Lokareddy et al., 2017[Lokareddy, R. K., Sankhala, R. S., Roy, A., Afonine, P. V., Motwani, T., Teschke, C. M., Parent, K. N. & Cingolani, G. (2017). Nature Commun. 8, 14310.]; Hryc et al., 2017[Hryc, C. F., Chen, D.-H., Afonine, P. V., Jakana, J., Wang, Z., Haase-Pettingell, C., Jiang, W., Adams, P. D., King, J. A., Schmid, M. F. & Chiu, W. (2017). Proc. Natl Acad. Sci. USA, 114, 3103-3108.]; Ahmed et al., 2017[Ahmed, T., Shi, J. & Bhushan, S. (2017). Nucleic Acids Res. 45, 8581-8595.]; Demo et al., 2017[Demo, G., Svidritskiy, E., Madireddy, R., Diaz-Avalos, R., Grant, T., Grigorieff, N., Sousa, D. & Korostelev, A. A. (2017). Elife, 6, e23687.]; Paulino et al., 2017[Paulino, C., Neldner, Y., Lam, A. K. M., Kalienkova, V., Brunner, J. D., Schenck, S. & Dutzler, R. (2017). Elife, 6, e26232.]; Liu et al., 2017[Liu, Y., Pan, J., Jenni, S., Raymond, D. D., Caradonna, T., Do, K. T., Schmidt, A. G., Harrison, S. C. & Grigorieff, N. (2017). J. Mol. Biol. 429, 1829-1839.]).

2. Methods

2.1. Refinement flowchart

Fig. 2[link] shows the model-refinement flowchart as it is implemented in phenix.real_space_refine. This is very similar to the reciprocal-space refinement workflow implemented in phenix.refine (see Fig. 1 in Afonine et al., 2012[Afonine, P. V., Grosse-Kunstleve, R. W., Echols, N., Headd, J. J., Moriarty, N. W., Mustyakimov, M., Terwilliger, T. C., Urzhumtsev, A., Zwart, P. H. & Adams, P. D. (2012). Acta Cryst. D68, 352-367.]).

[Figure 2]
Figure 2
Flowchart for phenix.real_space_refine.

The program begins by reading a model file, in PDB or mmCIF format, map data (as an actual map in MRC/CCP4 format or as Fourier map coefficients in MTZ format) and other parameters, such as resolution (if a map is provided) or additional restraint definitions for novel ligands, internal molecular symmetry (e.g. NCS in crystallography) or secondary structure. Once inputs have been read, the program proceeds to calculations that constitute a set of tasks repeated multiple times (macro-cycles). Tasks to be performed during the refinement are defined by the program automatically and/or by the user. In its default mode the program will only perform gradient-driven minimization of the entire model. Other nondefault tasks allow optimization using simulated annealing (SA; Brünger et al., 1987[Brünger, A. T., Kuriyan, J. & Karplus, M. (1987). Science, 235, 458-460.]), morphing (Terwilliger et al., 2013[Terwilliger, T. C., Read, R. J., Adams, P. D., Brunger, A. T., Afonine, P. V. & Hung, L.-W. (2013). Acta Cryst. D69, 2244-2250.]), rigid-body refinement (see Afonine et al., 2009[Afonine, P. V., Grosse-Kunstleve, R. W., Urzhumtsev, A. & Adams, P. D. (2009). J. Appl. Cryst. 42, 607-615.] and references therein) and systematic residue side-chain optim­izations using grid searches in torsion χ-angle space (Oldfield, 2001[Oldfield, T. J. (2001). Acta Cryst. D57, 82-94.]). Parts of the model related by internal symmetry are determined automatically, if available, or can be defined by the user. In the presence of such internal symmetry, restraints or constraints can be applied between the coordinates of related molecules. The operators relating molecules can also be refined. The result of refinement, i.e. the refined model, is output as a file in PDB or mmCIF format.

Central to almost all tasks performed within a refinement macro-cycle is the target function. Its choice is the key for the success of refinement, i.e. efficient convergence to an improved model. Also of the same importance is the assessment of refinement progress by quantifying model quality and the goodness of model-to-map fit throughout the entire process. Some relevant points are discussed below.

2.2. Refinement target function

Macromolecular cryo-EM or crystallographic experimental data are almost always of insufficient quality to refine parameters of atomic models individually. To make refinement practical, restraints or constraints are almost always used in order to incorporate extra information into refinement, and the corresponding procedures are called restrained or constrained refinement. In restrained refinement the target function is a sum of data-based and restraints-based components:

[T = {T}_{\rm data}+{w}_{\rm restraints} \times {T}_{\rm restraints}. \eqno(1)]

The first term scores the model-to-data fit and the second term incorporates a priori information about the model. The weight wrestraints balances the contribution of restraints to maximize the model-to-data fit while also obeying the a priori information, and an optimal choice of its value is crucial. Constrained refinement does not change the target function but rather changes (reduces) the set of independent parameters that can vary. Examples include rigid-body refinement, the use of a riding model (Sheldrick & Schneider, 1997[Sheldrick, G. M. & Schneider, T. R. (1997). Methods Enzymol. 277, 319-343.]) to parameterize the positions of H atoms in refinement or the implementation of RSR by Diamond (1971[Diamond, R. (1971). Acta Cryst. A27, 436-452.]) using torsion angles as variables.

2.2.1. Model-to-map target (Tdata)

In RSR, the Tdata term scores the fit of the model being refined to a target map. In cryo-EM the map is a three-dimensional reconstruction, while in crystallography it may be, for example, a 2mFobsDFmodel map (Read, 1986[Read, R. J. (1986). Acta Cryst. A42, 140-149.]).

It is possible to express the difference between the two maps in the integral form (see, for example, Diamond, 1971[Diamond, R. (1971). Acta Cryst. A27, 436-452.])1

[T_{\rm data} = \textstyle \int \limits_{V}[\rho_{\rm calc}({\bf r})-\rho_{\rm tar}({\bf r})]^{2}\, {\rm d}{\bf r}. \eqno(2)]

For (2)[link] we suppose that the original target map is optimally scaled to the model map (Diamond, 1971[Diamond, R. (1971). Acta Cryst. A27, 436-452.]; Chapman, 1995[Chapman, M. S. (1995). Acta Cryst. A51, 69-80.]). In the following, we will consider the target to be essentially unchanged by manipulations that shift its value by a constant or a scale factor, as such manipulations do not change the position of the minimum of the target. If the Euclidean norms of ρtar(r) and ρcalc(r) are conserved during refinement [i.e. if [\textstyle \int_{V}\rho^{2}_{\rm tar}({\bf r})\,{\rm d}{\bf r}] = constant, as will be the case when the target map itself does not change, and if [\textstyle \int_{V}\rho^{2}_{\rm calc}({\bf r})\,{\rm d}{\bf r}] = constant, which will be true if the overlap of atomic densities does not change] then minimization of (2)[link] is equivalent to minimization of the anticorrelation target, which does not need the maps to be optimally scaled,

[T_{\rm data} = -\textstyle \int \limits_{V} \rho_{\rm calc}({\bf r})\rho_{\rm tar}({\bf r})\,{\rm d}{\bf r}. \eqno(3)]

Assuming the target ρtar and model-calculated ρcalc maps are provided on the same grid, a continuous integration in (2)[link] and (3)[link] can be replaced with a numeric integration over the regular grid on which the maps are available (see, for example, Diamond, 1971[Diamond, R. (1971). Acta Cryst. A27, 436-452.]),

[T_{\rm data} = \textstyle \sum \limits_{{\bf n}\in G}[\rho_{\rm calc}({\bf n})-\rho_{\rm tar}({\bf n})]^{2} \eqno(4)]

or

[T_{\rm data} = -\textstyle \sum \limits_{{\bf n}\in G}\rho_{\rm calc}({\bf n})\rho_{\rm tar}({\bf n}), \eqno(5)]

respectively. The set G of grid nodes used to calculate the targets (i.e. the integration volume) is either the whole map or an envelope (mask) surrounding the whole atomic model or its part that is subject to refinement.

To match the finite resolution of the target map in (5)[link] accurately, several steps are required to compute the model map. Firstly, the model map distribution is calculated using one of the available approximations (Sears, 1992[Sears, V. F. (1992). Neutron News, 3(3), 26-37.]; Maslen et al., 1992[Maslen, E. N., Fox, A. G. & O'Keefe, M. A. (1992). International Tables for Crystallography, Vol. C, edited by A. J. C. Wilson, pp. 476-516. Dordrecht: Kluwer Academic Publishers.]; Waasmaier & Kirfel, 1995[Waasmaier, D. & Kirfel, A. (1995). Acta Cryst. A51, 416-431.]; Grosse-Kunstleve et al., 2004[Grosse-Kunstleve, R. W., Sauter, N. K. & Adams, P. D. (2004). IUCr Comput. Comm. Newsl. 3, 22-31. https://www.iucr.org/resources/commissions/crystallographic-computing/newsletters/3.]; Peng et al., 1996[Peng, L.-M., Ren, G., Dudarev, S. L. & Whelan, M. J. (1996). Acta Cryst. A52, 257-276.]; Peng, 1998[Peng, L.-M. (1998). Acta Cryst. A54, 481-485.]). A set of Fourier coefficients is then calculated from the distribution up to the resolution limit specified by the target map.2 Finally, a subset of these coefficients is used to calculate the model Fourier synthesis ρcalc that can then be used in (5)[link]. This synthesis is a representation of a model image at a given resolution. A typical refinement may require hundreds or even thousands of such model-image calculations, which are computationally expensive, involving two Fourier transforms.

Alternatively, a model map may be calculated from the atomic model directly as a sum of individual contributions of M atoms, with each contribution being a Fourier image (or its approximation) of the corresponding atom at a given resolution (see, for example, Diamond, 1971[Diamond, R. (1971). Acta Cryst. A27, 436-452.]; Lunin & Urzhumtsev, 1984[Lunin, V. Y. & Urzhumtsev, A. G. (1984). Acta Cryst. A40, 269-277.]; Chapman, 1995[Chapman, M. S. (1995). Acta Cryst. A51, 69-80.]; Mooij et al., 2006[Mooij, W. T. M., Hartshorn, M. J., Tickle, I. J., Sharff, A. J., Verdonk, M. L. & Jhoti, H. (2006). ChemMedChem, 1, 827-838.]; Sorzano et al., 2015[Sorzano, C. O. S., Vargas, J., Otón, J., Abrishami, V., de la Rosa-Trevín, J. M., del Riego, S., Fernández-Alderete, A., Martínez-Rey, C., Marabini, R. & Carazo, J. M. (2015). AIMS Biophys. 2, 8-20.]). While this is much faster than the previous method, it may be less accurate and still be computationally expensive, especially for large models.

A numeric integration over the whole map (5)[link] can be simplified by the integration exploring the volume directly around the atomic centers rm, m = 1, … M:

[T_{\rm data} = -\textstyle \sum \limits_{m = 1}^{M}\rho_{\rm calc}({\bf r}_m){\tilde{\rho}}_{\rm tar}({\bf r}_m). \eqno(6)]

Here, [{\tilde{\rho}}_{\rm tar}({\bf r}_m)] are the values interpolated from the nearby grid node values ρtar(n) to the atomic centers rm (Appendices A[link] and B[link]). Neglecting the local variation of the model map at the atomic centers (e.g. at low resolution) and thus supposing ρcalc(rm) ≃ constant for all m, the target simplifies further as (Rossmann, 2000[Rossmann, M. G. (2000). Acta Cryst. D56, 1341-1349.]; Rossmann et al., 2001[Rossmann, M. G., Bernal, R. & Pletnev, S. V. (2001). J. Struct. Biol. 136, 190-200.])

[T_{\rm data} = -\textstyle \sum \limits_{m = 1}^{M}{\tilde{\rho}}_{\rm tar}({\bf r}_m). \eqno(7)]

The hypothesis ρcalc(rm) ≃ constant seems to be reasonable at low resolution, when a calculated map can be considered to be rather flat. On the other hand, minimization of (7)[link] is essentially a fitting of atoms to the nearest peaks of the target map, which seems to be appropriate at high resolution as well. We show below (§[link]3) that indeed this target function is efficient over a large resolution range; Appendix B[link] supports this observation through the equivalence of targets (7)[link] and (5)[link] when taking map blurring/sharpening into account. If the difference in atomic size cannot be neglected, this target function can be modified to

[T_{\rm data} = -\textstyle \sum \limits_{m = 1}^{M}w_m{\tilde{\rho}}_{\rm tar}({\bf r}_m), \eqno(8)]

where wm is an atom-specific weight. For example, wm can be the electron number of the corresponding atom or it can be set negative for O atoms of Asp and Glu residues in the case of cryo-EM or for atoms that have a negative scattering length (such as hydrogen) in the case of neutron diffraction data. Clearly, for most of the macromolecular structures under consideration here these atom-centered targets are nearly the same, and for simplicity in the following we refer only to (7)[link] unless otherwise stated. The computational cost of (7)[link] is proportional, with a very small coefficient, to the number of atoms and therefore these targets are much faster to calculate compared with (5)[link], making it advantageous for the refinement of large models. Unlike (4)[link] or (5)[link], the computational cost of (7)[link] or (8)[link] does not depend on the resolution or map-sampling rate. Essentially, target (5)[link] optimizes the fit of the shape between model-calculated and experimental maps, while target (7)[link] simply guides atoms to the nearest peaks in the experimental map. Therefore, refinement using (5)[link] can produce a more accurate model-to-map fit. An optimal refinement protocol may consist of using target (7)[link] for routine refinements and using (5)[link] for the final refinement.

2.2.2. Restraints (Trestraints)

In restrained refinement, extra information is introduced through the term Trestraints with some weight (1)[link]. This extra term restrains model parameters to be similar, but not necessarily identical, to some reference values. At high to medium resolutions of approximately 3 Å or better, a standard set of restraints as implemented in PHENIX includes (Grosse-Kunstleve & Adams, 2004[Grosse-Kunstleve, R. W. & Adams, P. D. (2004). IUCr Comput. Comm. Newsl. 4, 19-36. https://www.iucr.org/resources/commissions/crystallographic-computing/newsletters/4.]) restraints on covalent bond lengths and angles, dihedral angles, planarity and chirality restraints, and a nonbonded repulsion term. However, at lower resolutions the amount of experimental data is insufficient to preserve the geometry characteristics of a higher level of structural organization (such as secondary structure), and therefore including extra information (restraints or constraints) to help to produce a chemically meaningful model is desirable. These extra restraints or constraints may include similarity of related copies (NCS in the case of crystallography), restraints on secondary structure and restraints to one or more external reference models (for implementation details in PHENIX, see Headd et al., 2012[Headd, J. J., Echols, N., Afonine, P. V., Grosse-Kunstleve, R. W., Chen, V. B., Moriarty, N. W., Richardson, D. C., Richardson, J. S. & Adams, P. D. (2012). Acta Cryst. D68, 381-390.], 2014[Headd, J. J., Echols, N., Afonine, P. V., Moriarty, N. W., Gildea, R. J. & Adams, P. D. (2014). Acta Cryst. D70, 1346-1356.]; Sobolev et al., 2015[Sobolev, O. V., Afonine, P. V., Adams, P. D. & Urzhumtsev, A. (2015). J. Appl. Cryst. 48, 1130-1141.]). phenix.real_space_refine can use the following extra restraints and constraints.

  • (i) Distance and angle restraints on hydrogen-bond patterns for protein helices and sheets and DNA/RNA base pairs.

  • (ii) Torsion-angle restraints on idealized protein secondary-structure fragments.

  • (iii) Restraints to maintain stacking bases in RNA/DNA parallel.

  • (iv) Ramachandran plot restraints.

  • (v) Amino-acid side-chain rotamer-specific restraints.

  • (vi) Cβ deviation restraints.

  • (vii) Reference-model restraints, where a reference model may be a similar structure of better quality or the initial position of the model being refined.

  • (viii) Similarity restraints in torsion or Cartesian space.

  • (ix) NCS constraints.

2.2.3. Relative weight

The relative weight wrestraints is chosen such that the model fits the map as well as possible while maintaining reasonable deviations from ideal covalent bond lengths and angles. In PHENIX, wrestraints for RSR is determined by systematically trying a range of plausible values and performing a short refinement for each trial value. A similar procedure in FSR would be very computationally expensive because for each trial value of wrestraints the whole structure would need to be used. In RSR this is computationally feasible using (7)[link] but not (5)[link]. The weight-calculation procedure implemented in phenix.real_space_refine splits the model into a set of randomly chosen segments, each one a few residues long. After trial refinements of each segment with different weights, the best weight is defined as the one that results in a model possessing reasonable bond and angle root-mean-square deviations (r.m.s.d.s) and that has the best model-to-map fit among all trial weights. The obtained array of best weights for all fragments is filtered for outliers and the average weight is calculated and defined as the best weight for the final refinement. This calculation typically takes less than a minute on an ordinary computer and is independent of the size of the structure or map. Instead of computing an average single weight for the entire model, this protocol can be extended (work in progress) to calculate and use different weights for different parts of the map, accounting for variations in local map quality.

2.3. Evaluation of refinement progress and results

It is recognized that model validation (see, for example, Brändén & Jones, 1990[Brändén, C.-I. & Jones, T. A. (1990). Nature (London), 343, 687-689.]; Read et al., 2011[Read, R. J. et al. (2011). Structure, 19, 1395-1412.]; Wlodawer & Dauter, 2017[Wlodawer, A. & Dauter, Z. (2017). Acta Cryst. D73, 379-380.]) is a critical step in structure determination, and a number of corresponding tools have been developed in crystallo­graphy (see, for example, Chen et al., 2010[Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12-21.]; Read et al., 2011[Read, R. J. et al. (2011). Structure, 19, 1395-1412.]; Gore et al., 2017[Gore, S. et al. (2017). Structure, 25, 1916-1927.]; Williams et al., 2018[Williams, C. J. et al. (2018). Protein Sci. 27, 193-315.] and references therein) and some in cryo-EM studies (see, for example, Henderson et al., 2012[Henderson, R. et al. (2012). Structure, 20, 205-214.]; Tickle, 2012[Tickle, I. J. (2012). Acta Cryst. D68, 454-467.]; Lagerstedt et al., 2013[Lagerstedt, I., Moore, W. J., Patwardhan, A., Sanz-García, E., Best, C., Swedlow, J. R. & Kleywegt, G. J. (2013). J. Struct. Biol. 184, 173-181.]; Barad et al., 2015[Barad, B. A., Echols, N., Wang, R. Y.-R., Cheng, Y., DiMaio, F., Adams, P. D. & Fraser, J. S. (2015). Nature Methods, 12, 943-946.]; Pintilie et al., 2016[Pintilie, G., Chen, D.-H., Haase-Pettingell, C. A., King, J. A. & Chiu, W. (2016). Biophys. J. 110, 827-839.]; Joseph et al., 2017[Joseph, A. P., Lagerstedt, I., Patwardhan, A., Topf, M. & Winn, M. (2017). J. Struct. Biol. 199, 12-26.], Afonine et al., 2018[Afonine, P. V., Klaholz, B. K., Moriarty, N. W., Poon, B. K., Sobolev, O. V., Terwilliger, T. C., Adams, P. D. & Urzhumtsev, A. (2018). bioRxiv. https://doi.org/10.1101/249607.]). Generally, the process consists of assessing data, model quality and model-to-data fit quality, and is performed locally and globally. At the stage of refining a model we assume that the intrinsic data quality has already been evaluated, and only model quality and model-to-data fit need to be monitored.

The methods and tools to evaluate the geometric quality of a model are the same in crystallography and in cryo-EM. For example, the PHENIX comprehensive validation program provides an extensive report on model quality, making extensive use of the MolProbity validation algorithms (Chen et al., 2010[Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12-21.]; Richardson et al., 2018[Williams, C. J. et al. (2018). Protein Sci. 27, 193-315.]). In crystallography, the model-to-data fit is quantified by crystallographic R and Rfree (Brünger, 1992[Brünger, A. T. (1992). Nature (London), 355, 472-475.]) factors, which are global reciprocal-space metrics. In cryo-EM, model and data validation is currently performed by the comparison of complex Fourier coefficients in resolution shells; these coefficients are calculated from the model and from the full map or half-maps; different masks can be applied prior to calculation of these coefficients. Also in real space the model-to-data fit can be evaluated locally or globally by various correlation coefficients between a model-calculated map and the experimentally derived map (Urzhumtsev et al., 2014[Urzhumtsev, A., Afonine, P. V., Lunin, V. Y., Terwilliger, T. C. & Adams, P. D. (2014). Acta Cryst. D70, 2593-2606.]; Afonine et al., 2018[Afonine, P. V., Klaholz, B. K., Moriarty, N. W., Poon, B. K., Sobolev, O. V., Terwilliger, T. C., Adams, P. D. & Urzhumtsev, A. (2018). bioRxiv. https://doi.org/10.1101/249607.]). Some of these tools are used in §3.2[link], where models extracted from the PDB are refined against experimental cryo-EM maps.

3. Results

3.1. Test refinements with simulated data

Below, we illustrate the performance of refinement at different resolutions and map sharpnesses and using atomic models with various amounts of error in the coordinates. All refinements were performed using refinement target (1)[link] with geometry restraints included with optimal weights and data term (7)[link]. We begin with several numerical tests using simulated data. The advantage of such tests is that one can study individual effects in a setting where the answer is known.

3.1.1. Preparing simulated data

A model from the PDB (PDB entry 3vb1) was chosen as a test model. The following manipulations were made to this model prior to test calculations: (i) the model was placed in a sufficiently large P1 unit cell, (ii) alternative conformations were replaced with a single conformation and (iii) model geometry was regularized using the phenix.geometry_minimization tool until convergence. In the following, we refer to this model as a reference model. Several Fourier maps at different resolutions dhigh (1, 2, 3, 4, 5 and 6 Å) were calculated from the reference model considering three different overall B factors of 0, 100 and 200 Å2; these maps mimic ρtar (18 maps in total). The maps were calculated on a grid with the step equal to dhigh/4. Additionally, we calculated the same maps on a much finer grid with a step of 0.2 Å; the same step was used for all maps independent of their resolution.

3.1.2. Refinement of the exact reference model

Firstly, we refined the reference model against finite-resolution maps calculated from this model, as described in §[link]3.1.1. While the reference model corresponds to the minimum of (5)[link], this is not the case for (7)[link] because map peaks in finite resolution Fourier images do not necessarily correspond to atomic centers. Therefore, it is expected that refinement using (7)[link] may shift the model from its original, correct, position. The goal of this test is to provide an estimate of the magnitude of these shifts after refinement. For each refined model we calculated the root-mean-square deviation (r.m.s.d.) from the reference model. Fig. 3[link] summarizes the result of this test. We observe the following.

  • (i) Refinement using a finer grid does not have any significant effect compared with using a dhigh/4 grid step (compare the orange dots and black circles in Fig. 3[link]).

  • (ii) The r.m.s.d. increases as the resolution worsens and ranges from as low as 0.01 Å at 1 Å resolution to as high as 0.48 Å at 6 Å resolution. These r.m.s.d.s are small compared with the details that can be resolved in maps at these resolutions. This justifies the use of a target (7)[link] that is less accurate but much faster to calculate than (5).

  • (iii) Map sharpness has a mixed effect. At high resolution (1–2 Å) maps corresponding to the lowest B of 0 Å2 produce more accurate results. At intermediate resolutions (3–5 Å) maps corresponding to both the lowest and the largest B perform worse compared with those corresponding to an intermediate value (B = 100 Å2). Maps with the largest B of 200 Å2 result in overall less accurate models. These observations suggest that depending on resolution some attenuation of map sharpness may be useful.

[Figure 3]
Figure 3
Refinement of the exact model against 18 maps computed as described in §[link]3.1.1. Each circle shows the root-mean-square deviation between the refined model and the reference model. Blue, green and orange full circles correspond to maps with overall B factors of 0, 100 and 200 Å2, respectively. Open circles correspond to the map with an overall B factor of 100 Å2 computed on the finer grid with a step of 0.2 Å. See §[link]3.1.2 for details.
3.1.3. Refinement of perturbed reference models

Here, we describe tests that are similar to those in §[link]3.1.2 except that instead of refining the reference model we refined perturbed reference models. These perturbed models were obtained by running molecular-dynamics (MD) simulations using the phenix.dynamics tool until a prescribed r.m.s.d. compared with the reference model was achieved. Given the stochastic nature of MD, it is possible to obtain many different models with the same r.m.s.d. from the reference model. Owing to the limited convergence radius of refinement and the finite resolution of the data, refinement of these models will not produce exactly the same refined models. Therefore, to ensure more robust statistics, for each chosen r.m.s.d. we generated an ensemble of 100 models. The r.m.s.d. values between the perturbed and reference models were chosen to be 0.5, 1.0, 1.5 and 2.0 Å. We then refined each of these 100 × 4 = 400 models against each of 18 maps (§[link]3.1.1) calculated on a grid with a spacing of dhigh/4. For each refined model (from 100 × 4 × 6 × 3 = 7200 refined models) we calculated the r.m.s.d. from the reference model and then the average r.m.s.d. over the corresponding ensemble of 100 models. Fig. 4[link] summarizes the results of this test. We observe the following.

  • (i) In most cases refinement was able to significantly reduce the difference between the reference and starting perturbed models. The refinement of models with a starting r.m.s.d. of 0.5 Å gives similar results as the refinement of a nonperturbed reference model (similar r.m.s.d.).

  • (ii) In almost all cases using a blurred map results in less accurate refined models.

  • (iii) In the case of large errors (1.5–2 Å) refinement against a 1 Å resolution map corresponding to an overall B of 0 Å2 performs the worst compared with blurrier maps. This can be rationalized as the peaks on a very sharp map are narrow and sufficiently large displacements of atoms away from these peaks results in shifts that are outside the convergence radius of minimization.

  • (iv) At resolutions of 3–5 Å using neither very sharp nor very blurred maps produces the best results, although the effect is rather small. This suggests that there exists an optimal sharpening B value that is most suitable for refinement at a given resolution.

[Figure 4]
Figure 4
Refinement of perturbed models against maps computed as described in §[link]3.1.1. The horizontal axis shows the r.m.s.d. between the reference model and perturbed models: 0.5, 1.0, 1.5 and 2.0 Å. The vertical axis shows the r.m.s.d. between the reference model and the refined models. Blue, green and orange full circles correspond to maps with overall B factors of 0, 100 and 200 Å2, respectively. See §[link]3.1.3 for details.

3.2. Refinement using data from the PDB and EMDB

3.2.1. Cryo-EM maps

Three-dimensional reconstructions (cryo-EM maps) represent the electric potential of the sample. Therefore, these maps are expected to have negative features around negatively charged moieties such as aspartate and glutamate (see, for example, Hryc et al., 2017[Hryc, C. F., Chen, D.-H., Afonine, P. V., Jakana, J., Wang, Z., Haase-Pettingell, C., Jiang, W., Adams, P. D., King, J. A., Schmid, M. F. & Chiu, W. (2017). Proc. Natl Acad. Sci. USA, 114, 3103-3108.]). Furthermore, such moieties may be susceptible to radiation damage and therefore may have a weaker footprint in the reconstructions. This may have an implication for real-space refinement that uses target (7)[link] [or (5)[link] if the form factors do not reproduce the negative features] because this target favors atomic shifts towards positive map peaks. To investigate this effect, we surveyed map values at atomic positions considering reconstructions at 3 Å or better and map–model correlation better than 0.8. This selected nine (map, model) pairs. Prior to calculations, we normalized all selected maps to have zero mean value and a standard deviation of 1. Fig. 5[link](a) shows the distribution of map values for four groups of atoms: main-chain atoms, side-chain O atoms of Asp and Glu residues that may be negatively charged (OD1, OD2, OE1 and OE2), side-chain atoms of Arg and Lys residues that may be positively charged (NH1, NH2 and NZ) and all other side-chain atoms. We observe that side-chain O atoms of Asp and Glu residues indeed have systematically weaker map values, with about 8% of atoms having values below a threshold of −1 times the r.m.s. of the map. Negative map values for all other kinds of atoms are greater than −0.5 r.m.s. and may be considered as noise. We note that the size and flexibility of Asp, Glu, Arg and Lys side chains are likely to contribute to systematically weaker densities for these side chains. We repeated the same analysis for maps of lower resolution (3–4 Å; Fig. 5[link]b). Here, the number of reliably observed atoms with negative features in the map is less than 1%.

[Figure 5]
Figure 5
Distribution of cryo-EM map values (scaled in r.m.s.) for selected groups of atoms, considering maps at 3 Å or better (a) and 3–4 Å (b) resolution. See §[link]3.2.1 for details.

This analysis shows that for the majority of cryo-EM models (resolution of 3 Å or worse) the concern about negative features in the map is rather small and is unlikely to affect the results of refinement using (7)[link] significantly. On the other hand, the rapidly increasing number of higher resolution cryo-EM maps (better than 3 Å) is likely to highlight the limitation of (7)[link] and to demand further improvements of the refinement target [such as using (8)[link] with properly chosen weights].

3.2.2. Default refinement

In order to test the suggested methods and demonstrate their utility, we re-refined 385 cryo-EM models from the PDB that are reported at a resolution of 6 Å or better, that have model–map correlation greater than 0.3 and that contain only residues and ligands that are known to the PHENIX restraint library. A number of metrics were analyzed: the model-to-map correlation coefficient CCmask calculated in the map region around the model (for an exact definition, see Afonine et al., 2018[Afonine, P. V., Klaholz, B. K., Moriarty, N. W., Poon, B. K., Sobolev, O. V., Terwilliger, T. C., Adams, P. D. & Urzhumtsev, A. (2018). bioRxiv. https://doi.org/10.1101/249607.]), the number of Ramachandran plot and rotamer outliers, excessive Cβ deviations, the MolProbity clashscore (Chen et al., 2010[Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12-21.]) and the EMRinger score (Barad et al., 2015[Barad, B. A., Echols, N., Wang, R. Y.-R., Cheng, Y., DiMaio, F., Adams, P. D. & Fraser, J. S. (2015). Nature Methods, 12, 943-946.]; calculated for 277 entries with maps at 4.5 Å resolution or better), all calculated for the initial models from the PDB and for the models after refinement. Default parameters were used in all refinements that, in addition to standard restraints, also include rotamer, Cβ deviations and Ramachandran plot restraints, as well as NCS constraints where applicable (see §[link]2.2.2). The program ran successfully, generating a refined model for all cases and highlighting the robustness of the algorithms and their implementation. In all cases we observe a substantial overall improvement of geometry metrics, such as reduced or fully eliminated Ramachandran plot and rotamer outliers, Cβ deviations and MolProbity clashscore, as well as improvement of the model-to-data (map) fit (Fig. 6[link]). Clearly, the removal of some outliers can be attributed to the use of rotamer, Cβ deviations and Ramachandran plot restraints. Therefore, we also used an orthogonal validation metric to assess model improvement: EMRinger (Barad et al., 2015[Barad, B. A., Echols, N., Wang, R. Y.-R., Cheng, Y., DiMaio, F., Adams, P. D. & Fraser, J. S. (2015). Nature Methods, 12, 943-946.]). We observe that the overall average EMRinger score for the initial models is 1.73 and that for the refined models is 2.26. The improvement of the EMRinger score for the refined models indicates that the amino-acid side chains are more chemically realistic and better fit the map. Detailed validation or analysis of individual refinement results is outside the scope of this work, but will be important in the future to assess the impact of stereochemical restraints on models, particularly when the starting models are of very poor quality.

[Figure 6]
Figure 6
Model statistics before (brown) and after (blue) refinement using phenix.real_space_refine, showing Ramachandran plot and residue side-chain rotamer outliers, Cβ deviations, MolProbity clashscore and model–map correlation coefficient (CCmask). The scatter plot shows the EMRinger score for the original and refined models (resolution better than 4.5 Å).
3.2.3. Refinement against sharpened maps

Our tests using simulated data (§[link]3.1) have indicated that map sharpening or blurring may be useful in refinement. To investigate this with the real experimental data we performed the following test. We selected models similarly to as described in §[link]3.2.2, additionally requiring that independent half-maps had also been deposited by the researcher. This resulted in 76 entries. We performed test refinements against the first of the two half-maps and evaluated the refined model-to-data fit using the original second half-map that had not been used in any calculations. In two independent refinements, the first half-map was taken either as deposited or modified with phenix.auto_sharpen (Terwilliger, Sobolev et al., 2018[Terwilliger, T. C., Sobolev, O. V., Afonine, P. V. & Adams, P. D. (2018). Acta Cryst. D74, 545-559.]) to automatically optimally sharpen or blur the map. Fig. 7[link] shows the model–map correlation CCmask for models refined against the original and sharpened first half-maps; the original second half-maps were used to compute the correlations. Overall, the CCs across all 76 cases are similar for refinement against the original first half-map and the sharpened first half-map. The refined models fit slightly but systematically better when using sharpened maps if the original model–map CC is low (<0.5) and systematically slightly worse if the original model–map correlation is higher (CC > 0.5). This agrees with the observation that target (7)[link] allows the removal of large errors but may slightly distort exact models (§[link]3.1.2). Also, we note that the MolProbity scores for models refined against sharpened maps are systematically better, but the difference is small.

[Figure 7]
Figure 7
Left, correlation coefficient CCmask calculated using the original second half-maps and maps calculated from models refined against the first half-maps: original (x axis) versus sharpened (y axis). Right, MolProbity scores for models using original first half-maps versus sharpened first half-maps.
3.2.4. Re-refinement of the TRPV1 structure

The structure of the TRPV1 ion channel (PDB entry 3j5p; EMDB code EMD-5778) was determined by single-particle cryo-EM (Liao et al., 2013[Liao, M., Cao, E., Julius, D. & Cheng, Y. (2013). Nature (London), 504, 107-112.]) at a resolution of 3.28 Å. The model was built manually and was not subjected to refinement. As the model was not refined it contains substantial geometry violations: the clashscore is high (∼100) and about one third of the side chains are identified as rotamer outliers (Table 1[link]). More recently, the better resolved part of this structure has been re-evaluated using the same data (Barad et al., 2015[Barad, B. A., Echols, N., Wang, R. Y.-R., Cheng, Y., DiMaio, F., Adams, P. D. & Fraser, J. S. (2015). Nature Methods, 12, 943-946.]; PDB entry 3j9j; ankyrin domain not included). This involved some rebuilding and refinement using algorithms implemented in the Rosetta suite (DiMaio et al., 2015[DiMaio, F., Song, Y., Li, X., Brunner, M. J., Xu, C., Conticello, V., Egelman, E., Marlovits, T., Cheng, Y. & Baker, D. (2015). Nature Methods, 12, 361-365.]). The resulting model has a much improved clashscore and EMRinger score (Barad et al., 2015[Barad, B. A., Echols, N., Wang, R. Y.-R., Cheng, Y., DiMaio, F., Adams, P. D. & Fraser, J. S. (2015). Nature Methods, 12, 943-946.]) and no rotamer outliers, yet the number of Ramachandran plot outliers has increased compared with the original model (Table 1[link]). We performed a refinement of PDB entry 3j5p (the portion that matches PDB entry 3j9j) using phenix.real_space_refine with all default settings and automatically, with no manual intervention, using the original, deposited map. The refinement took about 3 min on a Macintosh laptop.3 Overall, the refined model is similar to PDB entry 3j9j (virtually no rotamer or Ramachandran plot outliers), the EMRinger score is improved further and the model-to-map correlation (CCmask) is increased compared with both PDB entries 3j5p and 3j9j. Notably, the MolProbity clashscore decreased from 100.8 to 5.6 as a result of the resolution of numerous steric clashes (Fig. 8[link]).

Table 1
Summary of statistics for the original model (PDB entry 3j5p), that re-refined by Barad et al. (2015[Barad, B. A., Echols, N., Wang, R. Y.-R., Cheng, Y., DiMaio, F., Adams, P. D. & Fraser, J. S. (2015). Nature Methods, 12, 943-946.]) (PDB entry 3j9j) and that re-refined by phenix.real_space_refine models

Metric 3j5p 3j9j 3j5p (phenix.real_space_refine)
CCmask 0.65 0.59 0.82
EMRinger score 1.2 2.6 3.3
R.m.s.d.
 Bonds (Å) 0.01 0.02 0.01
 Angles (°) 1.50 1.10 1.44
Ramachandran plot (%)
 Favored 95.8 94.5 93.3
 Allowed 4.2 3.3 6.7
 Outliers 0 2.2 0
Rotamer outliers (%) 32.3 0 <1
Clashscore 100.8 2.7 5.6
Cβ deviations 0 0 0
†No ankyrin domain.
[Figure 8]
Figure 8
Backbone of the 3j5p model before (a) and after (b) refinement shown in black. The model before refinement contains a substantial number of steric clashes (indicated by red dots) and many side-chain rotamer outliers (blue side chains). Most clashes and rotamer outliers are resolved by phenix.real_space_refine. The images were created using the KiNG program (Chen et al., 2009[Chen, V. B., Davis, I. W. & Richardson, D. C. (2009). Protein Sci. 18, 2403-2409.]) from within PHENIX.

Modeling experimental data at resolutions below atomic (around 1–1.5 Å and better) may not be unambiguous (Terwilliger et al., 2007[Terwilliger, T. C., Grosse-Kunstleve, R. W., Afonine, P. V., Adams, P. D., Moriarty, N. W., Zwart, P., Read, R. J., Turk, D. & Hung, L.-W. (2007). Acta Cryst. D63, 597-610.]). Therefore, it may be instructive to perform several trial refinements, each using the exact same settings but different (perturbed) input models. Here, we generated an ensemble of 100 perturbed models by running molecular-dynamics simulation (using phenix.dynamics tool) until the r.m.s. deviation between the starting and simulated models reached 3 Å (Fig. 9[link]a). We then refined all models using phenix.real_space_refine until convergence. This resulted in 100 refined models that are overall similar but vary locally (Fig. 9[link]b). This highlights the fact that a single-model representation of experimental data is an approximation and should not be taken too literally (for example, when it comes to measuring and reporting distances between atoms). Also, this test demonstrates the rather large convergence radius of phenix.real_space_refine: the average map–model correlation (CCmask) across all 100 refined models is 0.80, with the smallest and largest values being 0.79 and 0.81.

[Figure 9]
Figure 9
(a) Ensemble of perturbed 3j5p models; the r.m.s. deviation of each model from the initial model is 3 Å, showing chain A only. (b) Ensemble of refined models in the experimental map. The largest variation is observed in regions that lack density. The images were created using the ChimeraX program (Goddard et al., 2018[Goddard, T. D., Huang, C. C., Meng, E. C., Pettersen, E. F., Couch, G. S., Morris, J. H. & Ferrin, T. E. (2018). Protein Sci. 27, 14-25.]).

4. Conclusions

Refinement of an atomic model against a map is increasingly important as the technique of cryo-EM rapidly develops. We have described the algorithms implemented in a new PHENIX tool, phenix.real_space_refine, that was specifically designed to perform such real-space refinements. RSR is a natural choice for cryo-EM, unlike crystallography, where real-space methods are complementary to Fourier-space refinement and are somewhat limited since crystallographic maps are almost always model-biased. Nevertheless, while this work was inspired by rapid advances in the field of cryo-EM and the increasing number of three-dimensional reconstructions that allow atomic models to be refined (as opposed to rigid-body docked), the implementation is not limited to cryo-EM and crystallographic maps can also be used.

The proposed real-space refinement procedure is fast owing to the use of an atom-centered refinement target function that has been shown to be efficient at all tested resolutions from 1 to 6 Å. Several options for key calculation steps, such as map interpolation, gradient calculation and preliminary processing of the target (experimental) map, are available with the default choices selected on the basis of extensive test calculations. The real-space refinement algorithm includes a fast and efficient search for the optimal relative weight of restraints, a procedure that is extremely challenging for reciprocal-space refinement. The refinement algorithm is robust, with no failures for any of the cryo-EM maps tested. For all test model refinements improvements are observed; in some cases these improvements are significant. Future developments of the algorithms will include methods to account for local variation in map resolution and a fast and accurate calculation of (5)[link] for the final refinement cycles and efficient modeling of atomic displacements.

APPENDIX A

Real-space targets and convolution

We show here that if the atoms all have the same shape, sampling a map at the positions of atomic centers, as in (7)[link], can be made equivalent to the correlation function obtained by integrating or summing over the product of calculated and target densities, as in (3)[link] or (5)[link]. Consider a simplified structure composed of a single atom. Looking for its best position according to (3)[link] or (5)[link] corresponds to seeking the position where the weighted average of the target map values (weighted by the atomic shape) inside a sphere centered at the trial atomic position is maximal. This calculation and check for the maximal value could be performed point by point. Alternatively, one can first calculate such averages for all grid points, replace the initial map values by these sums and then simply choose the maximum. From a mathematical point of view this averaging can be considered as a convolution and, if calculated simultaneously for the whole map, can be performed rapidly (Leslie, 1987[Leslie, A. G. W. (1987). Acta Cryst. A43, 134-136.]; Urzhumtsev et al., 1989[Urzhumtsev, A. G., Lunin, V. Y. & Luzyanina, T. B. (1989). Acta Cryst. A45, 34-39.]). Checking the values of the averaged, i.e. blurred, map for their maximum corresponds to using targets (7)[link] or (8)[link]. Below, we give a formal interpretation of these real-space targets.

Let Z0f0(|s|; B0) be a scattering factor of some isotropic atom characterized by a B0 value and the electron number Z0. Let Z0ρ0(r; B0) be an image of this atom in the corresponding model map if it is placed at the origin. Both Z0f0(|s|; B0) and Z0ρ0(r; B0) are spherically symmetric and related by Fourier transformation. If a hypothetical structure is composed of a single atom positioned at r0, the corresponding model map is

[\rho_{{\rm calc},0}({\bf r}) = Z_0\rho_0({\bf r}-{\bf r}_{0}\semi B_{0}), \eqno (9)]

which can be seen as a convolution of a point scatterer at position r0 with the atomic shape. Owing to the spherical symmetry of ρ0(r; B0), the target function (3)[link]

[\eqalignno {T_{\rm data} &= -\textstyle\int\limits_{V} \rho_{\rm tar}({\bf r})\rho_{{\rm calc},0}({\bf r})\,{\rm d}{\bf r} = -Z_{\rm 0}\textstyle \int\limits_{V}\rho_{\rm tar}({\bf r})\rho_{0}({\bf r}-{\bf r}_0\semi B_{0})\,{\rm d}{\bf r} \cr & = -Z_{0}\textstyle\int\limits_{V}\rho_{\rm tar}({\bf r})\rho_{0}({\bf r}_{0}-{\bf r}\semi B_{\rm 0})\,{\rm d}{\bf r} & (10)}]

can be interpreted as a convolution of the target map with ρ0(r; B0) taken at point r0. Let {Ftar(s)} be the set of Fourier coefficients corresponding to the target map ρtar(r). By the convolution theorem, (10)[link] is equal to the Fourier series of the corresponding Fourier coefficients,

[\eqalignno {-Z_{0}\textstyle \sum \limits_{\bf s}&{\bf F}_{\rm tar}({\bf s})f_{0}(|{\bf s}|\semi B_{0})\exp(-2\pi i{\bf r}_0{\bf s}) \cr & = -Z_{0}\textstyle \sum \limits_{\bf s}[{\bf F}_{\rm tar}({\bf s})\cdot f_{o}(|{\bf s}|\semi B_{0})]\exp(-2\pi i{\bf r}_0{\bf s}) \cr &= -Z_{0}\rho_{{\rm tar}\_0}({\bf r}_{0}\semi B_{0}). &(11)}]

Here, the map ρtar_0(r; B0) is a Fourier series calculated with the coefficients Ftar(s)f0(|s|; B0). In other words, instead of blurring the model map with the atomic shape and calculating the point-by-point product of the two maps, one may blur the experimental map and leave the model map unblurred, i.e. as a point map.

For a multi-atom model

[\eqalignno {T_{\rm data} & = -\textstyle \int \limits_{V}\rho_{\rm tar}({\bf r})\rho_{\rm calc}({\bf r})\,{\rm d}{\bf r} = {-\textstyle \int \limits_{V}}\rho_{\rm tar} \left [\textstyle \sum \limits_{m=1}^{M}\rho_{{\rm calc},m}({\bf r})\right]\, {\rm d}{\bf r} \cr & = -\textstyle \sum \limits_{m=1}^{M}\textstyle \int \limits_{V}\rho_{\rm tar}({\bf r})\rho_{{\rm calc},m}({\bf r})\,{\rm d}{\bf r}. & (12)}]

At resolutions typical for bio-crystallography the shapes of macromolecular atoms are similar. If we additionally suppose that all of the atoms of the structure have the same (or similar) atomic displacement parameters Bm = B0, then

[T_{\rm data} \simeq - \textstyle\sum\limits_{m=1}^{M}Z_m\rho_{{\rm tar}\_0}({\bf r}_m\semi B_0)\eqno (13)]

using the function ρtar_0(r; B0) calculated once in advance. This shows that in calculating (8) we in fact implicitly sharpen the target map using ρtar(r) instead of ρtar_0(r; B0). Even when using (8)[link] as the target, it is likely to be beneficial to choose an optimal sharpening factor, just as the signal in map correlations can be improved.

If the difference in atomic B values cannot be neglected, one can calculate in advance a few maps ρtar_0(r; Bk) for a range of B-factor values Bk, k = 1, …, K, and use the appropriate ρtar_0(rm; Bk) for a particular atom m,

[R_{Z{\hbox {-}}{\rm atoms}} = -\textstyle\sum\limits_{m=1}^{M}Z_m\rho_{{\rm tar}\_0}[{\bf r}_m\semi B_k(m)]. \eqno (14)]

If the atomic shapes are significantly different, as is the case for H atoms in neutron maps or negatively charged side chains in cryo-EM maps at high resolution, the approximation (13)[link] can be used with Zm being a negative value, or the target map can be convoluted with the respective atomic shape (which can be negative) before the sum over the relevant atoms is calculated.

APPENDIX B

Three-dimensional interpolation used

B1. General remarks

Using the atom-centered targets (7)[link] and (8)[link] requires an efficient and accurate interpolation of the maps calculated on three-dimensional regular grids. Not only the interpolated function values are needed but also the gradient. In this work, two options have been considered: trilinear (https://en.wikipedia.org/wiki/Trilinear_interpolation) and tricubic (https://en.wikipedia.org/wiki/Tricubic_interpolation). Both interpolation procedures, including the gradient calculation, are available through the cctbx software library (Grosse-Kunstleve et al., 2002[Grosse-Kunstleve, R. W., Sauter, N. K., Moriarty, N. W. & Adams, P. D. (2002). J. Appl. Cryst. 35, 126-136.]). Trilinear interpolation is the simplest and the easiest to understand. Its major disadvantage is that, by construction, the minimum of the interpolated function is always at one of the corners of the box of interpolation. Since the map grid step is usually larger that the accuracy of atomic positions required, this can impact the optimization procedure and results. For this reason, the tricubic interpolation has been chosen as the default method. Other interpolations have also been tried but are not discussed in this work. In the following, we first describe the interpolation procedures inside the unit cube and then adapt the results and the procedures to an arbitrary regular tridimensional grid.

B2. Tricubic interpolation inside a unit cube

Let us consider an interpolation inside a unit cube, 0 ≤ x < 1, 0 ≤ y < 1, 0 ≤ z < 1. We search for a function in the form

[\tilde{f}(x,y,z) = \textstyle\sum \limits_{k,l,m = 0}^{3}a_{klm}x^{k}y^{l}z^{m}. \eqno (15)]

This function is cubic with respect to any of its three variables, giving expressions for the partial derivatives

[\eqalignno {{{\partial \tilde{f}(x,y,z)}\over{\partial x}} & = \textstyle\sum\limits_{l,m = 0,k = 1}^{3}ka_{klm}x^{k-1}y^{l}z^{m}, \cr {{\partial \tilde{f}(x,y,z)}\over{\partial y}} & = \textstyle \sum \limits_{k,m = 0,l = 1}^{3}la_{klm}x^{k}y^{l-1}z^{m}, \cr {{\partial \tilde{f}(x,y,z)}\over{\partial z}} &= \textstyle\sum \limits_{k,l = 0, m = 1}^{3}ma_{klm}x^{k}y^{l}z^{m-1}. & (16)}]

One can calculate all 64 coefficients in advance and use them for further calculations (Lekien & Marsden, 2005[Lekien, F. & Marsden, J. (2005). Int. J. Numer. Methods Eng. 63, 455-471.]). Alternatively, one can build an interpolation for the coordinate x, then for the coordinate y and finally for the coordinate z (in any order of variables). To build interpolation (16)[link] eight values from the cube corners are insufficient and either values from the neighboring grid points (the corners of the neighboring cubes) or derivatives in the corners of the unit cube are required. In the following, fpqr with integers p, q, r stand for the grid function values f(p, q, r).

Firstly, we define a cubic interpolation

[\tilde{f}(x) = a_{0}+a_{1}x+a_{2}x^{2}+a_{3}x^{3} \eqno (17)]

of a function f(x) of one variable in the interval (0, 1) for which its values are known in the integer grid nodes, f−1 = f(−1), f0 = f(0), f1 = f(1), f2 = f(2). We notate this interpolation by int3(x; f−1, f0, f1, f2) and its derivative by gint3(x; f−1, f0, f1, f2), as they are called in cctbx:

[{{{\rm d}\tilde{f}(x)}\over{{\rm d}x}} = a_{1}+2a_{2}x+3a_{3}x^{2}. \eqno (18)]

The coefficients of this approximation are derived below. The procedure of the tricubic interpolation then becomes a suite of operations:

[\eqalignno {\tilde{f}_{xpq} &= {\rm int3}[x\semi f_{(-1)pq}, f_{0pq}, f_{1pq}, f_{2pq}], \cr \tilde{f}_{qyp} &= {\rm int3}[y\semi f_{q(-1)p}, f_{q0p}, f_{q1p}, f_{q2p}], \cr \tilde{f}_{pqz} &= {\rm int3}[z\semi f_{pq(-1)}, f_{pq0}, f_{pq1}, f_{pq2}], &(19)}]

where p and q are integers −1, 0, 1 or 2, then

[\eqalignno {\tilde{f}_{xyq} &= {\rm int3}[y \semi \tilde{f}_{x(-1)q}, \tilde{f}_{x0q},\tilde{f}_{x1q}, \tilde{f}_{x2q}], \cr \tilde{f}_{qyz} &= {\rm int3}[z\semi\tilde{f}_{qy(-1)}, \tilde{f}_{qy0}, \tilde{f}_{qy1}, \tilde{f}_{qy2}], \cr \tilde{f}_{xqz} &= {\rm int3}[x\semi\tilde{f}_{(-1)qz}, \tilde{f}_{0qz}, \tilde{f}_{1qz}, \tilde{f}_{2qz}] & (20)}]

and finally

[\eqalignno {\tilde{f}_{xyz} &= {\rm int3}[z\semi \tilde{f}_{xy(-1)}, \tilde{f}_{xy0}, \tilde{f}_{xy1}, \tilde{f}_{xz2}], \cr \tilde{f}_{xyz} &= {\rm int3}[x\semi \tilde{f}_{(-1)yz}, \tilde{f}_{0yz}, \tilde{f}_{1yz}, \tilde{f}_{2yz}], \cr \tilde{f}_{xyz} &= {\rm int3}[y \semi\tilde{f}_{x(-1)z}, \tilde{f}_{x0z}, \tilde{f}_{x1z}, \tilde{f}_{x2z}]. & (21)}]

The last three expressions are redundant and only one of them can be calculated. However, the expressions previous to them are necessary to calculate partial derivatives as

[\eqalignno {{{\partial \tilde{f}(x,y,z)}\over{\partial x}} & = {\rm gint3}[x\semi\tilde{f}_{(-1)yz}, \tilde{f}_{0yz}, \tilde{f}_{1yz}, \tilde{f}_{2yz}], \cr {{\partial \tilde{f}(x,y,z)}\over{\partial y}} & = {\rm gint3}[y\semi\tilde{f}_{x(-1)z}, \tilde{f}_{x0z}, \tilde{f}_{x1z}, \tilde{f}_{x2z}], \cr {{\partial \tilde{f}(x,y,z)}\over{\partial z}} &= {\rm gint3}[z\semi\tilde{f}_{xy(-1)}, \tilde{f}_{xy0},\tilde{f}_{xy1}, \tilde{f}_{xy2}]. &(22)}]

The coefficients of the one-dimensional cubic interpolation (17)[link] can be chosen using various considerations. The possibility taken as the default choice in the current software version is to build a cubic function [\tilde{f}(x)] such that it and its first derivative coincide with f(x) and with f′(x), respectively, at points 0 and 1. Since the f′(0) and f′(1) values are unknown, they are estimated as

[f'(0)\simeq {{1}\over{2}}(f_{1}-f_{-1}), \quad f'(1)\simeq {{1}\over{2}}(f_{2}-f_{0}). \eqno (23)]

This gives the coefficients of (17)[link] in the form

[\eqalignno {a_{0} & = f_{0}, \cr a_{1} &= {{1}\over{2}}(f_{1}-f_{-1}), \cr a_{2} & = {{1}\over{2}}(-f_{2}+4f_{1}-5f_{0}+2f_{-1}), \cr a_{3} & = {{1}\over{2}}(f_{2}-3f_{1}+3f_{0}-f_{-1}). & (24)}]

B3. Tricubic interpolation on a regular grid

Now let a function f(x, y, z) be defined in fractional coordinates on a grid with the step dx = Nx−1, dy = Ny−1, dz = Nz−1. Let us consider a point (xg, yg, zg) and a box of this grid that this point belongs to,

[\eqalignno {&n_{x}d_{x} \leq x_{g}\,\lt\, (n_{x}+1)d_{x}, \cr &n_{y}d_{y} \leq y_{g}\,\lt\, (n_{y}+1)d_{y}, \cr & n_{z}d_{z} \leq z_{g}\,\lt\, (n_{z}+1)d_{z} & (25)}]

with nx, ny, nz being integer numbers. We introduce intermediate variables rescaling this `box' to a unit cube as

[\eqalignno {0 \le x & = x_gd_x^{-1} - n_x \,\lt\, 1, \cr 0 \le y & = y_gd_y^{-1} - n_y \,\lt \,1, \cr 0 \le z & = z_gd_z^{-1} - n_z \,\lt\, 1 &(26)}]

and apply the procedure (19)[link]–(21)[link] described above. According to (26)[link], the respective derivatives are

[\eqalignno {{{\partial {\tilde f}(x_g,y_g,z_g)} \over {\partial x_g}} & = d_x^{-1} {{\partial {\tilde f}(x,y,z)} \over {\partial x}}, \cr {{\partial {\tilde f} (x_g,y_g,z_g)} \over {\partial y_g}} & = d_y^{-1}{{\partial {\tilde f}(x,y,z)} \over {\partial y}}, \cr {{\partial {\tilde f}(x_g,y_g,z_g)} \over {\partial z_g}} & = d_z^{-1}{{\partial {\tilde f}(x,y,z)} \over {\partial z}}. & (27)}]

Footnotes

1It is a widely known consequence of Parseval's theorem [see, for example, Diamond (1971[Diamond, R. (1971). Acta Cryst. A27, 436-452.]) or Arnold & Rossmann (1988[Arnold, E. & Rossmann, M. G. (1988). Acta Cryst. A44, 270-283.])] that this is equivalent to a least-squares target between a full set of the corresponding complex Fourier coefficients; CNS (Brünger et al., 1998[Brünger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T. & Warren, G. L. (1998). Acta Cryst. D54, 905-921.]) describes this as a `vector LS target'.

2In crystallography, the set of the calculated Fourier coefficients usually coincides with that of the experimentally measured intensities.

3For comparison of the CPU required by the two methods, we refer to Kim & Sanbonmatsu (2017[Kim, D. N. & Sanbonmatsu, K. Y. (2017). Biosci. Rep. 37, BSR20170072.]).

Funding information

This work was supported by the NIH (grant GM063210 to PDA, RJR and TT) and the PHENIX Industrial Consortium. This work was supported in part by the US Department of Energy under Contract No. DE-AC02-05CH11231. AU acknowledges the support and the use of resources of the French Infrastructure for Integrated Structural Biology FRISBI ANR-10-INBS-05 and of Instruct-ERIC. RJR is supported by a Principal Research Fellowship funded by the Wellcome Trust (Grant 082961/ Z/07/Z).

References

First citationAdams, P. D. et al. (2010). Acta Cryst. D66, 213–221.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationAfonine, P. V., Grosse-Kunstleve, R. W., Echols, N., Headd, J. J., Moriarty, N. W., Mustyakimov, M., Terwilliger, T. C., Urzhumtsev, A., Zwart, P. H. & Adams, P. D. (2012). Acta Cryst. D68, 352–367.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationAfonine, P. V., Grosse-Kunstleve, R. W., Urzhumtsev, A. & Adams, P. D. (2009). J. Appl. Cryst. 42, 607–615.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationAfonine, P. V., Headd, J. J., Terwilliger, T. C. & Adams, P. D. (2013). Comput. Crystallogr. Newsl. 4, 43–44. https://www.phenix-online.org/newsletter/CCN_2013_07.pdfGoogle Scholar
First citationAfonine, P. V., Klaholz, B. K., Moriarty, N. W., Poon, B. K., Sobolev, O. V., Terwilliger, T. C., Adams, P. D. & Urzhumtsev, A. (2018). bioRxiv. https://doi.org/10.1101/249607Google Scholar
First citationAfonine, P., Urzhumtsev, A. & Adams, P. D. (2015). Arbor, 191, a219. https://doi.org/10.3989/arbor.2015.772n2005Google Scholar
First citationAhmed, T., Shi, J. & Bhushan, S. (2017). Nucleic Acids Res. 45, 8581–8595.  CrossRef Google Scholar
First citationAhmed, T., Yin, Z. & Bhushan, S. (2016). Sci Rep. 6, 35793.  CrossRef Google Scholar
First citationArnold, E. & Rossmann, M. G. (1988). Acta Cryst. A44, 270–283.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationBaker, M. L., Hryc, C. F., Zhang, Q., Wu, W., Jakana, J., Haase-Pettingell, C., Afonine, P. V., Adams, P. D., King, J. A., Jiang, W. & Chiu, W. (2013). Proc. Natl Acad. Sci. USA, 110, 12301–12306.  Web of Science CrossRef CAS PubMed Google Scholar
First citationBarad, B. A., Echols, N., Wang, R. Y.-R., Cheng, Y., DiMaio, F., Adams, P. D. & Fraser, J. S. (2015). Nature Methods, 12, 943–946.  CrossRef Google Scholar
First citationBerman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P. E. (2000). Nucleic Acids Res. 28, 235–242.  Web of Science CrossRef PubMed CAS Google Scholar
First citationBernstein, F. C., Koetzle, T. F., Williams, G. J., Meyer, E. F. Jr, Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T. & Tasumi, M. (1977). J. Mol. Biol. 112, 535–542.  CSD CrossRef CAS PubMed Web of Science Google Scholar
First citationBhardwaj, A., Sankhala, R. S., Olia, A. S., Brooke, D., Casjens, S. R., Taylor, D. J., Prevelige, P. E. Jr & Cingolani, G. (2016). J. Biol. Chem. 291, 215–226.  CrossRef Google Scholar
First citationBrändén, C.-I. & Jones, T. A. (1990). Nature (London), 343, 687–689.  Google Scholar
First citationBrown, A., Long, F., Nicholls, R. A., Toots, J., Emsley, P. & Murshudov, G. (2015). Acta Cryst. D71, 136–153.  Web of Science CrossRef IUCr Journals Google Scholar
First citationBrünger, A. T. (1992). Nature (London), 355, 472–475.  PubMed Web of Science Google Scholar
First citationBrünger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T. & Warren, G. L. (1998). Acta Cryst. D54, 905–921.  Web of Science CrossRef IUCr Journals Google Scholar
First citationBrünger, A. T., Kuriyan, J. & Karplus, M. (1987). Science, 235, 458–460.  PubMed Web of Science Google Scholar
First citationChapman, M. S. (1995). Acta Cryst. A51, 69–80.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationChen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12–21.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationChen, V. B., Davis, I. W. & Richardson, D. C. (2009). Protein Sci. 18, 2403–2409.  Web of Science CrossRef PubMed CAS Google Scholar
First citationChen, Y. et al. (2016). Science, 353, aad8266.  CrossRef Google Scholar
First citationCheng, L., Sun, J., Zhang, K., Mou, Z., Huang, X., Ji, G., Sun, F., Zhang, J. & Zhu, P. (2011). Proc. Natl Acad. Sci. USA, 108, 1373–1378.  Web of Science CrossRef CAS PubMed Google Scholar
First citationChua, E. Y. D., Vogirala, V. K., Inian, O., Wong, A. S. W., Nordenskiöld, L., Plitzko, J. M., Danev, R. & Sandin, S. (2016). Nucleic Acids Res. 44, 8013–8019.  Web of Science CrossRef CAS PubMed Google Scholar
First citationDemo, G., Svidritskiy, E., Madireddy, R., Diaz-Avalos, R., Grant, T., Grigorieff, N., Sousa, D. & Korostelev, A. A. (2017). Elife, 6, e23687.  Google Scholar
First citationDiamond, R. (1971). Acta Cryst. A27, 436–452.  CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationDiMaio, F., Song, Y., Li, X., Brunner, M. J., Xu, C., Conticello, V., Egelman, E., Marlovits, T., Cheng, Y. & Baker, D. (2015). Nature Methods, 12, 361–365.  CrossRef Google Scholar
First citationEmsley, P. & Cowtan, K. (2004). Acta Cryst. D60, 2126–2132.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationEmsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationFischer, N., Neumann, P., Konevega, A. L., Bock, L. V., Ficner, R., Rodnina, M. V. & Stark, H. (2015). Nature (London), 520, 567–570.  Web of Science CrossRef PubMed Google Scholar
First citationFrank, J. (2006). Three-Dimensional Electron Microscopy of Macromolecular Assemblies. Oxford University Press.  Google Scholar
First citationGao, Y., Cao, E., Julius, D. & Cheng, Y. (2016). Nature (London), 534, 347–351.  Web of Science CrossRef CAS PubMed Google Scholar
First citationGoddard, T. D., Huang, C. C., Meng, E. C., Pettersen, E. F., Couch, G. S., Morris, J. H. & Ferrin, T. E. (2018). Protein Sci. 27, 14–25.  CrossRef Google Scholar
First citationGore, S. et al. (2017). Structure, 25, 1916–1927.  CrossRef CAS Google Scholar
First citationGrosse-Kunstleve, R. W. & Adams, P. D. (2004). IUCr Comput. Comm. Newsl. 4, 19–36. https://www.iucr.org/resources/commissions/crystallographic-computing/newsletters/4Google Scholar
First citationGrosse-Kunstleve, R. W., Sauter, N. K., Moriarty, N. W. & Adams, P. D. (2002). J. Appl. Cryst. 35, 126–136.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationGrosse-Kunstleve, R. W., Sauter, N. K. & Adams, P. D. (2004). IUCr Comput. Comm. Newsl. 3, 22–31. https://www.iucr.org/resources/commissions/crystallographic-computing/newsletters/3Google Scholar
First citationHeadd, J. J., Echols, N., Afonine, P. V., Grosse-Kunstleve, R. W., Chen, V. B., Moriarty, N. W., Richardson, D. C., Richardson, J. S. & Adams, P. D. (2012). Acta Cryst. D68, 381–390.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationHeadd, J. J., Echols, N., Afonine, P. V., Moriarty, N. W., Gildea, R. J. & Adams, P. D. (2014). Acta Cryst. D70, 1346–1356.  Web of Science CrossRef IUCr Journals Google Scholar
First citationHenderson, R. et al. (2012). Structure, 20, 205–214.  Web of Science CrossRef CAS PubMed Google Scholar
First citationHenrick, K., Newman, R., Tagari, M. & Chagoyen, M. (2003). J. Struct. Biol. 144, 228–237.  Web of Science CrossRef PubMed CAS Google Scholar
First citationHodel, A., Kim, S.-H. & Brünger, A. T. (1992). Acta Cryst. A48, 851–858.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationHryc, C. F., Chen, D.-H., Afonine, P. V., Jakana, J., Wang, Z., Haase-Pettingell, C., Jiang, W., Adams, P. D., King, J. A., Schmid, M. F. & Chiu, W. (2017). Proc. Natl Acad. Sci. USA, 114, 3103–3108.  CrossRef Google Scholar
First citationJones, T. A. (1978). J. Appl. Cryst. 11, 268–272.  CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationJones, T. A., Zou, J.-Y., Cowan, S. W. & Kjeldgaard, M. (1991). Acta Cryst. A47, 110–119.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationJoseph, A. P., Lagerstedt, I., Patwardhan, A., Topf, M. & Winn, M. (2017). J. Struct. Biol. 199, 12–26.  CrossRef Google Scholar
First citationKim, D. N. & Sanbonmatsu, K. Y. (2017). Biosci. Rep. 37, BSR20170072.  CrossRef Google Scholar
First citationLagerstedt, I., Moore, W. J., Patwardhan, A., Sanz-García, E., Best, C., Swedlow, J. R. & Kleywegt, G. J. (2013). J. Struct. Biol. 184, 173–181.  CrossRef Google Scholar
First citationLekien, F. & Marsden, J. (2005). Int. J. Numer. Methods Eng. 63, 455–471.  CrossRef Google Scholar
First citationLeslie, A. G. W. (1987). Acta Cryst. A43, 134–136.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationLiao, M., Cao, E., Julius, D. & Cheng, Y. (2013). Nature (London), 504, 107–112.  Web of Science CrossRef CAS PubMed Google Scholar
First citationLiu, D. C. & Nocedal, J. (1989). Math. Program. 45, 503–528.  CrossRef Web of Science Google Scholar
First citationLiu, Y., Pan, J., Jenni, S., Raymond, D. D., Caradonna, T., Do, K. T., Schmidt, A. G., Harrison, S. C. & Grigorieff, N. (2017). J. Mol. Biol. 429, 1829–1839.  CrossRef CAS Google Scholar
First citationLokareddy, R. K., Sankhala, R. S., Roy, A., Afonine, P. V., Motwani, T., Teschke, C. M., Parent, K. N. & Cingolani, G. (2017). Nature Commun. 8, 14310.  CrossRef Google Scholar
First citationLunin, V. Y., Afonine, P. V. & Urzhumtsev, A. G. (2002). Acta Cryst. A58, 270–282.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationLunin, V. Y. & Urzhumtsev, A. G. (1984). Acta Cryst. A40, 269–277.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationMaslen, E. N., Fox, A. G. & O'Keefe, M. A. (1992). International Tables for Crystallography, Vol. C, edited by A. J. C. Wilson, pp. 476–516. Dordrecht: Kluwer Academic Publishers.  Google Scholar
First citationMooij, W. T. M., Hartshorn, M. J., Tickle, I. J., Sharff, A. J., Verdonk, M. L. & Jhoti, H. (2006). ChemMedChem, 1, 827–838.  Web of Science CrossRef PubMed CAS Google Scholar
First citationOldfield, T. J. (2001). Acta Cryst. D57, 82–94.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationPaulino, C., Neldner, Y., Lam, A. K. M., Kalienkova, V., Brunner, J. D., Schenck, S. & Dutzler, R. (2017). Elife, 6, e26232.  CrossRef Google Scholar
First citationPeng, L.-M. (1998). Acta Cryst. A54, 481–485.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationPeng, L.-M., Ren, G., Dudarev, S. L. & Whelan, M. J. (1996). Acta Cryst. A52, 257–276.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationPintilie, G., Chen, D.-H., Haase-Pettingell, C. A., King, J. A. & Chiu, W. (2016). Biophys. J. 110, 827–839.  CrossRef Google Scholar
First citationRead, R. J. (1986). Acta Cryst. A42, 140–149.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationRead, R. J. et al. (2011). Structure, 19, 1395–1412.  Web of Science CrossRef CAS PubMed Google Scholar
First citationRossmann, M. G. (2000). Acta Cryst. D56, 1341–1349.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationRossmann, M. G., Bernal, R. & Pletnev, S. V. (2001). J. Struct. Biol. 136, 190–200.  Web of Science CrossRef PubMed CAS Google Scholar
First citationSears, V. F. (1992). Neutron News, 3(3), 26–37.  CrossRef Google Scholar
First citationShalev-Benami, M., Zhang, Y., Matzov, D., Halfon, Y., Zackay, A., Rozenberg, H., Zimmerman, E., Bashan, A., Jaffe, C. L., Yonath, A. & Skiniotis, G. (2016). Cell. Rep. 16, 288–294.  Google Scholar
First citationSheldrick, G. M. & Schneider, T. R. (1997). Methods Enzymol. 277, 319–343.  CrossRef PubMed CAS Web of Science Google Scholar
First citationSobolev, O. V., Afonine, P. V., Adams, P. D. & Urzhumtsev, A. (2015). J. Appl. Cryst. 48, 1130–1141.  CrossRef CAS IUCr Journals Google Scholar
First citationSorzano, C. O. S., Vargas, J., Otón, J., Abrishami, V., de la Rosa-Trevín, J. M., del Riego, S., Fernández-Alderete, A., Martínez-Rey, C., Marabini, R. & Carazo, J. M. (2015). AIMS Biophys. 2, 8–20.  Google Scholar
First citationTerwilliger, T. C., Adams, P. D., Afonine, P. V. & Sobolev, O. V. (2018). bioRxiv, 267138. https://doi.org/10.1101/267138Google Scholar
First citationTerwilliger, T. C., Grosse-Kunstleve, R. W., Afonine, P. V., Adams, P. D., Moriarty, N. W., Zwart, P., Read, R. J., Turk, D. & Hung, L.-W. (2007). Acta Cryst. D63, 597–610.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationTerwilliger, T. C., Read, R. J., Adams, P. D., Brunger, A. T., Afonine, P. V. & Hung, L.-W. (2013). Acta Cryst. D69, 2244–2250.  Web of Science CrossRef IUCr Journals Google Scholar
First citationTerwilliger, T. C., Sobolev, O. V., Afonine, P. V. & Adams, P. D. (2018). Acta Cryst. D74, 545–559.  CrossRef IUCr Journals Google Scholar
First citationTickle, I. J. (2012). Acta Cryst. D68, 454–467.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationTronrud, D. E. (2004). Acta Cryst. D60, 2156–2168.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationTurk, D. (2013). Acta Cryst. D69, 1342–1357.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationUrzhumtsev, A., Afonine, P. V., Lunin, V. Y., Terwilliger, T. C. & Adams, P. D. (2014). Acta Cryst. D70, 2593–2606.  Web of Science CrossRef IUCr Journals Google Scholar
First citationUrzhumtsev, A. G., Lunin, V. Y. & Luzyanina, T. B. (1989). Acta Cryst. A45, 34–39.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationWaasmaier, D. & Kirfel, A. (1995). Acta Cryst. A51, 416–431.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationWatkin, D. (2008). J. Appl. Cryst. 41, 491–522.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationWilliams, C. J. et al. (2018). Protein Sci. 27, 193–315.  CrossRef Google Scholar
First citationWlodawer, A. & Dauter, Z. (2017). Acta Cryst. D73, 379–380.  CrossRef IUCr Journals Google Scholar
First citationYang, H., Wang, J., Liu, M., Chen, X., Huang, M., Tan, D., Dong, M.-Q., Wong, C. C. L., Wang, J., Xu, Y. & Wang, H.-W. (2016). Protein Cell, 7, 878–887.  CrossRef Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

Journal logoSTRUCTURAL
BIOLOGY
ISSN: 2059-7983
Follow Acta Cryst. D
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds