DeepEMhancer: a deep learning solution for cryo-EM volume post-processing

Sanchez-Garcia, Ruben; Gomez-Blanco, Josue; Cuervo, Ana; Carazo, Jose Maria; Sorzano, Carlos Oscar S.; Vargas, Javier

doi:10.1038/s42003-021-02399-1

Download PDF

Article
Open access
Published: 15 July 2021

DeepEMhancer: a deep learning solution for cryo-EM volume post-processing

Communications Biology volume 4, Article number: 874 (2021) Cite this article

18k Accesses
420 Citations
34 Altmetric
Metrics details

Subjects

Abstract

Cryo-EM maps are valuable sources of information for protein structure modeling. However, due to the loss of contrast at high frequencies, they generally need to be post-processed to improve their interpretability. Most popular approaches, based on global B-factor correction, suffer from limitations. For instance, they ignore the heterogeneity in the map local quality that reconstructions tend to exhibit. Aiming to overcome these problems, we present DeepEMhancer, a deep learning approach designed to perform automatic post-processing of cryo-EM maps. Trained on a dataset of pairs of experimental maps and maps sharpened using their respective atomic models, DeepEMhancer has learned how to post-process experimental maps performing masking-like and sharpening-like operations in a single step. DeepEMhancer was evaluated on a testing set of 20 different experimental maps, showing its ability to reduce noise levels and obtain more detailed versions of the experimental maps. Additionally, we illustrated the benefits of DeepEMhancer on the structure of the SARS-CoV-2 RNA polymerase.

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Synthetic intrinsically disordered protein fusion tags that enhance protein solubility

Article Open access 02 May 2024

Computational scoring and experimental evaluation of enzymes generated by neural networks

Article Open access 23 April 2024

Introduction

Almost one decade after the beginning of the so-called “resolution revolution”, cryogenic electron microscopy (cryo-EM) has become one of the most versatile tools in the field of structural biology. Beginning from thousands of single-particle projection images, cryo-EM workflows are capable of obtaining three-dimensional (3D) reconstructions of many macromolecules at “near-atomic” resolution levels. However, the ultimate goal of the cryo-EM single-particle analysis is not the obtention of 3D maps but the detailed atomic understanding through the derivation of atomic models.

During the atomic model building process, raw 3D maps are rarely employed, as they suffer from loss of contrast at high resolution¹ that makes difficult the detection and interpretability of residues and secondary structure. Fortunately, loss of contrast can be alleviated using different contrast restoration algorithms, which are usually known as sharpening methods. The first sharpening approach for cryo-EM maps was introduced by Rosenthal and Henderson¹ and their formulation, based on the global B-factor correction, is still at the basis of the most commonly employed sharpening methods, including RELION postprocessing^2,3 or Phenix AutoSharpen⁴. The principle behind these algorithms consists in the correction of the raw maps by boosting the amplitude of their high-frequency Fourier components. The strength of the amplitude boost at each frequency depends on the frequency itself and on a single number, the B-factor, that measures the global loss of contrast. Thus, although the different global B-factor-based methods differ in the procedures employed to determine the B-factor that is applied, they modify the volume globally in a similar manner.

Despite being widely used, global B-factor-based approaches present an important limitation: they do not consider the differences in quality that different parts of the map may present and they produce density maps that do not correspond to the scattering properties of biological macromolecules⁵. Consequently, for the case of maps that exhibit heterogeneous local resolution, some regions could be undersharpened whereas others could be oversharpened. Recently, local sharpening algorithms that alleviate this shortcoming, have been proposed. Thus, the LocScale⁶ algorithm uses the information contained in an atomic model to locally scale up a map. Such transformation is achieved by means of a sliding window approach in which the amplitudes of the map region that lay inside the window are scaled up to agree with the atomic model provided. Following a totally different strategy, the LocalDeblur⁷ algorithm employs a Wiener filtering approach that performs local deblurring with a strength proportional to an estimation of the local resolution, that has to be pre-computed. Similarly, LocSpiral⁸ employs the spiral phase transformation to factorize the volume and then perform a local enhancement based on the normalization and thresholding of the amplitudes.

Despite their benefits, current local sharpening approaches present some drawbacks. Thus, both LocSpiral and LocalDeblur depend on masks to distinguish the macromolecule from the noise and LocalDeblur requires also an estimation of the local resolution of the map. On the other hand, the main strength of LocScale, its ability to employ the structural information of atomic models, could also be regarded as its main weakness since the availability of atomic models limits its applicability.

With the aim of overcoming these shortcomings, in this work, we present Deep cryo-EM Map Enhancer (DeepEMhancer), a fully automatic deep learning-based approach that performs cryo-EM volume post-processing. Deep learning has revolutionized the field of artificial intelligence and its impact has been felt in many others including cryo-EM. Deep learning in cryo-EM was first applied to the problem of particle picking^9,10,11 and since then, it has evolved to deal with other questions such as map reconstruction^12,13, map segmentation^14,15, or local resolution determination^16,17. As in most of those methods, our approach relies on a convolutional neural network (CNN) that is trained on massive quantities of data. Particularly, our development, which follows a simple image super-resolution setup¹⁸, exploits the vast amount of structural information that is contained in the Electron Microscopy Data Bank (EMDB) database¹⁹ in order to mimic the local sharpening effect of the LocScale algorithm. However, DeepEMhancer does not require any atomic model to function and, contrary to previous methods, it also performs automatic (tight) masking of input maps. Our results show that DeepEMhancer, which works in a fully automatic manner, is able to largely improve the interpretability of the maps contained in our benchmark, performing better than classical global B-factor approaches.

Results

DeepEMhancer is based on an end-to-end U-net architecture²⁰ trained in a supervised manner. Particularly, we implemented a 3D U-net consisting of three downsampling blocks and three upsampling blocks that process cubic chunks of the input map (see Supplementary Table 1 for more details). Training was performed using pairs of input maps and target maps, consisting of experimental cryo-EM maps and tightly masked LocScale post-processed maps. Despite other possible alternatives (e.g., LocalDeblur, etc.) LocScale was chosen as the method to produce targets because it makes use of atomic model information, which tends to produce high-quality results. For a complete description of the data preparation, training, and evaluation processes see the “Methods” section.

DeepEMhancer performance on the testing set

In order to assess the quality of DeepEMhancer predictions, we first compared them against the target maps generated by LocScale. Thus, for DeepEMhancer maps, we measured a median correlation coefficient of 0.9 against LocScale maps in contrast to 0.6 for input maps (see Supplementary Fig. 1). Such an important increase in the correlation coefficient implies that DeepEMhancer has learned to accurately reproduce the effect of LocScale sharpening with one important advantage: no atomic models are required to employ DeepEMhancer.

Although reproducing the LocScale-sharpening effect was our main objective, the ultimate goal of map post-processing is to simplify the process of atomic model building. With the aim of studying if DeepEMhancer also contributes to that purpose, we next explored whether DeepEMhancer post-processed maps were more similar to the actual atomic models. To do so, we computed, for all the maps included in the testing set, the Fourier shell correlation coefficient (FSC) resolution between the input (half maps average) and post-processed maps against the reference maps obtained from the atomic models. As it is shown in Fig. 1, for all the examples included in the testing set, the application of DeepEMhancer increased the similarity of the input maps with respect to the references (blue and green bars). Particularly, the median improvement achieved by DeepEMhancer was ~0.6 Å (~14% in the frequency domain). Such an important improvement confirms that the maps computed by DeepEMhancer are more similar to the target maps.

**Fig. 1: DeepEMhancer produces maps that are more similar to the atomic models.**

DeepEMhancer post-processing operation performs a non-linear transformation of the experimental volume that produces a set of effects that could be broadly classified as masking/denoising and sharpening-like features enhancement. In order to disentangle the contribution of the different effects, we have also computed the FSC of the input and post-processed maps using a tight mask derived from the atomic model. As it can be observed in Fig. 1, the FSC resolution obtained for the post-processed maps tends to be better than the values computed for the input independently of the mask application (green and red bars vs orange bar), which implies that the masking effect is of high-quality, as the resolutions for the unmasked DeepEMhancer results tend to be better than the ones for the masked input maps.

Comparison with other methods

With the aim of comparing DeepEMhancer with the commonly employed global B-factor-based sharpening methods, we repeated the same experiments using the post-processed maps obtained with the Relion postprocessing algorithm^2,3. Before it is important to notice that contrary to DeepEMhancer, Relion automatic masking is a simple process, and thus, in order to make the comparison more interesting, we used instead the masks derived from the atomic models.

Still, when we evaluated the FSC for the masked regions, only a few maps improved, while many others worsened, leading to a median improvement that was negligible (<0.05 Å) for both FSC and median DeepRes resolution (see Figs. 2 and 3).

**Fig. 2: DeepEMhancer produces better quality maps.**

**Fig. 3: DeepEMhancer produces better results than global B-factor-based methods.**

Similarly, and, although it is true that the trend is not as strong as in the previous experiment, DeepEMhancer also tends to improve the resolution of the masked regions (Fig. 1, orange vs. red bars), which supposes an enhancement of the map features. Leaving aside some problematic examples such as EMD-7055²¹, that will be discussed in Supplementary Note 1 and Supplementary Fig. 2, most of the evaluated maps exhibit a non-negligible improvement in resolution, especially notable when compared to B-factor-based results (see next section), with a median value of ~0.3 Å.

Alternatively, with the aim of obtaining a complementary measurement of improvement, we computed the DeepRes local resolution for the input and post-processed maps. As can be appreciated in Fig. 2, all test cases treated with DeepEMhancer improved in terms of DeepRes local resolution, with dramatic improvements of more than 0.7 Å and a median improvement of ~0.4 Å. Again, those figures, consistent with the FSC-based measurements, point out that DeepEMhancer is improving the interpretability of the maps.

We acknowledge that the automatic determination of the B-factor can lead to less accurate results than if it were manually selected and it may be the reason behind the poor observed performance. Thus, we have also included in the comparison the post-processed maps deposited in EMDB in which the estimation of B-factor was carried out by the authors. In this case, the improvement in resolution, with median values of ~0.15 and ~0.1 Å for DeepRes and FSC, respectively, although closer to the values obtained using DeepEMhancer, are still considerably inferior (see Figs. 2 and 3). Such a difference in performance can be partially explained by the ability of local sharpening methods to deal better with low-quality regions of input maps as is shown in Supplementary Figure 3 and discussed in Supplementary Note 2.

In the light of these results, we can state that DeepEMhancer maps tend to be more similar to the atomic models than the ones obtained using global B-factor-based methods and thus, more useful for the process of model building. Finally, for the sake of completeness, we also computed FSC curves to compare our approach with other state-of-the-art sharpening approaches, showing that our fully automatic approach produces competing if no better results for many cases (see Supplementary Note 3 and Supplementary Figs. 4–9).

Visual inspection of testing maps

The purpose of this section is to further explore the results obtained with DeepEMhancer for some of the maps included in the testing set with the aim of illustrating how the improvements in global quality measurements translate to tangible improvements in the quality of the maps.

EMD-7099

The EMD-7099²² is a high-resolution volume (global resolution 3.1 Å) of a multidrug resistance ATP-driven pump. EMD-7099 presents 17 transmembrane helices and, although the overall quality of the map is excellent, visualizing the transmembrane regions is challenging because of the signal that comes from the lipids. As a result, important parts of the protein are not traced. Due to the fact that DeepEMhancer was trained to ignore the signal coming from lipidic layers, this example illustrates the unique characteristics of DeepEMhancer when applied to membrane proteins. Thus, as can be observed in Fig. 4a–d, DeepEMhancer has been able to suppress the signal coming from the lipid layer in a much more simple and effective way than diminishing the threshold in the raw map or the B-factor-based sharpened maps. The noise suppression effect simplifies the process of model building, as the researchers do not have to deal with masks or larger thresholds that make the visualization of near-to-noise level features more difficult. Yet not only DeepEMhancer produces a noise reduction effect, but also it is able to enhance some parts of the map that under B-factor-based sharpening seem noisy and disconnected. Such improvement, although observed in several regions of the map, is more noticeable at the transmembrane region Thus, the most important enhancement is depicted in Fig. 4e, f, in which an important part of the backbone of the protein has been de novo traced thanks to DeepEMhancer enhancement, that has restored the densities corresponding to residues A195 to I203 in chain A of PDB 6bhu. Although it is true that this region was present in the raw data map, its intensity range was so close to one of the lipidic layers that after conventional B-factor post-processing, the region was so damaged that modeling was not possible. On the contrary, not only DeepEMhancer was able to suppress most of the signal coming from the lipid layer but also it was able to restore the density of the region so that it looks smooth and continuous.

EMD-4997

The EMD-4997²³ is a medium-high resolution volume (4.0 Å) for a murine epithelial anion transporter. As in the previous example, the overall quality of the map is quite good, yet it presents lower quality regions. Figure 5a shows an overview of the published map, displayed at the recommended threshold, and the map obtained with DeepEMhancer. Although it is true that both the published map and the post-processed map look very similar, it is also true that there exist important differences. Firstly, the map processed with DeepEMhancer is cleaner than the published one. Serve as an example the removal of the artifacts that the published map presents near the elbow of the complex (see Fig. 5a, red box). More importantly, there can also be found many regions for which the DeepEMhancer post-processed volume resolves better the different residues of the regions. One such example can be found near the N-terminal end of the protein complex. Thus, as it is shown in Fig. 5b, the densities that correspond to the strands of the β-sheet are better separated than in the published volume. It is important to notice that this better separation is not a consequence of the employed thresholds, as it is proven by the fact that rising the threshold makes the densities corresponding to the backbone discontinuous before the densities for the two strands separate (see Fig. 5b). As a result, we can affirm that the quality of this region has been improved by the usage of DeepEMhancer.

Another similar example is displayed in Fig. 5c. In this case, two non-contiguous aromatic residues, Y361 and H121, seem connected in the published map. However, when DeepEMhancer is applied, the densities corresponding to the two residues look separated while the backbone remains continuous.

Use case EMD-30178 from SARS-CoV-2 RNA-dependent RNA polymerase

In order to further explore the benefits of the DeepEMhancer algorithm, we analyzed more deeply the post-processing of EMD-30178 map from Gao²⁴, corresponding to the SARS-CoV-2 RNA-dependent RNA polymerase. The published map presents detailed structure up to 2.9 Å resolution, however, as is often the case in cryo-EM, the resolution of the map is highly heterogeneous. We have chosen this map not only for the importance of this structure in current days but also because of the fact that the heterogeneous quality of the map density presents an ideal case for DeepEMhacer software. As it is shown in Fig. 6a, the application of the algorithm reduces the noise and improves the consistency and depiction of the map. To better illustrate these differences, we have chosen two different regions in chains A and D where the differences between the published and the DeepEMhancer map can be appreciated (Fig. 6b and c). While the density in the published map looks noisy or discontinuous depending on the displayed threshold (Fig. 6b and c, left and middle panel), the application of the DeepEMhacer software results in a well-defined continuous density where the side chains are nicely depicted (Fig. 6b and c, right panel). This improvement in the map density allowed us to close the loop between residues in the β-sheet V115 to I132 from chain D tracing three new residues that were not traced in the published structure (Fig. 6b). The improvement of the density is not only applicable to the edges of the map but it can be also appreciated in its core. Residues H362–L366 in chain A, traced on the published map were positioned more accurately on the density after map post-processing (Fig. 6c).

Discussion

The number of deposited high-resolution cryo-EM maps has soared since the beginning of the ‘resolution revolution’. As a result, there is an increasing number of atomic models that are being built using cryo-EM as the primary source of information. However, building atomic models directly from the raw maps is generally not possible. Instead, maps are post-processed in order to enhance the contrast of their high-resolution features.

In this work, we have presented DeepEMhancer, a map post-processing method based on deep learning. Trained on pairs of experimental cryo-EM maps and post-processed maps constructed with LocScale using atomic models, DeepEMhancer has learned how to perform a high-quality post-processing operation that reproduces the effects of masking and local sharpening in an automatic fashion.

Although it is true that DeepEMhancer could have been trained on other targets, for instance, the simulated maps obtained directly from the atomic models, we discarded this alternative for two reasons. The first reason is that we wanted to reproduce the state-of-the-art local sharpening effect and not a new type of post-processing that could not be compatible with downstream atomic modeling tools. The other one is empirical: we obtained better results when targets were produced with LocScale than when the targets were directly obtained from the atomic models. As it is discussed in Supplementary Note 4 and illustrated in Supplementary Figs. 10–13, our neural network tends to suffer from underfitting when trained on maps derived from atomic models and thus, the results are blurrier than the ones obtained when using LocScale maps as a target. One possible explanation for such behavior could be the fact that, when using LocScale, the input and target maps, although different, still share some similar properties such as intensity ranges or local quality, which are not necessarily preserved when using simulated maps as targets. As a consequence, it is reasonable to believe that as the input and target maps become more similar, the training process should also become easier. For these reasons, we expect that super-resolution approaches trained on maps derived from atomic models will only be possible when more powerful models will be employed at the cost of more powerful computational resources and larger datasets.

The performance of our algorithm has been assessed using a testing set of 20 experimental maps that were not used for training nor during the trial-and-error process required for its implementation. In all cases, the similarity between the maps obtained from the atomic models and the experimental maps improved after the application of DeepEMhancer. Additionally, we evaluated in detail the performance of DeepEMhancer on two of those maps, showing that, not only DeepEMhancer facilitates the visualization of cryo-EM maps, but also that DeepEMhancer can unveil some details that are not easily recognizable in the raw maps.

Nevertheless, it is important to highlight that DeepEMhancer is not the ultimate solution and that different examples will benefit from considering simultaneously different post-processing techniques. This is of especial importance for some of the cases in which DeepEMhancer, by dataset scarcity, presents limitations, for instance, when dealing with uncommon posttranslational modifications (see Supplementary Note 5 and Supplementary Figs. 14 and 15).

Another important caveat that all methods intended to enhance maps need to face is the problem of model validation. Although the results here presented have been validated using as ground truth the published models, in real-world scenarios such ground truth models are not available, and thus, the goodness of the results should be addressed by the users. To that end, we recommend trying and comparing different approaches since orthogonal methods should reveal inconsistencies. On the contrary, we discourage users from trying to estimate the resolution of post-processed maps, as there is no obvious way of doing it without ground-truth and even in those cases, masking effects could be challenging (see Supplementary Note 3).

Finally, with the aim of illustrating how beneficial DeepEMhancer could be in real-world scenarios, we have employed it on a map of the RNA polymerase of the SARS-CoV 2 virus, improving its quality of the map and the quality of the associated atomic model.

Methods

Raw data collection

DeepEMhancer has been trained and evaluated using as input a subset of cryo-EM maps obtained from the EMDB¹⁹ that meet the following requirements: (1) resolution better than 7 Å; (2) have one and only one atomic model associated; (3) correlation between the atomic model and the map better than 0.6; and (4) half maps available. As a result, an original list of 415 maps was compiled. However, this initial list is highly redundant and, in order to avoid biases in both the training and evaluation procedures, this list was further filtered to reduce its redundancy (see subsection “Redundancy control”). Finally, after a visual inspection aimed at removing problematic cases that survived the automatic filtering procedure, a total amount of 147 maps, with an average reported resolution of 3.8 Å, were selected.

Since the main objective of DeepEMhancer is to perform a sharpening-like post-processing transformation, it is important to ensure that the maps used in this study were not previously sharpened. Given the fact that most of the maps deposited in EMDB are sharpened and many are also masked, we decided to employ only the half-maps available in EMDB (condition number 4). Due to the lack of an appropriate searching tool in EMDB and a file name convention, we had to analyze all the map file names included in the database looking for the substring “half” to recover the half maps. Full maps were obtained averaging respective half maps.

As learning targets, we employed the output generated by LocScale using as input the aforementioned maps and their associated atomic models. Additionally, the output maps were tightly masked using as masks the maps simulated from the atomic models after a thresholding operation (see Supplementary Note 6 and Supplementary Fig. 16).

Data preparation

Due to the fact that the monomers (amino acids, nucleotides, etc.) that compose the macromolecules have fixed size but the deposited maps vary in voxel size, both the input and the target maps were resampled to 1 Å/voxel size with the aim of facilitating the learning process. After that, the intensity of each volume was normalized using the classical cryo-EM approach by which the map noise statistics are forced to adopt a fixed mean and standard deviation (0 and 0.1, respectively). Finally, due to GPU memory limitations, the maps were chunked into 64 × 64 × 64 cubes, the maximum size that our computing systems were able to efficiently manage. As a result, more than 70k volume cubes, including both signal cubes and noise-only cubes were used for training.

Redundancy control

In order to perform the train/test/validation split used to develop and evaluate our method, it is important to consider that the universe of proteins is highly redundant and that the EMDB entries are even more redundant. Serve as an example the case of the ribosome, which supposes ~10% of all EMDB entries. Thus, in order to avoid an over-optimistic performance estimation, we have ensured that the train, test, and validation sets are mutually exclusive in the sense that their intersections are empty under a certain equivalence criterion. Particularly, we consider that two EMDB entries are equivalent if they share one sequence that belongs to the same 30% sequence identity cluster. Similarly, with the aim of eliminating potential bias in the evaluation, we have guaranteed that only one member per cluster is included in testing and validation sets. On the contrary, we have relaxed our quite strict redundancy control policy in the training set allowing up to five cluster representatives in an attempt to increase the size of this set. This decision is founded on the fact that even maps of the same exact protein may present different statistics due to the intrinsic variability of cryo-EM reconstruction workflows and thus, limiting their presence in the training set may be difficult for the generalization of the neural network.

As a result, a list of 107, 21, and 20 maps were used for training, validation, and testing, respectively. The full list of the EMDB entries used can be found in Supplementary Note 7 and Supplementary Data 1.

Neural network architecture

We have employed a 3D U-net-like neural network²⁰ as a regression model for the estimation of post-processed maps. Our neural network consists of three downsampling blocks and three upsampling blocks with skip connections. Each block contains three convolutional layers followed by group normalization²⁵ and PRelu activation²⁶. The number of filters for each block is 3 × 32, 3 × 64, and 3 × 128, respectively. Downsampling is carried out using strided convolution and upsampling is performed via transposed convolution. See Supplementary Table 1 for additional details.

Neural network training

Our neural network was trained using stochastic gradient descent with a batch size of 8 cubes. Initial learning rate was set to 10⁻³ and decreased by a factor of 0.5 when the validation loss did not improve during 5 epochs. As a loss function, a mean absolute error was employed. Data augmentation, consisting of random 90° rotations, gaussian blurring, and patch corruption was applied to the training data.

Neural network inference

In order to perform volume post-processing, the input volume is pre-processed as described in the “Data preparation” subsection. Then, the resized and normalized volume is chunked into overlapping cubes of size 64 × 64 × 64 with strides of 16 voxels. Each cube is individually processed by the trained neural network, yielding post-processed cubes. After that, the post-processed cubes are re-assembled into the final volume averaging the overlapping parts. Finally, the processed volume is resized to the size of the original volume, thus, showing the correct sampling rate value.

Evaluation

With the aim of guiding the cross-validation process, we computed the correlation coefficient between the maps produced by DeepEMhancer and the maps used as learning targets (masked LocScale post-processed maps). Once the final model was selected, the quality of DeepEMhancer predictions was assessed comparing the input and processed maps against the reference maps obtained from the atomic models. Specifically, we computed the FSC between them and we estimated the resolution using 0.5 as the threshold. Due to the fact that DeepEMhancer performs a non-conventional post-processing operation, including masking and enhancement operations, in order to disentangle the two effects, the FSC was also computed after masking the maps to compare with a tight mask derived from the atomic model.

As a complementary metric, we also applied DeepRes¹⁷ over the input and processed maps. DeepRes is a deep learning-based local resolution method that, contrary to others, is sensitive to the sharpening process and thus, it can provide an alternative estimation of the post-processing effect.

Finally, for comparison purposes, we repeated the FSC and DeepRes experiments using the Relion postprocessing program^2,3. As Relion automatic masking is very simple, in order to make the comparison more interesting, we decided to execute the postprocessing algorithm using the mask derived from the atomic models. Similarly, since the automatic determination of the B-factor can produce worse results than a manually selected one, in addition to the maps computed using an automatically determined B-factor by Relion, we also considered the sharpened map deposited in EMDB.

EMD-30178 map evaluation and atomic model modification

DeepEMhancer was applied to the half maps deposited in EMDB entry EMD-30178. The published and post-processed maps were visually inspected using Coot²⁷ and Chimera²⁸, and chosen regions on the 7btf PDB were newly built or modified using Coot.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

All training and testing examples used in this work can be found in the EMDB and PDB databases. Accessions codes are included in Supplementary Note 7 and Supplementary Data 1. Post-processed map examples and trained models are freely available at http://campins.cnb.csic.es/deepEMhancer/examples. Data used during figure preparation is available in Supplementary Data 2 and 3. All other data are available from the corresponding authors upon reasonable request.

Code availability

DeepEMhancer is freely available at https://github.com/rsanchezgarc/deepEMhancer and as an Xmipp protocol for Scipion v3 (https://github.com/I2PC/scipion-em-xmipp).

References

Rosenthal, P. B. & Henderson, R. Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy. J. Mol. Biol. 333, 721–745 (2003).
Article CAS Google Scholar
Kimanius, D., Forsberg, B. O., Scheres, S. H. & Lindahl, E. Accelerated cryo-EM structure determination with parallelisation using GPUs in RELION-2. Elife 5, e18722 (2016).
Article Google Scholar
Zivanov, J. et al. New tools for automated high-resolution cryo-EM structure determination in RELION-3. Elife 7, e42166 (2018).
Article Google Scholar
Terwilliger, T. C., Sobolev, O. V., Afonine, P. V. & Adams, P. D. Automated map sharpening by maximization of detail and connectivity. Acta Crystallogr. Sect. D 74, 545–559 (2018).
Article CAS Google Scholar
Vilas, J. L. et al. Re-examining the spectra of macromolecules. Current practice of spectral quasi B-factor flattening. J. Struct. Biol. 209, 107447 (2020).
Article CAS Google Scholar
Jakobi, A. J., Wilmanns, M. & Sachse, C. Model-based local density sharpening of cryo-EM maps. Elife 6, e27131 (2017).
Article Google Scholar
Ramírez-Aportela, E. et al. Automatic local resolution-based sharpening of cryo-EM maps. Bioinformatics 36, 765–772 (2020).
Article PubMed Google Scholar
Kaur, S. et al. Local computational methods to improve the interpretability and analysis of cryo-EM maps. Nat. Commun.12, 1240 (2021).
Article CAS Google Scholar
Wagner, T. et al. SPHIRE-crYOLO is a fast and accurate fully automated particle picker for cryo-EM. Commun. Biol. 2, 218 (2019).
Article Google Scholar
Wang, F. et al. DeepPicker: A deep learning approach for fully automated particle picking in cryo-EM. J. Struct. Biol. 195, 325–336 (2016).
Article Google Scholar
Zhu, Y., Ouyang, Q. & Mao, Y. A deep convolutional neural network approach to single-particle recognition in cryo-electron microscopy. BMC Bioinforma. 18, 348 (2017).
Article Google Scholar
Gupta, H., McCann, M. T., Donati, L. & Unser, M. CryoGAN: a new reconstruction paradigm for single-particle cryo-EM via deep adversarial learning. Preprint at bioRxiv https://doi.org/10.1101/2020.03.20.001016 (2020).
Zhong, E. D., Bepler, T., Davis, J. H. & Berger, B. Reconstructing continuous distributions of 3D protein structure from cryo-EM images. Preprint at arxiv https://arxiv.org/abs/1909.05215v3 (2019).
Maddhuri Venkata Subramaniya, S. R., Terashi, G. & Kihara, D. Protein secondary structure detection in intermediate-resolution cryo-EM maps using deep learning. Nat. Methods 16, 911–917 (2019).
Article CAS Google Scholar
Si, D. et al. Deep learning to predict protein backbone structure from high-resolution cryo-EM density maps. Sci. Rep. 10, 1–22 (2020).
Article Google Scholar
Avramov, T. et al. Deep learning for validating and estimating resolution of cryo-electron microscopy density maps †. Molecules 24, 1181 (2019).
Article Google Scholar
Ramírez-Aportela, E., Mota, J., Conesa, P., Carazo, J. M. & Sorzano, C. O. S. DeepRes: a new deep-learning- and aspect-based local resolution method for electron-microscopy maps. IUCrJ 6, 1054–1063 (2019).
Article Google Scholar
Yang, W. et al. Deep learning for single image super-resolution: a brief review. IEEE Trans. Multimed. 21, 3106–3121 (2019).
Article Google Scholar
Lawson, C. L. et al. EMDataBank unified data resource for 3DEM. Nucleic Acids Res. 44, D396–D403 (2015).
Article Google Scholar
Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention-MICCAI, Vol. 9351, (eds Navab, N., Hornegger, J., Wells, W. M., Frangi, A. F.) 234–241 (2015).
Tenthorey, J. L. et al. The structural basis of flagellin detection by NAIP5: a strategy to limit pathogen immune evasion. Science (80-.) 358, 888–893 (2017).
Article CAS Google Scholar
Johnson, Z. L. & Chen, J. ATP Binding enables substrate release from multidrug resistance protein 1. Cell 172, 81–89e10 (2018).
Article CAS Google Scholar
Walter, J. D., Sawicka, M. & Dutzler, R. Cryo-EM structures and functional characterization of murine Slc26a9 reveal mechanism of uncoupled chloride transport. Elife 8, e46986 (2019).
Article Google Scholar
Gao, Y. et al. Structure of the RNA-dependent RNA polymerase from COVID-19 virus. Science (80-.) 368, 779–782 (2020).
Article CAS Google Scholar
Wu, Y. & He, K. Group normalization. Int. J. Comput. Vis. 128, 742–755 (2020).
Article Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proc. 2015 International IEEE International Conference on Computer Vision, 1026–1034 (2015).
Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr. Sect. D 60, 2126–2132 (2004).
Article Google Scholar
Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
Article CAS Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge economical support from: The Spanish Ministry of Science and Innovation through Grants: Proyectos de I+D+i - RTI Tipo A PID2019-108850RA-I00, SEV 2017-0712, PID2019-104757RB-I00/ AEI/10.13039/501100011033; the “Comunidad Autónoma de Madrid” through Grant S2017/BMD-3817; CSIC: PIE/COVID-19 number 202020E079; European Union (EU) and Horizon 2020 through grants EOSC Life (INFRAEOSC-04-2018, Proposal: 824087) and HighResCells (ERC - 2018- SyG, Proposal: 810057). J.V. acknowledges economical support from the Ramón y Cajal 2018 program (RYC2018-024087-I).

Author information

Ruben Sanchez-Garcia
Present address: Department of Statistics, University of Oxford, Oxford, UK

Authors and Affiliations

Biocomputing Unit, Centro Nacional de Biotecnología-CSIC, Madrid, Spain
Ruben Sanchez-Garcia, Ana Cuervo, Jose Maria Carazo & Carlos Oscar S. Sorzano
Department of Anatomy and Cell Biology, McGill University, Montréal, QC, Canada
Josue Gomez-Blanco & Javier Vargas
Departamento de Óptica, Universidad Complutense de Madrid, Madrid, Spain
Josue Gomez-Blanco & Javier Vargas

Authors

Ruben Sanchez-Garcia
View author publications
You can also search for this author in PubMed Google Scholar
Josue Gomez-Blanco
View author publications
You can also search for this author in PubMed Google Scholar
Ana Cuervo
View author publications
You can also search for this author in PubMed Google Scholar
Jose Maria Carazo
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Oscar S. Sorzano
View author publications
You can also search for this author in PubMed Google Scholar
Javier Vargas
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: J.V., C.O.S.S., R.S.-G.; Methodology: R.S.-G., J.V., and C.O.S.S.; Software implementation: R.S.-G., J.G.; Evaluation: R.S.-G., J.V., A.C.; Writing: R.S.-G., A.C., J.V., C.O.S.S. and J.M.C.; Supervision: J.V., C.O.S.S.; Funding acquisition: J.M.C., J.V.

Corresponding authors

Correspondence to Carlos Oscar S. Sorzano or Javier Vargas.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Communications Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available. Primary handling editors: Jung-Eun Lee, Christina Karlsson Rosenthal, George Inglis.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Material

Description of Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sanchez-Garcia, R., Gomez-Blanco, J., Cuervo, A. et al. DeepEMhancer: a deep learning solution for cryo-EM volume post-processing. Commun Biol 4, 874 (2021). https://doi.org/10.1038/s42003-021-02399-1

Download citation

Received: 20 August 2020
Accepted: 17 June 2021
Published: 15 July 2021
DOI: https://doi.org/10.1038/s42003-021-02399-1

This article is cited by

Noncanonical assembly, neddylation and chimeric cullin–RING/RBR ubiquitylation by the 1.8 MDa CUL9 E3 ligase complex
- Daniel Horn-Ghetko
- Linus V. M. Hopf
- Brenda A. Schulman
Nature Structural & Molecular Biology (2024)
Receptor-recognition and antiviral mechanisms of retrovirus-derived human proteins
- Shashank Khare
- Miryam I. Villalba
- Nicolas Reyes
Nature Structural & Molecular Biology (2024)
A new antibiotic traps lipopolysaccharide in its intermembrane transporter
- Karanbir S. Pahil
- Morgan S. A. Gilman
- Daniel Kahne
Nature (2024)
Multi-scale structures of the mammalian radial spoke and divergence of axonemal complexes in ependymal cilia
- Xueming Meng
- Cong Xu
- Yao Cong
Nature Communications (2024)
Structural insights into vesicular monoamine storage and drug interactions
- Jin Ye
- Huaping Chen
- Weikai Li
Nature (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.