Introduction

Electron and scanning probe microscopy techniques have become a mainstay of research in materials science, condensed-matter physics, biology, and nanotechnology.1 Electron microscopy (EM) has made atomic resolution imaging,2 the mapping of plasmon and phonon excitations,3 the probing of the chemical states of individual atoms,4 and atomic assembly5 a reality. Similarly, scanning probe microscopy has enabled mapping of individual chemical bonds, probing of quasiparticles and superconducting order parameters,6 as well as the exploration of tip-induced reactions and atomic manipulation.7 These instrumental capabilities have provided insight into atomic structures and functionalities of materials ranging from superconductors and ferroelectrics to macromolecules. However, the reams of data generated by modern imaging and hyperspectral imaging tools surpass the current infrastructure for storage and analysis, limiting our ability to derive actionable physics, chemistry, and materials insights.

Until recently, most imaging techniques have relied on semiqualitative analysis, where human experts interpret two-dimensional (2D) images or individual one-dimensional (1D) spectra. However, data collected by electron and probe microscopes are often intrinsically quantitative but encoded across various modalities and dimensionalities. This characteristic requires subsequent extraction of features and correlations at various length and time scales. This includes information regarding the relative positions of atoms and bonding networks directly related to the thermodynamics and kinetics of materials synthesis. Minute symmetry-breaking distortions contain information concerning structural and polar-order parameters. Spatial maps of plasmonic and phonon interactions contain information about the dielectric function, convoluted with shape effects. Correspondingly, the second challenge for microscopy is learning the fundamental physics and chemistry of the studied materials from imaging and spectral data.

Finally, the third challenge is inherently linked to the capability of electron and scanning probe microscopes to modify materials in a controllable fashion. The last note left by Richard Feynman on his blackboard was “What I cannot make, I cannot understand.” From this perspective, local probes’ capability to visualize and manipulate matter on the atomic level is the next frontier for nanoscience and nanotechnology. Active learning methods under the guise of machine learning offer a pathway to harness imaging data streams, derive physical insights into the chemistry and physics of materials, and assemble them at the nano- and atomic scales.

Learning more from data

Since the early days of van Leeuwenhoek, the power of microscopy lays in its ability to visualize objects on progressively smaller length scales. For electron and scanning probe microscopies, an additional step involves the conversion of interactions between the electron or scanning probe and the object to form images. However, data interpretation has been largely driven by human insights. This includes identifying features in images or spectra and qualitative interpretation, in some cases followed by quantitative analysis, ultimately connecting results to physical models and prior knowledge. This approach is inherently limited by human perception and bias. For example, the human eye is remarkably good at identifying well-localized objects, but struggles to detect the emergence of correlated signatures in different parts of the image field or to detect small or gradual changes in periodicity. Furthermore, the eye is extremely sensitive to color scales and can regularly be deceived by the perception of contrast. Even more importantly, interpretation of the data in terms of relevant physics is highly dependent on prior knowledge.

Machine learning (ML) methods offer an opportunity to change this paradigm. While neural networks have been known since the early 1950s, the lack of training algorithms and sufficiently powerful computational tools have led to slow progress, with ebbs and flows. This situation has changed in the last decade.8 The explosive development of ML tools in computer vision, medicine, and robotics has created the knowledge base and enabling infrastructure that allows its extension to the physical sciences. Next, we identify some of the applications enabled by classical supervised ML methods and new opportunities enabled by unsupervised and physics-driven ML.

What is in the image: Supervised learning

Artificial intelligence-guided knowledge extraction from raw experimental data streams is a challenge across all materials science, chemistry, and physics disciplines.9,10 In electron and scanning probe microscopy specifically, we are often concerned with determining key microstructural descriptors that encode atomic-scale properties and processes.11 These descriptors are challenging to define rigorously but are essential as a key step in transforming raw data into actionable metrics for high-throughput and automated decision-making. The most common characterization modalities for scanning transmission electron microscopy (STEM), scanning electron microscopy (SEM), and scanning probe microscopy (SPM) are various forms of 2D imaging, as shown in Figure 1. For TEM, most measurements compress the signals measured from a three-dimensional (3D) sample volume into a projected 2D image. SEM and SPM are primarily sensitive to material surfaces and, thus, are also well suited for 2D imaging. Fortunately, 2D image analysis is also the sub-domain of deep learning (DL) that has received the most attention, due to immense online image databases in medicine and biology, astronomy, and social media.

Figure 1
figure 1

Information channels are available in (a) scanning transmission electron microscopy (STEM), and (b) scanning probe microscopy (SPM) experiments. Measurement techniques are labeled in pink. Figure courtesy of Colin Ophus.

Most DL applications in image processing target classification, with the most common subproblem being segmentation. Semantic segmentation is the process of labeling every pixel (or voxel) into discrete classes and is the desired output of many deep learning routines.12 Segmentation is beneficial for many microscopy applications in both TEM13 and SPM studies.14 Image segmentation examples in electron microscopy include separating atomic resolution phases,15 and rapid determination of microstructural features.16 Segmentation has also been applied to SPM data sets, where it has been used to detect and avoid atomic surface defects before patterning17 and detect arbitrarily complex features over many length scales.18 Deep learning can also produce more profound insights into microscopy experiments by directly extracting relevant properties. SPM examples include extracting mechanical properties of copolymers19 and automated molecular structure discovery in AFM.20 The field of TEM also contains many examples, including measurement of carbon nanotube chiral indices21 and automated classification, and symmetry determination of atomic structures.22

Given limited prior knowledge of a system or experiment, we require models that can generalize to novel situations in real-world scenarios. For example, structure determination in the microscope heavily depends on imaging conditions, aberrations, sample orientation, and detector. Existing libraries often poorly capture the resulting variety of potential image types, spectra, and diffraction patterns. Although high-throughput simulations can generate synthetic data for network training, these trained networks are typically material-specific. They cannot yet approximate the vast number of possible real-world imaging conditions. Recent advances in few-shot ML may be used for triaging and classification in scenarios in which we have very limited prior knowledge.23 This approach is almost entirely unexplored in the context of electron microscopy and analytical characterization more broadly. It can allow us to do more with less and inform otherwise intractable novel situations. Motivated by humans' ability, especially children, to rapidly learn novel visual concepts by utilizing what they learned in the past, one-shot or few-shot approaches allow human-level performance with fewer and less intensively labeled data (i.e., shots). Studies leveraging these methods in materials science and microscopy are limited,16,24 but are a critical step toward handling large data streams with few to no annotations. This concept also has significant implications for studying transient behavior and unfolding experiments, where decisions must be made quickly given limited prior knowledge. Old ways of training and retraining large models on vastly annotated data are not well suited to a rapid “discovery” cycle.

Discovery via unsupervised learning

The alternative approach for analyzing imaging and spectroscopic data is based on unsupervised learning. In this case, an ML algorithm seeks to discover common traits and variabilities within the high-dimensional data. One approach to address this is to use statistical methods of machine and deep learning to disentangle phenomena in high-dimensional space to bring important physics into focus. Techniques such as clustering, principal component analysis, nonnegative matrix factorization, and dictionary learning are highly effective in quickly probing structure–property relationships in multidimensional imaging but they have an inherent flaw, specifically, probing structure–property relationships in microscopy requires consideration of spatiotemporal relationships in data, whereas the aforementioned ML approaches consider each dimension as independent and uncorrelated. This situation becomes particularly problematic when comparing data translated in space or time.

One partial solution to this problem can be achieved by using autoencoders. Autoencoders generally refer to a class of ML methods aiming to discover low-dimensional representations of the data and are composed of the encoder and decoder parts represented by neural networks. When trained, the encoder learns a compact statistical representation of the training data distribution that can be decoded back to the original data using the decoder. Because autoencoders are variants of deep neural networks, various neural network building blocks can be used. This allows for incorporation of local spatial dependencies using 2D convolutions and sequential dependencies using 1D convolution or recurrent neural networks.

One rapidly developing example is the variational autoencoder (VAE). In the encoder part of the network, a data set is compressed to a small number of latent variables (latent vector), which the decoder then aims to decode into original data. The training of VAEs balances the reconstruction quality and the proximity of the latent variable distributions to a chosen (typically Gaussian) prior. The unique aspect of autoencoders is that the latent representations of the system often allow one to disentangle the representation of the data (i.e., to discover the systematic traits and align them with the specific latent variables). For example, when disentangling a representation of the hand-written digit (MNIST) data set, one of the latent representations may correspond to the width of the digit, and the other to the tilt. Practically, VAE architectures can be modified to represent variables such as rotation, dilation, and shift as separate latent vectors. The remaining factors of variation can be used to establish order parameters, explore structures, and, in special cases, analyze relevant physical mechanisms.25

Probing structure–property relations

Although mere representation of data is important, a central concept in materials science is that of mapping quantitative or semiquantitative structure–property relationships. Advances in spectroscopic modes such as electron energy-loss spectroscopy (EELS) and scanning tunneling spectroscopy (STS) allow us to generalize this concept to atomic- and nanometer-level studies. A variant of scanning tunneling microscopy (STM), STS of individual defects, has been known since the early 1990s. In electron microscopy, advances in EELS allowed the determination of the valence states of the three and fourfold-coordinated Si atom in graphene,26 followed by P27 and other elements.28 In cases when structural descriptors are well defined, this analysis is straightforward. However, in many cases, it is not clear what relevant building blocks define the structure of a solid.

In these cases, ML can be used as a correlative tool to simultaneously simplify the structural and spectral descriptors and establish the relationship between the two. The linear method of this analysis is canonical correlation analysis (CCA), representing the generalization of principal component analysis to two data sources. CCA has been applied to establish the relationship between structural distortions and electron-scattering patterns in graphene with defects.7 However, in many cases, the image formation mechanisms are nonlinear. In this case, the encoder–decoder architectures allow building correlative relationships as described for the predictability of the plasmon spectra from high-angle annular dark-field (HAADF) STEM data29 (Figure 2) and decoding relationships between domain structure and functionality in ferroelectric materials.30

Figure 2
figure 2

(a) Illustration of the encoder–decoder model and its latent space. The inputs and outputs can be structural image patches and 1D spectra measured in those patches. In this example, the inputs are high-angle annular dark-field (HAADF) scanning transmission electron microscope (STEM) image patches, and the outputs are corresponding electron energy-loss spectroscopy (EELS) spectra. (b) Distinct plasmonic responses uncovered from the analysis of the latent space and (c) their spatial locations on the HAADF STEM image. (d, e) Prediction of EELS spectra from a structural image using a trained encoder–decoder model. Data from K. Roccapriore (Oak Ridge National Laboratory). Figure courtesy of M. Ziatdinov.

Physical discovery

Ultimately, atomic configurations and property maps reflect fundamental information and interactions for atomic assembly. In certain cases, information regarding the processes that lead to the formation of a material can be derived. Hence, a key task for microscopy is to find out whether these laws can be learned from the observational data. A parallel can be drawn to sciences such as astronomy, in which physical laws of celestial mechanics and subsequently Newton’s Laws were derived from the observation of planetary motion. Recently, it has been shown that for ferroic materials, including ferroelectrics and ferroelastics, mesoscopic-order parameter fields can be visualized.31,32 These parameters can be fitted to the Ginzburg–Landau theory prediction to learn the corresponding free energies, gradient terms, and even flexoelectric constants.33 This approach can be further extended to the atomistic level to learn the Hamiltonians describing atomic interactions.34

Toward automated experimentation

Transformative discoveries in domains such as energy storage, quantum information science, and advanced manufacturing require novel experimental paradigms beyond highly manual, disconnected, and inefficient experiment architectures. Electron microscopy, a cornerstone of the study of atomic structure, chemistry, and dynamics, exemplifies this challenge. Hardware advances have left us awash with multimodal, high-volume data, leading us to rethink how to make effective decisions in complex, fast-paced experiments.35 We are now data-rich but faced with increasingly difficult analysis tasks that leave us knowledge-poor. There is presently a transformative opportunity to leverage AI and emerging analytics approaches to accelerate scientific discovery in electron microscopy, laying the groundwork for autonomous experimentation.

Self-driving electron microscopy

Fully automated experiments range from simply repeating an (often-debated) “standard recipe” for a given experimental method (open-loop experimentation), to data-driven autonomous discovery platforms, which can, without intervention, optimize measurement or synthesis parameters. Examples of the latter may select samples or regions to probe or modify, or in some cases even, which experiments to perform or which samples to fabricate next (closed-loop experimentation), balancing exploration, and exploitation. One of the key ingredients for the implementation of full automation is tight integration between the experimental hardware, software and algorithms, and control hardware, as well as the ability to modify both to suit experimental needs. Synchrotron beamline experiments are typically custom built, and therefore, many have pioneered full automation. These light source experiments are supported by a range of data science tools, including the Globus data transfer ecosystem36 and the Data and Learning Hub for ML-enabled modeling.37 These developments lead to the open question of how the scanning probe and electron microscopy communities will manage the growing volumes of data.

In electron microscopy, the most widespread automation tools are those optimized for biological studies, where samples are relatively uniform and imaging modalities are standardized. One example is the SerialEM platform, which automates EM experiments such as tomographic tilt series collection, as well as automating steps such as stage positioning and microscope defocus tuning.38 More complex examples include the hardware automation platforms developed for single-particle reconstructions in cryo-EM.39 These examples only scratch the surface of the possibilities for fully autonomous experiments, which could combine multiple signal channels and measurement modalities to perform arbitrarily complex experiments.

To make meaningful progress toward automation of complex experiments in materials EM and SPM, we must be able to address several gaps in current experimental AI practice. First, we require models capable of analytically enriching large, heterogeneous, and complex data sets in real time with minimal human intervention. Such multimodal data are highly varied in their dimensionality, feature types, and artifacts, motivating the need for more flexible and transferable approaches to classification and segmentation, particularly as we move toward self-driving instrumentation. Second, we require models that can generalize and adapt to novel scenarios, given limited prior knowledge of a system or experiment. In contrast to other experimental techniques, the reconstruction of a structure in the electron microscope heavily depends on imaging conditions, aberrations, orientation, and the detector. The wide variety of potential image types, spectra, and diffraction patterns is poorly captured by existing libraries. High-throughput simulations can help generate some synthetic data for network training, but it is still difficult to approximate the vast number of possible real-world imaging conditions. Third, we require physically grounded, interpretable models from which we can derive actionable metrics for control systems and automation. As described in the preceding section, domain knowledge must be integrated into these models from the outset to ensure that meaningful features and behaviors are detected, particularly if a control system is to anticipate and decide on the next steps of an experiment in a robust manner. Recent efforts in this direction include sparse-data-guided electron microscope platforms,40 which leverage a handful of user-provided examples to guide automated decision-making. These developments show the potential of a flexible and domain-aware approach to analytics and control unconstrained by a lack of prior knowledge. Finally, we need to consider how computational hardware can be co-designed to meet the latency and throughput requirements of an experiment, taking advantage of both cloud and edge-based computing systems. There is also an emerging opportunity to design automation for custom hardware solutions that can achieve inference times on the order of nanoseconds.41

Realization of autonomous experiments necessitates a definition of prior knowledge and goals. These considerations, in turn, determine the selection of enabling algorithms. For example, automated experimentation based on uniform imaging over a large grid of points assumes zero prior knowledge and explores material everywhere. Similarly, spectroscopic measurements at locations with a priori known structures of interest require ML methods at the segmentation stage. Conversely, the discovery of specific structural and spectral elements of interest and sparse image reconstructions requires more advanced Gaussian process-based techniques. Recently, it has been shown that deep kernel learning captures a combination of the expressive power of the deep convolutional networks and flexibility of Gaussian processes to enable discovery of structure–property relationships in scanning probe and electron microscopy.42,43 There is a tremendous opportunity for automated experiments that control perturbation energy and frequency to capture spatiotemporal phenomena associated with specific materials structures.

The unique opportunity opened by the quantitative microscopies is the fundamental studies of structure– and composition–property relationships. Even nominally homogeneous materials contain atomic-level fluctuations, thus, encompassing the range of local chemical composition and environments. Learning local structure–property relationships allows for probing broad regions of chemical spaces. This approach can be further expanded by combining the local characterization and macro- or mesoscale combinatorial libraries. Here, the key requirement is the instantiation of ML methods that contain information on possible physical mechanisms underlying materials properties and refinement of them throughout the experiment.

Mesoscopic and atomic fabrication

A unique niche for automated experimentation is the possibility to not only measure previously fabricated structures, but to also use the imaging modality itself as a manipulation tool. Although light-based lithography is widely used in semiconductor manufacturing and is reaching ever-smaller dimensions, the wavelength of light places a hard limit on the dimensionality. Truly atomic-level manipulation can only be achieved with atomic-sized probes. Richard Feynman pointed out more than 60 years ago that nothing in the laws of physics prohibits this ultimate limit of materials engineering.44 It took several decades for technological developments to catch up, first with the invention of STM in the early 1980s,45 and then with the development of effective aberration correctors for STEM by the early 2000s.46

The power to manipulate individual atoms was established for the first time by pioneering STM experiments in the early 1990s.7 This soon led to the creation of atomic assemblies with controllable collective quantum properties.47,48 Although STM is primarily limited to surfaces held at cryogenic temperatures in ultrahigh vacuum, the technique of hydrogen depassivation lithography (HDL)49 has enabled heteroatom placement on semiconductor surfaces followed by crystal overgrowth to pattern subsurface donor atoms50 and to fabricate solid-state qubits.51 Despite less widely used, atomic force microscopy (AFM) can also be used for atom manipulation/fabrication, notably in some cases even at room temperature.52 Despite many successes, the inherent limitation of SPM techniques to a slowly scanning physical tip in contact with a surface has prompted a search for alternatives.

The possibility to manipulate individual atoms using the Å-sized electron probe of an aberration-corrected STEM was discovered in 2014, alongside a mechanism for the nondestructive movement of Si impurities in graphene.53 Controlled experiments for Si in graphene5,54,55,56 and single-walled carbon nanotubes57 were soon reported, although this proved more challenging for P58 and the most common dopants in graphene, N and B.59 From 2016, increasing attention was drawn to the unique advantages of STEM in being able to address atoms within bulk crystals,60,61 alongside the first experimental demonstrations of directed crystallization and amorphization,62 dopant movement,63 and finally, the controlled manipulation of Bi dopants in bulk silicon,64 the latter exhibiting a novel nondestructive mechanism.65

Although SPM automation is already commonplace, the first efforts to apply ML and automation to STEM atomic fabrication are underway, as shown in Figure 3. Critical components, such as neural network tools to automatically locate the atoms of the lattice66,67 including any impurities, are already operating with real-time integration into microscope control software,68 while automated electron-beam placement and various detector feedback schemes are being explored.56,69 Overall, the modern computerized STEM is already an excellent platform for atomic fabrication,70 and even more, rapid progress is primarily held back by the difficulty of obtaining high-quality samples with large impurity concentrations and atomically clean surfaces, as well as the presence of competing processes including electron-beam damage71 and uncontrolled chemical modifications.57,58,72

Figure 3
figure 3

Atomic fabrication with scanning probes and electron beams. (a) Hydrogen depassivation lithography (HDL) with a scanning tunneling microscope (STM) has removed single H atoms and used error detection and correction, to form dangling bonds to print “150” and a Canadian Maple leaf. Adapted from Reference 73. (b) HDL with an STM has written a 5 × 5 array 20 × 20 atom squares at a pitch of 130 atoms on a Si (100) 2 × 1 H passivated surface (with permission of J.H.G. Owen of Zyvex Labs). (c) Scanning transmission electron microscope (STEM) image sequence of the electron-beam manipulation of a silicon heteroatom substitution around one carbon hexagon in freestanding monolayer graphene. Adapted with permission from Reference 56. (d) Projected STEM image of a triangular pattern of bismuth impurities manipulated via STEM in a thin slab of bulk silicon. Adapted with permission from Reference 64.

In summary, broad adoption of ML methods in scanning probe and electron microscopy offers a clear pathway to facilitate and automate the analysis of data. These methods also show promise to guide learning of underpinning physical mechanisms. Harnessing ML as a part of the experiment holds promise for enabling automated microscopy experiments, targeting exploration of predefined objects of interest, discovering structure–property relationships, and ultimately testing multiple physical hypotheses. Finally, ML methods in microscopy open the pathway toward direct atomic fabrication via scanning probes and electron beams, which is a critical step toward quantum computing and other serendipitous developments.