Introduction

Self-driving laboratories combine automation and artificial intelligence to accelerate the discovery and optimization of materials1,2,3. The increasing flexibility of laboratory automation is enabling self-driving laboratories to manipulate and measure a broader set of experimental variables4. Consequently, a growing number of self-driving laboratories are being developed across a range of fields5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28. While many self-driving laboratories are able to test multiple experimental variables, most optimize for only a single objective (e.g., process parameter, material property)7,8,9,10,11,12,13,14,15,16. This situation is not consistent with most practical applications, where multiple objectives need to be simultaneously optimized29,30,31,32,33,34,35. Consider, for example, how a solar cell must be optimized for voltage, current, and fill factor to yield a high power conversion efficiency36,37; how an electrolyzer must form products at low voltages and high reaction rates and selectivities38; and, how structural alloys are optimized for both strength and toughness31,35. These and other applications motivate the emerging use of self-driving laboratories for multiobjective optimization18,19,20,21,22,23,24,25,26,27,28.

The optimization of materials for multiple objectives can be challenging because improving one objective often compromises another (e.g., decreasing the bandgap of the light-absorbing material in a photovoltaic cell increases the photocurrent but decreases the voltage39). As a result, there is often no single champion material, but rather a set of materials exhibiting trade-offs between objectives (Fig. 1). The set of materials with the best possible trade-offs lie at the Pareto front. Materials on the Pareto front cannot be improved for one objective without compromising one or more other objectives. Most self-driving laboratories used for multiobjective optimization, however, identify only a single optimal material based on preferences specified in advance of the experiment18,19,20,21,22,23,24,25.

Fig. 1: A Pareto front.
figure 1

No single optimal material exists when searching for materials that satisfy two or more conflicting objectives (e.g., film conductivity and processing temperature). Rather, there is a set of materials that offer the best possible tradeoffs between the objectives (indicated by the blue curve). The state-of-the-art materials that offer the best known compromises between the two objectives form the experimentally observed Pareto front (black points).

Here, we use a self-driving laboratory to map out an entire Pareto front27,28. We apply this approach to thin film materials for the first time by mapping out a trade-off between film conductivity and processing temperature. In doing so, our self-driving laboratory identifies previously untested conditions that decrease the temperature required for the combustion synthesis of palladium films from 250 to 190 °C40. This finding increases the scope of polymeric substrates that palladium can be deposited on by combustion synthesis to include Nafion41, polyethersulfone42, and heat-stabilized polyethylene napththalate42. Our self-driving laboratory also identifies conditions suitable for spray coating homogeneous films on larger substrates with conductivities approaching those of films made by vacuum deposition methods. The approach presented here is highly relevant to the materials sciences because it identifies optimal materials for every preferred tradeoff between objectives.

Results

Autonomously discovering a Pareto front

We upgraded the hardware and software of our existing self-driving laboratory, Ada8, (Fig. 2) to study the combustion synthesis of conducting palladium films. This upgraded self-driving laboratory was designed to map out a Pareto front that shows the tradeoff between the temperature at which the films are processed and the film conductivity. We selected combustion synthesis as an optimization problem because it is a solution-based method for making functional metal coatings. This method, however, has not yet been scaled and has not been proven for making high-quality, conductive metal films40,43,44. Combustion synthesis can form coatings at lower temperatures, enabling the potential use of inexpensive polymeric substrates45,46, but film conductivity typically decreases with processing temperature43. This situation presents a trade-off: to what extent can the conductivity be maximized while the processing temperature is minimized? The answer to this question would enable the researcher to determine, for example, what types of substrates could be layered with a metal coating of certain conductivity. We therefore leveraged Ada to effectively study the numerous compositional47,48 and processing variables43,49 that influence processing temperatures and the corresponding conductivities.

Fig. 2: The Ada self-driving laboratory and autonomous experimental workflow.
figure 2

a Schematic of the Ada self-driving laboratory. Ada consists of two robots (N9 and UR5e) with overlapping work envelopes. These robots work together to synthesize and characterize thin film samples. The N9 robot is a 4-axis arm equipped to mix, drop cast, and anneal precursors to create thin-film samples. The N9 also performs imaging and 4-point probe conductance measurements on the films it creates. The UR5 robot is a larger 6-axis arm equipped to transport samples to additional modules, including an XRF microscope. b Steps in the automated experimental workflow. Each iteration of the experiment produces a single, drop-cast, thin-film sample; images of the sample before and after annealing; an XRF map of the quantity of palladium in the film; and a map of the film conductance measured by the 4-point-probe at different locations on the sample. After the sample is characterized, the film conductivity is calculated and the qEHVI algorithm is used to autonomously plan the next experiment. All scale bars are 5 mm.

For this study we configured Ada to manipulate four variables: fuel identity, fuel-to-oxidizer ratio, precursor solution concentration, and annealing temperature (Fig. 3a). We confined the study to mixtures of two fuels, glycine and acetylacetone, that we independently identified to yield conductive films at temperatures below 300 °C. The fuel-to-oxidizer ratio was varied because it controls product oxidation in bulk combustion syntheses50,51. The precursor concentration influences the morphology of the drop-casted films. Finally, we varied the processing temperature which may influence the conductivity through solvent removal, precursor decomposition, film densification, impurity removal, grain growth, oxidation, or cracking52,53,54.

Fig. 3: Trade-off between annealing temperature and conductivity for the combustion synthesized palladium films.
figure 3

a Maps of the combustion synthesis conditions required to obtain experimental outcomes on the Pareto front. Sampled points not on the Pareto front are shown in gray. The combustion synthesis reaction, with parameters manipulated during the optimization highlighted, is shown. In this reaction x controls the fuel blend and φ is the fuel-to-oxidizer ratio, as calculated using Jain’s method (see Supplementary Methods). The fuel-to-oxidizer ratio used is scaled by k = 36/5(8 - 5x) to account for the differing reducing valences of the acetylacetone and glycine fuels. b The empirical Pareto fronts from each of the four campaigns (solid lines) reveal the observed trade-off between temperature and conductivity. The experimental points which define the fronts are shown with open markers. Sampled points not on the Pareto front are shown in gray.

Flexible automation4 enabled us to upgrade Ada (Fig. 2a) by coupling a larger, 6-axis robot to the existing smaller, 4-axis robot. The smaller robot (Fig. 2b) deposited and characterized the thin films8, while the larger robot transported the samples to a commercial X-ray fluorescence (XRF) microscope for elemental analysis. These two robots jointly executed a 7-step experimental workflow (Fig. 2c, see “Methods” section). First, a combustion synthesis precursor solution was formulated from stock solutions and then drop-cast onto a glass microscope slide. The resulting precursor droplet was imaged and then annealed in a forced-convection oven to form a film. The film was subsequently characterized by XRF microscopy, imaging, and 4-point probe conductance mapping. The conductivity of each film was determined by combining the conductance with a film thickness estimated by XRF (see “Autonomous workflow step 7” in “Methods” section, Supplementary Fig. 2). Finally, the conductivity and processing temperature for each film were passed to a multiobjective Bayesian optimization algorithm55 to plan the next experiment based on all the available data (see “Autonomous workflow step 8” in “Methods” section). The algorithm we used is called q-expected hypervolume improvement (qEHVI)55.

All of the steps in the autonomous workflow were performed without human intervention at a typical rate of two samples an hour. Ada could run unattended for 40–60 experiments until the necessary consumables (e.g., pipettes tips, mixing vials, glass substrates, and precursors; see “Methods” section) were exhausted. We used Ada to execute a total of 253 combustion synthesis experiments that explored a wide range of pertinent composition and processing variables.

The qEHVI algorithm is one of a number of a posteriori multiobjective optimization algorithms designed to identify the Pareto front55,56,57,58. These multiobjective optimization methods are known as a posteriori methods because preferred solutions are selected after the optimization. We chose to use an a posteriori method for this exploratory study, because we sought to identify a range of Pareto-optimal outcomes rather than a single optimal point. We selected the qEHVI algorithm because previously reported benchmarks show that the qEHVI algorithm often resolves the Pareto front in fewer experiments than other algorithms.55

The qEHVI algorithm directed our self-driving laboratory to quantify the trade-off between film conductivity and annealing temperature (Fig. 3). We manually selected eight synthesis conditions spanning most of the design space to provide initialization data for the qEHVI algorithm (Supplementary Table 1). After executing these initial experiments, Ada executed more than 50 iterative qEHVI-guided experiments to map the Pareto front of annealing temperature and conductivity. We performed this autonomous optimization campaign in quadruplicate. Each replicate generated a Pareto front showing a clear trade-off between temperature and conductivity (Fig. 3b).

The synthesis conditions tested during the optimization are shown in Fig. 3a; the conditions that created materials on the Pareto front are highlighted. The data revealed that the optimal precursors typically were those of concentrations near 6 mg mL−1, fuel-to-oxidizer ratios below 1, and fuel blends consisting primarily of acetylacetone. Notably, our experiments did not reveal a single optimal synthesis condition. The conditions required to obtain the maximum conductivity depended in part on the annealing temperature. Specifically, conductive films created below 200 °C required precursors with predominantly acetylacetone fuel. At higher temperatures, however, glycine-rich fuel blends also yielded samples on the Pareto front. The data shows how the fuel-to-oxidizer ratio could vary widely for fuels rich in acetylacetone yet still yield films on the Pareto front. The Pareto-optimal samples resulting from glycine-rich fuel blends, however, did not exhibit a wide range of fuel to oxidizer values. These observations highlight the richness of the data generated by the self-driving laboratory.

Quantification of algorithm performance

We used computer simulations to quantify the benefit of the qEHVI algorithm relative to random search (an open-loop sampling technique that does not use feedback from the experiment to determine which experiment to do next). These simulations were performed by running both the random and qEHVI sampling techniques on a response surface fit to the experimental data (see “Methods” section). Scenarios with and without experimental noise were simulated by adding synthetic noise to the response surface as appropriate (see “Models of the experimental response surface and noise” in “Methods” section).The hypervolume (i.e., the area under the Pareto front) was used to measure the progress of optimization (Fig. 4a). We used acceleration factor (Fig. 4b) and enhancement factor (Fig. 4c) to compare our closed-loop sampling using qEHVI to open-loop sampling using random search; see “Methods” section59. In a noise-free scenario, qEHVI required less than 100 samples to outperform 10,000 random samples (Fig. 4a). The performance of the qEHVI algorithm degraded in the presence of simulated experimental noise (see “Methods” section), but still exceeded the performance of random search. This performance decrease due to noise emphasizes the importance of minimizing experimental noise when developing a self-driving experiment. Benchmarks comparing other closed-loop and open-loop sampling techniques yielded similar results (see Supplementary Fig. 11). These findings highlight how self-driving laboratories can effectively search large materials design spaces without requiring extremely high throughput.

Fig. 4: Quantification of the benefit provided by the qEHVI algorithm in simulated optimization campaigns.
figure 4

a The hypervolumes achieved by the simulated qEHVI and random searches. The median (solid line) and interquartile range (shaded bands) of the results are shown for simulations with and without simulated experimental noise. b The acceleration factors for the qEHVI algorithm relative to random search. The geometric mean is also shown (dashed line). c The enhancement factor for the qEHVI algorithm relative to random search.

Translation of discovery to a scalable manufacturing process

The practical application of combustion synthesis would require the deposition of uniform films over large areas that are inaccessible to drop casting. On this basis, we set out to combine palladium combustion synthesis with ultrasonic spray coating60. We sprayed precursors directly onto a preheated glass substrate60 (Fig. 5a, see “Methods” section). The precursors decomposed in less than five minutes to yield reflective, conductive palladium films (Fig. 5b). An XRF map of the films (Fig. 5c) showed improved homogeneity relative to the drop-cast films (Fig. 2b).

Fig. 5: Low-temperature spray combustion synthesis of homogeneous palladium films over large areas.
figure 5

a Spray coating apparatus. An ultrasonic nozzle attached to an overhead XYZ stage (not visible) is used to spray palladium combustion synthesis precursors onto glass substrates placed on a hot plate with an aluminum fixture. The precursors decompose to yield palladium films. Scale bar is 3 cm. b Photograph of a typical resulting film on a 3″ × 1″ glass substrate. Sharp edges were produced by masking the substrate using Kapton tape. c XRF map of the sample pictured in b. To aid visualization, the photograph in panel a has been flipped horizontally. Scale bars in b, c are 1 cm.

We performed additional spray coating experiments to verify that the trends observed in the autonomous optimization translate to spray coating (i.e., that conductive palladium films can be obtained below 200 °C and that the film conductivity increases with temperature). Specifically, we spray coated palladium films using three recipes from the autonomously identified Pareto front, with temperatures of 191, 200, and 226 °C (Fig. 6 and Supplementary Table 2). Triplicate samples were spray coated using each recipe. All three recipes yielded films approximately 50–60 nm thick, as measured by XRF microscopy (see “Methods” section). All the films were relatively uniform, with spatial variations in the film thickness less than 5% of the mean within an 8 mm × 20 mm region at the center of each sample (see “Methods” section and Supplementary Table 3). Spatial variations in the conductivity within the same region were measured using the robot and were less than 18% of the mean for all samples (see “Methods” section and Supplementary Table 3). The film conductivity of the lowest temperature recipe (T = 191 °C) was 1.1 × 105 S m−1, which is approximately 1% of the bulk conductivity of palladium61. The film conductivity can increase by more than an order of magnitude when the spray coating temperature is increased by 35 °C. The highest temperature recipe tested (T = 226 °C) yielded palladium films with a conductivity of 2.0 × 106 S m−1, which is comparable to the conductivities of sputtered palladium films reported in the literature (2.0–5.8 × 106 S m−1; Fig. 6)62,63,64. These findings create new opportunities to deposit palladium films without vacuum onto large-area substrates, including an expanded range of temperature-sensitive polymers (e.g., Nafion41, polyethersulfone42, and heat-stabilized polyethylene naphthalate42). One application of this deposition process could be the fabrication of large, supported palladium membranes for more cost-effective electrocatalytic palladium membrane reactors65.

Fig. 6: Comparison between the conductivity of the spray-coated palladium films and sputtered films.
figure 6

The conductivity values for sputtered films62,63,64 and bulk palladium61 are from previous literature. The spray coating recipes are taken directly from the Pareto front and are given in Supplementary Table 2. For the spray combustion data (see also Supplementary Table 3), each point shows the conductivity of one of the three replicate samples for each recipe. The bars show the average conductivity across all three replicates for each recipe.

Discussion

Here, we mapped out a Pareto front between film processing temperature and conductivity using a self-driving laboratory guided by the qEHVI multi-objective optimization algorithm. This tradeoff is just one example of the conflicting objectives routinely faced by materials scientists to which our method could be applied. Our approach eliminates the need for the researcher to specify preferences between competing objectives in advance of the experiment, and also produces a richer, more valuable data set. In this case, the temperature–conductivity Pareto front is more useful than optimizing conductance for a fixed temperature limit because processing temperature limits vary depending on the application. Our self-driving laboratory also identified synthesis conditions that translated to a scalable spray-coating method for depositing high-quality, high-conductivity palladium films at temperatures above 190 °C. This work shows how self-driving laboratories can potentially accelerate the translation of materials to industry, where satisfying multiple objectives is essential.

Methods

Materials

MeCN (CAS 75-05-8; high-performance liquid chromatography (HPLC) grade, ≥99.9% purity), glycine (CAS 56-40-6, ACS reagent grade, >98.5% purity) and acetylacetone (CAS 123-54-6; ≥99% purity) were purchased from Sigma-Aldrich. Urea (CAS 57-13-6, ultra-pure; heavy metal content 0.01 ppm) was purchased from Schwarz/Mann. Palladium(II) nitrate hydrate (Pd(NO3)2•H2O; Pd ~40% m/m; 99.9% Pd purity, CAS 10102-05-3) was purchased from Strem Chemicals, Inc. All chemicals were used as received without further purification.

Manual preparation of stock solutions

The self-driving laboratory is provided with starting materials in the form of stock solutions which are prepared manually and then placed in capped 2 mL HPLC vials in a tray where they can be accessed by the self-driving laboratory. All solutions were prepared at a concentration of 12 mg mL−1. The Pd(NO3)2•H2O solution was prepared using MeCN as a solvent while all other solutions were prepared using deionized H2O.

Preparation of glass substrates and other consumables

In addition to stock solutions, the self-driving laboratory uses consumable glass substrates (75 mm × 25 mm × 1 mm microscope slides; VWR catalog no. 16004-430), 2 mL HPLC vials (Canadian Life Science), and 200 µL pipettes (Biotix, M-0200-BC). These are placed in appropriate racks and trays for access by the robotics.

The HPLC vials and pipettes were used as received, whereas the microscope slides were cleaned by sequential sonication in detergent, deionized water, acetone, and isopropanol for 10 min each8. Wells of 18 mm diameter were then created on the microscope slides using a sprayed enamel coating (DEM-KOTE enamel finish) and circular masks placed at the center of each slide (Supplementary Fig. 1). The wells serve to confine the precursor solution before it dries.

Self-driving laboratory

The self-driving laboratory consists of a precision 4-axis laboratory robot (N9, North Robotics) coupled with a 6-axis collaborative robot (UR5e, Universal Robotics). The 4-axis robot is equipped to perform entire thin film deposition and characterization workflows and is described in our previous work8. The 6-axis robot enables samples to be transferred to a variety of additional modules, including the XRF microscope used here. Both robots are equipped with vacuum-based tools for substrate handling. All robots and instruments were controlled by a PC with software written in Python.

Overview of autonomous robotic workflow

The majority of operations in the autonomous robotic workflow are performed by the 4-axis laboratory robot. Samples are transported between the 4-axis robot and the XRF microscope by the 6-axis robot.

The 4-axis robot prepared each sample by combining stock solutions to form a precursor mixture, drop casting this precursor onto a glass slide, and then annealing the sample in a forced convection oven (Supplementary Fig. 1). The samples were characterized by white light photography before and after annealing, X-ray fluorescence microscopy, and 4-point-probe conductance measurements. The resulting data was then automatically analyzed using a custom data pipeline implemented in Python. Finally, the result of the experiment was fed to a Bayesian optimizer which used an expected hypervolume improvement acquisition function to select the next experiment to be performed. Each of these steps is described in further detail below.

Autonomous workflow step 1: mix precursors

The 4-axis robot formulated each precursor by pipetting varying volumes of the stock solutions described above into a clean 2 mL HPLC vial. Gravimetric feedback from an analytical balance (ZSA120, Scientech) was used to minimize and record pipetting errors. The precursor was mixed by repeated aspiration and dispensing.

Autonomous workflow step 2: drop cast precursor

The 4-axis robot used a vacuum-based substrate handling tool to place a clean glass slide onto a tray. This robot then created a thin film sample by using a pipette to drop cast 98 µL of the precursor into a predefined well on the slide. The solution was ejected from the pipette at a rate of 5 µL s−1 from a height of approximately 1.5 mm above the top surface of the substrate.

Autonomous workflow step 3: image precursor droplet

The 4-axis robot acquired visible-light photographs of each sample before annealing. This robot positioned samples 90 mm below a camera (FLIR Blackfly S USB3; BFS-U3-120S4C-CS) using a Sony 12.00 MP CMOS sensor (IMX226) and an Edmund Optics 25 mm C Series Fixed Focal Length Imaging Lens (#59–871). The C-mount lens was connected to the CS-mount camera using a Thorlabs CS- to C-Mount Extension Adapter, 1.00″-32 Threaded, 5 mm Length (CML05). The sample was illuminated from the direction of the camera using a MIC-209 3-W ring light. For imaging, the lens was opened to f/1.4, and black flocking paper (Thorlabs BFP1) was placed 10 cm behind the sample.

Autonomous workflow step 4: annealing

After drop casting, the 4-axis robot used the substrate handling tool to transport the precursor-coated slide into a purpose-built miniature convection oven for annealing at a variable temperature between 180 and 280 °C. The most important features of the oven are a low-thermal mass construction (lightweight aluminum frame with glass-fiber insulation) and internal and external fans. These features enable rapid heating and cooling of the sample (Supplementary Fig. 12). A pneumatically actuated lid enables robotic access to the sample. The oven employs a ceramic heating element (P/N 3559K23, McMaster Carr) controlled by a PID temperature controller (P/N CN7523, Omega Engineering). A type-K thermocouple located in the oven air space provides temperature feedback to the controller. In the experiments performed here, the sample was inserted into the oven which was then ramped at 40 °C per minute to the temperature set point, which was then held for 450 s. Upon completion of the hold, the oven lid was opened and a cooling fan turned on to blow ambient temperature air through the oven and over the sample. The sample was removed from the oven after the temperature dropped below 60 °C. The oven was further cooled to below 40 °C prior to loading of the next sample.

Autonomous workflow step 5: XRF imaging and data analysis

The self-driving laboratory acquired hyperspectral X-ray fluorescence (XRF) images of each sample using a Bruker M4 TORNADO X-ray fluorescence microscope equipped with a customized sample fixture. Samples were transported to the XRF microscope by the UR5e 6-axis robotic arm equipped with a vacuum-based substrate handling tool similar to the one used by the 4-axis N9 robot. A dedicated exchange tray accessible to both robots enabled samples to be passed from one robot to the other.

The XRF microscope has a rhodium X-ray source operated at 50 kV/600 µA/30 W and polycapillary X-ray optics yielding a 25 µm spot size on the sample. The instrument employs twin 30 mm2 silicon drift detectors and achieves an energy resolution of 10 eV. Hyperspectral images were taken over a 20 mm × 20 mm area at a resolution of 125 × 125 pixels. The XRF spectra obtained (reported in counts) were scaled by the integration time (50 ms) and the energy resolution (10 eV) to yield units of counts s−1 eV−1.

To quantify the relative amount of palladium in the film, the palladium Lyman-alpha X-ray fluorescence line (2.837 keV) was integrated from 2.6 to 3.2 keV. The resulting counts were converted to film thickness estimates by applying a calibration factor obtained using reference samples (see below). Ninety-seven points of interest are defined within the XRF hypermap of the sample, as defined in Supplementary Fig. 3. For each point of interest, the average XRF counts per second were calculated over a 3 mm × 3 mm area.

Autonomous workflow step 6: image annealed film

The self-driving laboratory acquired visible light photographs (as described in step 3) of each sample after annealing.

Autonomous workflow step 7: film conductivity measurement

After hyperspectral XRF imaging, the sample was returned by the UR5e robot to the N9 robot for film conductance measurements. Four-point probe conductance measurements were performed with a Keithley Series K2636B System Source Meter instrument connected to a Signatone four-point probe head (part number SP4-40045TBN; 0.040-inch tip spacing, 45 g pressure, and tungsten carbide tips with 0.010-inch radii) by a Signatone triax to BNC feedthrough panel (part number TXBA-M160-M). The source current was stepped from 0 to 1 mA in 0.2 mA steps. After each current step, the source meter was stabilized for 0.1 s and the voltage across the inner probes was then averaged for three cycles of the 60 Hz power line (i.e., for 0.05 s) and recorded. Conductance measurements were made on the same 97 points of interest as analyzed in the XRF data, as defined in Supplementary Fig. 3.

The film conductivity was calculated using a custom data analysis pipeline implemented in Python using the open-source Luigi framework66. This pipeline combined conductance data and XRF data to estimate the film conductivity at each of the 97 points of interest on the sample.

For each set of current–voltage measurements at each position on each sample, the RANSAC robust linear fitting algorithm67 was used to extract the conductance (dI/dV). The voltage compliance limit of the K2636B was set to 10 V and voltage measurements greater than 10 V were therefore considered to have saturated the Source Meter instrument and automatically discarded by the data analysis pipeline.

The conductivity of the thin films was then calculated by combining the 4-point-probe conductance data with the film thicknesses estimated by XRF:

$$\sigma =\frac{{{{{{\rm{ln}}}}}}2}{\pi }\times \frac{{{{{{{\mathrm{d}}}}}}I}}{{{{{{{\mathrm{d}}}}}}V}}\times {t}^{-1},$$
(1)

where dI/dV is the conductance from the 4-point-probe measurement, t is estimated film thickness from the XRF measurements, and σ is conductivity.

Due to the poor morphology of the drop-cast films, a robust conductivity estimation scheme was employed. First, conductance data was excluded for any measurement positions with zero conductance. Next, outliers were excluded from the remaining conductance data using a kernel density exclusion method (see below). Outliers were also excluded from the XRF film thickness estimates using the same exclusion method. Conductivities were calculated for each position on the sample for which neither conductance nor XRF data was excluded. The mean of these conductivities was returned to the optimizer (see below). In cases where all points were discarded, a mean conductivity of 0 was reported.

The outlier kernel density exclusion method was performed by calculating Gaussian kernel density estimates for the conductance and XRF data, normalizing the density between 0 and 1, and rejecting data points with a kernel density below 0.3. Bandwidths of 5 × 10−3 μΩ−1 m−1 and 5 × 103 cps were used for the conductance and XRF data, respectively.

Autonomous workflow step 8: algorithmic experiment planning

The experiment parameters for each optimization experiment performed on the autonomous laboratory were determined by the qEHVI55 multiobjective Bayesian optimization algorithm. In brief, this algorithm proposes experiments expected to increase the area underneath the Pareto front by the largest amount. More formally, the algorithm proposes a batch of q experiments (q = 1 here, but q could be increased to exploit parallelized experimentation), which are collectively expected to increase the hypervolume between the Pareto front and a reference point by the largest amount. The hypervolume is a generalization of volume to an arbitrary number of dimensions; this generalization supports optimization with more than two objectives. The reference point must be specified prior to the optimization and specifies a minimum value of interest for each objective.

The algorithm involves two major conceptual steps: modeling the objectives from data and proposing the next experiment. In the configuration used here, each objective is assumed to be independent and is modeled with an independent gaussian process (see Supplementary Figs. 5 and 6). Based on the models for each objective, the expectation value of the hypervolume improvement associated with any candidate experiment can be computed; the candidate experiment with the largest expected hypervolume improvement is selected. We ran the qEHVI algorithm using the implementation available in the open-source BoTorch Bayesian optimization library68,69. We used a temperature reference point at the upper limit of the experiment (280 °C) so that any outcome with a processing temperature below this upper limit would be targeted. We used a dynamic conductivity reference point set to 5% of the running observed maximum conductivity. This dynamic reference point ensured that the optimization would identify Pareto-optimal outcomes over a wide range of conductivity values and did not require prior knowledge of the scale of conductivity values expected. We used heteroskedastic Gaussian processes to model both the conductivity and the temperature68. We assigned each conductivity point an uncertainty equal to 20% of its value, which is comparable to the repeatability of the experiment (Supplementary Fig. 13). Zero uncertainty was assigned to the temperature values, which were manipulated rather than responding variables and were trivial to model.

Calibration of XRF signal against reference samples

To enable palladium film thickness to be estimated from the XRF signal, a calibration procedure was performed on sputtered palladium reference samples having four different nominal thicknesses (10, 50, 100, and 250 nm). These samples were characterized by profilometry and XRF. A linear relationship between the film thickness and the XRF counts was observed (see Supplementary Fig. 2). This relationship was used to estimate the thickness of each sample from the XRF data.

The reference samples were sputtered onto clean glass microscope slides (see cleaning procedure above) using a Univex 250 sputter deposition system with a DC magnetron source at 100 W and an argon working pressure of 5 × 10−6 bar. The deposition chamber base pressure is 5 × 10−9 bar. Films were deposited after 1 min of pre-sputtering. The substrate holder rotated at 10 rpm. Nominal film thickness was monitored using a quartz crystal microbalance mounted in the sputter chamber. A 2-inch diameter palladium sputter target was used (99.99%, ACI Alloys). Step edges for profilometry were obtained by placing strips of Kapton™ tape onto the substrates prior to sputtering and removing these after sputtering. The substrates were rinsed with acetone and IPA to remove any Kapton™ tape residue prior to performing profilometry.

Profilometry was performed on the reference samples using a Bruker DektakXT stylus profilometer. XRF was performed on the reference samples using the same settings used for the drop-casted samples during the optimization campaigns (see above).

Deposition and characterization of spray-coated samples

The spray coater was built from an ultrasonic nozzle (Microspray, USA) mounted to a custom motorized XYZ gantry system (Zaber Technologies Inc., Canada) above a hot plate (PC-420D, Corning, USA). Precursor ink was fed to the nozzle by a syringe pump (cavro centris pump PN: 30098790-B, Tecan Trading AG, Switzerland). The ultrasonic spray nozzle was operated at 3 W and 120 kHz. For each recipe, a total of 700 µL of precursor was sprayed onto a glass substrate (75 mm × 25 mm × 1 mm microscope slides; VWR catalog no. 16004-430) placed on a custom aluminum fixture mounted to the hotplate. Approximate substrate temperatures were measured using a thermocouple attached to a glass substrate with thermal cement. This instrumented substrate was placed at a position on the hotplate fixture symmetrical to the position where the substrates to be coated were placed. The hotplate power was adjusted until the steady-state temperature of the instrumented substrate was within 4 °C of the desired temperature before spray coating each of the recipes reported here. To achieve consistent thermal contact between the substrates and the hotplate fixture, both the instrumented substrate and substrate to be coated were affixed to the hotplate with thermal paste (TG-7, Thermaltake Technology Co., Taiwan). The spray coater nozzle speed was 5.1 mm s−1, the nozzle-to-substrate distance was 15 mm, the spray flow rate was 2 µL s−1, and the carrier gas flow rate was 7 L min−1. When spraying, the nozzle moved in a serpentine pattern consisting of twelve 50 mm lines with 25 mm spacing (see an illustration of the pattern in Supplementary Fig. 14). The coating on each sample was produced by repeating this spray pattern three times with no delay between passes. After spray coating, the samples were left to anneal on the hot plate for 5 min.

The spray-coated palladium films were characterized at 26 locations on a 2 × 13 grid within an 8 × 20 mm region of interest at the center of the film (see Supplementary Fig. 14). The amount of palladium at each location was measured using the XRF microscope and converted to a film thickness estimate by applying the same calibration method used for the drop-cast films. The film conductance at each location was measured using the 4-point probe system on the robot described above. The film conductivity was calculated at each of the 26 measurement locations by combining the 4-point probe conductance and XRF film thickness values for that location using Eq. (1). The mean and standard deviations of the 26 resulting thickness and conductivity values are reported for each film in Supplementary Table 3.

Computer simulations of optimization algorithm performance

Computer simulations were used to study the performance of the qEHVI algorithm for optimizing the combustion synthesis experiments. A model of the experimental response surface was built from the experimental data. Experimental optimizations were then simulated by sampling the model using grid search, random search, Sobol sampling, and the qEHVI and qParEGO algorithms. Optimization performance was quantified with and without simulated experimental noise. The performance of qEHVI relative to random sampling was quantified using the acceleration factor (AF) and enhancement factor (EF) metrics59. These simulation procedures are described in more detail below.

Models of the experimental response surface and noise

Gaussian process regression was used to create a model of the experimental response surface using the combined data from all four optimization campaigns. This model predicts the experimental outputs (i.e., annealing temperature and conductivity) from the experimental inputs (i.e., fuel-to-oxidizer ratio, fuel blend, total concentration, and annealing temperature). Our model is composed of two separate Gaussian processes, as implemented by the scikit-learn Python package67. Each Gaussian process is regressed on a single experimental output (i.e., either temperature or conductivity) and all four of the experimental inputs. The kernels used for the conductivity (kcond) and temperature (ktemp) models are:

$${k}_{{{{{{{\mathrm{cond}}}}}}}}\left(x,{x}^{{\prime} }\right)={k}_{{{{{{{\mathrm{lin}}}}}}}}\left(x,{x}^{{\prime} }\right)\times {k}_{{{{{{{\mathrm{SE}}}}}}}}\left(x,{x}^{{\prime} }\right)+{k}_{{{{{{{\mathrm{noise}}}}}}}}\left(x,{x}^{{\prime} }\right),$$
(2)
$${k}_{{{{{{{\mathrm{temp}}}}}}}}\left(x,{x}^{{\prime} }\right)={k}_{{{{{{{\mathrm{lin}}}}}}}}{\left(x,{x}^{{\prime} }\right)}^{* }\times {k}_{{{{{{{\mathrm{SE}}}}}}}}{\left(x,{x}^{{\prime} }\right)}^{* },$$
(3)
$${k}_{{{{{{{\mathrm{noise}}}}}}}}\left(x,{x}^{{\prime} }\right)={{{{{{\mathrm{noise}}}}}}}\,{if}\,x={x}^{{\prime} }{{{{{{\mathrm{else}}}}}}}\,0,$$
(4)

where klin is a constant kernel, kSE is a squared exponential kernel, and knoise is a white noise kernel, and * indicates that the lengthscale of the kernel is fixed to 1.

The four types of input data and two types of output data were each normalized prior to training the model. To simplify the optimization to be strictly a maximization problem, the temperature values (which must be minimized) were multiplied by negative one. The leave-one-out cross-validation residuals (LOOCV; Supplementary Figs. 4, 9, and 10 and Supplementary Table 4) are comparable to the measured experimental uncertainties (Supplementary Table 1). We also plotted the LOOCV residuals as a function of each input (Supplementary Fig. 6), each modeled output (Supplementary Fig. 7), and sampling order (Supplementary Fig. 8) and observed that the distribution of the residuals was largely random.

For the simulations with no noise, the model posterior means were used directly to represent the experiment. For simulations with experimental noise, noisy conductivity values were simulated by randomly sampling a modified Maxwell–Boltzmann distribution. First, the Maxwell–Boltzmann distribution was flipped across the y-axis by negating the x term. Second, the mean of this Maxwell–Boltzmann distribution was set to the noiseless model posterior mean. Finally, the variance of the Maxwell–Boltzmann distribution was set to be equal to the noise level (or variance) of the white noise kernel. We chose to employ Maxwell–Boltzmann noise to model the experimental noise because of the tendency of drop-casted samples to exhibit a wide range of downwards deviations in the apparent conductivity due to the poor sample morphology.

The Maxwell–Boltzmann probability density function is

$${P}_{{{{{{\mathrm{B}}}}}}}\left(x\right)=\sqrt{\frac{\pi }{2}}\frac{{x}^{2}{{{{{\rm{exp }}}}}}\left(\frac{-{x}^{2}}{2{a}^{2}}\right)}{{a}^{3}},$$
(5)

where a is the distribution parameter. Since we set the variance of Maxwell–Boltzmann distribution (\({\sigma }_{{{{{{\mathrm{B}}}}}}}^{2}\)) equal to the white noise kernel noise level (\({\sigma }_{{{{{{{\mathrm{noise}}}}}}}}^{2}\))

$${\sigma }_{{{{{{{\mathrm{noise}}}}}}}}^{2}={\sigma }_{{{{{{\mathrm{B}}}}}}}^{2}=\frac{{a}^{2}(3\pi -8)}{\pi }.$$
(6)

We computed the distribution parameter for the Maxwell–Boltzmann-distributed simulated experimental noise as:

$$a=\sqrt{\frac{\pi {\sigma }_{{{{{{{\mathrm{noise}}}}}}}}}{3\pi -8}}.$$
(7)

If subtracting the Maxwell–Boltzmann noise from the posterior mean resulted in a value less than zero, the noisy model value was set to zero.

Sampling strategies

To compare the performance of closed-loop and open-loop approaches, several sampling strategies (grid search, random search, Sobol sampling, and the qEHVI and qParEGO algorithms) were used to sample both the noise-free and noisy experimental models. Random sampling was performed by generating samples from a uniform distribution across the entire normalized input space. Sobol sampling was performed with a scrambling technique such that each Sobol sequence is unique to yield a statistically meaningful distribution of optimizations70. qEHVI and qParEGO are initiated with ten scrambled Sobol points. The qEHVI algorithm was configured as it was for the physical experiments detailed above. The qParEGO algorithm was configured using the default settings, except for the reference point which was configured in the same way done for qEHVI. The complete benchmarking results are shown in Supplementary Fig. 11.

Simulated optimization campaigns

The performance of each sampling strategy (grid, random, Sobol, qParEGO, and qEHVI) was determined both with and without experimental noise. Each simulated optimization campaign was performed for 1000 replicates and 100 experimental iterations, except for random which was performed for 100,000 iterations for use in the acceleration calculations (shown in Fig. 4b). If an optimization algorithm produced an error during optimization, then that replicate was removed and repeated.

Pareto front and hypervolume

The Pareto front is defined by the set of samples for which no other sample simultaneously improves all the objectives. When assessing the performance of a simulation, the hypervolume was computed using the least desirable value of each objective function (the nadir objective vector) as the reference point (i.e., zero conductivity and 280 °C temperature). This reference point was held constant for evaluating all of the simulated optimization campaigns. We assessed the performance of the noisy optimizations such that the only difference between the noisy and noise-free optimizations was the information provided to the optimization algorithm. To perform this assessment, we followed the procedure described by Bakshy and coworkers58 wherein the hypervolume for optimizations using the noisy experimental model were calculated from the equivalent point from the noiseless model. Since each objective of the model is normalized to the range [0, 1], the hypervolume of the model is in the range [0, 1]. For each simulated optimization campaign, the normalized hypervolume was calculated at each iteration. When calculating acceleration and enhancement factors, each of the 1000 simulated qEHVI campaigns was compared to each of the 1000 simulated random campaigns, resulting in 1,000,000 comparisons.

Calculation of acceleration and enhancement factor

The acceleration factor quantifies how much faster one sampling technique is than another (Eq. 5). For example, if sampling technique B requires 40 samples to reach the performance attained by technique A after 20 samples, the acceleration factor of A relative to B at 20 samples is 2.

$${{{{{\rm{A}}}}}}{{{{{{\rm{F}}}}}}}_{{{{{{\rm{A}}}}}}:{{{{{\rm{B}}}}}}}\left({n}_{{{{{{\mathrm{a}}}}}}}\right)=\frac{{n}_{{{{{{\mathrm{b}}}}}}}}{{n}_{{{{{{\mathrm{a}}}}}}}},$$
$$s.t.\,{P}_{{{{{{\mathrm{B}}}}}}}\left({n}_{{{{{{\mathrm{b}}}}}}}\right)\ge {P}_{{{{{{\mathrm{A}}}}}}}\left({n}_{{{{{{\mathrm{a}}}}}}}\right),{{{{{\rm{min }}}}}}\,{n}_{{{{{{\mathrm{b}}}}}}},$$
(8)

where \({{{{{\rm{A}}}}}}{{{{{{\rm{F}}}}}}}_{{{{{{\rm{A}}}}}}:{{{{{\rm{B}}}}}}}\left({n}_{{{{{{\mathrm{a}}}}}}}\right)\) is the acceleration of technique A with respect to B at \({n}_{{{{{{\mathrm{a}}}}}}}\) samples, and \({P}_{i}(n)\) is the performance of technique i at n samples. Note that it is possible for sampling technique A to outperform B such that there exists no value of \({n}_{{{{{{\mathrm{b}}}}}}}\) where \({P}_{{{{{{\mathrm{B}}}}}}}\ge {P}_{{{{{{\mathrm{A}}}}}}}\). In these cases, more samples with technique B are required to make the comparison, otherwise \({{{{{\rm{A}}}}}}{{{{{{\rm{F}}}}}}}_{{{{{{\rm{A}}}}}}:{{{{{\rm{B}}}}}}}\) is not calculable. If \({{{{{\rm{A}}}}}}{{{{{{\rm{F}}}}}}}_{{{{{{\rm{A}}}}}}:{{{{{\rm{B}}}}}}}\) is not calculable, then a lower bound acceleration factor is calculated by assuming that the slow sampling technique would beat the fast sampling technique if it observed one more sample. The acceleration factor in Fig. 4b was reported until these lower bound estimates compose more than 25% of all of the acceleration comparisons.

The enhancement factor of one sampling technique with respect to another for a given number of samples is defined as the ratio of their performance values for the same number of observations (Eq. 6). For example, if sampling technique A reaches a performance value of 7 after 20 samples, and technique B reaches a performance value of 2 after 20 samples, the enhancement of technique A is 3.5 at 20 samples.

$${{{{{\rm{E}}}}}}{{{{{{\rm{F}}}}}}}_{{{{{{\rm{A}}}}}}:{{{{{\rm{B}}}}}}}\left(n\right)=\frac{{P}_{{{{{{\mathrm{A}}}}}}}(n)}{{P}_{{{{{{\mathrm{B}}}}}}}(n)},$$
(9)

where \({{{{{\rm{E}}}}}}{{{{{{\rm{F}}}}}}}_{{{{{{\rm{A}}}}}}:{{{{{\rm{B}}}}}}}(n)\) is the acceleration factor of sampling technique A with respect to B after n samples. When \({P}_{{{{{{\mathrm{A}}}}}}}\left(n\right)=0\) and \({P}_{{{{{{\mathrm{B}}}}}}}\left(n\right)=0\), then \({{{{{{\rm{EF}}}}}}}_{{{{{{\rm{A}}}}}}:{{{{{\rm{B}}}}}}}\left(n\right)=1\). When \({P}_{{{{{{\mathrm{A}}}}}}}\left(n\right) \, > \, 0\) and \({P}_{{{{{{\mathrm{B}}}}}}}\left(n\right) \, > \, 0\), then \({{{{{\rm{E}}}}}}{{{{{{\rm{F}}}}}}}_{{{{{{\rm{A}}}}}}:{{{{{\rm{B}}}}}}}\left(n\right)\) is not calculable. To compare the AF and EF from the repeated simulations, the median, geometric mean and interquartile range were calculated.