Abstract:Physical imaging is a foundational characterization method in areas from condensed matter physics and chemistry to astronomy and spans length scales from atomic to universe. Images encapsulate crucial data regarding atomic bonding, materials microstructures, and dynamic phenomena such as microstructural evolution and turbulence, among other phenomena. The challenge lies in effectively extracting and interpreting this information. Variational Autoencoders (VAEs) have emerged as powerful tools for identifying underlying factors of variation in image data, providing a systematic approach to distilling meaningful patterns from complex datasets. However, a significant hurdle in their application is the definition and selection of appropriate descriptors reflecting local structure. Here we introduce the scale-invariant VAE approach (SI-VAE) based on the progressive training of the VAE with the descriptors sampled at different length scales. The SI-VAE allows the discovery of the length scale dependent factors of variation in the system. Here, we illustrate this approach using the ferroelectric domain images and generalize it to the movies of the electron-beam induced phenomena in graphene and topography evolution across combinatorial libraries. This approach can further be used to initialize the decision making in automated experiments including structure-property discovery and can be applied across a broad range of imaging methods. This approach is universal and can be applied to any spatially resolved data including both experimental imaging studies and simulations, and can be particularly useful for exploration of phenomena such as turbulence, scale-invariant transformation fronts, etc.
Abstract:Scientific advancement is universally based on the dynamic interplay between theoretical insights, modelling, and experimental discoveries. However, this feedback loop is often slow, including delayed community interactions and the gradual integration of experimental data into theoretical frameworks. This challenge is particularly exacerbated in domains dealing with high-dimensional object spaces, such as molecules and complex microstructures. Hence, the integration of theory within automated and autonomous experimental setups, or theory in the loop automated experiment, is emerging as a crucial objective for accelerating scientific research. The critical aspect is not only to use theory but also on-the-fly theory updates during the experiment. Here, we introduce a method for integrating theory into the loop through Bayesian co-navigation of theoretical model space and experimentation. Our approach leverages the concurrent development of surrogate models for both simulation and experimental domains at the rates determined by latencies and costs of experiments and computation, alongside the adjustment of control parameters within theoretical models to minimize epistemic uncertainty over the experimental object spaces. This methodology facilitates the creation of digital twins of material structures, encompassing both the surrogate model of behavior that includes the correlative part and the theoretical model itself. While demonstrated here within the context of functional responses in ferroelectric materials, our approach holds promise for broader applications, the exploration of optical properties in nanoclusters, microstructure-dependent properties in complex materials, and properties of molecular systems. The analysis code that supports the funding is publicly available at https://github.com/Slautin/2024_Co-navigation/tree/main
Abstract:The rapid growth of automated and autonomous instrumentations brings forth an opportunity for the co-orchestration of multimodal tools, equipped with multiple sequential detection methods, or several characterization tools to explore identical samples. This can be exemplified by the combinatorial libraries that can be explored in multiple locations by multiple tools simultaneously, or downstream characterization in automated synthesis systems. In the co-orchestration approaches, information gained in one modality should accelerate the discovery of other modalities. Correspondingly, the orchestrating agent should select the measurement modality based on the anticipated knowledge gain and measurement cost. Here, we propose and implement a co-orchestration approach for conducting measurements with complex observables such as spectra or images. The method relies on combining dimensionality reduction by variational autoencoders with representation learning for control over the latent space structure, and integrated into iterative workflow via multi-task Gaussian Processes (GP). This approach further allows for the native incorporation of the system's physics via a probabilistic model as a mean function of the GP. We illustrated this method for different modalities of piezoresponse force microscopy and micro-Raman on combinatorial $Sm-BiFeO_3$ library. However, the proposed framework is general and can be extended to multiple measurement modalities and arbitrary dimensionality of measured signals. The analysis code that supports the funding is publicly available at https://github.com/Slautin/2024_Co-orchestration.
Abstract:The current focus in Autonomous Experimentation (AE) is on developing robust workflows to conduct the AE effectively. This entails the need for well-defined approaches to guide the AE process, including strategies for hyperparameter tuning and high-level human interventions within the workflow loop. This paper presents a comprehensive analysis of the influence of initial experimental conditions and in-loop interventions on the learning dynamics of Deep Kernel Learning (DKL) within the realm of AE in Scanning Probe Microscopy. We explore the concept of 'seed effect', where the initial experiment setup has a substantial impact on the subsequent learning trajectory. Additionally, we introduce an approach of the seed point interventions in AE allowing the operator to influence the exploration process. Using a dataset from Piezoresponse Force Microscopy (PFM) on PbTiO3 thin films, we illustrate the impact of the 'seed effect' and in-loop seed interventions on the effectiveness of DKL in predicting material properties. The study highlights the importance of initial choices and adaptive interventions in optimizing learning rates and enhancing the efficiency of automated material characterization. This work offers valuable insights into designing more robust and effective AE workflows in microscopy with potential applications across various characterization techniques. The analysis code that supports the funding is publicly available at https://github.com/Slautin/2024_Seed_effect_DKL_BO.
Abstract:Machine learning methods are progressively gaining acceptance in the electron microscopy community for de-noising, semantic segmentation, and dimensionality reduction of data post-acquisition. The introduction of the APIs by major instrument manufacturers now allows the deployment of ML workflows in microscopes, not only for data analytics but also for real-time decision-making and feedback for microscope operation. However, the number of use cases for real-time ML remains remarkably small. Here, we discuss some considerations in designing ML-based active experiments and pose that the likely strategy for the next several years will be human-in-the-loop automated experiments (hAE). In this paradigm, the ML learning agent directly controls beam position and image and spectroscopy acquisition functions, and human operator monitors experiment progression in real- and feature space of the system and tunes the policies of the ML agent to steer the experiment towards specific objectives.
Abstract:Optimization of experimental materials synthesis and characterization through active learning methods has been growing over the last decade, with examples ranging from measurements of diffraction on combinatorial alloys at synchrotrons, to searches through chemical space with automated synthesis robots for perovskites. In virtually all cases, the target property of interest for optimization is defined apriori with limited human feedback during operation. In contrast, here we present the development of a new type of human in the loop experimental workflow, via a Bayesian optimized active recommender system (BOARS), to shape targets on the fly, employing human feedback. We showcase examples of this framework applied to pre-acquired piezoresponse force spectroscopy of a ferroelectric thin film, and then implement this in real time on an atomic force microscope, where the optimization proceeds to find symmetric piezoresponse amplitude hysteresis loops. It is found that such features appear more affected by subsurface defects than the local domain structure. This work shows the utility of human-augmented machine learning approaches for curiosity-driven exploration of systems across experimental domains. The analysis reported here is summarized in Colab Notebook for the purpose of tutorial and application to other data: https://github.com/arpanbiswas52/varTBO
Abstract:Electron, optical, and scanning probe microscopy methods are generating ever increasing volume of image data containing information on atomic and mesoscale structures and functionalities. This necessitates the development of the machine learning methods for discovery of physical and chemical phenomena from the data, such as manifestations of symmetry breaking in electron and scanning tunneling microscopy images, variability of the nanoparticles. Variational autoencoders (VAEs) are emerging as a powerful paradigm for the unsupervised data analysis, allowing to disentangle the factors of variability and discover optimal parsimonious representation. Here, we summarize recent developments in VAEs, covering the basic principles and intuition behind the VAEs. The invariant VAEs are introduced as an approach to accommodate scale and translation invariances present in imaging data and separate known factors of variations from the ones to be discovered. We further describe the opportunities enabled by the control over VAE architecture, including conditional, semi-supervised, and joint VAEs. Several case studies of VAE applications for toy models and experimental data sets in Scanning Transmission Electron Microscopy are discussed, emphasizing the deep connection between VAE and basic physical principles. All the codes used here are available at https://github.com/saimani5/VAE-tutorials and this article can be used as an application guide when applying these to own data sets.
Abstract:Through digital imaging, microscopy has evolved from primarily being a means for visual observation of life at the micro- and nano-scale, to a quantitative tool with ever-increasing resolution and throughput. Artificial intelligence, deep neural networks, and machine learning are all niche terms describing computational methods that have gained a pivotal role in microscopy-based research over the past decade. This Roadmap is written collectively by prominent researchers and encompasses selected aspects of how machine learning is applied to microscopy image data, with the aim of gaining scientific knowledge by improved image quality, automated detection, segmentation, classification and tracking of objects, and efficient merging of information from multiple imaging modalities. We aim to give the reader an overview of the key developments and an understanding of possibilities and limitations of machine learning for microscopy. It will be of interest to a wide cross-disciplinary audience in the physical sciences and life sciences.
Abstract:We pose that microscopy offers an ideal real-world experimental environment for the development and deployment of active Bayesian and reinforcement learning methods. Indeed, the tremendous progress achieved by machine learning (ML) and artificial intelligence over the last decade has been largely achieved via the utilization of static data sets, from the paradigmatic MNIST to the bespoke corpora of text and image data used to train large models such as GPT3, DALLE and others. However, it is now recognized that continuous, minute improvements to state-of-the-art do not necessarily translate to advances in real-world applications. We argue that a promising pathway for the development of ML methods is via the route of domain-specific deployable algorithms in areas such as electron and scanning probe microscopy and chemical imaging. This will benefit both fundamental physical studies and serve as a test bed for more complex autonomous systems such as robotics and manufacturing. Favorable environment characteristics of scanning and electron microscopy include low risk, extensive availability of domain-specific priors and rewards, relatively small effects of exogeneous variables, and often the presence of both upstream first principles as well as downstream learnable physical models for both statics and dynamics. Recent developments in programmable interfaces, edge computing, and access to APIs facilitating microscope control, all render the deployment of ML codes on operational microscopes straightforward. We discuss these considerations and hope that these arguments will lead to creating a novel set of development targets for the ML community by accelerating both real-world ML applications and scientific progress.
Abstract:Recent progress in machine learning methods, and the emerging availability of programmable interfaces for scanning probe microscopes (SPMs), have propelled automated and autonomous microscopies to the forefront of attention of the scientific community. However, enabling automated microscopy requires the development of task-specific machine learning methods, understanding the interplay between physics discovery and machine learning, and fully defined discovery workflows. This, in turn, requires balancing the physical intuition and prior knowledge of the domain scientist with rewards that define experimental goals and machine learning algorithms that can translate these to specific experimental protocols. Here, we discuss the basic principles of Bayesian active learning and illustrate its applications for SPM. We progress from the Gaussian Process as a simple data-driven method and Bayesian inference for physical models as an extension of physics-based functional fits to more complex deep kernel learning methods, structured Gaussian Processes, and hypothesis learning. These frameworks allow for the use of prior data, the discovery of specific functionalities as encoded in spectral data, and exploration of physical laws manifesting during the experiment. The discussed framework can be universally applied to all techniques combining imaging and spectroscopy, SPM methods, nanoindentation, electron microscopy and spectroscopy, and chemical imaging methods, and can be particularly impactful for destructive or irreversible measurements.