Abstract:In this work, we explore the potential of self-supervised learning from unlabeled electron microscopy datasets, taking a step toward building a foundation model in this field. We show how self-supervised pretraining facilitates efficient fine-tuning for a spectrum of downstream tasks, including semantic segmentation, denoising, noise & background removal, and super-resolution. Experimentation with varying model complexities and receptive field sizes reveals the remarkable phenomenon that fine-tuned models of lower complexity consistently outperform more complex models with random weight initialization. We demonstrate the versatility of self-supervised pretraining across various downstream tasks in the context of electron microscopy, allowing faster convergence and better performance. We conclude that self-supervised pretraining serves as a powerful catalyst, being especially advantageous when limited annotated data are available and efficient scaling of computational cost are important.
Abstract:Detecting and analyzing various defect types in semiconductor materials is an important prerequisite for understanding the underlying mechanisms as well as tailoring the production processes. Analysis of microscopy images that reveal defects typically requires image analysis tasks such as segmentation and object detection. With the permanently increasing amount of data that is produced by experiments, handling these tasks manually becomes more and more impossible. In this work, we combine various image analysis and data mining techniques for creating a robust and accurate, automated image analysis pipeline. This allows for extracting the type and position of all defects in a microscopy image of a KOH-etched 4H-SiC wafer that was stitched together from approximately 40,000 individual images.
Abstract:Crystalline materials, such as metals and semiconductors, nearly always contain a special defect type called dislocation. This defect decisively determines many important material properties, e.g., strength, fracture toughness, or ductility. Over the past years, significant effort has been put into understanding dislocation behavior across different length scales via experimental characterization techniques and simulations. This paper introduces the dislocation ontology (DISO), which defines the concepts and relationships related to linear defects in crystalline materials. We developed DISO using a top-down approach in which we start defining the most general concepts in the dislocation domain and subsequent specialization of them. DISO is published through a persistent URL following W3C best practices for publishing Linked Data. Two potential use cases for DISO are presented to illustrate its usefulness in the dislocation dynamics domain. The evaluation of the ontology is performed in two directions, evaluating the success of the ontology in modeling a real-world domain and the richness of the ontology.
Abstract:In recent years, there has been a growing interest in accelerated materials innovation in both, research and industry. However, to truly add value to the development of new advanced materials, it is inevitable to take into account manufacturing processes and thereby tailor materials design approaches to support downstream process design approaches. As a major step into this direction, we present a holistic optimization approach that covers the entire materials process-structure-property chain. Our approach specifically employs machine learning techniques to address two critical identification problems. The first is to solve a materials design problem, which involves identifying near-optimal material structures that exhibit desired macroscopic properties. The second is to solve a process design problem that is to find an optimal processing path to manufacture these material structures. Both identification problems are typically ill-posed, which presents a significant challenge for solution approaches. However, the non-unique nature of these problems also offers an important advantage for processing: By having several target structures that perform similarly well, the corresponding processes can be efficiently guided towards manufacturing the best reachable structure. In particular, we apply deep reinforcement learning for process design in combination with a multi-task learning-based optimization approach for materials design. The functionality of the approach will be demonstrated by using it to manufacture crystallographic textures with desired properties in a metal forming process.
Abstract:Research in the field of Materials Science and Engineering focuses on the design, synthesis, properties, and performance of materials. An important class of materials that is widely investigated are crystalline materials, including metals and semiconductors. Crystalline material typically contains a distinct type of defect called "dislocation". This defect significantly affects various material properties, including strength, fracture toughness, and ductility. Researchers have devoted a significant effort in recent years to understanding dislocation behavior through experimental characterization techniques and simulations, e.g., dislocation dynamics simulations. This paper presents how data from dislocation dynamics simulations can be modeled using semantic web technologies through annotating data with ontologies. We extend the already existing Dislocation Ontology by adding missing concepts and aligning it with two other domain-related ontologies (i.e., the Elementary Multi-perspective Material Ontology and the Materials Design Ontology) allowing for representing the dislocation simulation data efficiently. Moreover, we show a real-world use case by representing the discrete dislocation dynamics data as a knowledge graph (DisLocKG) that illustrates the relationship between them. We also developed a SPARQL endpoint that brings extensive flexibility to query DisLocKG.
Abstract:In this study, Cu-Cr composites were studied by nanoindentation. Arrays of indents were placed over large areas of the samples resulting in datasets consisting of several hundred measurements of Young's modulus and hardness at varying indentation depths. The unsupervised learning technique, Gaussian mixture model, was employed to analyze the data, which helped to determine the number of "mechanical phases" and the respective mechanical properties. Additionally, a cross-validation approach was introduced to infer whether the data quantity was adequate and to suggest the amount of data required for reliable predictions -- one of the often encountered but difficult to resolve issues in machine learning of materials science problems.
Abstract:Quantitative Transmission Electron Microscopy (TEM) during in-situ straining experiment is able to reveal the motion of dislocations -- linear defects in the crystal lattice of metals. In the domain of materials science, the knowledge about the location and movement of dislocations is important for creating novel materials with superior properties. A long-standing problem, however, is to identify the position and extract the shape of dislocations, which would ultimately help to create a digital twin of such materials. In this work, we quantitatively compare state-of-the-art instance segmentation methods, including Mask R-CNN and YOLOv8. The dislocation masks as the results of the instance segmentation are converted to mathematical lines, enabling quantitative analysis of dislocation length and geometry -- important information for the domain scientist, which we then propose to include as a novel length-aware quality metric for estimating the network performance. Our segmentation pipeline shows a high accuracy suitable for all domain-specific, further post-processing. Additionally, our physics-based metric turns out to perform much more consistently than typically used pixel-wise metrics.
Abstract:Determining, understanding, and predicting the so-called structure-property relation is an important task in many scientific disciplines, such as chemistry, biology, meteorology, physics, engineering, and materials science. Structure refers to the spatial distribution of, e.g., substances, material, or matter in general, while property is a resulting characteristic that usually depends in a non-trivial way on spatial details of the structure. Traditionally, forward simulations models have been used for such tasks. Recently, several machine learning algorithms have been applied in these scientific fields to enhance and accelerate simulation models or as surrogate models. In this work, we develop and investigate the applications of six machine learning techniques based on two different datasets from the domain of materials science: data from a two-dimensional Ising model for predicting the formation of magnetic domains and data representing the evolution of dual-phase microstructures from the Cahn-Hilliard model. We analyze the accuracy and robustness of all models and elucidate the reasons for the differences in their performances. The impact of including domain knowledge through tailored features is studied, and general recommendations based on the availability and quality of training data are derived from this.
Abstract:Crystalline defects, such as line-like dislocations, play an important role for the performance and reliability of many metallic devices. Their interaction and evolution still poses a multitude of open questions to materials science and materials physics. In-situ TEM experiments can provide important insights into how dislocations behave and move. During such experiments, the dislocation microstructure is captured in form of videos. The analysis of individual video frames can provide useful insights but is limited by the capabilities of automated identification, digitization, and quantitative extraction of the dislocations as curved objects. The vast amount of data also makes manual annotation very time consuming, thereby limiting the use of Deep Learning-based, automated image analysis and segmentation of the dislocation microstructure. In this work, a parametric model for generating synthetic training data for segmentation of dislocations is developed. Even though domain scientists might dismiss synthetic training images sometimes as too artificial, our findings show that they can result in superior performance, particularly regarding the generalizing of the Deep Learning models with respect to different microstructures and imaging conditions. Additionally, we propose an enhanced deep learning method optimized for segmenting overlapping or intersecting dislocation lines. Upon testing this framework on four distinct real datasets, we find that our synthetic training data are able to yield high-quality results also on real images-even more so if fine-tune on a few real images was done.
Abstract:Atomistic simulations of the molecular dynamics/statics kind are regularly used to study small scale plasticity. Contemporary simulations are performed with tens to hundreds of millions of atoms, with snapshots of these configurations written out at regular intervals for further analysis. Continuum scale constitutive models for material behavior can benefit from information on the atomic scale, in particular in terms of the deformation mechanisms, the accommodation of the total strain and partitioning of stress and strain fields in individual grains. In this work we develop a methodology using statistical data mining and machine learning algorithms to automate the analysis of continuum field variables in atomistic simulations. We focus on three important field variables: total strain, elastic strain and microrotation. Our results show that the elastic strain in individual grains exhibits a unimodal log-normal distribution, whilst the total strain and microrotation fields evidence a multimodal distribution. The peaks in the distribution of total strain are identified with a Gaussian mixture model and methods to circumvent overfitting problems are presented. Subsequently, we evaluate the identified peaks in terms of deformation mechanisms in a grain, which e.g., helps to quantify the strain for which individual deformation mechanisms are responsible. The overall statistics of the distributions over all grains are an important input for higher scale models, which ultimately also helps to be able to quantitatively discuss the implications for information transfer to phenomenological models.