The University of Tokyo
Abstract:In this study, we propose Explainable Multimodal Machine Learning (EMML), which integrates the analysis of diverse data types (multimodal data) using factor analysis for feature extraction with Explainable AI (XAI), for carbon nanotube (CNT) fibers prepared from aqueous dispersions. This method is a powerful approach to elucidate the mechanisms governing material properties, where multi-stage fabrication conditions and multiscale structures have complex influences. Thus, in our case, this approach helps us understand how different processing steps and structures at various scales impact the final properties of CNT fibers. The analysis targeted structures ranging from the nanoscale to the macroscale, including aggregation size distributions of CNT dispersions and the effective length of CNTs. Furthermore, because some types of data were difficult to interpret using standard methods, challenging-to-interpret distribution data were analyzed using Negative Matrix Factorization (NMF) for extracting key features that determine the outcome. Contribution analysis with SHapley Additive exPlanations (SHAP) demonstrated that small, uniformly distributed aggregates are crucial for improving fracture strength, while CNTs with long effective lengths are significant factors for enhancing electrical conductivity. The analysis also identified thresholds and trends for these key factors to assist in defining the conditions needed to optimize CNT fiber properties. EMML is not limited to CNT fibers but can be applied to the design of other materials derived from nanomaterials, making it a useful tool for developing a wide range of advanced materials. This approach provides a foundation for advancing data-driven materials research.
Abstract:The kernel method is a potential approach to analyzing structured data such as sequences, trees, and graphs; however, unordered trees have not been investigated extensively. Kimura et al. (2011) proposed a kernel function for unordered trees on the basis of their subpaths, which are vertical substructures of trees responsible for hierarchical information in them. Their kernel exhibits practically good performance in terms of accuracy and speed; however, linear-time computation is not guaranteed theoretically, unlike the case of the other unordered tree kernel proposed by Vishwanathan and Smola (2003). In this paper, we propose a theoretically guaranteed linear-time kernel computation algorithm that is practically fast, and we present an efficient prediction algorithm whose running time depends only on the size of the input tree. Experimental results show that the proposed algorithms are quite efficient in practice.