Abstract:Anomaly detection in manufacturing pipelines remains a critical challenge, intensified by the complexity and variability of industrial environments. This paper introduces AssemAI, an interpretable image-based anomaly detection system tailored for smart manufacturing pipelines. Our primary contributions include the creation of a tailored image dataset and the development of a custom object detection model, YOLO-FF, designed explicitly for anomaly detection in manufacturing assembly environments. Utilizing the preprocessed image dataset derived from an industry-focused rocket assembly pipeline, we address the challenge of imbalanced image data and demonstrate the importance of image-based methods in anomaly detection. The proposed approach leverages domain knowledge in data preparation, model development and reasoning. We compare our method against several baselines, including simple CNN and custom Visual Transformer (ViT) models, showcasing the effectiveness of our custom data preparation and pretrained CNN integration. Additionally, we incorporate explainability techniques at both user and model levels, utilizing ontology for user-friendly explanations and SCORE-CAM for in-depth feature and model analysis. Finally, the model was also deployed in a real-time setting. Our results include ablation studies on the baselines, providing a comprehensive evaluation of the proposed system. This work highlights the broader impact of advanced image-based anomaly detection in enhancing the reliability and efficiency of smart manufacturing processes.
Abstract:As people become more aware of their food choices, food computation models have become increasingly popular in assisting people in maintaining healthy eating habits. For example, food recommendation systems analyze recipe instructions to assess nutritional contents and provide recipe recommendations. The recent and remarkable successes of generative AI methods, such as auto-regressive large language models, can lead to robust methods for a more comprehensive understanding of recipes for healthy food recommendations beyond surface-level nutrition content assessments. In this study, we explore the use of generative AI methods to extend current food computation models, primarily involving the analysis of nutrition and ingredients, to also incorporate cooking actions (e.g., add salt, fry the meat, boil the vegetables, etc.). Cooking actions are notoriously hard to model using statistical learning methods due to irregular data patterns - significantly varying natural language descriptions for the same action (e.g., marinate the meat vs. marinate the meat and leave overnight) and infrequently occurring patterns (e.g., add salt occurs far more frequently than marinating the meat). The prototypical approach to handling irregular data patterns is to increase the volume of data that the model ingests by orders of magnitude. Unfortunately, in the cooking domain, these problems are further compounded with larger data volumes presenting a unique challenge that is not easily handled by simply scaling up. In this work, we propose novel aggregation-based generative AI methods, Cook-Gen, that reliably generate cooking actions from recipes, despite difficulties with irregular data patterns, while also outperforming Large Language Models and other strong baselines.