Abstract:The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer, we inspect thousands of masks from COCO (2017 version) and uncover different types of errors such as imprecise mask boundaries, non-exhaustively annotated instances, and mislabeled masks. Due to the prevalence of COCO, we choose to correct these errors to maintain continuity with prior research. We develop COCO-ReM (Refined Masks), a cleaner set of annotations with visibly better mask quality than COCO-2017. We evaluate fifty object detectors and find that models that predict visually sharper masks score higher on COCO-ReM, affirming that they were being incorrectly penalized due to errors in COCO-2017. Moreover, our models trained using COCO-ReM converge faster and score higher than their larger variants trained using COCO-2017, highlighting the importance of data quality in improving object detectors. With these findings, we advocate using COCO-ReM for future object detection research. Our dataset is available at https://cocorem.xyz
Abstract:Dynamical analysis of manufacturing and natural systems provides critical information about production of manufactured and natural resources respectively, thus playing an important role in assessing sustainability of these systems. However, current dynamic models for these systems exist as mechanistic models, simulation of which is computationally intensive and does not provide a simplified understanding of the mechanisms driving the overall dynamics. For such systems, lower-order models can prove useful to enable sustainability analysis through coupled dynamical analysis. There have been few attempts at finding low-order models of manufacturing and natural systems, with existing work focused on model development of individual mechanism level. This work seeks to fill this current gap in the literature of developing simplified dynamical models for these systems by developing reduced-order models using a machine learning (ML) approach. The approach is demonstrated on an entire soybean-oil to soybean-diesel process plant and a lake system. We use a grey-box ML method with a standard nonlinear optimization approach to identify relevant models of governing dynamics as ODEs using the data simulated from mechanistic models. Results show that the method identifies a high accuracy linear ODE models for the process plant, reflective of underlying linear stoichiometric mechanisms and mass balance driving the dynamics. For the natural systems, we modify the ML approach to include the effect of past dynamics, which gives non-linear ODE. While the modified approach provides a better match to dynamics of stream flow, it falls short of completely recreating the dynamics. We conclude that the proposed ML approach work well for systems where dynamics is smooth, such as in manufacturing plant whereas does not work perfectly well in case of chaotic dynamics such as water stream flow.
Abstract:Machine learning recently has been used to identify the governing equations for dynamics in physical systems. The promising results from applications on systems such as fluid dynamics and chemical kinetics inspire further investigation of these methods on complex engineered systems. Dynamics of these systems play a crucial role in design and operations. Hence, it would be advantageous to learn about the mechanisms that may be driving the complex dynamics of systems. In this work, our research question was aimed at addressing this open question about applicability and usefulness of novel machine learning approach in identifying the governing dynamical equations for engineered systems. We focused on distillation column which is an ubiquitous unit operation in chemical engineering and demonstrates complex dynamics i.e. it's dynamics is a combination of heuristics and fundamental physical laws. We tested the method of Sparse Identification of Non-Linear Dynamics (SINDy) because of it's ability to produce white-box models with terms that can be used for physical interpretation of dynamics. Time series data for dynamics was generated from simulation of distillation column using ASPEN Dynamics. One promising result was reduction of number of equations for dynamic simulation from 1000s in ASPEN to only 13 - one for each state variable. Prediction accuracy was high on the test data from system within the perturbation range, however outside perturbation range equations did not perform well. In terms of physical law extraction, some terms were interpretable as related to Fick's law of diffusion (with concentration terms) and Henry's law (with ratio of concentration and pressure terms). While some terms were interpretable, we conclude that more research is needed on combining engineering systems with machine learning approach to improve understanding of governing laws for unknown dynamics.