Abstract:Anomaly Detection is an important problem within computer vision, having variety of real-life applications. Yet, the current set of solutions to this problem entail known, systematic shortcomings. Specifically, contemporary surface Anomaly Detection task assumes the presence of multiple specific anomaly classes e.g. cracks, rusting etc., unlike one-class classification model of past. However, building a deep learning model in such setup remains a challenge because anomalies arise rarely, and hence anomaly samples are quite scarce. Transfer learning has been a preferred paradigm in such situations. But the typical source domains with large dataset sizes e.g. ImageNet, JFT-300M, LAION-2B do not correlate well with the domain of surfaces and materials, an important premise of transfer learning. In this paper, we make an important hypothesis and show, by exhaustive experimentation, that the space of anomaly-free visual patterns of the normal samples correlates well with each of the various spaces of anomalous patterns of the class-specific anomaly samples. The first results of using this hypothesis in transfer learning have indeed been quite encouraging. We expect that finding such a simple closeby domain that readily entails large number of samples, and which also oftentimes shows interclass separability though with narrow margins, will be a useful discovery. Especially, it is expected to improve domain adaptation for anomaly detection, and few-shot learning for anomaly detection, making in-the-wild anomaly detection realistically possible in future.
Abstract:Generative Adversarial Networks (GANs) have been workhorse generative models for last many years, especially in the research field of computer vision. Accordingly, there have been many significant advancements in the theory and application of GAN models, which are notoriously hard to train, but produce good results if trained well. There have been many a surveys on GANs, organizing the vast GAN literature from various focus and perspectives. However, none of the surveys brings out the important chronological aspect: how the multiple challenges of employing GAN models were solved one-by-one over time, across multiple landmark research works. This survey intends to bridge that gap and present some of the landmark research works on the theory and application of GANs, in chronological order.
Abstract:Anomaly detection and localization is an important vision problem, having multiple applications. Effective and generic semantic segmentation of anomalous regions on various different surfaces, where most anomalous regions inherently do not have any obvious pattern, is still under active research. Periodic health monitoring and fault (anomaly) detection in vast infrastructures, which is an important safety-related task, is one such application area of vision-based anomaly segmentation. However, the task is quite challenging due to large variations in surface faults, texture-less construction material/background, lighting conditions etc. Cracks are critical and frequent surface faults that manifest as extreme zigzag-shaped thin, elongated regions. They are among the hardest faults to detect, even with deep learning. In this work, we address an open aspect of automatic crack segmentation problem, that of generalizing and improving the performance of segmentation across a variety of scenarios, by modeling the problem differently. We carefully study and abstract the sub-problems involved and solve them in a broader context, making our solution generic. On a variety of datasets related to surveillance of different infrastructures, under varying conditions, our model consistently outperforms the state-of-the-art algorithms by a significant margin, without any bells-and-whistles. This performance advantage easily carried over in two deployments of our model, tested against industry-provided datasets. Even further, we could establish our model's performance for two manufacturing quality inspection scenarios as well, where the defect types are not just crack equivalents, but much more and different. Hence we hope that our model is indeed a truly generic defect segmentation model.
Abstract:While using drones for remote surveillance missions, it is mandatory to do path planning of the vehicle since these are pilot-less vehicles. Path planning, whether offline or online, entails setting up the path as a sequence of locations in the 3D Euclidean space, whose coordinates happen to be latitude, longitude and altitude. For the specific application of remote surveillance of long linear infrastructures in non-urban terrain, the continuous 3D-ESP problem practically entails two important scalar costs. The first scalar cost is the distance traveled along the planned path. Since drones are battery operated, hence it is needed that the path length between fixed start and goal locations of a mission should be minimal at all costs. The other scalar cost is the cost of transmitting the acquired video during the mission of remote surveillance, via a camera mounted in the drone's belly. Because of the length of surveillance target which is long linear infrastructure, the amount of video generated is very high and cannot be generally stored in its entirety, on board. If the connectivity is poor along certain segments of a naive path, to boost video transmission rate, the transmission power of the signal is kept high, which in turn dissipates more battery energy. Hence a path is desired that simultaneously also betters what is known as communication cost. These two costs trade-off, and hence Pareto optimization is needed for this 3D bi-objective Euclidean shortest path problem. In this report, we study the mono-objective offline path planning problem, based on the distance cost, while posing the communication cost as an upper-bounded constraint. The bi-objective path planning solution is sketched out towards the end.