Abstract:Object re-identification (ReID) in large camera networks has many challenges. First, the similar appearances of objects degrade ReID performances. This challenge cannot be addressed by existing appearance-based ReID methods. Second, most ReID studies are performed in laboratory settings and do not consider ReID problems in real-world scenarios. To overcome these challenges, we introduce a novel ReID framework that leverages a spatial-temporal fusion network and causal identity matching (CIM). The framework estimates camera network topology using the proposed adaptive Parzen window and combines appearance features with spatial-temporal cue within the Fusion Network. It achieved outstanding performance across several datasets, including VeRi776, Vehicle-3I, and Market-1501, achieving up to 99.70% rank-1 accuracy and 95.5% mAP. Furthermore, the proposed CIM approach, which dynamically assigns gallery sets based on the camera network topology, further improved ReID accuracy and robustness in real-world settings, evidenced by a 94.95% mAP and 95.19% F1 score on the Vehicle-3I dataset. The experimental results support the effectiveness of incorporating spatial-temporal information and CIM for real-world ReID scenarios regardless of the data domain (e.g., vehicle, person).
Abstract:The flatfish is a major farmed species consumed globally in large quantities. However, due to the densely populated farming environment, flatfish are susceptible to injuries and diseases, making early disease detection crucial. Traditionally, diseases were detected through visual inspection, but observing large numbers of fish is challenging. Automated approaches based on deep learning technologies have been widely used, to address this problem, but accurate detection remains difficult due to the diversity of the fish and the lack of the fish disease dataset. In this study, augments fish disease images using generative adversarial networks and image harmonization methods. Next, disease detectors are trained separately for three body parts (head, fins, and body) to address individual diseases properly. In addition, a flatfish disease image dataset called \texttt{FlatIMG} is created and verified on the dataset using the proposed methods. A flash salmon disease dataset is also tested to validate the generalizability of the proposed methods. The results achieved 12\% higher performance than the baseline framework. This study is the first attempt to create a large-scale flatfish disease image dataset and propose an effective disease detection framework. Automatic disease monitoring could be achieved in farming environments based on the proposed methods and dataset.
Abstract:Vehicle re-identification (ReID) in a large-scale camera network is important in public safety, traffic control, and security. However, due to the appearance ambiguities of vehicle, the previous appearance-based ReID methods often fail to track vehicle across multiple cameras. To overcome the challenge, we propose a spatial-temporal vehicle ReID framework that estimates reliable camera network topology based on the adaptive Parzen window method and optimally combines the appearance and spatial-temporal similarities through the fusion network. Based on the proposed methods, we performed superior performance on the public dataset (VeRi776) by 99.64% of rank-1 accuracy. The experimental results support that utilizing spatial and temporal information for ReID can leverage the accuracy of appearance-based methods and effectively deal with appearance ambiguities.
Abstract:In this paper, we propose a novel evaluation metric for performance evaluation of semantic segmentation. In recent years, many studies have tried to train pixel-level classifiers on large-scale image datasets to perform accurate semantic segmentation. The goal of semantic segmentation is to assign a class label of each pixel in the scene. It has various potential applications in computer vision fields e.g., object detection, classification, scene understanding and Etc. To validate the proposed wIoU evaluation metric, we tested state-of-the art methods on public benchmark datasets (e.g., KITTI) based on the proposed wIoU metric and compared with other conventional evaluation metrics.
Abstract:In this work, we present a framework for product quality inspection based on deep learning techniques. First, we categorize several deep learning models that can be applied to product inspection systems. Also we explain entire steps for building a deep learning-based inspection system in great detail. Second, we address connection schemes that efficiently link the deep learning models to the product inspection systems. Finally, we propose an effective method that can maintain and enhance the deep learning models of the product inspection system. It has good system maintenance and stability due to the proposed methods. All the proposed methods are integrated in a unified framework and we provide detailed explanations of each proposed method. In order to verify the effectiveness of the proposed system, we compared and analyzed the performance of methods in various test scenarios.
Abstract:In this paper, we propose a novel distance-based camera network topology inference method for efficient person re-identification. To this end, we first calibrate each camera and estimate relative scales between cameras. Using the calibration results of multiple cameras, we calculate the speed of each person and infer the distance between cameras to generate distance-based camera network topology. The proposed distance-based topology can be applied adaptively to each person according to its speed and handle diverse transition time of people between non-overlapping cameras. To validate the proposed method, we tested the proposed method using an open person re-identification dataset and compared to state-of-the-art methods. The experimental results show that the proposed method is effective for person re-identification in the large-scale camera network with various people transition time.
Abstract:Person re-identification is the task of recognizing or identifying a person across multiple views in multi-camera networks. Although there has been much progress in person re-identification, person re-identification in large-scale multi-camera networks still remains a challenging task because of the large spatio-temporal uncertainty and high complexity due to a large number of cameras and people. To handle these difficulties, additional information such as camera network topology should be provided, which is also difficult to automatically estimate, unfortunately. In this study, we propose a unified framework which jointly solves both person re-identification and camera network topology inference problems with minimal prior knowledge about the environments. The proposed framework takes general multi-camera network environments into account and can be applied to online person re-identification in large-scale multi-camera networks. In addition, to effectively show the superiority of the proposed framework, we provide a new person re-identification dataset with full annotations, named SLP, captured in the multi-camera network consisting of nine non-overlapping cameras. Experimental results using our person re-identification and public datasets show that the proposed methods are promising for both person re-identification and camera topology inference tasks.
Abstract:Person re-identification in large-scale multi-camera networks is a challenging task because of the spatio-temporal uncertainty and high complexity due to large numbers of cameras and people. To handle these difficulties, additional information such as camera network topology should be provided, which is also difficult to automatically estimate. In this paper, we propose a unified framework which jointly solves both person re-id and camera network topology inference problems. The proposed framework takes general multi-camera network environments into account. To effectively show the superiority of the proposed framework, we also provide a new person re-id dataset with full annotations, named SLP, captured in the synchronized multi-camera network. Experimental results show that the proposed methods are promising for both person re-id and camera topology inference tasks.
Abstract:Person re-identification is the problem of recognizing people across different images or videos with non-overlapping views. Although there has been much progress in person re-identification over the last decade, it remains a challenging task because appearances of people can seem extremely different across diverse camera viewpoints and person poses. In this paper, we propose a novel framework for person re-identification by analyzing camera viewpoints and person poses in a so-called Pose-aware Multi-shot Matching (PaMM), which robustly estimates people's poses and efficiently conducts multi-shot matching based on pose information. Experimental results using public person re-identification datasets show that the proposed methods outperform state-of-the-art methods and are promising for person re-identification from diverse viewpoints and pose variances.