Abstract:Despite the promising performance of current video segmentation models on existing benchmarks, these models still struggle with complex scenes. In this paper, we introduce the 6th Large-scale Video Object Segmentation (LSVOS) challenge in conjunction with ECCV 2024 workshop. This year's challenge includes two tasks: Video Object Segmentation (VOS) and Referring Video Object Segmentation (RVOS). In this year, we replace the classic YouTube-VOS and YouTube-RVOS benchmark with latest datasets MOSE, LVOS, and MeViS to assess VOS under more challenging complex environments. This year's challenge attracted 129 registered teams from more than 20 institutes across over 8 countries. This report include the challenge and dataset introduction, and the methods used by top 7 teams in two tracks. More details can be found in our homepage https://lsvos.github.io/.
Abstract:Video object segmentation is a challenging task that serves as the cornerstone of numerous downstream applications, including video editing and autonomous driving. In this technical report, we briefly introduce the solution of our team "yuanjie" for video object segmentation in the 6-th LSVOS Challenge VOS Track at ECCV 2024. We believe that our proposed CSS-Segment will perform better in videos of complex object motion and long-term presentation. In this report, we successfully validated the effectiveness of the CSS-Segment in video object segmentation. Finally, our method achieved a J\&F score of 80.84 in and test phases, and ultimately ranked 2nd in the 6-th LSVOS Challenge VOS Track at ECCV 2024.
Abstract:This study proposes a method based on lightweight convolutional neural networks (CNN) and generative adversarial networks (GAN) for apple ripeness and damage level detection tasks. Initially, a lightweight CNN model is designed by optimizing the model's depth and width, as well as employing advanced model compression techniques, successfully reducing the model's parameter and computational requirements, thus enhancing real-time performance in practical applications. Simultaneously, attention mechanisms are introduced, dynamically adjusting the importance of different feature layers to improve the performance in object detection tasks. To address the issues of sample imbalance and insufficient sample size, GANs are used to generate realistic apple images, expanding the training dataset and enhancing the model's recognition capability when faced with apples of varying ripeness and damage levels. Furthermore, by applying the object detection network for damage location annotation on damaged apples, the accuracy of damage level detection is improved, providing a more precise basis for decision-making. Experimental results show that in apple ripeness grading detection, the proposed model achieves 95.6\%, 93.8\%, 95.0\%, and 56.5 in precision, recall, accuracy, and FPS, respectively. In apple damage level detection, the proposed model reaches 95.3\%, 93.7\%, and 94.5\% in precision, recall, and mAP, respectively. In both tasks, the proposed method outperforms other mainstream models, demonstrating the excellent performance and high practical value of the proposed method in apple ripeness and damage level detection tasks.
Abstract:Support vector regression (SVR) is one of the most popular machine learning algorithms aiming to generate the optimal regression curve through maximizing the minimal margin of selected training samples, i.e., support vectors. Recent researchers reveal that maximizing the margin distribution of whole training dataset rather than the minimal margin of a few support vectors, is prone to achieve better generalization performance. However, the margin distribution support vector regression machines suffer difficulties resulted from solving a non-convex quadratic optimization, compared to the margin distribution strategy for support vector classification, This paper firstly proposes a maximal margin distribution model for SVR(MMD-SVR), then implementing coupled constrain factor to convert the non-convex quadratic optimization to a convex problem with linear constrains, which enhance the training feasibility and efficiency for SVR to derived from maximizing the margin distribution. The theoretical and empirical analysis illustrates the superiority of MMD-SVR. In addition, numerical experiments show that MMD-SVR could significantly improve the accuracy of prediction and generate more smooth regression curve with better generalization compared with the classic SVR.