Abstract:Cross-Domain Few-Shot Object Detection (CD-FSOD) poses significant challenges to existing object detection and few-shot detection models when applied across domains. In conjunction with NTIRE 2025, we organized the 1st CD-FSOD Challenge, aiming to advance the performance of current object detectors on entirely novel target domains with only limited labeled data. The challenge attracted 152 registered participants, received submissions from 42 teams, and concluded with 13 teams making valid final submissions. Participants approached the task from diverse perspectives, proposing novel models that achieved new state-of-the-art (SOTA) results under both open-source and closed-source settings. In this report, we present an overview of the 1st NTIRE 2025 CD-FSOD Challenge, highlighting the proposed solutions and summarizing the results submitted by the participants.
Abstract:Learning-based 3D shape segmentation is usually formulated as a semantic labeling problem, assuming that all parts of training shapes are annotated with a given set of tags. This assumption, however, is impractical for learning fine-grained segmentation. Although most off-the-shelf CAD models are, by construction, composed of fine-grained parts, they usually miss semantic tags and labeling those fine-grained parts is extremely tedious. We approach the problem with deep clustering, where the key idea is to learn part priors from a shape dataset with fine-grained segmentation but no part labels. Given point sampled 3D shapes, we model the clustering priors of points with a similarity matrix and achieve part segmentation through minimizing a novel low rank loss. To handle highly densely sampled point sets, we adopt a divide-and-conquer strategy. We partition the large point set into a number of blocks. Each block is segmented using a deep-clustering-based part prior network trained in a category-agnostic manner. We then train a graph convolution network to merge the segments of all blocks to form the final segmentation result. Our method is evaluated with a challenging benchmark of fine-grained segmentation, showing state-of-the-art performance.
Abstract:Part mobility analysis is a significant aspect required to achieve a functional understanding of 3D objects. It would be natural to obtain part mobility from the continuous part motion of 3D objects. In this study, we introduce a self-supervised method for segmenting motion parts and predicting their motion attributes from a point cloud sequence representing a dynamic object. To sufficiently utilize spatiotemporal information from the point cloud sequence, we generate trajectories by using correlations among successive frames of the sequence instead of directly processing the point clouds. We propose a novel neural network architecture called PointRNN to learn feature representations of trajectories along with their part rigid motions. We evaluate our method on various tasks including motion part segmentation, motion axis prediction and motion range estimation. The results demonstrate that our method outperforms previous techniques on both synthetic and real datasets. Moreover, our method has the ability to generalize to new and unseen objects. It is important to emphasize that it is not required to know any prior shape structure, prior shape category information, or shape orientation. To the best of our knowledge, this is the first study on deep learning to extract part mobility from point cloud sequence of a dynamic object.