Abstract:Current remote-sensing interpretation models often focus on a single task such as detection, segmentation, or caption. However, the task-specific designed models are unattainable to achieve the comprehensive multi-level interpretation of images. The field also lacks support for multi-task joint interpretation datasets. In this paper, we propose Panoptic Perception, a novel task and a new fine-grained dataset (FineGrip) to achieve a more thorough and universal interpretation for RSIs. The new task, 1) integrates pixel-level, instance-level, and image-level information for universal image perception, 2) captures image information from coarse to fine granularity, achieving deeper scene understanding and description, and 3) enables various independent tasks to complement and enhance each other through multi-task learning. By emphasizing multi-task interactions and the consistency of perception results, this task enables the simultaneous processing of fine-grained foreground instance segmentation, background semantic segmentation, and global fine-grained image captioning. Concretely, the FineGrip dataset includes 2,649 remote sensing images, 12,054 fine-grained instance segmentation masks belonging to 20 foreground things categories, 7,599 background semantic masks for 5 stuff classes and 13,245 captioning sentences. Furthermore, we propose a joint optimization-based panoptic perception model. Experimental results on FineGrip demonstrate the feasibility of the panoptic perception task and the beneficial effect of multi-task joint optimization on individual tasks. The dataset will be publicly available.
Abstract:Accurate polyp delineation in colonoscopy is crucial for assisting in diagnosis, guiding interventions, and treatments. However, current deep-learning approaches fall short due to integrity deficiency, which often manifests as missing lesion parts. This paper introduces the integrity concept in polyp segmentation at both macro and micro levels, aiming to alleviate integrity deficiency. Specifically, the model should distinguish entire polyps at the macro level and identify all components within polyps at the micro level. Our Integrity Capturing Polyp Segmentation (IC-PolypSeg) network utilizes lightweight backbones and 3 key components for integrity ameliorating: 1) Pixel-wise feature redistribution (PFR) module captures global spatial correlations across channels in the final semantic-rich encoder features. 2) Cross-stage pixel-wise feature redistribution (CPFR) module dynamically fuses high-level semantics and low-level spatial features to capture contextual information. 3) Coarse-to-fine calibration module combines PFR and CPFR modules to achieve precise boundary detection. Extensive experiments on 5 public datasets demonstrate that the proposed IC-PolypSeg outperforms 8 state-of-the-art methods in terms of higher precision and significantly improved computational efficiency with lower computational consumption. IC-PolypSeg-EF0 employs 300 times fewer parameters than PraNet while achieving a real-time processing speed of 235 FPS. Importantly, IC-PolypSeg reduces the false negative ratio on five datasets, meeting clinical requirements.
Abstract:The implementation details of factorizing the 3x4 projection matrices of linear cameras into their left matrix factors and the 4x4 homogeneous central(also parallel for infinite center cases) projection factors are presented in this work. Any full row rank 3x4 real matrix can be factorized into such basic matrices which will be called LC factors. A further extension to multiple view midpoint triangulation, for both pinhole and affine camera cases, is also presented based on such camera factorizations.
Abstract:We present algebraic projective geometry definitions of 3D rotations so as to bridge a small gap between the applications and the definitions of 3D rotations in homogeneous matrix form. A general homogeneous matrix formulation to 3D rotation geometric transformations is proposed which suits for the cases when the rotation axis is unnecessarily through the coordinate system origin given their rotation axes and rotation angles.