Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chanho Lee

FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection

Dec 22, 2023

Chanho Lee, Jinsu Son, Hyounguk Shon, Yunho Jeon, Junmo Kim

Abstract:Rotation-equivariance is an essential yet challenging property in oriented object detection. While general object detectors naturally leverage robustness to spatial shifts due to the translation-equivariance of the conventional CNNs, achieving rotation-equivariance remains an elusive goal. Current detectors deploy various alignment techniques to derive rotation-invariant features, but still rely on high capacity models and heavy data augmentation with all possible rotations. In this paper, we introduce a Fully Rotation-Equivariant Oriented Object Detector (FRED), whose entire process from the image to the bounding box prediction is strictly equivariant. Specifically, we decouple the invariant task (object classification) and the equivariant task (object localization) to achieve end-to-end equivariance. We represent the bounding box as a set of rotation-equivariant vectors to implement rotation-equivariant localization. Moreover, we utilized these rotation-equivariant vectors as offsets in the deformable convolution, thereby enhancing the existing advantages of spatial adaptation. Leveraging full rotation-equivariance, our FRED demonstrates higher robustness to image-level rotation compared to existing methods. Furthermore, we show that FRED is one step closer to non-axis aligned learning through our experiments. Compared to state-of-the-art methods, our proposed method delivers comparable performance on DOTA-v1.0 and outperforms by 1.5 mAP on DOTA-v1.5, all while significantly reducing the model parameters to 16%.

* Accepted to the 38th Annual AAAI Conference on Artificial Intelligence (AAAI24),Vancouver, British Columbia, 2024

Via

Access Paper or Ask Questions

Projection-based Point Convolution for Efficient Point Cloud Segmentation

Feb 04, 2022

Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim

Figure 1 for Projection-based Point Convolution for Efficient Point Cloud Segmentation

Figure 2 for Projection-based Point Convolution for Efficient Point Cloud Segmentation

Figure 3 for Projection-based Point Convolution for Efficient Point Cloud Segmentation

Figure 4 for Projection-based Point Convolution for Efficient Point Cloud Segmentation

Abstract:Understanding point cloud has recently gained huge interests following the development of 3D scanning devices and the accumulation of large-scale 3D data. Most point cloud processing algorithms can be classified as either point-based or voxel-based methods, both of which have severe limitations in processing time or memory, or both. To overcome these limitations, we propose Projection-based Point Convolution (PPConv), a point convolutional module that uses 2D convolutions and multi-layer perceptrons (MLPs) as its components. In PPConv, point features are processed through two branches: point branch and projection branch. Point branch consists of MLPs, while projection branch transforms point features into a 2D feature map and then apply 2D convolutions. As PPConv does not use point-based or voxel-based convolutions, it has advantages in fast point cloud processing. When combined with a learnable projection and effective feature fusion strategy, PPConv achieves superior efficiency compared to state-of-the-art methods, even with a simple architecture based on PointNet++. We demonstrate the efficiency of PPConv in terms of the trade-off between inference time and segmentation performance. The experimental results on S3DIS and ShapeNetPart show that PPConv is the most efficient method among the compared ones. The code is available at github.com/pahn04/PPConv.

* Published in IEEE Access (Early Access)

Via

Access Paper or Ask Questions

PBP-Net: Point Projection and Back-Projection Network for 3D Point Cloud Segmentation

Nov 02, 2020

JuYoung Yang, Chanho Lee, Pyunghwan Ahn, Haeil Lee, Eojindl Yi, Junmo Kim

Figure 1 for PBP-Net: Point Projection and Back-Projection Network for 3D Point Cloud Segmentation

Figure 2 for PBP-Net: Point Projection and Back-Projection Network for 3D Point Cloud Segmentation

Figure 3 for PBP-Net: Point Projection and Back-Projection Network for 3D Point Cloud Segmentation

Figure 4 for PBP-Net: Point Projection and Back-Projection Network for 3D Point Cloud Segmentation

Abstract:Following considerable development in 3D scanning technologies, many studies have recently been proposed with various approaches for 3D vision tasks, including some methods that utilize 2D convolutional neural networks (CNNs). However, even though 2D CNNs have achieved high performance in many 2D vision tasks, existing works have not effectively applied them onto 3D vision tasks. In particular, segmentation has not been well studied because of the difficulty of dense prediction for each point, which requires rich feature representation. In this paper, we propose a simple and efficient architecture named point projection and back-projection network (PBP-Net), which leverages 2D CNNs for the 3D point cloud segmentation. 3 modules are introduced, each of which projects 3D point cloud onto 2D planes, extracts features using a 2D CNN backbone, and back-projects features onto the original 3D point cloud. To demonstrate effective 3D feature extraction using 2D CNN, we perform various experiments including comparison to recent methods. We analyze the proposed modules through ablation studies and perform experiments on object part segmentation (ShapeNet-Part dataset) and indoor scene semantic segmentation (S3DIS dataset). The experimental results show that proposed PBP-Net achieves comparable performance to existing state-of-the-art methods.

* 7 pages, accepted by IROS2020

Via

Access Paper or Ask Questions