Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yueru Chen

Hierarchical Attention Networks for Lossless Point Cloud Attribute Compression

Apr 01, 2025

Yueru Chen, Wei Zhang, Dingquan Li, Jing Wang, Ge Li

Abstract:In this paper, we propose a deep hierarchical attention context model for lossless attribute compression of point clouds, leveraging a multi-resolution spatial structure and residual learning. A simple and effective Level of Detail (LoD) structure is introduced to yield a coarse-to-fine representation. To enhance efficiency, points within the same refinement level are encoded in parallel, sharing a common context point group. By hierarchically aggregating information from neighboring points, our attention model learns contextual dependencies across varying scales and densities, enabling comprehensive feature extraction. We also adopt normalization for position coordinates and attributes to achieve scale-invariant compression. Additionally, we segment the point cloud into multiple slices to facilitate parallel processing, further optimizing time complexity. Experimental results demonstrate that the proposed method offers better coding performance than the latest G-PCC for color and reflectance attributes while maintaining more efficient encoding and decoding runtimes.

* Accepted by DCC 2025

Via

Access Paper or Ask Questions

LSR: A Light-Weight Super-Resolution Method

Feb 27, 2023

Wei Wang, Xuejing Lei, Yueru Chen, Ming-Sui Lee, C. -C. Jay Kuo

Abstract:A light-weight super-resolution (LSR) method from a single image targeting mobile applications is proposed in this work. LSR predicts the residual image between the interpolated low-resolution (ILR) and high-resolution (HR) images using a self-supervised framework. To lower the computational complexity, LSR does not adopt the end-to-end optimization deep networks. It consists of three modules: 1) generation of a pool of rich and diversified representations in the neighborhood of a target pixel via unsupervised learning, 2) selecting a subset from the representation pool that is most relevant to the underlying super-resolution task automatically via supervised learning, 3) predicting the residual of the target pixel via regression. LSR has low computational complexity and reasonable model size so that it can be implemented on mobile/edge platforms conveniently. Besides, it offers better visual quality than classical exemplar-based methods in terms of PSNR/SSIM measures.

* 8 pages, 3 figures, 10 tables

Via

Access Paper or Ask Questions

Point Cloud Attribute Compression via Successive Subspace Graph Transform

Oct 29, 2020

Yueru Chen, Yiting Shao, Jing Wang, Ge Li, C. -C. Jay Kuo

Figure 1 for Point Cloud Attribute Compression via Successive Subspace Graph Transform

Figure 2 for Point Cloud Attribute Compression via Successive Subspace Graph Transform

Figure 3 for Point Cloud Attribute Compression via Successive Subspace Graph Transform

Figure 4 for Point Cloud Attribute Compression via Successive Subspace Graph Transform

Abstract:Inspired by the recently proposed successive subspace learning (SSL) principles, we develop a successive subspace graph transform (SSGT) to address point cloud attribute compression in this work. The octree geometry structure is utilized to partition the point cloud, where every node of the octree represents a point cloud subspace with a certain spatial size. We design a weighted graph with self-loop to describe the subspace and define a graph Fourier transform based on the normalized graph Laplacian. The transforms are applied to large point clouds from the leaf nodes to the root node of the octree recursively, while the represented subspace is expanded from the smallest one to the whole point cloud successively. It is shown by experimental results that the proposed SSGT method offers better R-D performances than the previous Region Adaptive Haar Transform (RAHT) method.

* Accepted by VCIP 2020

Via

Access Paper or Ask Questions

PixelHop++: A Small Successive-Subspace-Learning-Based (SSL-based) Model for Image Classification

Feb 08, 2020

Yueru Chen, Mozhdeh Rouhsedaghat, Suya You, Raghuveer Rao, C. -C. Jay Kuo

Figure 1 for PixelHop++: A Small Successive-Subspace-Learning-Based (SSL-based) Model for Image Classification

Figure 2 for PixelHop++: A Small Successive-Subspace-Learning-Based (SSL-based) Model for Image Classification

Figure 3 for PixelHop++: A Small Successive-Subspace-Learning-Based (SSL-based) Model for Image Classification

Figure 4 for PixelHop++: A Small Successive-Subspace-Learning-Based (SSL-based) Model for Image Classification

Abstract:The successive subspace learning (SSL) principle was developed and used to design an interpretable learning model, known as the PixelHop method,for image classification in our prior work. Here, we propose an improved PixelHop method and call it PixelHop++. First, to make the PixelHop model size smaller, we decouple a joint spatial-spectral input tensor to multiple spatial tensors (one for each spectral component) under the spatial-spectral separability assumption and perform the Saab transform in a channel-wise manner, called the channel-wise (c/w) Saab transform.Second, by performing this operation from one hop to another successively, we construct a channel-decomposed feature tree whose leaf nodes contain features of one dimension (1D). Third, these 1D features are ranked according to their cross-entropy values, which allows us to select a subset of discriminant features for image classification. In PixelHop++, one can control the learning model size of fine-granularity,offering a flexible tradeoff between the model size and the classification performance. We demonstrate the flexibility of PixelHop++ on MNIST, Fashion MNIST, and CIFAR-10 three datasets.

* 5 pages, 5 figures, 4 tables, Submitted to ICIP 2020

Via

Access Paper or Ask Questions

PixelHop: A Successive Subspace Learning (SSL) Method for Object Classification

Sep 17, 2019

Yueru Chen, C. -C. Jay Kuo

Figure 1 for PixelHop: A Successive Subspace Learning (SSL) Method for Object Classification

Figure 2 for PixelHop: A Successive Subspace Learning (SSL) Method for Object Classification

Figure 3 for PixelHop: A Successive Subspace Learning (SSL) Method for Object Classification

Figure 4 for PixelHop: A Successive Subspace Learning (SSL) Method for Object Classification

Abstract:A new machine learning methodology, called successive subspace learning (SSL), is introduced in this work. SSL contains four key ingredients: 1) successive near-to-far neighborhood expansion; 2) unsupervised dimension reduction via subspace approximation; 3) supervised dimension reduction via label-assisted regression (LAG); and 4) feature concatenation and decision making. An image-based object classification method, called PixelHop, is proposed to illustrate the SSL design. It is shown by experimental results that the PixelHop method outperforms the classic CNN model of similar model complexity in three benchmarking datasets (MNIST, Fashion MNIST and CIFAR-10). Although SSL and deep learning (DL) have some high-level concept in common, they are fundamentally different in model formulation, the training process and training complexity. Extensive discussion on the comparison of SSL and DL is made to provide further insights into the potential of SSL.

* 17 pages, 11 figures, 11 tables

Via

Access Paper or Ask Questions

Semi-supervised learning via Feedforward-Designed Convolutional Neural Networks

Feb 06, 2019

Yueru Chen, Yijing Yang, Min Zhang, C. -C. Jay Kuo

Figure 1 for Semi-supervised learning via Feedforward-Designed Convolutional Neural Networks

Figure 2 for Semi-supervised learning via Feedforward-Designed Convolutional Neural Networks

Figure 3 for Semi-supervised learning via Feedforward-Designed Convolutional Neural Networks

Figure 4 for Semi-supervised learning via Feedforward-Designed Convolutional Neural Networks

Abstract:A semi-supervised learning framework using the feedforward-designed convolutional neural networks (FF-CNNs) is proposed for image classification in this work. One unique property of FF-CNNs is that no backpropagation is used in model parameters determination. Since unlabeled data may not always enhance semi-supervised learning, we define an effective quality score and use it to select a subset of unlabeled data in the training process. We conduct experiments on the MNIST, SVHN, and CIFAR-10 datasets, and show that the proposed semi-supervised FF-CNN solution outperforms the CNN trained by backpropagation (BP-CNN) when the amount of labeled data is reduced. Furthermore, we develop an ensemble system that combines the output decision vectors of different semi-supervised FF-CNNs to boost classification accuracy. The ensemble systems can achieve further performance gains on all three benchmarking datasets.

* 5 pages, under review of ICIP 2019

Via

Access Paper or Ask Questions

Ensembles of feedforward-designed convolutional neural networks

Jan 08, 2019

Yueru Chen, Yijing Yang, Wei Wang, C. -C. Jay Kuo

Figure 1 for Ensembles of feedforward-designed convolutional neural networks

Figure 2 for Ensembles of feedforward-designed convolutional neural networks

Figure 3 for Ensembles of feedforward-designed convolutional neural networks

Figure 4 for Ensembles of feedforward-designed convolutional neural networks

Abstract:An ensemble method that fuses the output decision vectors of multiple feedforward-designed convolutional neural networks (FF-CNNs) to solve the image classification problem is proposed in this work. To enhance the performance of the ensemble system, it is critical to increasing the diversity of FF-CNN models. To achieve this objective, we introduce diversities by adopting three strategies: 1) different parameter settings in convolutional layers, 2) flexible feature subsets fed into the Fully-connected (FC) layers, and 3) multiple image embeddings of the same input source. Furthermore, we partition input samples into easy and hard ones based on their decision confidence scores. As a result, we can develop a new ensemble system tailored to hard samples to further boost classification accuracy. Experiments are conducted on the MNIST and CIFAR-10 datasets to demonstrate the effectiveness of the ensemble method.

Via

Access Paper or Ask Questions

Towards Visible and Thermal Drone Monitoring with Convolutional Neural Networks

Dec 19, 2018

Ye Wang, Yueru Chen, Jongmoo Choi, C. -C. Jay Kuo

Figure 1 for Towards Visible and Thermal Drone Monitoring with Convolutional Neural Networks

Figure 2 for Towards Visible and Thermal Drone Monitoring with Convolutional Neural Networks

Figure 3 for Towards Visible and Thermal Drone Monitoring with Convolutional Neural Networks

Figure 4 for Towards Visible and Thermal Drone Monitoring with Convolutional Neural Networks

Abstract:This paper reports a visible and thermal drone monitoring system that integrates deep-learning-based detection and tracking modules. The biggest challenge in adopting deep learning methods for drone detection is the paucity of training drone images especially thermal drone images. To address this issue, we develop two data augmentation techniques. One is a model-based drone augmentation technique that automatically generates visible drone images with a bounding box label on the drone's location. The other is exploiting an adversarial data augmentation methodology to create thermal drone images. To track a small flying drone, we utilize the residual information between consecutive image frames. Finally, we present an integrated detection and tracking system that outperforms the performance of each individual module containing detection or tracking only. The experiments show that even being trained on synthetic data, the proposed system performs well on real-world drone images with complex background. The USC drone detection and tracking dataset with user labeled bounding boxes is available to the public.

* 12 pages, 18 figures, journal. arXiv admin note: substantial text overlap with arXiv:1712.00863

Via

Access Paper or Ask Questions

Unsupervised Video Object Segmentation with Distractor-Aware Online Adaptation

Dec 19, 2018

Ye Wang, Jongmoo Choi, Yueru Chen, Siyang Li, Qin Huang, Kaitai Zhang, Ming-Sui Lee, C. -C. Jay Kuo

Figure 1 for Unsupervised Video Object Segmentation with Distractor-Aware Online Adaptation

Figure 2 for Unsupervised Video Object Segmentation with Distractor-Aware Online Adaptation

Figure 3 for Unsupervised Video Object Segmentation with Distractor-Aware Online Adaptation

Figure 4 for Unsupervised Video Object Segmentation with Distractor-Aware Online Adaptation

Abstract:Unsupervised video object segmentation is a crucial application in video analysis without knowing any prior information about the objects. It becomes tremendously challenging when multiple objects occur and interact in a given video clip. In this paper, a novel unsupervised video object segmentation approach via distractor-aware online adaptation (DOA) is proposed. DOA models spatial-temporal consistency in video sequences by capturing background dependencies from adjacent frames. Instance proposals are generated by the instance segmentation network for each frame and then selected by motion information as hard negatives if they exist and positives. To adopt high-quality hard negatives, the block matching algorithm is then applied to preceding frames to track the associated hard negatives. General negatives are also introduced in case that there are no hard negatives in the sequence and experiments demonstrate both kinds of negatives (distractors) are complementary. Finally, we conduct DOA using the positive, negative, and hard negative masks to update the foreground/background segmentation. The proposed approach achieves state-of-the-art results on two benchmark datasets, DAVIS 2016 and FBMS-59 datasets.

* 11 pages, 6 figures, 4 tables, conference

Via

Access Paper or Ask Questions

Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Dec 13, 2018

Ye Wang, Jongmoo Choi, Yueru Chen, Qin Huang, Siyang Li, Ming-Sui Lee, C. -C. Jay Kuo

Figure 1 for Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Figure 2 for Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Figure 3 for Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Figure 4 for Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Abstract:One major technique debt in video object segmentation is to label the object masks for training instances. As a result, we propose to prepare inexpensive, yet high quality pseudo ground truth corrected with motion cue for video object segmentation training. Our method conducts semantic segmentation using instance segmentation networks and, then, selects the segmented object of interest as the pseudo ground truth based on the motion information. Afterwards, the pseudo ground truth is exploited to finetune the pretrained objectness network to facilitate object segmentation in the remaining frames of the video. We show that the pseudo ground truth could effectively improve the segmentation performance. This straightforward unsupervised video object segmentation method is more efficient than existing methods. Experimental results on DAVIS and FBMS show that the proposed method outperforms state-of-the-art unsupervised segmentation methods on various benchmark datasets. And the category-agnostic pseudo ground truth has great potential to extend to multiple arbitrary object tracking.

* 16 pages, 7 figures, 6 tables, conference

Via

Access Paper or Ask Questions