Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Feiyu Chen

FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution

Feb 10, 2025

Qiang Zhu, Fan Zhang, Feiyu Chen, Shuyuan Zhu, David Bull, Bing Zeng

Figure 1 for FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution

Figure 2 for FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution

Figure 3 for FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution

Figure 4 for FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution

Abstract:Compressed video super-resolution (SR) aims to generate high-resolution (HR) videos from the corresponding low-resolution (LR) compressed videos. Recently, some compressed video SR methods attempt to exploit the spatio-temporal information in the frequency domain, showing great promise in super-resolution performance. However, these methods do not differentiate various frequency subbands spatially or capture the temporal frequency dynamics, potentially leading to suboptimal results. In this paper, we propose a deep frequency-based compressed video SR model (FCVSR) consisting of a motion-guided adaptive alignment (MGAA) network and a multi-frequency feature refinement (MFFR) module. Additionally, a frequency-aware contrastive loss is proposed for training FCVSR, in order to reconstruct finer spatial details. The proposed model has been evaluated on three public compressed video super-resolution datasets, with results demonstrating its effectiveness when compared to existing works in terms of super-resolution performance (up to a 0.14dB gain in PSNR over the second-best model) and complexity.

Via

Access Paper or Ask Questions

OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal

Aug 21, 2024

Qiao Mo, Yukang Ding, Jinhua Hao, Qiang Zhu, Ming Sun, Chao Zhou, Feiyu Chen, Shuyuan Zhu

Figure 1 for OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal

Figure 2 for OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal

Figure 3 for OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal

Figure 4 for OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal

Abstract:Deep learning-based methods have shown remarkable performance in single JPEG artifacts removal task. However, existing methods tend to degrade on double JPEG images, which are prevalent in real-world scenarios. To address this issue, we propose Offset-Aware Partition Transformer for double JPEG artifacts removal, termed as OAPT. We conduct an analysis of double JPEG compression that results in up to four patterns within each 8x8 block and design our model to cluster the similar patterns to remedy the difficulty of restoration. Our OAPT consists of two components: compression offset predictor and image reconstructor. Specifically, the predictor estimates pixel offsets between the first and second compression, which are then utilized to divide different patterns. The reconstructor is mainly based on several Hybrid Partition Attention Blocks (HPAB), combining vanilla window-based self-attention and sparse attention for clustered pattern features. Extensive experiments demonstrate that OAPT outperforms the state-of-the-art method by more than 0.16dB in double JPEG image restoration task. Moreover, without increasing any computation cost, the pattern clustering module in HPAB can serve as a plugin to enhance other transformer-based image restoration methods. The code will be available at https://github.com/QMoQ/OAPT.git .

* 14 pages, 9 figures. Codes and models are available at https://github.com/QMoQ/OAPT.git

Via

Access Paper or Ask Questions

Decoupling Meta-Reinforcement Learning with Gaussian Task Contexts and Skills

Dec 11, 2023

Hongcai He, Anjie Zhu, Shuang Liang, Feiyu Chen, Jie Shao

Abstract:Offline meta-reinforcement learning (meta-RL) methods, which adapt to unseen target tasks with prior experience, are essential in robot control tasks. Current methods typically utilize task contexts and skills as prior experience, where task contexts are related to the information within each task and skills represent a set of temporally extended actions for solving subtasks. However, these methods still suffer from limited performance when adapting to unseen target tasks, mainly because the learned prior experience lacks generalization, i.e., they are unable to extract effective prior experience from meta-training tasks by exploration and learning of continuous latent spaces. We propose a framework called decoupled meta-reinforcement learning (DCMRL), which (1) contrastively restricts the learning of task contexts through pulling in similar task contexts within the same task and pushing away different task contexts of different tasks, and (2) utilizes a Gaussian quantization variational autoencoder (GQ-VAE) for clustering the Gaussian distributions of the task contexts and skills respectively, and decoupling the exploration and learning processes of their spaces. These cluster centers which serve as representative and discrete distributions of task context and skill are stored in task context codebook and skill codebook, respectively. DCMRL can acquire generalizable prior experience and achieve effective adaptation to unseen target tasks during the meta-testing phase. Experiments in the navigation and robot manipulation continuous control tasks show that DCMRL is more effective than previous meta-RL methods with more generalizable prior experience.

* Accepted by AAAI 2024 (this version includes appendix)

Via

Access Paper or Ask Questions

SegT: A Novel Separated Edge-guidance Transformer Network for Polyp Segmentation

Jun 19, 2023

Feiyu Chen, Haiping Ma, Weijia Zhang

Figure 1 for SegT: A Novel Separated Edge-guidance Transformer Network for Polyp Segmentation

Figure 2 for SegT: A Novel Separated Edge-guidance Transformer Network for Polyp Segmentation

Figure 3 for SegT: A Novel Separated Edge-guidance Transformer Network for Polyp Segmentation

Figure 4 for SegT: A Novel Separated Edge-guidance Transformer Network for Polyp Segmentation

Abstract:Accurate segmentation of colonoscopic polyps is considered a fundamental step in medical image analysis and surgical interventions. Many recent studies have made improvements based on the encoder-decoder framework, which can effectively segment diverse polyps. Such improvements mainly aim to enhance local features by using global features and applying attention methods. However, relying only on the global information of the final encoder block can result in losing local regional features in the intermediate layer. In addition, determining the edges between benign regions and polyps could be a challenging task. To address the aforementioned issues, we propose a novel separated edge-guidance transformer (SegT) network that aims to build an effective polyp segmentation model. A transformer encoder that learns a more robust representation than existing CNN-based approaches was specifically applied. To determine the precise segmentation of polyps, we utilize a separated edge-guidance module consisting of separator and edge-guidance blocks. The separator block is a two-stream operator to highlight edges between the background and foreground, whereas the edge-guidance block lies behind both streams to strengthen the understanding of the edge. Lastly, an innovative cascade fusion module was used and fused the refined multi-level features. To evaluate the effectiveness of SegT, we conducted experiments with five challenging public datasets, and the proposed model achieved state-of-the-art performance.

Via

Access Paper or Ask Questions

Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints

Jun 01, 2023

Jiachen Li, Xinwei Shi, Feiyu Chen, Jonathan Stroud, Zhishuai Zhang, Tian Lan, Junhua Mao, Jeonhyung Kang, Khaled S. Refaat, Weilong Yang(+2 more)

Figure 1 for Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints

Figure 2 for Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints

Figure 3 for Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints

Figure 4 for Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints

Abstract:Accurate understanding and prediction of human behaviors are critical prerequisites for autonomous vehicles, especially in highly dynamic and interactive scenarios such as intersections in dense urban areas. In this work, we aim at identifying crossing pedestrians and predicting their future trajectories. To achieve these goals, we not only need the context information of road geometry and other traffic participants but also need fine-grained information of the human pose, motion and activity, which can be inferred from human keypoints. In this paper, we propose a novel multi-task learning framework for pedestrian crossing action recognition and trajectory prediction, which utilizes 3D human keypoints extracted from raw sensor data to capture rich information on human pose and activity. Moreover, we propose to apply two auxiliary tasks and contrastive learning to enable auxiliary supervisions to improve the learned keypoints representation, which further enhances the performance of major tasks. We validate our approach on a large-scale in-house dataset, as well as a public benchmark dataset, and show that our approach achieves state-of-the-art performance on a wide range of evaluation metrics. The effectiveness of each model component is validated in a detailed ablation study.

* ICRA 2023

Via

Access Paper or Ask Questions

Big-Data Clustering: K-Means or K-Indicators?

Jun 03, 2019

Feiyu Chen, Yuchen Yang, Liwei Xu, Taiping Zhang, Yin Zhang

Figure 1 for Big-Data Clustering: K-Means or K-Indicators?

Figure 2 for Big-Data Clustering: K-Means or K-Indicators?

Figure 3 for Big-Data Clustering: K-Means or K-Indicators?

Figure 4 for Big-Data Clustering: K-Means or K-Indicators?

Abstract:The K-means algorithm is arguably the most popular data clustering method, commonly applied to processed datasets in some "feature spaces", as is in spectral clustering. Highly sensitive to initializations, however, K-means encounters a scalability bottleneck with respect to the number of clusters K as this number grows in big data applications. In this work, we promote a closely related model called K-indicators model and construct an efficient, semi-convex-relaxation algorithm that requires no randomized initializations. We present extensive empirical results to show advantages of the new algorithm when K is large. In particular, using the new algorithm to start the K-means algorithm, without any replication, can significantly outperform the standard K-means with a large number of currently state-of-the-art random replications.

Via

Access Paper or Ask Questions

Compressed Sensing: From Research to Clinical Practice with Data-Driven Learning

Mar 19, 2019

Joseph Y. Cheng, Feiyu Chen, Christopher Sandino, Morteza Mardani, John M. Pauly, Shreyas S. Vasanawala

Figure 1 for Compressed Sensing: From Research to Clinical Practice with Data-Driven Learning

Figure 2 for Compressed Sensing: From Research to Clinical Practice with Data-Driven Learning

Figure 3 for Compressed Sensing: From Research to Clinical Practice with Data-Driven Learning

Figure 4 for Compressed Sensing: From Research to Clinical Practice with Data-Driven Learning

Abstract:Compressed sensing in MRI enables high subsampling factors while maintaining diagnostic image quality. This technique enables shortened scan durations and/or improved image resolution. Further, compressed sensing can increase the diagnostic information and value from each scan performed. Overall, compressed sensing has significant clinical impact in improving the diagnostic quality and patient experience for imaging exams. However, a number of challenges exist when moving compressed sensing from research to the clinic. These challenges include hand-crafted image priors, sensitive tuning parameters, and long reconstruction times. Data-driven learning provides a solution to address these challenges. As a result, compressed sensing can have greater clinical impact. In this tutorial, we will review the compressed sensing formulation and outline steps needed to transform this formulation to a deep learning framework. Supplementary open source code in python will be used to demonstrate this approach with open databases. Further, we will discuss considerations in applying data-driven compressed sensing in the clinical setting.

* Submitted to the Special Issue on Computational MRI: Compressed Sensing and Beyond in the IEEE Signal Processing Magazine

Via

Access Paper or Ask Questions

Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering

May 08, 2018

Joseph Y. Cheng, Feiyu Chen, Marcus T. Alley, John M. Pauly, Shreyas S. Vasanawala

Figure 1 for Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering

Figure 2 for Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering

Figure 3 for Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering

Figure 4 for Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering

Abstract:To increase the flexibility and scalability of deep neural networks for image reconstruction, a framework is proposed based on bandpass filtering. For many applications, sensing measurements are performed indirectly. For example, in magnetic resonance imaging, data are sampled in the frequency domain. The introduction of bandpass filtering enables leveraging known imaging physics while ensuring that the final reconstruction is consistent with actual measurements to maintain reconstruction accuracy. We demonstrate this flexible architecture for reconstructing subsampled datasets of MRI scans. The resulting high subsampling rates increase the speed of MRI acquisitions and enable the visualization rapid hemodynamics.

* 9 pages, 10 figures

Via

Access Paper or Ask Questions