Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qiang Guo

1-Tb/s/λ Transmission over Record 10714-km AR-HCF

Apr 02, 2025

Dawei Ge, Siyuan Liu, Qiang Qiu, Peng Li, Qiang Guo, Yiqi Li, Dong Wang, Baoluo Yan, Mingqing Zuo, Lei Zhang(+5 more)

Abstract:We present the first single-channel 1.001-Tb/s DP-36QAM-PCS recirculating transmission over 73 loops of 146.77-km ultra-low-loss & low-IMI DNANF-5 fiber, achieving a record transmission distance of 10,714.28 km.

Via

Access Paper or Ask Questions

Medical Image Segmentation via Single-Source Domain Generalization with Random Amplitude Spectrum Synthesis

Sep 07, 2024

Qiang Qiao, Wenyu Wang, Meixia Qu, Kun Su, Bin Jiang, Qiang Guo

Abstract:The field of medical image segmentation is challenged by domain generalization (DG) due to domain shifts in clinical datasets. The DG challenge is exacerbated by the scarcity of medical data and privacy concerns. Traditional single-source domain generalization (SSDG) methods primarily rely on stacking data augmentation techniques to minimize domain discrepancies. In this paper, we propose Random Amplitude Spectrum Synthesis (RASS) as a training augmentation for medical images. RASS enhances model generalization by simulating distribution changes from a frequency perspective. This strategy introduces variability by applying amplitude-dependent perturbations to ensure broad coverage of potential domain variations. Furthermore, we propose random mask shuffle and reconstruction components, which can enhance the ability of the backbone to process structural information and increase resilience intra- and cross-domain changes. The proposed Random Amplitude Spectrum Synthesis for Single-Source Domain Generalization (RAS^4DG) is validated on 3D fetal brain images and 2D fundus photography, and achieves an improved DG segmentation performance compared to other SSDG models.

* 11 pages, 4 figures, Medical Image Computing and Computer Assisted Intervention 2024

Via

Access Paper or Ask Questions

Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended?

Aug 20, 2024

Chen Liang, Qiang Guo, Xiaochao Qu, Luoqi Liu, Ting Liu

Abstract:Video segmentation aims at partitioning video sequences into meaningful segments based on objects or regions of interest within frames. Current video segmentation models are often derived from image segmentation techniques, which struggle to cope with small-scale or class-imbalanced video datasets. This leads to inconsistent segmentation results across frames. To address these issues, we propose a training strategy Masked Video Consistency, which enhances spatial and temporal feature aggregation. MVC introduces a training strategy that randomly masks image patches, compelling the network to predict the entire semantic segmentation, thus improving contextual information integration. Additionally, we introduce Object Masked Attention (OMA) to optimize the cross-attention mechanism by reducing the impact of irrelevant queries, thereby enhancing temporal modeling capabilities. Our approach, integrated into the latest decoupled universal video segmentation framework, achieves state-of-the-art performance across five datasets for three video segmentation tasks, demonstrating significant improvements over previous methods without increasing model parameters.

Via

Access Paper or Ask Questions

MCNet: A crowd denstity estimation network based on integrating multiscale attention module

Mar 29, 2024

Qiang Guo, Rubo Zhang, Di Zhao

Figure 1 for MCNet: A crowd denstity estimation network based on integrating multiscale attention module

Figure 2 for MCNet: A crowd denstity estimation network based on integrating multiscale attention module

Figure 3 for MCNet: A crowd denstity estimation network based on integrating multiscale attention module

Figure 4 for MCNet: A crowd denstity estimation network based on integrating multiscale attention module

Abstract:Aiming at the metro video surveillance system has not been able to effectively solve the metro crowd density estimation problem, a Metro Crowd density estimation Network (called MCNet) is proposed to automatically classify crowd density level of passengers. Firstly, an Integrating Multi-scale Attention (IMA) module is proposed to enhance the ability of the plain classifiers to extract semantic crowd texture features to accommodate to the characteristics of the crowd texture feature. The innovation of the IMA module is to fuse the dilation convolution, multiscale feature extraction and attention mechanism to obtain multi-scale crowd feature activation from a larger receptive field with lower computational cost, and to strengthen the crowds activation state of convolutional features in top layers. Secondly, a novel lightweight crowd texture feature extraction network is proposed, which can directly process video frames and automatically extract texture features for crowd density estimation, while its faster image processing speed and fewer network parameters make it flexible to be deployed on embedded platforms with limited hardware resources. Finally, this paper integrates IMA module and the lightweight crowd texture feature extraction network to construct the MCNet, and validate the feasibility of this network on image classification dataset: Cifar10 and four crowd density datasets: PETS2009, Mall, QUT and SH_METRO to validate the MCNet whether can be a suitable solution for crowd density estimation in metro video surveillance where there are image processing challenges such as high density, high occlusion, perspective distortion and limited hardware resources.

Via

Access Paper or Ask Questions

Tensor Robust PCA with Nonconvex and Nonlocal Regularization

Nov 04, 2022

Xiaoyu Geng, Qiang Guo, Shuaixiong Hui, Caiming Zhang

Abstract:Tensor robust principal component analysis (TRPCA) is a promising way for low-rank tensor recovery, which minimizes the convex surrogate of tensor rank by shrinking each tensor singular values equally. However, for real-world visual data, large singular values represent more signifiant information than small singular values. In this paper, we propose a nonconvex TRPCA (N-TRPCA) model based on the tensor adjustable logarithmic norm. Unlike TRPCA, our N-TRPCA can adaptively shrink small singular values more and shrink large singular values less. In addition, TRPCA assumes that the whole data tensor is of low rank. This assumption is hardly satisfied in practice for natural visual data, restricting the capability of TRPCA to recover the edges and texture details from noisy images and videos. To this end, we integrate nonlocal self-similarity into N-TRPCA, and further develop a nonconvex and nonlocal TRPCA (NN-TRPCA) model. Specifically, similar nonlocal patches are grouped as a tensor and then each group tensor is recovered by our N-TRPCA. Since the patches in one group are highly correlated, all group tensors have strong low-rank property, leading to an improvement of recovery performance. Experimental results demonstrate that the proposed NN-TRPCA outperforms some existing TRPCA methods in visual data recovery. The demo code is available at https://github.com/qguo2010/NN-TRPCA.

* 19 pages, 7 figures

Via

Access Paper or Ask Questions

NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

May 02, 2021

Ren Yang, Radu Timofte, Jing Liu, Yi Xu, Xinjian Zhang, Minyi Zhao, Shuigeng Zhou, Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu(+62 more)

Figure 1 for NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

Figure 2 for NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

Figure 3 for NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

Figure 4 for NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

Abstract:This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results. In this challenge, the new Large-scale Diverse Video (LDV) dataset is employed. The challenge has three tracks. Tracks 1 and 2 aim at enhancing the videos compressed by HEVC at a fixed QP, while Track 3 is designed for enhancing the videos compressed by x265 at a fixed bit-rate. Besides, the quality enhancement of Tracks 1 and 3 targets at improving the fidelity (PSNR), and Track 2 targets at enhancing the perceptual quality. The three tracks totally attract 482 registrations. In the test phase, 12 teams, 8 teams and 11 teams submitted the final results of Tracks 1, 2 and 3, respectively. The proposed methods and solutions gauge the state-of-the-art of video quality enhancement. The homepage of the challenge: https://github.com/RenYang-home/NTIRE21_VEnh

* Corrected the MOS values in Table 2

Via

Access Paper or Ask Questions

TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator

May 08, 2020

Pengyao Zhao, Quanli Liu, Wei Wang, Qiang Guo

Figure 1 for TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator

Figure 2 for TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator

Figure 3 for TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator

Figure 4 for TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator

Abstract:In a generic object tracking, depth (D) information provides informative cues for foreground-background separation and target bounding box regression. However, so far, few trackers have used depth information to play the important role aforementioned due to the lack of a suitable model. In this paper, a RGB-D tracker named TSDM is proposed, which is composed of a Mask-generator (M-g), SiamRPN++ and a Depth-refiner (D-r). The M-g generates the background masks, and updates them as the target 3D position changes. The D-r optimizes the target bounding box estimated by SiamRPN++, based on the spatial depth distribution difference between the target and the surrounding background. Extensive evaluation on the Princeton Tracking Benchmark and the Visual Object Tracking challenge shows that our tracker outperforms the state-of-the-art by a large margin while achieving 23 FPS. In addition, a light-weight variant can run at 31 FPS and thus it is practical for real world applications. Code and models of TSDM are available at https://github.com/lql-team/TSDM.

* 6 Pages, 6 Figures, 2 Tables

Via

Access Paper or Ask Questions

Multi-task Learning for Macromolecule Classification, Segmentation and Coarse Structural Recovery in Cryo-Tomography

May 16, 2018

Chang Liu, Xiangrui Zeng, Kaiwen Wang, Qiang Guo, Min Xu

Figure 1 for Multi-task Learning for Macromolecule Classification, Segmentation and Coarse Structural Recovery in Cryo-Tomography

Figure 2 for Multi-task Learning for Macromolecule Classification, Segmentation and Coarse Structural Recovery in Cryo-Tomography

Figure 3 for Multi-task Learning for Macromolecule Classification, Segmentation and Coarse Structural Recovery in Cryo-Tomography

Figure 4 for Multi-task Learning for Macromolecule Classification, Segmentation and Coarse Structural Recovery in Cryo-Tomography

Abstract:Cellular Electron Cryo-Tomography (CECT) is a powerful 3D imaging tool for studying the native structure and organization of macromolecules inside single cells. For systematic recognition and recovery of macromolecular structures captured by CECT, methods for several important tasks such as subtomogram classification and semantic segmentation have been developed. However, the recognition and recovery of macromolecular structures are still very difficult due to high molecular structural diversity, crowding molecular environment, and the imaging limitations of CECT. In this paper, we propose a novel multi-task 3D convolutional neural network model for simultaneous classification, segmentation, and coarse structural recovery of macromolecules of interest in subtomograms. In our model, the learned image features of one task are shared and thereby mutually reinforce the learning of other tasks. Evaluated on realistically simulated and experimental CECT data, our multi-task learning model outperformed all single-task learning methods for classification and segmentation. In addition, we demonstrate that our model can generalize to discover, segment and recover novel structures that do not exist in the training data.

* British Machine Vision Conference (BMVC) 2018

Via

Access Paper or Ask Questions

An integration of fast alignment and maximum-likelihood methods for electron subtomogram averaging and classification

Apr 04, 2018

Yixiu Zhao, Xiangrui Zeng, Qiang Guo, Min Xu

Figure 1 for An integration of fast alignment and maximum-likelihood methods for electron subtomogram averaging and classification

Figure 2 for An integration of fast alignment and maximum-likelihood methods for electron subtomogram averaging and classification

Figure 3 for An integration of fast alignment and maximum-likelihood methods for electron subtomogram averaging and classification

Figure 4 for An integration of fast alignment and maximum-likelihood methods for electron subtomogram averaging and classification

Abstract:Motivation: Cellular Electron CryoTomography (CECT) is an emerging 3D imaging technique that visualizes subcellular organization of single cells at submolecular resolution and in near-native state. CECT captures large numbers of macromolecular complexes of highly diverse structures and abundances. However, the structural complexity and imaging limits complicate the systematic de novo structural recovery and recognition of these macromolecular complexes. Efficient and accurate reference-free subtomogram averaging and classification represent the most critical tasks for such analysis. Existing subtomogram alignment based methods are prone to the missing wedge effects and low signal-to-noise ratio (SNR). Moreover, existing maximum-likelihood based methods rely on integration operations, which are in principle computationally infeasible for accurate calculation. Results: Built on existing works, we propose an integrated method, Fast Alignment Maximum Likelihood method (FAML), which uses fast subtomogram alignment to sample sub-optimal rigid transformations. The transformations are then used to approximate integrals for maximum-likelihood update of subtomogram averages through expectation-maximization algorithm. Our tests on simulated and experimental subtomograms showed that, compared to our previously developed fast alignment method (FA), FAML is significantly more robust to noise and missing wedge effects with moderate increases of computation cost.Besides, FAML performs well with significantly fewer input subtomograms when the FA method fails. Therefore, FAML can serve as a key component for improved construction of initial structural models from macromolecules captured by CECT.

* Intelligent Systems for Molecular Biology (ISMB) 2018, Bioinformatics
* 17 pages

Via

Access Paper or Ask Questions

Feature Decomposition Based Saliency Detection in Electron Cryo-Tomograms

Jan 31, 2018

Bo Zhou, Qiang Guo, Xiangrui Zeng, Min Xu

Figure 1 for Feature Decomposition Based Saliency Detection in Electron Cryo-Tomograms

Figure 2 for Feature Decomposition Based Saliency Detection in Electron Cryo-Tomograms

Figure 3 for Feature Decomposition Based Saliency Detection in Electron Cryo-Tomograms

Figure 4 for Feature Decomposition Based Saliency Detection in Electron Cryo-Tomograms

Abstract:Electron Cryo-Tomography (ECT) allows 3D visualization of subcellular structures at the submolecular resolution in close to the native state. However, due to the high degree of structural complexity and imaging limits, the automatic segmentation of cellular components from ECT images is very difficult. To complement and speed up existing segmentation methods, it is desirable to develop a generic cell component segmentation method that is 1) not specific to particular types of cellular components, 2) able to segment unknown cellular components, 3) fully unsupervised and does not rely on the availability of training data. As an important step towards this goal, in this paper, we propose a saliency detection method that computes the likelihood that a subregion in a tomogram stands out from the background. Our method consists of four steps: supervoxel over-segmentation, feature extraction, feature matrix decomposition, and computation of saliency. The method produces a distribution map that represents the regions' saliency in tomograms. Our experiments show that our method can successfully label most salient regions detected by a human observer, and able to filter out regions not containing cellular components. Therefore, our method can remove the majority of the background region, and significantly speed up the subsequent processing of segmentation and recognition of cellular components captured by ECT.

* 14 pages

Via

Access Paper or Ask Questions