Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Cheng Ma

Neural Fields for Adaptive Photoacoustic Computed Tomography

Sep 17, 2024

Tianao Li, Manxiu Cui, Cheng Ma, Emma Alexander

Abstract:Photoacoustic computed tomography (PACT) is a non-invasive imaging modality with wide medical applications. Conventional PACT image reconstruction algorithms suffer from wavefront distortion caused by the heterogeneous speed of sound (SOS) in tissue, which leads to image degradation. Accounting for these effects improves image quality, but measuring the SOS distribution is experimentally expensive. An alternative approach is to perform joint reconstruction of the initial pressure image and SOS using only the PA signals. Existing joint reconstruction methods come with limitations: high computational cost, inability to directly recover SOS, and reliance on inaccurate simplifying assumptions. Implicit neural representation, or neural fields, is an emerging technique in computer vision to learn an efficient and continuous representation of physical fields with a coordinate-based neural network. In this work, we introduce NF-APACT, an efficient self-supervised framework utilizing neural fields to estimate the SOS in service of an accurate and robust multi-channel deconvolution. Our method removes SOS aberrations an order of magnitude faster and more accurately than existing methods. We demonstrate the success of our method on a novel numerical phantom as well as an experimentally collected phantom and in vivo data. Our code and numerical phantom are available at https://github.com/Lukeli0425/NF-APACT.

Via

Access Paper or Ask Questions

Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution

Apr 11, 2024

Handi Deng, Yucheng Zhou, Jiaxuan Xiang, Liujie Gu, Yan Luo, Hai Feng, Mingyuan Liu, Cheng Ma

Figure 1 for Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution

Figure 2 for Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution

Figure 3 for Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution

Figure 4 for Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution

Abstract:Foundation models have rapidly evolved and have achieved significant accomplishments in computer vision tasks. Specifically, the prompt mechanism conveniently allows users to integrate image prior information into the model, making it possible to apply models without any training. Therefore, we propose a method based on foundation models and zero training to solve the tasks of photoacoustic (PA) image segmentation. We employed the segment anything model (SAM) by setting simple prompts and integrating the model's outputs with prior knowledge of the imaged objects to accomplish various tasks, including: (1) removing the skin signal in three-dimensional PA image rendering; (2) dual speed-of-sound reconstruction, and (3) segmentation of finger blood vessels. Through these demonstrations, we have concluded that deep learning can be directly applied in PA imaging without the requirement for network design and training. This potentially allows for a hands-on, convenient approach to achieving efficient and accurate segmentation of PA images. This letter serves as a comprehensive tutorial, facilitating the mastery of the technique through the provision of code and sample datasets.

Via

Access Paper or Ask Questions

Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution

Jul 26, 2022

Cheng Ma, Jingyi Zhang, Jie Zhou, Jiwen Lu

Figure 1 for Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution

Figure 2 for Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution

Figure 3 for Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution

Figure 4 for Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution

Abstract:Lookup table (LUT) has shown its efficacy in low-level vision tasks due to the valuable characteristics of low computational cost and hardware independence. However, recent attempts to address the problem of single image super-resolution (SISR) with lookup tables are highly constrained by the small receptive field size. Besides, their frameworks of single-layer lookup tables limit the extension and generalization capacities of the model. In this paper, we propose a framework of series-parallel lookup tables (SPLUT) to alleviate the above issues and achieve efficient image super-resolution. On the one hand, we cascade multiple lookup tables to enlarge the receptive field of each extracted feature vector. On the other hand, we propose a parallel network which includes two branches of cascaded lookup tables which process different components of the input low-resolution images. By doing so, the two branches collaborate with each other and compensate for the precision loss of discretizing input pixels when establishing lookup tables. Compared to previous lookup table-based methods, our framework has stronger representation abilities with more flexible architectures. Furthermore, we no longer need interpolation methods which introduce redundant computations so that our method can achieve faster inference speed. Extensive experimental results on five popular benchmark datasets show that our method obtains superior SISR performance in a more efficient way. The code is available at https://github.com/zhjy2016/SPLUT.

* accpted by ECCV2022

Via

Access Paper or Ask Questions

Weakly Supervised High-Fidelity Clothing Model Generation

Dec 14, 2021

Ruili Feng, Cheng Ma, Chengji Shen, Xin Gao, Zhenjiang Liu, Xiaobo Li, Kairi Ou, Zhengjun Zha

Figure 1 for Weakly Supervised High-Fidelity Clothing Model Generation

Figure 2 for Weakly Supervised High-Fidelity Clothing Model Generation

Figure 3 for Weakly Supervised High-Fidelity Clothing Model Generation

Figure 4 for Weakly Supervised High-Fidelity Clothing Model Generation

Abstract:The development of online economics arouses the demand of generating images of models on product clothes, to display new clothes and promote sales. However, the expensive proprietary model images challenge the existing image virtual try-on methods in this scenario, as most of them need to be trained on considerable amounts of model images accompanied with paired clothes images. In this paper, we propose a cheap yet scalable weakly-supervised method called Deep Generative Projection (DGP) to address this specific scenario. Lying in the heart of the proposed method is to imitate the process of human predicting the wearing effect, which is an unsupervised imagination based on life experience rather than computation rules learned from supervisions. Here a pretrained StyleGAN is used to capture the practical experience of wearing. Experiments show that projecting the rough alignment of clothing and body onto the StyleGAN space can yield photo-realistic wearing results. Experiments on real scene proprietary model images demonstrate the superiority of DGP over several state-of-the-art supervised methods when generating clothing model images.

Via

Access Paper or Ask Questions

Structure-Preserving Image Super-Resolution

Sep 26, 2021

Cheng Ma, Yongming Rao, Jiwen Lu, Jie Zhou

Figure 1 for Structure-Preserving Image Super-Resolution

Figure 2 for Structure-Preserving Image Super-Resolution

Figure 3 for Structure-Preserving Image Super-Resolution

Figure 4 for Structure-Preserving Image Super-Resolution

Abstract:Structures matter in single image super-resolution (SISR). Benefiting from generative adversarial networks (GANs), recent studies have promoted the development of SISR by recovering photo-realistic images. However, there are still undesired structural distortions in the recovered images. In this paper, we propose a structure-preserving super-resolution (SPSR) method to alleviate the above issue while maintaining the merits of GAN-based methods to generate perceptual-pleasant details. Firstly, we propose SPSR with gradient guidance (SPSR-G) by exploiting gradient maps of images to guide the recovery in two aspects. On the one hand, we restore high-resolution gradient maps by a gradient branch to provide additional structure priors for the SR process. On the other hand, we propose a gradient loss to impose a second-order restriction on the super-resolved images, which helps generative networks concentrate more on geometric structures. Secondly, since the gradient maps are handcrafted and may only be able to capture limited aspects of structural information, we further extend SPSR-G by introducing a learnable neural structure extractor (NSE) to unearth richer local structures and provide stronger supervision for SR. We propose two self-supervised structure learning methods, contrastive prediction and solving jigsaw puzzles, to train the NSEs. Our methods are model-agnostic, which can be potentially used for off-the-shelf SR networks. Experimental results on five benchmark datasets show that the proposed methods outperform state-of-the-art perceptual-driven SR methods under LPIPS, PSNR, and SSIM metrics. Visual results demonstrate the superiority of our methods in restoring structures while generating natural SR images. Code is available at https://github.com/Maclory/SPSR.

* Accepted by T-PAMI. Journal version of arXiv:2003.13081 (CVPR 2020)

Via

Access Paper or Ask Questions

Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search

Feb 02, 2021

Cheng Ma, Jiwen Lu, Jie Zhou

Figure 1 for Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search

Figure 2 for Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search

Figure 3 for Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search

Figure 4 for Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search

Abstract:As hashing becomes an increasingly appealing technique for large-scale image retrieval, multi-label hashing is also attracting more attention for the ability to exploit multi-level semantic contents. In this paper, we propose a novel deep hashing method for scalable multi-label image search. Unlike existing approaches with conventional objectives such as contrast and triplet losses, we employ a rank list, rather than pairs or triplets, to provide sufficient global supervision information for all the samples. Specifically, a new rank-consistency objective is applied to align the similarity orders from two spaces, the original space and the hamming space. A powerful loss function is designed to penalize the samples whose semantic similarity and hamming distance are mismatched in two spaces. Besides, a multi-label softmax cross-entropy loss is presented to enhance the discriminative power with a concise formulation of the derivative function. In order to manipulate the neighborhood structure of the samples with different labels, we design a multi-label clustering loss to cluster the hashing vectors of the samples with the same labels by reducing the distances between the samples and their multiple corresponding class centers. The state-of-the-art experimental results achieved on three public multi-label datasets, MIRFLICKR-25K, IAPRTC12 and NUS-WIDE, demonstrate the effectiveness of the proposed method.

* IEEE Transactions on Multimedia, 2020

Via

Access Paper or Ask Questions

Structure-Preserving Super Resolution with Gradient Guidance

Mar 29, 2020

Cheng Ma, Yongming Rao, Yean Cheng, Ce Chen, Jiwen Lu, Jie Zhou

Figure 1 for Structure-Preserving Super Resolution with Gradient Guidance

Figure 2 for Structure-Preserving Super Resolution with Gradient Guidance

Figure 3 for Structure-Preserving Super Resolution with Gradient Guidance

Figure 4 for Structure-Preserving Super Resolution with Gradient Guidance

Abstract:Structures matter in single image super resolution (SISR). Recent studies benefiting from generative adversarial network (GAN) have promoted the development of SISR by recovering photo-realistic images. However, there are always undesired structural distortions in the recovered images. In this paper, we propose a structure-preserving super resolution method to alleviate the above issue while maintaining the merits of GAN-based methods to generate perceptual-pleasant details. Specifically, we exploit gradient maps of images to guide the recovery in two aspects. On the one hand, we restore high-resolution gradient maps by a gradient branch to provide additional structure priors for the SR process. On the other hand, we propose a gradient loss which imposes a second-order restriction on the super-resolved images. Along with the previous image-space loss functions, the gradient-space objectives help generative networks concentrate more on geometric structures. Moreover, our method is model-agnostic, which can be potentially used for off-the-shelf SR networks. Experimental results show that we achieve the best PI and LPIPS performance and meanwhile comparable PSNR and SSIM compared with state-of-the-art perceptual-driven SR methods. Visual results demonstrate our superiority in restoring structures while generating natural SR images.

* Accepted to CVPR 2020

Via

Access Paper or Ask Questions

Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation

Mar 29, 2020

Cheng Ma, Zhenyu Jiang, Yongming Rao, Jiwen Lu, Jie Zhou

Figure 1 for Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation

Figure 2 for Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation

Figure 3 for Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation

Figure 4 for Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation

Abstract:Recent works based on deep learning and facial priors have succeeded in super-resolving severely degraded facial images. However, the prior knowledge is not fully exploited in existing methods, since facial priors such as landmark and component maps are always estimated by low-resolution or coarsely super-resolved images, which may be inaccurate and thus affect the recovery performance. In this paper, we propose a deep face super-resolution (FSR) method with iterative collaboration between two recurrent networks which focus on facial image recovery and landmark estimation respectively. In each recurrent step, the recovery branch utilizes the prior knowledge of landmarks to yield higher-quality images which facilitate more accurate landmark estimation in turn. Therefore, the iterative information interaction between two processes boosts the performance of each other progressively. Moreover, a new attentive fusion module is designed to strengthen the guidance of landmark maps, where facial components are generated individually and aggregated attentively for better restoration. Quantitative and qualitative experimental results show the proposed method significantly outperforms state-of-the-art FSR methods in recovering high-quality face images.

* Accepted to CVPR 2020

Via

Access Paper or Ask Questions

Transfer Learning in General Lensless Imaging through Scattering Media

Dec 28, 2019

Yukuan Yang, Lei Deng, Peng Jiao, Yansong Chua, Jing Pei, Cheng Ma, Guoqi Li

Figure 1 for Transfer Learning in General Lensless Imaging through Scattering Media

Figure 2 for Transfer Learning in General Lensless Imaging through Scattering Media

Figure 3 for Transfer Learning in General Lensless Imaging through Scattering Media

Figure 4 for Transfer Learning in General Lensless Imaging through Scattering Media

Abstract:Recently deep neural networks (DNNs) have been successfully introduced to the field of lensless imaging through scattering media. By solving an inverse problem in computational imaging, DNNs can overcome several shortcomings in the conventional lensless imaging through scattering media methods, namely, high cost, poor quality, complex control, and poor anti-interference. However, for training, a large number of training samples on various datasets have to be collected, with a DNN trained on one dataset generally performing poorly for recovering images from another dataset. The underlying reason is that lensless imaging through scattering media is a high dimensional regression problem and it is difficult to obtain an analytical solution. In this work, transfer learning is proposed to address this issue. Our main idea is to train a DNN on a relatively complex dataset using a large number of training samples and fine-tune the last few layers using very few samples from other datasets. Instead of the thousands of samples required to train from scratch, transfer learning alleviates the problem of costly data acquisition. Specifically, considering the difference in sample sizes and similarity among datasets, we propose two DNN architectures, namely LISMU-FCN and LISMU-OCN, and a balance loss function designed for balancing smoothness and sharpness. LISMU-FCN, with much fewer parameters, can achieve imaging across similar datasets while LISMU-OCN can achieve imaging across significantly different datasets. What's more, we establish a set of simulation algorithms which are close to the real experiment, and it is of great significance and practical value in the research on lensless scattering imaging. In summary, this work provides a new solution for lensless imaging through scattering media using transfer learning in DNNs.

Via

Access Paper or Ask Questions

Multi-Context Attention for Human Pose Estimation

Feb 24, 2017

Xiao Chu, Wei Yang, Wanli Ouyang, Cheng Ma, Alan L. Yuille, Xiaogang Wang

Figure 1 for Multi-Context Attention for Human Pose Estimation

Figure 2 for Multi-Context Attention for Human Pose Estimation

Figure 3 for Multi-Context Attention for Human Pose Estimation

Figure 4 for Multi-Context Attention for Human Pose Estimation

Abstract:In this paper, we propose to incorporate convolutional neural networks with a multi-context attention mechanism into an end-to-end framework for human pose estimation. We adopt stacked hourglass networks to generate attention maps from features at multiple resolutions with various semantics. The Conditional Random Field (CRF) is utilized to model the correlations among neighboring regions in the attention map. We further combine the holistic attention model, which focuses on the global consistency of the full human body, and the body part attention model, which focuses on the detailed description for different body parts. Hence our model has the ability to focus on different granularity from local salient regions to global semantic-consistent spaces. Additionally, we design novel Hourglass Residual Units (HRUs) to increase the receptive field of the network. These units are extensions of residual units with a side branch incorporating filters with larger receptive fields, hence features with various scales are learned and combined within the HRUs. The effectiveness of the proposed multi-context attention mechanism and the hourglass residual units is evaluated on two widely used human pose estimation benchmarks. Our approach outperforms all existing methods on both benchmarks over all the body parts.

* The first two authors contribute equally to this work

Via

Access Paper or Ask Questions