Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuan Ma

TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

May 14, 2025

Xuefeng Jiang, Yuan Ma, Pengxiang Li, Leimeng Xu, Xin Wen, Kun Zhan, Zhongpu Xia, Peng Jia, XianPeng Lang, Sheng Sun

Abstract:In recent years, diffusion model has shown its potential across diverse domains from vision generation to language modeling. Transferring its capabilities to modern autonomous driving systems has also emerged as a promising direction.In this work, we propose TransDiffuser, an encoder-decoder based generative trajectory planning model for end-to-end autonomous driving. The encoded scene information serves as the multi-modal conditional input of the denoising decoder. To tackle the mode collapse dilemma in generating high-quality diverse trajectories, we introduce a simple yet effective multi-modal representation decorrelation optimization mechanism during the training process.TransDiffuser achieves PDMS of 94.85 on the NAVSIM benchmark, surpassing previous state-of-the-art methods without any anchor-based prior trajectories.

* Under review

Via

Access Paper or Ask Questions

Implementing Trust in Non-Small Cell Lung Cancer Diagnosis with a Conformalized Uncertainty-Aware AI Framework in Whole-Slide Images

Dec 28, 2024

Xiaoge Zhang, Tao Wang, Chao Yan, Fedaa Najdawi, Kai Zhou, Yuan Ma, Yiu-ming Cheung, Bradley A. Malin

Abstract:Ensuring trustworthiness is fundamental to the development of artificial intelligence (AI) that is considered societally responsible, particularly in cancer diagnostics, where a misdiagnosis can have dire consequences. Current digital pathology AI models lack systematic solutions to address trustworthiness concerns arising from model limitations and data discrepancies between model deployment and development environments. To address this issue, we developed TRUECAM, a framework designed to ensure both data and model trustworthiness in non-small cell lung cancer subtyping with whole-slide images. TRUECAM integrates 1) a spectral-normalized neural Gaussian process for identifying out-of-scope inputs and 2) an ambiguity-guided elimination of tiles to filter out highly ambiguous regions, addressing data trustworthiness, as well as 3) conformal prediction to ensure controlled error rates. We systematically evaluated the framework across multiple large-scale cancer datasets, leveraging both task-specific and foundation models, illustrate that an AI model wrapped with TRUECAM significantly outperforms models that lack such guidance, in terms of classification accuracy, robustness, interpretability, and data efficiency, while also achieving improvements in fairness. These findings highlight TRUECAM as a versatile wrapper framework for digital pathology AI models with diverse architectural designs, promoting their responsible and effective applications in real-world settings.

Via

Access Paper or Ask Questions

A Backdoor Attack Scheme with Invisible Triggers Based on Model Architecture Modification

Dec 22, 2024

Yuan Ma, Xu Ma, Jiankang Wei, Jinmeng Tang, Xiaoyu Zhang, Yilun Lyu, Kehao Chen, Jingtong Huang

Abstract:Machine learning systems are vulnerable to backdoor attacks, where attackers manipulate model behavior through data tampering or architectural modifications. Traditional backdoor attacks involve injecting malicious samples with specific triggers into the training data, causing the model to produce targeted incorrect outputs in the presence of the corresponding triggers. More sophisticated attacks modify the model's architecture directly, embedding backdoors that are harder to detect as they evade traditional data-based detection methods. However, the drawback of the architectural modification based backdoor attacks is that the trigger must be visible in order to activate the backdoor. To further strengthen the invisibility of the backdoor attacks, a novel backdoor attack method is presented in the paper. To be more specific, this method embeds the backdoor within the model's architecture and has the capability to generate inconspicuous and stealthy triggers. The attack is implemented by modifying pre-trained models, which are then redistributed, thereby posing a potential threat to unsuspecting users. Comprehensive experiments conducted on standard computer vision benchmarks validate the effectiveness of this attack and highlight the stealthiness of its triggers, which remain undetectable through both manual visual inspection and advanced detection tools.

Via

Access Paper or Ask Questions

Deep learning and random light structuring ensure robust free-space communications

Jan 18, 2024

Xiaofei Li, Yu Wang, Xin Liu, Yuan Ma, Yangjian Cai, Sergey A. Ponomarenko, Xianlong Liu

Abstract:Having shown early promise, free-space optical communications (FSO) face formidable challenges in the age of information explosion. The ever-growing demand for greater channel communication capacity is one of the challenges. The inter-channel crosstalk, which severely degrades the quality of transmitted information, creates another roadblock in the way of efficient FSO implementation. Here we advance theoretically and realize experimentally a potentially high-capacity FSO protocol that enables high-fidelity transfer of an image, or set of images through a complex environment. In our protocol, we complement random light structuring at the transmitter with a deep learning image classification platform at the receiver. Multiplexing novel, independent, mutually orthogonal degrees of freedom available to structured random light can potentially significantly boost the channel communication capacity of our protocol without introducing any deleterious crosstalk. Specifically, we show how one can multiplex the degrees of freedom associated with the source coherence radius and a spatial position of a beamlet within an array of structured random beams to greatly enhance the capacity of our communication link. The superb resilience of structured random light to environmental noise, as well as extreme efficiency of deep learning networks at classifying images guarantees high-fidelity image transfer within the framework of our protocol.

* 18 pages,13 figures

Via

Access Paper or Ask Questions

Self-supervised Learning for Electroencephalogram: A Systematic Survey

Jan 09, 2024

Weining Weng, Yang Gu, Shuai Guo, Yuan Ma, Zhaohua Yang, Yuchen Liu, Yiqiang Chen

Figure 1 for Self-supervised Learning for Electroencephalogram: A Systematic Survey

Figure 2 for Self-supervised Learning for Electroencephalogram: A Systematic Survey

Figure 3 for Self-supervised Learning for Electroencephalogram: A Systematic Survey

Figure 4 for Self-supervised Learning for Electroencephalogram: A Systematic Survey

Abstract:Electroencephalogram (EEG) is a non-invasive technique to record bioelectrical signals. Integrating supervised deep learning techniques with EEG signals has recently facilitated automatic analysis across diverse EEG-based tasks. However, the label issues of EEG signals have constrained the development of EEG-based deep models. Obtaining EEG annotations is difficult that requires domain experts to guide collection and labeling, and the variability of EEG signals among different subjects causes significant label shifts. To solve the above challenges, self-supervised learning (SSL) has been proposed to extract representations from unlabeled samples through well-designed pretext tasks. This paper concentrates on integrating SSL frameworks with temporal EEG signals to achieve efficient representation and proposes a systematic review of the SSL for EEG signals. In this paper, 1) we introduce the concept and theory of self-supervised learning and typical SSL frameworks. 2) We provide a comprehensive review of SSL for EEG analysis, including taxonomy, methodology, and technique details of the existing EEG-based SSL frameworks, and discuss the difference between these methods. 3) We investigate the adaptation of the SSL approach to various downstream tasks, including the task description and related benchmark datasets. 4) Finally, we discuss the potential directions for future SSL-EEG research.

* 35 pages, 12 figures

Via

Access Paper or Ask Questions

Infrared Small Target Detection Using Double-Weighted Multi-Granularity Patch Tensor Model With Tensor-Train Decomposition

Oct 09, 2023

Guiyu Zhang, Qunbo Lv, Zui Tao, Baoyu Zhu, Zheng Tan, Yuan Ma

Abstract:Infrared small target detection plays an important role in the remote sensing fields. Therefore, many detection algorithms have been proposed, in which the infrared patch-tensor (IPT) model has become a mainstream tool due to its excellent performance. However, most IPT-based methods face great challenges, such as inaccurate measure of the tensor low-rankness and poor robustness to complex scenes, which will leadto poor detection performance. In order to solve these problems, this paper proposes a novel double-weighted multi-granularity infrared patch tensor (DWMGIPT) model. First, to capture different granularity information of tensor from multiple modes, a multi-granularity infrared patch tensor (MGIPT) model is constructed by collecting nonoverlapping patches and tensor augmentation based on the tensor train (TT) decomposition. Second, to explore the latent structure of tensor more efficiently, we utilize the auto-weighted mechanism to balance the importance of information at different granularity. Then, the steering kernel (SK) is employed to extract local structure prior, which suppresses background interference such as strong edges and noise. Finally, an efficient optimization algorithm based on the alternating direction method of multipliers (ADMM) is presented to solve the model. Extensive experiments in various challenging scenes show that the proposed algorithm is robust to noise and different scenes. Compared with the other eight state-of-the-art methods, different evaluation metrics demonstrate that our method achieves better detection performance in various complex scenes.

Via

Access Paper or Ask Questions

Image Super-Resolution via Residual Blended Attention Generative Adversarial Network with Dual Discriminators

Nov 13, 2019

Yuan Ma, Kewen Liu, Hongxia Xiong, Panpan Fang, Xiaojun Li, Yalei Chen, Chaoyang Liu

Figure 1 for Image Super-Resolution via Residual Blended Attention Generative Adversarial Network with Dual Discriminators

Figure 2 for Image Super-Resolution via Residual Blended Attention Generative Adversarial Network with Dual Discriminators

Figure 3 for Image Super-Resolution via Residual Blended Attention Generative Adversarial Network with Dual Discriminators

Abstract:When reviewing codes, it seems that there's something wrong with the training codes that we did not notice, this article is temporarily withdrawn and will be modified and resubmitted after codes reviewing.

* When reviewing codes, it seems that there's something wrong with the training codes that we did not notice, this article is temporarily withdrawn and will be modified and resubmitted after codes reviewing

Via

Access Paper or Ask Questions

Single Image Super-resolution via Dense Blended Attention Generative Adversarial Network for Clinical Diagnosis

Jun 15, 2019

Kewen Liu, Yuan Ma, Hongxia Xiong, Zejun Yan, Zhijun Zhou, Chaoyang Liu, Panpan Fang, Xiaojun Li, Yalei Chen

Abstract:In clinical diagnosis, doctors are able to see biological tissues and early lesions more clearly with the assistance of high-resolution(HR) medical images, which is of vital significance for improving diagnosis accuracy. In order to address the issue that medical images would suffer from severe blurring caused by lack of high-frequency details, this paper develops a novel image super-resolution(SR) algorithm called SR-DBAN via dense neural network and blended attention mechanism. Specifically, a novel blended attention block is proposed and introduced to dense neural network(DenseNet), so that the neural network can concentrate more attention to the regions and channels with sufficient high-frequency details adaptively. In the framework of SR-DBAN, batch normalization layers in the original DenseNet are removed to avoid loss of high-frequency texture details, final HR images are obtained by deconvolution at the very end of the network. Furthermore, inspired by the impressive performance of generative adversarial network, this paper develops a novel image SR algorithm called SR-DBAGAN via dense blended attention generative adversarial network. SR-DBAGAN consists a generator and a discriminator, the generator uses our proposed SR-DBAN to generate HR images and try to fool the discriminator while the discriminator is designed based on Wasserstein GAN(WGAN) to discriminate. We deployed our algorithms on blurry prostate MRI images, and experimental results showed that our proposed algorithms have generated considerable sharpness and texture details and have a significant improvement on the peak signal-to-noise ratio(PSNR) and structural similarity index(SSIM), respectively, compared with mainstream interpolation-based and deep learning-based image SR algorithms, which fully proves the effectiveness and superiority of our proposed algorithms.

* arXiv admin note: substantial text overlap with arXiv:1905.05084

Via

Access Paper or Ask Questions

Medical image super-resolution method based on dense blended attention network

May 13, 2019

Kewen Liu, Yuan Ma, Hongxia Xiong, Zejun Yan, Zhijun Zhou, Panpan Fang, Chaoyang Liu

Figure 1 for Medical image super-resolution method based on dense blended attention network

Figure 2 for Medical image super-resolution method based on dense blended attention network

Abstract:In order to address the issue that medical image would suffer from severe blurring caused by the lack of high-frequency details in the process of image super-resolution reconstruction, a novel medical image super-resolution method based on dense neural network and blended attention mechanism is proposed. The proposed method adds blended attention blocks to dense neural network(DenseNet), so that the neural network can concentrate more attention to the regions and channels with sufficient high-frequency details. Batch normalization layers are removed to avoid loss of high-frequency texture details. Final obtained high resolution medical image are obtained using deconvolutional layers at the very end of the network as up-sampling operators. Experimental results show that the proposed method has an improvement of 0.05db to 11.25dB and 0.6% to 14.04% on the peak signal-to-noise ratio(PSNR) metric and structural similarity index(SSIM) metric, respectively, compared with the mainstream image super-resolution methods. This work provides a new idea for theoretical studies of medical image super-resolution reconstruction.

* 12 pages, 4 figures, 32 references

Via

Access Paper or Ask Questions

Vehicle Speed Prediction using Deep Learning

Oct 25, 2015

Joe Lemieux, Yuan Ma

Figure 1 for Vehicle Speed Prediction using Deep Learning

Figure 2 for Vehicle Speed Prediction using Deep Learning

Abstract:Global optimization of the energy consumption of dual power source vehicles such as hybrid electric vehicles, plug-in hybrid electric vehicles, and plug in fuel cell electric vehicles requires knowledge of the complete route characteristics at the beginning of the trip. One of the main characteristics is the vehicle speed profile across the route. The profile will translate directly into energy requirements for a given vehicle. However, the vehicle speed that a given driver chooses will vary from driver to driver and from time to time, and may be slower, equal to, or faster than the average traffic flow. If the specific driver speed profile can be predicted, the energy usage can be optimized across the route chosen. The purpose of this paper is to research the application of Deep Learning techniques to this problem to identify at the beginning of a drive cycle the driver specific vehicle speed profile for an individual driver repeated drive cycle, which can be used in an optimization algorithm to minimize the amount of fossil fuel energy used during the trip.

Via

Access Paper or Ask Questions