Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jirui Yang

Concept Enhancement Engineering: A Lightweight and Efficient Robust Defense Against Jailbreak Attacks in Embodied AI

Apr 15, 2025

Jirui Yang, Zheyu Lin, Shuhan Yang, Zhihui Lu, Xin Du

Abstract:Embodied Intelligence (EI) systems integrated with large language models (LLMs) face significant security risks, particularly from jailbreak attacks that manipulate models into generating harmful outputs or executing unsafe physical actions. Traditional defense strategies, such as input filtering and output monitoring, often introduce high computational overhead or interfere with task performance in real-time embodied scenarios. To address these challenges, we propose Concept Enhancement Engineering (CEE), a novel defense framework that leverages representation engineering to enhance the safety of embodied LLMs by dynamically steering their internal activations. CEE operates by (1) extracting multilingual safety patterns from model activations, (2) constructing control directions based on safety-aligned concept subspaces, and (3) applying subspace concept rotation to reinforce safe behavior during inference. Our experiments demonstrate that CEE effectively mitigates jailbreak attacks while maintaining task performance, outperforming existing defense methods in both robustness and efficiency. This work contributes a scalable and interpretable safety mechanism for embodied AI, bridging the gap between theoretical representation engineering and practical security applications. Our findings highlight the potential of latent-space interventions as a viable defense paradigm against emerging adversarial threats in physically grounded AI systems.

Via

Access Paper or Ask Questions

Ambiguity-Free Broadband DOA Estimation Relying on Parameterized Time-Frequency Transform

Mar 05, 2025

Wei Wang, Shefeng Yan, Linlin Mao, Zeping Sui, Jirui Yang

Abstract:An ambiguity-free direction-of-arrival (DOA) estimation scheme is proposed for sparse uniform linear arrays under low signal-to-noise ratios (SNRs) and non-stationary broadband signals. First, for achieving better DOA estimation performance at low SNRs while using non-stationary signals compared to the conventional frequency-difference (FD) paradigms, we propose parameterized time-frequency transform-based FD processing. Then, the unambiguous compressive FD beamforming is conceived to compensate the resolution loss induced by difference operation. Finally, we further derive a coarse-to-fine histogram statistics scheme to alleviate the perturbation in compressive FD beamforming with good DOA estimation accuracy. Simulation results demonstrate the superior performance of our proposed algorithm regarding robustness, resolution, and DOA estimation accuracy.

* 6 figures

Via

Access Paper or Ask Questions

Backdoor Attack on Vertical Federated Graph Neural Network Learning

Oct 15, 2024

Jirui Yang, Peng Chen, Zhihui Lu, Ruijun Deng, Qiang Duan, Jianping Zeng

Abstract:Federated Graph Neural Network (FedGNN) is a privacy-preserving machine learning technology that combines federated learning (FL) and graph neural networks (GNNs). It offers a privacy-preserving solution for training GNNs using isolated graph data. Vertical Federated Graph Neural Network (VFGNN) is an important branch of FedGNN, where data features and labels are distributed among participants, and each participant has the same sample space. Due to the difficulty of accessing and modifying distributed data and labels, the vulnerability of VFGNN to backdoor attacks remains largely unexplored. In this context, we propose BVG, the first method for backdoor attacks in VFGNN. Without accessing or modifying labels, BVG uses multi-hop triggers and requires only four target class nodes for an effective backdoor attack. Experiments show that BVG achieves high attack success rates (ASR) across three datasets and three different GNN models, with minimal impact on main task accuracy (MTA). We also evaluate several defense methods, further validating the robustness and effectiveness of BVG. This finding also highlights the need for advanced defense mechanisms to counter sophisticated backdoor attacks in practical VFGNN applications.

Via

Access Paper or Ask Questions

UIFV: Data Reconstruction Attack in Vertical Federated Learning

Jun 18, 2024

Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao

Figure 1 for UIFV: Data Reconstruction Attack in Vertical Federated Learning

Figure 2 for UIFV: Data Reconstruction Attack in Vertical Federated Learning

Figure 3 for UIFV: Data Reconstruction Attack in Vertical Federated Learning

Figure 4 for UIFV: Data Reconstruction Attack in Vertical Federated Learning

Abstract:Vertical Federated Learning (VFL) facilitates collaborative machine learning without the need for participants to share raw private data. However, recent studies have revealed privacy risks where adversaries might reconstruct sensitive features through data leakage during the learning process. Although data reconstruction methods based on gradient or model information are somewhat effective, they reveal limitations in VFL application scenarios. This is because these traditional methods heavily rely on specific model structures and/or have strict limitations on application scenarios. To address this, our study introduces the Unified InverNet Framework into VFL, which yields a novel and flexible approach (dubbed UIFV) that leverages intermediate feature data to reconstruct original data, instead of relying on gradients or model details. The intermediate feature data is the feature exchanged by different participants during the inference phase of VFL. Experiments on four datasets demonstrate that our methods significantly outperform state-of-the-art techniques in attack precision. Our work exposes severe privacy vulnerabilities within VFL systems that pose real threats to practical VFL applications and thus confirms the necessity of further enhancing privacy protection in the VFL architecture.

Via

Access Paper or Ask Questions

PatchDCT: Patch Refinement for High Quality Instance Segmentation

Feb 07, 2023

Qinrou Wen, Jirui Yang, Xue Yang, Kewei Liang

Abstract:High-quality instance segmentation has shown emerging importance in computer vision. Without any refinement, DCT-Mask directly generates high-resolution masks by compressed vectors. To further refine masks obtained by compressed vectors, we propose for the first time a compressed vector based multi-stage refinement framework. However, the vanilla combination does not bring significant gains, because changes in some elements of the DCT vector will affect the prediction of the entire mask. Thus, we propose a simple and novel method named PatchDCT, which separates the mask decoded from a DCT vector into several patches and refines each patch by the designed classifier and regressor. Specifically, the classifier is used to distinguish mixed patches from all patches, and to correct previously mispredicted foreground and background patches. In contrast, the regressor is used for DCT vector prediction of mixed patches, further refining the segmentation quality at boundary locations. Experiments on COCO show that our method achieves 2.0%, 3.2%, 4.5% AP and 3.4%, 5.3%, 7.0% Boundary AP improvements over Mask-RCNN on COCO, LVIS, and Cityscapes, respectively. It also surpasses DCT-Mask by 0.7%, 1.1%, 1.3% AP and 0.9%, 1.7%, 4.2% Boundary AP on COCO, LVIS and Cityscapes. Besides, the performance of PatchDCT is also competitive with other state-of-the-art methods.

* 15 pages, 7 figures, 13 tables, accepted by ICLR 2023, the source code is available at https://github.com/olivia-w12/PatchDCT

Via

Access Paper or Ask Questions

The KFIoU Loss for Rotated Object Detection

Feb 01, 2022

Xue Yang, Yue Zhou, Gefan Zhang, Jirui Yang, Wentao Wang, Junchi Yan, Xiaopeng Zhang, Qi Tian

Figure 1 for The KFIoU Loss for Rotated Object Detection

Figure 2 for The KFIoU Loss for Rotated Object Detection

Figure 3 for The KFIoU Loss for Rotated Object Detection

Figure 4 for The KFIoU Loss for Rotated Object Detection

Abstract:Differing from the well-developed horizontal object detection area whereby the computing-friendly IoU based loss is readily adopted and well fits with the detection metrics. In contrast, rotation detectors often involve a more complicated loss based on SkewIoU which is unfriendly to gradient-based training. In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss instead of the strict value-level identity. Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU by its definition, and show its alignment with the SkewIoU at trend-level. This is in contrast to recent Gaussian modeling based rotation detectors e.g. GWD, KLD that involves a human-specified distribution distance metric which requires additional hyperparameter tuning. The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU, thanks to its full differentiability and ability to handle the non-overlapping cases. We further extend our technique to the 3-D case which also suffers from the same issues as 2-D detection. Extensive results on various public datasets (2-D/3-D, aerial/text/face images) with different base detectors show the effectiveness of our approach.

* 19 pages, 5 figures, 11 tables, tensorflow code: https://github.com/yangxue0827/RotationDetection, pytorch code: https://github.com/open-mmlab/mmrotate

Via

Access Paper or Ask Questions

Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

Jun 04, 2021

Xue Yang, Xiaojiang Yang, Jirui Yang, Qi Ming, Wentao Wang, Qi Tian, Junchi Yan

Figure 1 for Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

Figure 2 for Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

Figure 3 for Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

Figure 4 for Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

Abstract:Existing rotated object detectors are mostly inherited from the horizontal detection paradigm, as the latter has evolved into a well-developed area. However, these detectors are difficult to perform prominently in high-precision detection due to the limitation of current regression loss design, especially for objects with large aspect ratios. Taking the perspective that horizontal detection is a special case for rotated object detection, in this paper, we are motivated to change the design of rotation regression loss from induction paradigm to deduction methodology, in terms of the relation between rotation and horizontal detection. We show that one essential challenge is how to modulate the coupled parameters in the rotation regression loss, as such the estimated parameters can influence to each other during the dynamic joint optimization, in an adaptive and synergetic way. Specifically, we first convert the rotated bounding box into a 2-D Gaussian distribution, and then calculate the Kullback-Leibler Divergence (KLD) between the Gaussian distributions as the regression loss. By analyzing the gradient of each parameter, we show that KLD (and its derivatives) can dynamically adjust the parameter gradients according to the characteristics of the object. It will adjust the importance (gradient weight) of the angle parameter according to the aspect ratio. This mechanism can be vital for high-precision detection as a slight angle error would cause a serious accuracy drop for large aspect ratios objects. More importantly, we have proved that KLD is scale invariant. We further show that the KLD loss can be degenerated into the popular $l_{n}$-norm loss for horizontal detection. Experimental results on seven datasets using different detectors show its consistent superiority, and codes are available at https://github.com/yangxue0827/RotationDetection.

* 15 pages, 5 figures, 7 tables

Via

Access Paper or Ask Questions

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Nov 19, 2020

Xing Shen, Jirui Yang, Chunbo Wei, Bing Deng, Jianqiang Huang, Xiansheng Hua, Xiaoliang Cheng, Kewei Liang

Figure 1 for DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Figure 2 for DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Figure 3 for DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Figure 4 for DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Abstract:Binary grid mask representation is broadly used in instance segmentation. A representative instantiation is Mask R-CNN which predicts masks on a $28\times 28$ binary grid. Generally, a low-resolution grid is not sufficient to capture the details, while a high-resolution grid dramatically increases the training complexity. In this paper, we propose a new mask representation by applying the discrete cosine transform(DCT) to encode the high-resolution binary grid mask into a compact vector. Our method, termed DCT-Mask, could be easily integrated into most pixel-based instance segmentation methods. Without any bells and whistles, DCT-Mask yields significant gains on different frameworks, backbones, datasets, and training schedules. It does not require any pre-processing or pre-training, and almost no harm to the running speed. Especially, for higher-quality annotations and more complex backbones, our method has a greater improvement. Moreover, we analyze the performance of our method from the perspective of the quality of mask representation. The main reason why DCT-Mask works well is that it obtains a high-quality mask representation with low complexity. Code will be made available.

Via

Access Paper or Ask Questions

R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy

Nov 20, 2018

Xue Yang, Kun Fu, Hao Sun, Jirui Yang, Zhi Guo, Menglong Yan, Tengfei Zhang, Sun Xian

Figure 1 for R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy

Figure 2 for R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy

Figure 3 for R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy

Figure 4 for R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy

Abstract:Object detection plays a vital role in natural scene and aerial scene and is full of challenges. Although many advanced algorithms have succeeded in the natural scene, the progress in the aerial scene has been slow due to the complexity of the aerial image and the large degree of freedom of remote sensing objects in scale, orientation, and density. In this paper, a novel multi-category rotation detector is proposed, which can efficiently detect small objects, arbitrary direction objects, and dense objects in complex remote sensing images. Specifically, the proposed model adopts a targeted feature fusion strategy called inception fusion network, which fully considers factors such as feature fusion, anchor sampling, and receptive field to improve the ability to handle small objects. Then we combine the pixel attention network and the channel attention network to weaken the noise information and highlight the objects feature. Finally, the rotational object detection algorithm is realized by redefining the rotating bounding box. Experiments on public datasets including DOTA, NWPU VHR-10 demonstrate that the proposed algorithm significantly outperforms state-of-the-art methods. The code and models will be available at https://github.com/DetectionTeamUCAS/R2CNN-Plus-Plus_Tensorflow.

* 10 pages, 8 figures, 5 tables

Via

Access Paper or Ask Questions

Automatic Ship Detection of Remote Sensing Images from Google Earth in Complex Scenes Based on Multi-Scale Rotation Dense Feature Pyramid Networks

Jun 12, 2018

Xue Yang, Hao Sun, Kun Fu, Jirui Yang, Xian Sun, Menglong Yan, Zhi Guo

Figure 1 for Automatic Ship Detection of Remote Sensing Images from Google Earth in Complex Scenes Based on Multi-Scale Rotation Dense Feature Pyramid Networks

Figure 2 for Automatic Ship Detection of Remote Sensing Images from Google Earth in Complex Scenes Based on Multi-Scale Rotation Dense Feature Pyramid Networks

Figure 3 for Automatic Ship Detection of Remote Sensing Images from Google Earth in Complex Scenes Based on Multi-Scale Rotation Dense Feature Pyramid Networks

Figure 4 for Automatic Ship Detection of Remote Sensing Images from Google Earth in Complex Scenes Based on Multi-Scale Rotation Dense Feature Pyramid Networks

Abstract:Ship detection has been playing a significant role in the field of remote sensing for a long time but it is still full of challenges. The main limitations of traditional ship detection methods usually lie in the complexity of application scenarios, the difficulty of intensive object detection and the redundancy of detection region. In order to solve such problems above, we propose a framework called Rotation Dense Feature Pyramid Networks (R-DFPN) which can effectively detect ship in different scenes including ocean and port. Specifically, we put forward the Dense Feature Pyramid Network (DFPN), which is aimed at solving the problem resulted from the narrow width of the ship. Compared with previous multi-scale detectors such as Feature Pyramid Network (FPN), DFPN builds the high-level semantic feature-maps for all scales by means of dense connections, through which enhances the feature propagation and encourages the feature reuse. Additionally, in the case of ship rotation and dense arrangement, we design a rotation anchor strategy to predict the minimum circumscribed rectangle of the object so as to reduce the redundant detection region and improve the recall. Furthermore, we also propose multi-scale ROI Align for the purpose of maintaining the completeness of semantic and spatial information. Experiments based on remote sensing images from Google Earth for ship detection show that our detection method based on R-DFPN representation has a state-of-the-art performance.

* Remote Sens. 2018, 10, 132
* 14 pages, 11 figures

Via

Access Paper or Ask Questions