Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rui Ding

Joint Resource Optimization Over Licensed and Unlicensed Spectrum in Spectrum Sharing UAV Networks Against Jamming Attacks

Jul 23, 2025

Rui Ding, Fuhui Zhou, Yuhang Wu, Qihui Wu, Tony Q. S. Quek

Abstract:Unmanned aerial vehicle (UAV) communication is of crucial importance in realizing heterogeneous practical wireless application scenarios. However, the densely populated users and diverse services with high data rate demands has triggered an increasing scarcity of UAV spectrum utilization. To tackle this problem, it is promising to incorporate the underutilized unlicensed spectrum with the licensed spectrum to boost network capacity. However, the openness of unlicensed spectrum makes UAVs susceptible to security threats from potential jammers. Therefore, a spectrum sharing UAV network coexisting with licensed cellular network and unlicensed Wi-Fi network is considered with the anti-jamming technique in this paper. The sum rate maximization of the secondary network is studied by jointly optimizing the transmit power, subchannel allocation, and UAV trajectory. We first decompose the challenging non-convex problem into two subproblems, 1) the joint power and subchannel allocation and 2) UAV trajectory design subproblems. A low-complexity iterative algorithm is proposed in a alternating optimization manner over these two subproblems to solve the formulated problem. Specifically, the Lagrange dual decomposition is exploited to jointly optimize the transmit power and subchannel allocation iteratively. Then, an efficient iterative algorithm capitalizing on successive convex approximation is designed to get a suboptimal solution for UAV trajectory. Simulation results demonstrate that our proposed algorithm can significantly improve the sum transmission rate compared with the benchmark schemes.

Via

Access Paper or Ask Questions

QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Mar 20, 2025

Xuan Shen, Weize Ma, Jing Liu, Changdi Yang, Rui Ding, Quanyi Wang, Henghui Ding, Wei Niu, Yanzhi Wang, Pu Zhao(+2 more)

Figure 1 for QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Figure 2 for QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Figure 3 for QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Figure 4 for QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Abstract:Monocular Depth Estimation (MDE) has emerged as a pivotal task in computer vision, supporting numerous real-world applications. However, deploying accurate depth estimation models on resource-limited edge devices, especially Application-Specific Integrated Circuits (ASICs), is challenging due to the high computational and memory demands. Recent advancements in foundational depth estimation deliver impressive results but further amplify the difficulty of deployment on ASICs. To address this, we propose QuartDepth which adopts post-training quantization to quantize MDE models with hardware accelerations for ASICs. Our approach involves quantizing both weights and activations to 4-bit precision, reducing the model size and computation cost. To mitigate the performance degradation, we introduce activation polishing and compensation algorithm applied before and after activation quantization, as well as a weight reconstruction method for minimizing errors in weight quantization. Furthermore, we design a flexible and programmable hardware accelerator by supporting kernel fusion and customized instruction programmability, enhancing throughput and efficiency. Experimental results demonstrate that our framework achieves competitive accuracy while enabling fast inference and higher energy efficiency on ASICs, bridging the gap between high-performance depth estimation and practical edge-device applicability. Code: https://github.com/shawnricecake/quart-depth

* Accepted by CVPR 2025

Via

Access Paper or Ask Questions

UAV Cognitive Semantic Communications Enabled by Knowledge Graph for Robust Object Detection

Feb 06, 2025

Xi Song, Fuhui Zhou, Rui Ding, Zhibo Qu, Yihao Li, Qihui Wu, Naofal Al-Dhahir

Figure 1 for UAV Cognitive Semantic Communications Enabled by Knowledge Graph for Robust Object Detection

Figure 2 for UAV Cognitive Semantic Communications Enabled by Knowledge Graph for Robust Object Detection

Figure 3 for UAV Cognitive Semantic Communications Enabled by Knowledge Graph for Robust Object Detection

Figure 4 for UAV Cognitive Semantic Communications Enabled by Knowledge Graph for Robust Object Detection

Abstract:Unmanned aerial vehicles (UAVs) are widely used for object detection. However, the existing UAV-based object detection systems are subject to severe challenges, namely, their limited computation, energy and communication resources, which limits the achievable detection performance. To overcome these challenges, a UAV cognitive semantic communication system is proposed by exploiting a knowledge graph. Moreover, we design a multi-scale codec for semantic compression to reduce data transmission volume while guaranteeing detection performance. Considering the complexity and dynamicity of UAV communication scenarios, a signal-to-noise ratio (SNR) adaptive module with robust channel adaptation capability is introduced. Furthermore, an object detection scheme is proposed by exploiting the knowledge graph to overcome channel noise interference and compression distortion. Simulation results conducted on the practical aerial image dataset demonstrate that our proposed semantic communication system outperforms benchmark systems in terms of detection accuracy, communication robustness, and computation efficiency, especially in dealing with low bandwidth compression ratios and low SNR regimes.

Via

Access Paper or Ask Questions

Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers

Dec 19, 2024

Rui Ding, Liang Yong, Sihuan Zhao, Jing Nie, Lihui Chen, Haijun Liu, Xichuan Zhou

Figure 1 for Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers

Figure 2 for Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers

Figure 3 for Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers

Figure 4 for Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers

Abstract:Due to its efficiency, Post-Training Quantization (PTQ) has been widely adopted for compressing Vision Transformers (ViTs). However, when quantized into low-bit representations, there is often a significant performance drop compared to their full-precision counterparts. To address this issue, reconstruction methods have been incorporated into the PTQ framework to improve performance in low-bit quantization settings. Nevertheless, existing related methods predefine the reconstruction granularity and seldom explore the progressive relationships between different reconstruction granularities, which leads to sub-optimal quantization results in ViTs. To this end, in this paper, we propose a Progressive Fine-to-Coarse Reconstruction (PFCR) method for accurate PTQ, which significantly improves the performance of low-bit quantized vision transformers. Specifically, we define multi-head self-attention and multi-layer perceptron modules along with their shortcuts as the finest reconstruction units. After reconstructing these two fine-grained units, we combine them to form coarser blocks and reconstruct them at a coarser granularity level. We iteratively perform this combination and reconstruction process, achieving progressive fine-to-coarse reconstruction. Additionally, we introduce a Progressive Optimization Strategy (POS) for PFCR to alleviate the difficulty of training, thereby further enhancing model performance. Experimental results on the ImageNet dataset demonstrate that our proposed method achieves the best Top-1 accuracy among state-of-the-art methods, particularly attaining 75.61% for 3-bit quantized ViT-B in PTQ. Besides, quantization results on the COCO dataset reveal the effectiveness and generalization of our proposed method on other computer vision tasks like object detection and instance segmentation.

Via

Access Paper or Ask Questions

Relation Learning and Aggregate-attention for Multi-person Motion Prediction

Nov 06, 2024

Kehua Qu, Rui Ding, Jin Tang

Figure 1 for Relation Learning and Aggregate-attention for Multi-person Motion Prediction

Figure 2 for Relation Learning and Aggregate-attention for Multi-person Motion Prediction

Figure 3 for Relation Learning and Aggregate-attention for Multi-person Motion Prediction

Figure 4 for Relation Learning and Aggregate-attention for Multi-person Motion Prediction

Abstract:Multi-person motion prediction is an emerging and intricate task with broad real-world applications. Unlike single person motion prediction, it considers not just the skeleton structures or human trajectories but also the interactions between others. Previous methods use various networks to achieve impressive predictions but often overlook that the joints relations within an individual (intra-relation) and interactions among groups (inter-relation) are distinct types of representations. These methods often lack explicit representation of inter&intra-relations, and inevitably introduce undesired dependencies. To address this issue, we introduce a new collaborative framework for multi-person motion prediction that explicitly modeling these relations:a GCN-based network for intra-relations and a novel reasoning network for inter-relations.Moreover, we propose a novel plug-and-play aggregation module called the Interaction Aggregation Module (IAM), which employs an aggregate-attention mechanism to seamlessly integrate these relations. Experiments indicate that the module can also be applied to other dual-path models. Extensive experiments on the 3DPW, 3DPW-RC, CMU-Mocap, MuPoTS-3D, as well as synthesized datasets Mix1 & Mix2 (9 to 15 persons), demonstrate that our method achieves state-of-the-art performance.

* Submitted to IEEE Transactions on Multimedia

Via

Access Paper or Ask Questions

UnityGraph: Unified Learning of Spatio-temporal features for Multi-person Motion Prediction

Nov 06, 2024

Kehua Qu, Rui Ding, Jin Tang

Figure 1 for UnityGraph: Unified Learning of Spatio-temporal features for Multi-person Motion Prediction

Figure 2 for UnityGraph: Unified Learning of Spatio-temporal features for Multi-person Motion Prediction

Figure 3 for UnityGraph: Unified Learning of Spatio-temporal features for Multi-person Motion Prediction

Figure 4 for UnityGraph: Unified Learning of Spatio-temporal features for Multi-person Motion Prediction

Abstract:Multi-person motion prediction is a complex and emerging field with significant real-world applications. Current state-of-the-art methods typically adopt dual-path networks to separately modeling spatial features and temporal features. However, the uncertain compatibility of the two networks brings a challenge for spatio-temporal features fusion and violate the spatio-temporal coherence and coupling of human motions by nature. To address this issue, we propose a novel graph structure, UnityGraph, which treats spatio-temporal features as a whole, enhancing model coherence and coupling.spatio-temporal features as a whole, enhancing model coherence and coupling. Specifically, UnityGraph is a hypervariate graph based network. The flexibility of the hypergraph allows us to consider the observed motions as graph nodes. We then leverage hyperedges to bridge these nodes for exploring spatio-temporal features. This perspective considers spatio-temporal dynamics unitedly and reformulates multi-person motion prediction into a problem on a single graph. Leveraging the dynamic message passing based on this hypergraph, our model dynamically learns from both types of relations to generate targeted messages that reflect the relevance among nodes. Extensive experiments on several datasets demonstrates that our method achieves state-of-the-art performance, confirming its effectiveness and innovative design.

* 13pages, 12 figures. arXiv admin note: text overlap with arXiv:2411.03729

Via

Access Paper or Ask Questions

Control the GNN: Utilizing Neural Controller with Lyapunov Stability for Test-Time Feature Reconstruction

Oct 13, 2024

Jielong Yang, Rui Ding, Feng Ji, Hongbin Wang, Linbo Xie

Figure 1 for Control the GNN: Utilizing Neural Controller with Lyapunov Stability for Test-Time Feature Reconstruction

Figure 2 for Control the GNN: Utilizing Neural Controller with Lyapunov Stability for Test-Time Feature Reconstruction

Figure 3 for Control the GNN: Utilizing Neural Controller with Lyapunov Stability for Test-Time Feature Reconstruction

Figure 4 for Control the GNN: Utilizing Neural Controller with Lyapunov Stability for Test-Time Feature Reconstruction

Abstract:The performance of graph neural networks (GNNs) is susceptible to discrepancies between training and testing sample distributions. Prior studies have attempted to enhance GNN performance by reconstructing node features during the testing phase without modifying the model parameters. However, these approaches lack theoretical analysis of the proximity between predictions and ground truth at test time. In this paper, we propose a novel node feature reconstruction method grounded in Lyapunov stability theory. Specifically, we model the GNN as a control system during the testing phase, considering node features as control variables. A neural controller that adheres to the Lyapunov stability criterion is then employed to reconstruct these node features, ensuring that the predictions progressively approach the ground truth at test time. We validate the effectiveness of our approach through extensive experiments across multiple datasets, demonstrating significant performance improvements.

Via

Access Paper or Ask Questions

Leveraging Data Mining, Active Learning, and Domain Adaptation in a Multi-Stage, Machine Learning-Driven Approach for the Efficient Discovery of Advanced Acidic Oxygen Evolution Electrocatalysts

Jul 05, 2024

Rui Ding, Jianguo Liu, Kang Hua, Xuebin Wang, Xiaoben Zhang, Minhua Shao, Yuxin Chen, Junhong Chen

Figure 1 for Leveraging Data Mining, Active Learning, and Domain Adaptation in a Multi-Stage, Machine Learning-Driven Approach for the Efficient Discovery of Advanced Acidic Oxygen Evolution Electrocatalysts

Figure 2 for Leveraging Data Mining, Active Learning, and Domain Adaptation in a Multi-Stage, Machine Learning-Driven Approach for the Efficient Discovery of Advanced Acidic Oxygen Evolution Electrocatalysts

Figure 3 for Leveraging Data Mining, Active Learning, and Domain Adaptation in a Multi-Stage, Machine Learning-Driven Approach for the Efficient Discovery of Advanced Acidic Oxygen Evolution Electrocatalysts

Figure 4 for Leveraging Data Mining, Active Learning, and Domain Adaptation in a Multi-Stage, Machine Learning-Driven Approach for the Efficient Discovery of Advanced Acidic Oxygen Evolution Electrocatalysts

Abstract:Developing advanced catalysts for acidic oxygen evolution reaction (OER) is crucial for sustainable hydrogen production. This study introduces a novel, multi-stage machine learning (ML) approach to streamline the discovery and optimization of complex multi-metallic catalysts. Our method integrates data mining, active learning, and domain adaptation throughout the materials discovery process. Unlike traditional trial-and-error methods, this approach systematically narrows the exploration space using domain knowledge with minimized reliance on subjective intuition. Then the active learning module efficiently refines element composition and synthesis conditions through iterative experimental feedback. The process culminated in the discovery of a promising Ru-Mn-Ca-Pr oxide catalyst. Our workflow also enhances theoretical simulations with domain adaptation strategy, providing deeper mechanistic insights aligned with experimental findings. By leveraging diverse data sources and multiple ML strategies, we establish an efficient pathway for electrocatalyst discovery and optimization. This comprehensive, data-driven approach represents a paradigm shift and potentially new benchmark in electrocatalysts research.

* 95 pages (main text 37 pages; supplementary materials 58 pages); 38 figures (main text 6 figures; supplementary materials 32 figures)

Via

Access Paper or Ask Questions

Scalable Differentiable Causal Discovery in the Presence of Latent Confounders with Skeleton Posterior (Extended Version)

Jun 15, 2024

Pingchuan Ma, Rui Ding, Qiang Fu, Jiaru Zhang, Shuai Wang, Shi Han, Dongmei Zhang

Figure 1 for Scalable Differentiable Causal Discovery in the Presence of Latent Confounders with Skeleton Posterior (Extended Version)

Figure 2 for Scalable Differentiable Causal Discovery in the Presence of Latent Confounders with Skeleton Posterior (Extended Version)

Figure 3 for Scalable Differentiable Causal Discovery in the Presence of Latent Confounders with Skeleton Posterior (Extended Version)

Figure 4 for Scalable Differentiable Causal Discovery in the Presence of Latent Confounders with Skeleton Posterior (Extended Version)

Abstract:Differentiable causal discovery has made significant advancements in the learning of directed acyclic graphs. However, its application to real-world datasets remains restricted due to the ubiquity of latent confounders and the requirement to learn maximal ancestral graphs (MAGs). To date, existing differentiable MAG learning algorithms have been limited to small datasets and failed to scale to larger ones (e.g., with more than 50 variables). The key insight in this paper is that the causal skeleton, which is the undirected version of the causal graph, has potential for improving accuracy and reducing the search space of the optimization procedure, thereby enhancing the performance of differentiable causal discovery. Therefore, we seek to address a two-fold challenge to harness the potential of the causal skeleton for differentiable causal discovery in the presence of latent confounders: (1) scalable and accurate estimation of skeleton and (2) universal integration of skeleton estimation with differentiable causal discovery. To this end, we propose SPOT (Skeleton Posterior-guided OpTimization), a two-phase framework that harnesses skeleton posterior for differentiable causal discovery in the presence of latent confounders. On the contrary to a ``point-estimation'', SPOT seeks to estimate the posterior distribution of skeletons given the dataset. It first formulates the posterior inference as an instance of amortized inference problem and concretizes it with a supervised causal learning (SCL)-enabled solution to estimate the skeleton posterior. To incorporate the skeleton posterior with differentiable causal discovery, SPOT then features a skeleton posterior-guided stochastic optimization procedure to guide the optimization of MAGs. [abridged due to length limit]

Via

Access Paper or Ask Questions

Enhancing Text Authenticity: A Novel Hybrid Approach for AI-Generated Text Detection

Jun 01, 2024

Ye Zhang, Qian Leng, Mengran Zhu, Rui Ding, Yue Wu, Jintong Song, Yulu Gong

Figure 1 for Enhancing Text Authenticity: A Novel Hybrid Approach for AI-Generated Text Detection

Figure 2 for Enhancing Text Authenticity: A Novel Hybrid Approach for AI-Generated Text Detection

Figure 3 for Enhancing Text Authenticity: A Novel Hybrid Approach for AI-Generated Text Detection

Abstract:The rapid advancement of Large Language Models (LLMs) has ushered in an era where AI-generated text is increasingly indistinguishable from human-generated content. Detecting AI-generated text has become imperative to combat misinformation, ensure content authenticity, and safeguard against malicious uses of AI. In this paper, we propose a novel hybrid approach that combines traditional TF-IDF techniques with advanced machine learning models, including Bayesian classifiers, Stochastic Gradient Descent (SGD), Categorical Gradient Boosting (CatBoost), and 12 instances of Deberta-v3-large models. Our approach aims to address the challenges associated with detecting AI-generated text by leveraging the strengths of both traditional feature extraction methods and state-of-the-art deep learning models. Through extensive experiments on a comprehensive dataset, we demonstrate the effectiveness of our proposed method in accurately distinguishing between human and AI-generated text. Our approach achieves superior performance compared to existing methods. This research contributes to the advancement of AI-generated text detection techniques and lays the foundation for developing robust solutions to mitigate the challenges posed by AI-generated content.

Via

Access Paper or Ask Questions