Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rong Gu

Mälardalen University

Chordless Structure: A Pathway to Simple and Expressive GNNs

May 25, 2025

Hongxu Pan, Shuxian Hu, Mo Zhou, Zhibin Wang, Rong Gu, Chen Tian, Kun Yang, Sheng Zhong

Abstract:Researchers have proposed various methods of incorporating more structured information into the design of Graph Neural Networks (GNNs) to enhance their expressiveness. However, these methods are either computationally expensive or lacking in provable expressiveness. In this paper, we observe that the chords increase the complexity of the graph structure while contributing little useful information in many cases. In contrast, chordless structures are more efficient and effective for representing the graph. Therefore, when leveraging the information of cycles, we choose to omit the chords. Accordingly, we propose a Chordless Structure-based Graph Neural Network (CSGNN) and prove that its expressiveness is strictly more powerful than the k-hop GNN (KPGNN) with polynomial complexity. Experimental results on real-world datasets demonstrate that CSGNN outperforms existing GNNs across various graph tasks while incurring lower computational costs and achieving better performance than the GNNs of 3-WL expressiveness.

Via

Access Paper or Ask Questions

FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding

May 23, 2025

Zhibin Wang, Rui Ning, Chao Fang, Zhonghui Zhang, Xi Lin, Shaobo Ma, Mo Zhou, Xue Li, Zhongfeng Wang, Chengying Huan(+5 more)

Abstract:Prefix-sharing among multiple prompts presents opportunities to combine the operations of the shared prefix, while attention computation in the decode stage, which becomes a critical bottleneck with increasing context lengths, is a memory-intensive process requiring heavy memory access on the key-value (KV) cache of the prefixes. Therefore, in this paper, we explore the potential of prefix-sharing in the attention computation of the decode stage. However, the tree structure of the prefix-sharing mechanism presents significant challenges for attention computation in efficiently processing shared KV cache access patterns while managing complex dependencies and balancing irregular workloads. To address the above challenges, we propose a dedicated attention kernel to combine the memory access of shared prefixes in the decoding stage, namely FlashForge. FlashForge delivers two key innovations: a novel shared-prefix attention kernel that optimizes memory hierarchy and exploits both intra-block and inter-block parallelism, and a comprehensive workload balancing mechanism that efficiently estimates cost, divides tasks, and schedules execution. Experimental results show that FlashForge achieves an average 1.9x speedup and 120.9x memory access reduction compared to the state-of-the-art FlashDecoding kernel regarding attention computation in the decode stage and 3.8x end-to-end time per output token compared to the vLLM.

Via

Access Paper or Ask Questions

Composable Strategy Framework with Integrated Video-Text based Large Language Models for Heart Failure Assessment

Feb 23, 2025

Jianzhou Chen, Xiumei Wang, Jinyang Sun, Xi Chen, Heyu Chu, Guo Song, Yuji Luo, Xingping Zhou, Rong Gu

Abstract:Heart failure is one of the leading causes of death worldwide, with millons of deaths each year, according to data from the World Health Organization (WHO) and other public health agencies. While significant progress has been made in the field of heart failure, leading to improved survival rates and improvement of ejection fraction, there remains substantial unmet needs, due to the complexity and multifactorial characteristics. Therefore, we propose a composable strategy framework for assessment and treatment optimization in heart failure. This framework simulates the doctor-patient consultation process and leverages multi-modal algorithms to analyze a range of data, including video, physical examination, text results as well as medical history. By integrating these various data sources, our framework offers a more holistic evaluation and optimized treatment plan for patients. Our results demonstrate that this multi-modal approach outperforms single-modal artificial intelligence (AI) algorithms in terms of accuracy in heart failure (HF) prognosis prediction. Through this method, we can further evaluate the impact of various pathological indicators on HF prognosis,providing a more comprehensive evaluation.

Via

Access Paper or Ask Questions

Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think!

Nov 21, 2024

Rong Gu

Figure 1 for Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think!

Figure 2 for Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think!

Figure 3 for Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think!

Figure 4 for Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think!

Abstract:Most reinforcement learning (RL) platforms use high-level programming languages, such as OpenAI Gymnasium using Python. These frameworks provide various API and benchmarks for testing RL algorithms in different domains, such as autonomous driving (AD) and robotics. These platforms often emphasise the design of RL algorithms and the training performance but neglect the correctness of models and reward functions, which can be crucial for the successful application of RL. This paper proposes using formal methods to model AD systems and demonstrates how model checking (MC) can be used in RL for AD. Most studies combining MC and RL focus on safety, such as safety shields. However, this paper shows different facets where MC can strengthen RL. First, an MC-based model pre-analysis can reveal bugs with respect to sensor accuracy and learning step size. This step serves as a preparation of RL, which saves time if bugs exist and deepens users' understanding of the target system. Second, reward automata can benefit the design of reward functions and greatly improve learning performance especially when the learning objectives are multiple. All these findings are supported by experiments.

* EPTCS 411, 2024, pp. 160-177
* In Proceedings FMAS2024, arXiv:2411.13215

Via

Access Paper or Ask Questions

Revisiting SLO and Goodput Metrics in LLM Serving

Oct 18, 2024

Zhibin Wang, Shipeng Li, Yuhang Zhou, Xue Li, Rong Gu, Nguyen Cam-Tu, Chen Tian, Sheng Zhong

Abstract:Large language models (LLMs) have achieved remarkable performance and are widely deployed in various applications, while the serving of LLM inference has raised concerns about user experience and serving throughput. Accordingly, service level objectives (SLOs) and goodput-the number of requests that meet SLOs per second-are introduced to evaluate the performance of LLM serving. However, existing metrics fail to capture the nature of user experience. We observe two ridiculous phenomena in existing metrics: 1) delaying token delivery can smooth the tail time between tokens (tail TBT) of a request and 2) dropping the request that fails to meet the SLOs midway can improve goodput. In this paper, we revisit SLO and goodput metrics in LLM serving and propose a unified metric framework smooth goodput including SLOs and goodput to reflect the nature of user experience in LLM serving. The framework can adapt to specific goals of different tasks by setting parameters. We re-evaluate the performance of different LLM serving systems under multiple workloads based on this unified framework and provide possible directions for future optimization of existing strategies. We hope that this framework can provide a unified standard for evaluating LLM serving and foster researches in the field of LLM serving optimization to move in a cohesive direction.

Via

Access Paper or Ask Questions

CommonUppRoad: A Framework of Formal Modelling, Verifying, Learning, and Visualisation of Autonomous Vehicles

Aug 02, 2024

Rong Gu, Kaige Tan, Andreas Holck Høeg-Petersen, Lei Feng, Kim Guldstrand Larsen

Figure 1 for CommonUppRoad: A Framework of Formal Modelling, Verifying, Learning, and Visualisation of Autonomous Vehicles

Figure 2 for CommonUppRoad: A Framework of Formal Modelling, Verifying, Learning, and Visualisation of Autonomous Vehicles

Figure 3 for CommonUppRoad: A Framework of Formal Modelling, Verifying, Learning, and Visualisation of Autonomous Vehicles

Figure 4 for CommonUppRoad: A Framework of Formal Modelling, Verifying, Learning, and Visualisation of Autonomous Vehicles

Abstract:Combining machine learning and formal methods (FMs) provides a possible solution to overcome the safety issue of autonomous driving (AD) vehicles. However, there are gaps to be bridged before this combination becomes practically applicable and useful. In an attempt to facilitate researchers in both FMs and AD areas, this paper proposes a framework that combines two well-known tools, namely CommonRoad and UPPAAL. On the one hand, CommonRoad can be enhanced by the rigorous semantics of models in UPPAAL, which enables a systematic and comprehensive understanding of the AD system's behaviour and thus strengthens the safety of the system. On the other hand, controllers synthesised by UPPAAL can be visualised by CommonRoad in real-world road networks, which facilitates AD vehicle designers greatly adopting formal models in system design. In this framework, we provide automatic model conversions between CommonRoad and UPPAAL. Therefore, users only need to program in Python and the framework takes care of the formal models, learning, and verification in the backend. We perform experiments to demonstrate the applicability of our framework in various AD scenarios, discuss the advantages of solving motion planning in our framework, and show the scalability limit and possible solutions.

* 20 pages, 5 figures, ISoLA 2024

Via

Access Paper or Ask Questions

Semi-supervised Embedding Learning for High-dimensional Bayesian Optimization

May 29, 2020

Jingfan Chen, Guanghui Zhu, Rong Gu, Chunfeng Yuan, Yihua Huang

Figure 1 for Semi-supervised Embedding Learning for High-dimensional Bayesian Optimization

Figure 2 for Semi-supervised Embedding Learning for High-dimensional Bayesian Optimization

Figure 3 for Semi-supervised Embedding Learning for High-dimensional Bayesian Optimization

Figure 4 for Semi-supervised Embedding Learning for High-dimensional Bayesian Optimization

Abstract:Bayesian optimization is a broadly applied methodology to optimize the expensive black-box function. Despite its success, it still faces the challenge from the high-dimensional search space. To alleviate this problem, we propose a novel Bayesian optimization framework (termed SILBO), which finds a low-dimensional space to perform Bayesian optimization iteratively through semi-supervised dimension reduction. SILBO incorporates both labeled points and unlabeled points acquired from the acquisition function to guide the embedding space learning. To accelerate the learning procedure, we present a randomized method for generating the projection matrix. Furthermore, to map from the low-dimensional space to the high-dimensional original space, we propose two mapping strategies: $\text{SILBO}_{FZ}$ and $\text{SILBO}_{FX}$ according to the evaluation overhead of the objective function. Experimental results on both synthetic function and hyperparameter optimization tasks demonstrate that SILBO outperforms the existing state-of-the-art high-dimensional Bayesian optimization methods.

Via

Access Paper or Ask Questions