Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xuan Song

Connecting the Dots: A Chain-of-Collaboration Prompting Framework for LLM Agents

May 16, 2025

Jiaxing Zhao, Hongbin Xie, Yuzhen Lei, Xuan Song, Zhuoran Shi, Lianxin Li, Shuangxue Liu, Haoran Zhang

Abstract:Large Language Models (LLMs) have demonstrated impressive performance in executing complex reasoning tasks. Chain-of-thought effectively enhances reasoning capabilities by unlocking the potential of large models, while multi-agent systems provide more comprehensive solutions by integrating collective intelligence of multiple agents. However, both approaches face significant limitations. Single-agent with chain-of-thought, due to the inherent complexity of designing cross-domain prompts, faces collaboration challenges. Meanwhile, multi-agent systems consume substantial tokens and inevitably dilute the primary problem, which is particularly problematic in business workflow tasks. To address these challenges, we propose Cochain, a collaboration prompting framework that effectively solves business workflow collaboration problem by combining knowledge and prompts at a reduced cost. Specifically, we construct an integrated knowledge graph that incorporates knowledge from multiple stages. Furthermore, by maintaining and retrieving a prompts tree, we can obtain prompt information relevant to other stages of the business workflow. We perform extensive evaluations of Cochain across multiple datasets, demonstrating that Cochain outperforms all baselines in both prompt engineering and multi-agent LLMs. Additionally, expert evaluation results indicate that the use of a small model in combination with Cochain outperforms GPT-4.

* 34 pages, 20 figures

Via

Access Paper or Ask Questions

Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting

Nov 18, 2024

Hongjun Wang, Jiyuan Chen, Lingyu Zhang, Renhe Jiang, Xuan Song

Figure 1 for Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting

Figure 2 for Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting

Figure 3 for Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting

Figure 4 for Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting

Abstract:Spatiotemporal Graph Neural Networks (ST-GNNs) and Transformers have shown significant promise in traffic forecasting by effectively modeling temporal and spatial correlations. However, rapid urbanization in recent years has led to dynamic shifts in traffic patterns and travel demand, posing major challenges for accurate long-term traffic prediction. The generalization capability of ST-GNNs in extended temporal scenarios and cross-city applications remains largely unexplored. In this study, we evaluate state-of-the-art models on an extended traffic benchmark and observe substantial performance degradation in existing ST-GNNs over time, which we attribute to their limited inductive capabilities. Our analysis reveals that this degradation stems from an inability to adapt to evolving spatial relationships within urban environments. To address this limitation, we reconsider the design of adaptive embeddings and propose a Principal Component Analysis (PCA) embedding approach that enables models to adapt to new scenarios without retraining. We incorporate PCA embeddings into existing ST-GNN and Transformer architectures, achieving marked improvements in performance. Notably, PCA embeddings allow for flexibility in graph structures between training and testing, enabling models trained on one city to perform zero-shot predictions on other cities. This adaptability demonstrates the potential of PCA embeddings in enhancing the robustness and generalization of spatiotemporal models.

Via

Access Paper or Ask Questions

Learning to Balance: Diverse Normalization for Cloth-Changing Person Re-Identification

Oct 14, 2024

Hongjun Wang, Jiyuan Chen, Zhengwei Yin, Xuan Song, Yinqiang Zheng

Figure 1 for Learning to Balance: Diverse Normalization for Cloth-Changing Person Re-Identification

Figure 2 for Learning to Balance: Diverse Normalization for Cloth-Changing Person Re-Identification

Figure 3 for Learning to Balance: Diverse Normalization for Cloth-Changing Person Re-Identification

Figure 4 for Learning to Balance: Diverse Normalization for Cloth-Changing Person Re-Identification

Abstract:Cloth-Changing Person Re-Identification (CC-ReID) involves recognizing individuals in images regardless of clothing status. In this paper, we empirically and experimentally demonstrate that completely eliminating or fully retaining clothing features is detrimental to the task. Existing work, either relying on clothing labels, silhouettes, or other auxiliary data, fundamentally aim to balance the learning of clothing and identity features. However, we practically find that achieving this balance is challenging and nuanced. In this study, we introduce a novel module called Diverse Norm, which expands personal features into orthogonal spaces and employs channel attention to separate clothing and identity features. A sample re-weighting optimization strategy is also introduced to guarantee the opposite optimization direction. Diverse Norm presents a simple yet effective approach that does not require additional data. Furthermore, Diverse Norm can be seamlessly integrated ResNet50 and significantly outperforms the state-of-the-art methods.

Via

Access Paper or Ask Questions

On Feature Decorrelation in Cloth-Changing Person Re-identification

Oct 07, 2024

Hongjun Wang, Jiyuan Chen, Renhe Jiang, Xuan Song, Yinqiang Zheng

Figure 1 for On Feature Decorrelation in Cloth-Changing Person Re-identification

Figure 2 for On Feature Decorrelation in Cloth-Changing Person Re-identification

Figure 3 for On Feature Decorrelation in Cloth-Changing Person Re-identification

Figure 4 for On Feature Decorrelation in Cloth-Changing Person Re-identification

Abstract:Cloth-changing person re-identification (CC-ReID) poses a significant challenge in computer vision. A prevailing approach is to prompt models to concentrate on causal attributes, like facial features and hairstyles, rather than confounding elements such as clothing appearance. Traditional methods to achieve this involve integrating multi-modality data or employing manually annotated clothing labels, which tend to complicate the model and require extensive human effort. In our study, we demonstrate that simply reducing feature correlations during training can significantly enhance the baseline model's performance. We theoretically elucidate this effect and introduce a novel regularization technique based on density ratio estimation. This technique aims to minimize feature correlation in the training process of cloth-changing ReID baselines. Our approach is model-independent, offering broad enhancements without needing additional data or labels. We validate our method through comprehensive experiments on prevalent CC-ReID datasets, showing its effectiveness in improving baseline models' generalization capabilities.

Via

Access Paper or Ask Questions

Robust Traffic Forecasting against Spatial Shift over Years

Oct 01, 2024

Hongjun Wang, Jiyuan Chen, Tong Pan, Zheng Dong, Lingyu Zhang, Renhe Jiang, Xuan Song

Figure 1 for Robust Traffic Forecasting against Spatial Shift over Years

Figure 2 for Robust Traffic Forecasting against Spatial Shift over Years

Figure 3 for Robust Traffic Forecasting against Spatial Shift over Years

Figure 4 for Robust Traffic Forecasting against Spatial Shift over Years

Abstract:Recent advancements in Spatiotemporal Graph Neural Networks (ST-GNNs) and Transformers have demonstrated promising potential for traffic forecasting by effectively capturing both temporal and spatial correlations. The generalization ability of spatiotemporal models has received considerable attention in recent scholarly discourse. However, no substantive datasets specifically addressing traffic out-of-distribution (OOD) scenarios have been proposed. Existing ST-OOD methods are either constrained to testing on extant data or necessitate manual modifications to the dataset. Consequently, the generalization capacity of current spatiotemporal models in OOD scenarios remains largely underexplored. In this paper, we investigate state-of-the-art models using newly proposed traffic OOD benchmarks and, surprisingly, find that these models experience a significant decline in performance. Through meticulous analysis, we attribute this decline to the models' inability to adapt to previously unobserved spatial relationships. To address this challenge, we propose a novel Mixture of Experts (MoE) framework, which learns a set of graph generators (i.e., graphons) during training and adaptively combines them to generate new graphs based on novel environmental conditions to handle spatial distribution shifts during testing. We further extend this concept to the Transformer architecture, achieving substantial improvements. Our method is both parsimonious and efficacious, and can be seamlessly integrated into any spatiotemporal model, outperforming current state-of-the-art approaches in addressing spatial dynamics.

Via

Access Paper or Ask Questions

STGformer: Efficient Spatiotemporal Graph Transformer for Traffic Forecasting

Oct 01, 2024

Hongjun Wang, Jiyuan Chen, Tong Pan, Zheng Dong, Lingyu Zhang, Renhe Jiang, Xuan Song

Figure 1 for STGformer: Efficient Spatiotemporal Graph Transformer for Traffic Forecasting

Figure 2 for STGformer: Efficient Spatiotemporal Graph Transformer for Traffic Forecasting

Figure 3 for STGformer: Efficient Spatiotemporal Graph Transformer for Traffic Forecasting

Figure 4 for STGformer: Efficient Spatiotemporal Graph Transformer for Traffic Forecasting

Abstract:Traffic forecasting is a cornerstone of smart city management, enabling efficient resource allocation and transportation planning. Deep learning, with its ability to capture complex nonlinear patterns in spatiotemporal (ST) data, has emerged as a powerful tool for traffic forecasting. While graph neural networks (GCNs) and transformer-based models have shown promise, their computational demands often hinder their application to real-world road networks, particularly those with large-scale spatiotemporal interactions. To address these challenges, we propose a novel spatiotemporal graph transformer (STGformer) architecture. STGformer effectively balances the strengths of GCNs and Transformers, enabling efficient modeling of both global and local traffic patterns while maintaining a manageable computational footprint. Unlike traditional approaches that require multiple attention layers, STG attention block captures high-order spatiotemporal interactions in a single layer, significantly reducing computational cost. In particular, STGformer achieves a 100x speedup and a 99.8\% reduction in GPU memory usage compared to STAEformer during batch inference on a California road graph with 8,600 sensors. We evaluate STGformer on the LargeST benchmark and demonstrate its superiority over state-of-the-art Transformer-based methods such as PDFormer and STAEformer, which underline STGformer's potential to revolutionize traffic forecasting by overcoming the computational and memory limitations of existing approaches, making it a promising foundation for future spatiotemporal modeling tasks.

Via

Access Paper or Ask Questions

Enhancing Spatio-temporal Quantile Forecasting with Curriculum Learning: Lessons Learned

Jun 18, 2024

Du Yin, Jinliang Deng, Shuang Ao, Zechen Li, Hao Xue, Arian Prabowo, Renhe Jiang, Xuan Song, Flora Salim

Figure 1 for Enhancing Spatio-temporal Quantile Forecasting with Curriculum Learning: Lessons Learned

Figure 2 for Enhancing Spatio-temporal Quantile Forecasting with Curriculum Learning: Lessons Learned

Figure 3 for Enhancing Spatio-temporal Quantile Forecasting with Curriculum Learning: Lessons Learned

Figure 4 for Enhancing Spatio-temporal Quantile Forecasting with Curriculum Learning: Lessons Learned

Abstract:Training models on spatio-temporal (ST) data poses an open problem due to the complicated and diverse nature of the data itself, and it is challenging to ensure the model's performance directly trained on the original ST data. While limiting the variety of training data can make training easier, it can also lead to a lack of knowledge and information for the model, resulting in a decrease in performance. To address this challenge, we presented an innovative paradigm that incorporates three separate forms of curriculum learning specifically targeting from spatial, temporal, and quantile perspectives. Furthermore, our framework incorporates a stacking fusion module to combine diverse information from three types of curriculum learning, resulting in a strong and thorough learning process. We demonstrated the effectiveness of this framework with extensive empirical evaluations, highlighting its better performance in addressing complex ST challenges. We provided thorough ablation studies to investigate the effectiveness of our curriculum and to explain how it contributes to the improvement of learning efficiency on ST data.

Via

Access Paper or Ask Questions

Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications

Jun 08, 2024

Zhou Zhou, Guohang He, Zheng Zhang, Luziwei Leng, Qinghai Guo, Jianxing Liao, Xuan Song, Ran Cheng

Figure 1 for Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications

Figure 2 for Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications

Figure 3 for Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications

Figure 4 for Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications

Abstract:Traditional invasive Brain-Computer Interfaces (iBCIs) typically depend on neural decoding processes conducted on workstations within laboratory settings, which prevents their everyday usage. Implementing these decoding processes on edge devices, such as the wearables, introduces considerable challenges related to computational demands, processing speed, and maintaining accuracy. This study seeks to identify an optimal neural decoding backbone that boasts robust performance and swift inference capabilities suitable for edge deployment. We executed a series of neural decoding experiments involving nonhuman primates engaged in random reaching tasks, evaluating four prospective models, Gated Recurrent Unit (GRU), Transformer, Receptance Weighted Key Value (RWKV), and Selective State Space model (Mamba), across several metrics: single-session decoding, multi-session decoding, new session fine-tuning, inference speed, calibration speed, and scalability. The findings indicate that although the GRU model delivers sufficient accuracy, the RWKV and Mamba models are preferable due to their superior inference and calibration speeds. Additionally, RWKV and Mamba comply with the scaling law, demonstrating improved performance with larger data sets and increased model sizes, whereas GRU shows less pronounced scalability, and the Transformer model requires computational resources that scale prohibitively. This paper presents a thorough comparative analysis of the four models in various scenarios. The results are pivotal in pinpointing an optimal backbone that can handle increasing data volumes and is viable for edge implementation. This analysis provides essential insights for ongoing research and practical applications in the field.

Via

Access Paper or Ask Questions

Continuous Temporal Domain Generalization

May 25, 2024

Zekun Cai, Guangji Bai, Renhe Jiang, Xuan Song, Liang Zhao

Figure 1 for Continuous Temporal Domain Generalization

Figure 2 for Continuous Temporal Domain Generalization

Figure 3 for Continuous Temporal Domain Generalization

Figure 4 for Continuous Temporal Domain Generalization

Abstract:Temporal Domain Generalization (TDG) addresses the challenge of training predictive models under temporally varying data distributions. Traditional TDG approaches typically focus on domain data collected at fixed, discrete time intervals, which limits their capability to capture the inherent dynamics within continuous-evolving and irregularly-observed temporal domains. To overcome this, this work formalizes the concept of Continuous Temporal Domain Generalization (CTDG), where domain data are derived from continuous times and are collected at arbitrary times. CTDG tackles critical challenges including: 1) Characterizing the continuous dynamics of both data and models, 2) Learning complex high-dimensional nonlinear dynamics, and 3) Optimizing and controlling the generalization across continuous temporal domains. To address them, we propose a Koopman operator-driven continuous temporal domain generalization (Koodos) framework. We formulate the problem within a continuous dynamic system and leverage the Koopman theory to learn the underlying dynamics; the framework is further enhanced with a comprehensive optimization strategy equipped with analysis and control driven by prior knowledge of the dynamics patterns. Extensive experiments demonstrate the effectiveness and efficiency of our approach.

Via

Access Paper or Ask Questions

Heterogeneity-Informed Meta-Parameter Learning for Spatiotemporal Time Series Forecasting

May 17, 2024

Zheng Dong, Renhe Jiang, Haotian Gao, Hangchen Liu, Jinliang Deng, Qingsong Wen, Xuan Song

Figure 1 for Heterogeneity-Informed Meta-Parameter Learning for Spatiotemporal Time Series Forecasting

Figure 2 for Heterogeneity-Informed Meta-Parameter Learning for Spatiotemporal Time Series Forecasting

Figure 3 for Heterogeneity-Informed Meta-Parameter Learning for Spatiotemporal Time Series Forecasting

Figure 4 for Heterogeneity-Informed Meta-Parameter Learning for Spatiotemporal Time Series Forecasting

Abstract:Spatiotemporal time series forecasting plays a key role in a wide range of real-world applications. While significant progress has been made in this area, fully capturing and leveraging spatiotemporal heterogeneity remains a fundamental challenge. Therefore, we propose a novel Heterogeneity-Informed Meta-Parameter Learning scheme. Specifically, our approach implicitly captures spatiotemporal heterogeneity through learning spatial and temporal embeddings, which can be viewed as a clustering process. Then, a novel spatiotemporal meta-parameter learning paradigm is proposed to learn spatiotemporal-specific parameters from meta-parameter pools, which is informed by the captured heterogeneity. Based on these ideas, we develop a Heterogeneity-Informed Spatiotemporal Meta-Network (HimNet) for spatiotemporal time series forecasting. Extensive experiments on five widely-used benchmarks demonstrate our method achieves state-of-the-art performance while exhibiting superior interpretability. Our code is available at https://github.com/XDZhelheim/HimNet.

* Accepted by KDD'24 Research Track

Via

Access Paper or Ask Questions