Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiao-Jun Zeng

Group-Agent Reinforcement Learning with Heterogeneous Agents

Jan 21, 2025

Kaiyue Wu, Xiao-Jun Zeng, Tingting Mu

Figure 1 for Group-Agent Reinforcement Learning with Heterogeneous Agents

Figure 2 for Group-Agent Reinforcement Learning with Heterogeneous Agents

Figure 3 for Group-Agent Reinforcement Learning with Heterogeneous Agents

Figure 4 for Group-Agent Reinforcement Learning with Heterogeneous Agents

Abstract:Group-agent reinforcement learning (GARL) is a newly arising learning scenario, where multiple reinforcement learning agents study together in a group, sharing knowledge in an asynchronous fashion. The goal is to improve the learning performance of each individual agent. Under a more general heterogeneous setting where different agents learn using different algorithms, we advance GARL by designing novel and effective group-learning mechanisms. They guide the agents on whether and how to learn from action choices from the others, and allow the agents to adopt available policy and value function models sent by another agent if they perform better. We have conducted extensive experiments on a total of 43 different Atari 2600 games to demonstrate the superior performance of the proposed method. After the group learning, among the 129 agents examined, 96% are able to achieve a learning speed-up, and 72% are able to learn over 100 times faster. Also, around 41% of those agents have achieved a higher accumulated reward score by learning in less than 5% of the time steps required by a single agent when learning on its own.

Via

Access Paper or Ask Questions

Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework

Sep 18, 2024

Yuping Wu, Hao Li, Hongbo Zhu, Goran Nenadic, Xiao-Jun Zeng

Figure 1 for Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework

Figure 2 for Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework

Figure 3 for Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework

Figure 4 for Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework

Abstract:Extract-then-Abstract is a naturally coherent paradigm to conduct abstractive summarization with the help of salient information identified by the extractive model. Previous works that adopt this paradigm train the extractor and abstractor separately and introduce extra parameters to highlight the extracted salients to the abstractor, which results in error accumulation and additional training costs. In this paper, we first introduce a parameter-free highlight method into the encoder-decoder framework: replacing the encoder attention mask with a saliency mask in the cross-attention module to force the decoder to focus only on salient parts of the input. A preliminary analysis compares different highlight methods, demonstrating the effectiveness of our saliency mask. We further propose the novel extract-and-abstract paradigm, ExtAbs, which jointly and seamlessly performs Extractive and Abstractive summarization tasks within single encoder-decoder model to reduce error accumulation. In ExtAbs, the vanilla encoder is augmented to extract salients, and the vanilla decoder is modified with the proposed saliency mask to generate summaries. Built upon BART and PEGASUS, experiments on three datasets show that ExtAbs can achieve superior performance than baselines on the extractive task and performs comparable, or even better than the vanilla models on the abstractive task.

Via

Access Paper or Ask Questions

Time Series Analysis for Education: Methods, Applications, and Future Directions

Aug 27, 2024

Shengzhong Mao, Chaoli Zhang, Yichi Song, Jindong Wang, Xiao-Jun Zeng, Zenglin Xu, Qingsong Wen

Figure 1 for Time Series Analysis for Education: Methods, Applications, and Future Directions

Figure 2 for Time Series Analysis for Education: Methods, Applications, and Future Directions

Figure 3 for Time Series Analysis for Education: Methods, Applications, and Future Directions

Figure 4 for Time Series Analysis for Education: Methods, Applications, and Future Directions

Abstract:Recent advancements in the collection and analysis of sequential educational data have brought time series analysis to a pivotal position in educational research, highlighting its essential role in facilitating data-driven decision-making. However, there is a lack of comprehensive summaries that consolidate these advancements. To the best of our knowledge, this paper is the first to provide a comprehensive review of time series analysis techniques specifically within the educational context. We begin by exploring the landscape of educational data analytics, categorizing various data sources and types relevant to education. We then review four prominent time series methods-forecasting, classification, clustering, and anomaly detection-illustrating their specific application points in educational settings. Subsequently, we present a range of educational scenarios and applications, focusing on how these methods are employed to address diverse educational tasks, which highlights the practical integration of multiple time series methods to solve complex educational problems. Finally, we conclude with a discussion on future directions, including personalized learning analytics, multimodal data fusion, and the role of large language models (LLMs) in educational time series. The contributions of this paper include a detailed taxonomy of educational data, a synthesis of time series techniques with specific educational applications, and a forward-looking perspective on emerging trends and future research opportunities in educational analysis. The related papers and resources are available and regularly updated at the project page.

* 24 pages, 3 figures, 6 tables, project page: see https://github.com/ai-for-edu/time-series-analysis-for-education

Via

Access Paper or Ask Questions

Enhancing Cooperation through Selective Interaction and Long-term Experiences in Multi-Agent Reinforcement Learning

May 04, 2024

Tianyu Ren, Xiao-Jun Zeng

Figure 1 for Enhancing Cooperation through Selective Interaction and Long-term Experiences in Multi-Agent Reinforcement Learning

Figure 2 for Enhancing Cooperation through Selective Interaction and Long-term Experiences in Multi-Agent Reinforcement Learning

Figure 3 for Enhancing Cooperation through Selective Interaction and Long-term Experiences in Multi-Agent Reinforcement Learning

Figure 4 for Enhancing Cooperation through Selective Interaction and Long-term Experiences in Multi-Agent Reinforcement Learning

Abstract:The significance of network structures in promoting group cooperation within social dilemmas has been widely recognized. Prior studies attribute this facilitation to the assortment of strategies driven by spatial interactions. Although reinforcement learning has been employed to investigate the impact of dynamic interaction on the evolution of cooperation, there remains a lack of understanding about how agents develop neighbour selection behaviours and the formation of strategic assortment within an explicit interaction structure. To address this, our study introduces a computational framework based on multi-agent reinforcement learning in the spatial Prisoner's Dilemma game. This framework allows agents to select dilemma strategies and interacting neighbours based on their long-term experiences, differing from existing research that relies on preset social norms or external incentives. By modelling each agent using two distinct Q-networks, we disentangle the coevolutionary dynamics between cooperation and interaction. The results indicate that long-term experience enables agents to develop the ability to identify non-cooperative neighbours and exhibit a preference for interaction with cooperative ones. This emergent self-organizing behaviour leads to the clustering of agents with similar strategies, thereby increasing network reciprocity and enhancing group cooperation.

* Accepted at IJCAI 2024 (33nd International Joint Conference on Artificial Intelligence - Jeju)

Via

Access Paper or Ask Questions

Mapping-to-Parameter Nonlinear Functional Regression with Novel B-spline Free Knot Placement Algorithm

Jan 26, 2024

Chengdong Shi, Ching-Hsun Tseng, Wei Zhao, Xiao-Jun Zeng

Abstract:We propose a novel approach to nonlinear functional regression, called the Mapping-to-Parameter function model, which addresses complex and nonlinear functional regression problems in parameter space by employing any supervised learning technique. Central to this model is the mapping of function data from an infinite-dimensional function space to a finite-dimensional parameter space. This is accomplished by concurrently approximating multiple functions with a common set of B-spline basis functions by any chosen order, with their knot distribution determined by the Iterative Local Placement Algorithm, a newly proposed free knot placement algorithm. In contrast to the conventional equidistant knot placement strategy that uniformly distributes knot locations based on a predefined number of knots, our proposed algorithms determine knot location according to the local complexity of the input or output functions. The performance of our knot placement algorithms is shown to be robust in both single-function approximation and multiple-function approximation contexts. Furthermore, the effectiveness and advantage of the proposed prediction model in handling both function-on-scalar regression and function-on-function regression problems are demonstrated through several real data applications, in comparison with four groups of state-of-the-art methods.

Via

Access Paper or Ask Questions

To transfer or not transfer: Unified transferability metric and analysis

May 12, 2023

Qianshan Zhan, Xiao-Jun Zeng

Figure 1 for To transfer or not transfer: Unified transferability metric and analysis

Figure 2 for To transfer or not transfer: Unified transferability metric and analysis

Figure 3 for To transfer or not transfer: Unified transferability metric and analysis

Figure 4 for To transfer or not transfer: Unified transferability metric and analysis

Abstract:In transfer learning, transferability is one of the most fundamental problems, which aims to evaluate the effectiveness of arbitrary transfer tasks. Existing research focuses on classification tasks and neglects domain or task differences. More importantly, there is a lack of research to determine whether to transfer or not. To address these, we propose a new analytical approach and metric, Wasserstein Distance based Joint Estimation (WDJE), for transferability estimation and determination in a unified setting: classification and regression problems with domain and task differences. The WDJE facilitates decision-making on whether to transfer or not by comparing the target risk with and without transfer. To enable the comparison, we approximate the target transfer risk by proposing a non-symmetric, easy-to-understand and easy-to-calculate target risk bound that is workable even with limited target labels. The proposed bound relates the target risk to source model performance, domain and task differences based on Wasserstein distance. We also extend our bound into unsupervised settings and establish the generalization bound from finite empirical samples. Our experiments in image classification and remaining useful life regression prediction illustrate the effectiveness of the WDJE in determining whether to transfer or not, and the proposed bound in approximating the target transfer risk.

Via

Access Paper or Ask Questions

EDU-level Extractive Summarization with Varying Summary Lengths

Oct 08, 2022

Yuping Wu, Ching-Hsun Tseng, Jiayu Shang, Shengzhong Mao, Goran Nenadic, Xiao-Jun Zeng

Figure 1 for EDU-level Extractive Summarization with Varying Summary Lengths

Figure 2 for EDU-level Extractive Summarization with Varying Summary Lengths

Figure 3 for EDU-level Extractive Summarization with Varying Summary Lengths

Figure 4 for EDU-level Extractive Summarization with Varying Summary Lengths

Abstract:Extractive models usually formulate text summarization as extracting top-k important sentences from document as summary. Few work exploited extracting finer-grained Elementary Discourse Unit (EDU) and there is little analysis and justification for the extractive unit selection. To fill such a gap, this paper firstly conducts oracle analysis to compare the upper bound of performance for models based on EDUs and sentences. The analysis provides evidences from both theoretical and experimental perspectives to justify that EDUs make more concise and precise summary than sentences without losing salient information. Then, considering this merit of EDUs, this paper further proposes EDU-level extractive model with Varying summary Lengths (EDU-VL) and develops the corresponding learning algorithm. EDU-VL learns to encode and predict probabilities of EDUs in document, and encode EDU-level candidate summaries with different lengths based on various $k$ values and select the best candidate summary in an end-to-end training manner. Finally, the proposed and developed approach is experimented on single and multi-document benchmark datasets and shows the improved performances in comparison with the state-of-the-art models.

Via

Access Paper or Ask Questions

Towards Better Dermoscopic Image Feature Representation Learning for Melanoma Classification

Jul 15, 2022

ChengHui Yu, MingKang Tang, ShengGe Yang, MingQing Wang, Zhe Xu, JiangPeng Yan, HanMo Chen, Yu Yang, Xiao-Jun Zeng, Xiu Li

Abstract:Deep learning-based melanoma classification with dermoscopic images has recently shown great potential in automatic early-stage melanoma diagnosis. However, limited by the significant data imbalance and obvious extraneous artifacts, i.e., the hair and ruler markings, discriminative feature extraction from dermoscopic images is very challenging. In this study, we seek to resolve these problems respectively towards better representation learning for lesion features. Specifically, a GAN-based data augmentation (GDA) strategy is adapted to generate synthetic melanoma-positive images, in conjunction with the proposed implicit hair denoising (IHD) strategy. Wherein the hair-related representations are implicitly disentangled via an auxiliary classifier network and reversely sent to the melanoma-feature extraction backbone for better melanoma-specific representation learning. Furthermore, to train the IHD module, the hair noises are additionally labeled on the ISIC2020 dataset, making it the first large-scale dermoscopic dataset with annotation of hair-like artifacts. Extensive experiments demonstrate the superiority of the proposed framework as well as the effectiveness of each component. The improved dataset publicly avaliable at https://github.com/kirtsy/DermoscopicDataset.

* ICONIP 2021 conference

Via

Access Paper or Ask Questions

Group-Agent Reinforcement Learning with A Distributed Deep Solution

Mar 07, 2022

Kaiyue Wu, Xiao-Jun Zeng

Figure 1 for Group-Agent Reinforcement Learning with A Distributed Deep Solution

Figure 2 for Group-Agent Reinforcement Learning with A Distributed Deep Solution

Figure 3 for Group-Agent Reinforcement Learning with A Distributed Deep Solution

Figure 4 for Group-Agent Reinforcement Learning with A Distributed Deep Solution

Abstract:It can largely benefit the reinforcement learning process of each agent if multiple agents perform their separate reinforcement learning tasks cooperatively. These tasks can be not exactly the same but still benefit from the communication behaviour between agents due to task similarities. In fact, this learning scenario is not well understood yet and not well formulated. As the first effort, we provide a detailed discussion of this scenario, and propose group-agent reinforcement learning as a formulation of the reinforcement learning problem under this scenario and a third type of reinforcement learning problem with respect to single-agent and multi-agent reinforcement learning. We propose that it can be solved with the help of modern deep reinforcement learning techniques and provide a distributed deep reinforcement learning algorithm called DDA3C (Decentralised Distributed Asynchronous Advantage Actor-Critic) that is the first framework designed for group-agent reinforcement learning. We show through experiments in the CartPole-v0 game environment that DDA3C achieved desirable performance with very stable training and has good scalability.

Via

Access Paper or Ask Questions

Multiple Dynamic Pricing for Demand Response with Adaptive Clustering-based Customer Segmentation in Smart Grids

Jun 10, 2021

Fanlin Meng, Qian Ma, Zixu Liu, Xiao-Jun Zeng

Figure 1 for Multiple Dynamic Pricing for Demand Response with Adaptive Clustering-based Customer Segmentation in Smart Grids

Figure 2 for Multiple Dynamic Pricing for Demand Response with Adaptive Clustering-based Customer Segmentation in Smart Grids

Figure 3 for Multiple Dynamic Pricing for Demand Response with Adaptive Clustering-based Customer Segmentation in Smart Grids

Figure 4 for Multiple Dynamic Pricing for Demand Response with Adaptive Clustering-based Customer Segmentation in Smart Grids

Abstract:In this paper, we propose a realistic multiple dynamic pricing approach to demand response in the retail market. First, an adaptive clustering-based customer segmentation framework is proposed to categorize customers into different groups to enable the effective identification of usage patterns. Second, customized demand models with important market constraints which capture the price-demand relationship explicitly, are developed for each group of customers to improve the model accuracy and enable meaningful pricing. Third, the multiple pricing based demand response is formulated as a profit maximization problem subject to realistic market constraints. The overall aim of the proposed scalable and practical method aims to achieve 'right' prices for 'right' customers so as to benefit various stakeholders in the system such as grid operators, customers and retailers. The proposed multiple pricing framework is evaluated via simulations based on real-world datasets.

Via

Access Paper or Ask Questions