Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hua Lu

A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future

Apr 12, 2025

Jialun Zhong, Wei Shen, Yanzeng Li, Songyang Gao, Hua Lu, Yicheng Chen, Yang Zhang, Wei Zhou, Jinjie Gu, Lei Zou

Abstract:Reward Model (RM) has demonstrated impressive potential for enhancing Large Language Models (LLM), as RM can serve as a proxy for human preferences, providing signals to guide LLMs' behavior in various tasks. In this paper, we provide a comprehensive overview of relevant research, exploring RMs from the perspectives of preference collection, reward modeling, and usage. Next, we introduce the applications of RMs and discuss the benchmarks for evaluation. Furthermore, we conduct an in-depth analysis of the challenges existing in the field and dive into the potential research directions. This paper is dedicated to providing beginners with a comprehensive introduction to RMs and facilitating future studies. The resources are publicly available at github\footnote{https://github.com/JLZhong23/awesome-reward-models}.

Via

Access Paper or Ask Questions

Not All Data are Good Labels: On the Self-supervised Labeling for Time Series Forecasting

Feb 21, 2025

Yuxuan Yang, Dalin Zhang, Yuxuan Liang, Hua Lu, Gang Chen, Huan Li

Abstract:Time Series Forecasting (TSF) is a crucial task in various domains, yet existing TSF models rely heavily on high-quality data and insufficiently exploit all available data. This paper explores a novel self-supervised approach to re-label time series datasets by inherently constructing candidate datasets. During the optimization of a simple reconstruction network, intermediates are used as pseudo labels in a self-supervised paradigm, improving generalization for any predictor. We introduce the Self-Correction with Adaptive Mask (SCAM), which discards overfitted components and selectively replaces them with pseudo labels generated from reconstructions. Additionally, we incorporate Spectral Norm Regularization (SNR) to further suppress overfitting from a loss landscape perspective. Our experiments on eleven real-world datasets demonstrate that SCAM consistently improves the performance of various backbone models. This work offers a new perspective on constructing datasets and enhancing the generalization of TSF models through self-supervised learning.

Via

Access Paper or Ask Questions

Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks

Aug 27, 2024

Maria Despoina Siampou, Jialiang Li, John Krumm, Cyrus Shahabi, Hua Lu

Figure 1 for Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks

Figure 2 for Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks

Figure 3 for Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks

Figure 4 for Poly2Vec: Polymorphic Encoding of Geospatial Objects for Spatial Reasoning with Deep Neural Networks

Abstract:Encoding geospatial data is crucial for enabling machine learning (ML) models to perform tasks that require spatial reasoning, such as identifying the topological relationships between two different geospatial objects. However, existing encoding methods are limited as they are typically customized to handle only specific types of spatial data, which impedes their applicability across different downstream tasks where multiple data types coexist. To address this, we introduce Poly2Vec, an encoding framework that unifies the modeling of different geospatial objects, including 2D points, polylines, and polygons, irrespective of the downstream task. We leverage the power of the 2D Fourier transform to encode useful spatial properties, such as shape and location, from geospatial objects into fixed-length vectors. These vectors are then inputted into neural network models for spatial reasoning tasks.This unified approach eliminates the need to develop and train separate models for each distinct spatial type. We evaluate Poly2Vec on both synthetic and real datasets of mixed geometry types and verify its consistent performance across several downstream spatial reasoning tasks.

Via

Access Paper or Ask Questions

Missing Value Imputation for Multi-attribute Sensor Data Streams via Message Propagation (Extended Version)

Nov 14, 2023

Xiao Li, Huan Li, Hua Lu, Christian S. Jensen, Varun Pandey, Volker Markl

Abstract:Sensor data streams occur widely in various real-time applications in the context of the Internet of Things (IoT). However, sensor data streams feature missing values due to factors such as sensor failures, communication errors, or depleted batteries. Missing values can compromise the quality of real-time analytics tasks and downstream applications. Existing imputation methods either make strong assumptions about streams or have low efficiency. In this study, we aim to accurately and efficiently impute missing values in data streams that satisfy only general characteristics in order to benefit real-time applications more widely. First, we propose a message propagation imputation network (MPIN) that is able to recover the missing values of data instances in a time window. We give a theoretical analysis of why MPIN is effective. Second, we present a continuous imputation framework that consists of data update and model update mechanisms to enable MPIN to perform continuous imputation both effectively and efficiently. Extensive experiments on multiple real datasets show that MPIN can outperform the existing data imputers by wide margins and that the continuous imputation framework is efficient and accurate.

* Accepted at VLDB 2024

Via

Access Paper or Ask Questions

LightCTS: A Lightweight Framework for Correlated Time Series Forecasting

Feb 27, 2023

Zhichen Lai, Dalin Zhang, Huan Li, Christian S. Jensen, Hua Lu, Yan Zhao

Figure 1 for LightCTS: A Lightweight Framework for Correlated Time Series Forecasting

Figure 2 for LightCTS: A Lightweight Framework for Correlated Time Series Forecasting

Figure 3 for LightCTS: A Lightweight Framework for Correlated Time Series Forecasting

Figure 4 for LightCTS: A Lightweight Framework for Correlated Time Series Forecasting

Abstract:Correlated time series (CTS) forecasting plays an essential role in many practical applications, such as traffic management and server load control. Many deep learning models have been proposed to improve the accuracy of CTS forecasting. However, while models have become increasingly complex and computationally intensive, they struggle to improve accuracy. Pursuing a different direction, this study aims instead to enable much more efficient, lightweight models that preserve accuracy while being able to be deployed on resource-constrained devices. To achieve this goal, we characterize popular CTS forecasting models and yield two observations that indicate directions for lightweight CTS forecasting. On this basis, we propose the LightCTS framework that adopts plain stacking of temporal and spatial operators instead of alternate stacking that is much more computationally expensive. Moreover, LightCTS features light temporal and spatial operator modules, called L-TCN and GL-Former, that offer improved computational efficiency without compromising their feature extraction capabilities. LightCTS also encompasses a last-shot compression scheme to reduce redundant temporal features and speed up subsequent computations. Experiments with single-step and multi-step forecasting benchmark datasets show that LightCTS is capable of nearly state-of-the-art accuracy at much reduced computational and storage overheads.

* accepted by ACM SIGMOD 2023

Via

Access Paper or Ask Questions

PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Nov 02, 2022

Siqi Bao, Huang He, Jun Xu, Hua Lu, Fan Wang, Hua Wu, Han Zhou, Wenquan Wu, Zheng-Yu Niu, Haifeng Wang

Figure 1 for PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Figure 2 for PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Figure 3 for PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Figure 4 for PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

Abstract:Recently, the practical deployment of open-domain dialogue systems has been plagued by the knowledge issue of information deficiency and factual inaccuracy. To this end, we introduce PLATO-K based on two-stage dialogic learning to strengthen internal knowledge memorization and external knowledge exploitation. In the first stage, PLATO-K learns through massive dialogue corpora and memorizes essential knowledge into model parameters. In the second stage, PLATO-K mimics human beings to search for external information and to leverage the knowledge in response generation. Extensive experiments reveal that the knowledge issue is alleviated significantly in PLATO-K with such comprehensive internal and external knowledge enhancement. Compared to the existing state-of-the-art Chinese dialogue model, the overall engagingness of PLATO-K is improved remarkably by 36.2% and 49.2% on chit-chat and knowledge-intensive conversations.

* First four authors contributed equally to this work

Via

Access Paper or Ask Questions

Towards Boosting the Open-Domain Chatbot with Human Feedback

Aug 30, 2022

Hua Lu, Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang

Figure 1 for Towards Boosting the Open-Domain Chatbot with Human Feedback

Figure 2 for Towards Boosting the Open-Domain Chatbot with Human Feedback

Figure 3 for Towards Boosting the Open-Domain Chatbot with Human Feedback

Figure 4 for Towards Boosting the Open-Domain Chatbot with Human Feedback

Abstract:Many open-domain dialogue models pre-trained with social media comments can generate coherent replies but have difficulties producing engaging responses when interacting with real users. This phenomenon might mainly result from the deficiency of annotated human-human conversations and the misalignment with human preference. In this paper, we propose a novel and efficient approach Diamante to boost the open-domain chatbot, where two kinds of human feedback (including explicit demonstration and implicit preference) are collected and leveraged. By asking annotators to select or amend the model-generated candidate responses, Diamante efficiently collects the human demonstrated responses and constructs a Chinese chit-chat dataset. To enhance the alignment with human preference, Diamante leverages the implicit preference in the data collection process and introduces the generation-evaluation joint training. Comprehensive experiments indicate that the Diamante dataset and joint training paradigm can significantly boost the performance of Chinese pre-trained dialogue models.

* First two authors contributed equally to this work

Via

Access Paper or Ask Questions

RFID-Based Indoor Spatial Query Evaluation with Bayesian Filtering Techniques

Apr 02, 2022

Bo Hui, Wenlu Wang, Jiao Yu, Zhitao Gong, Wei-Shinn Ku, Min-Te Sun, Hua Lu

Figure 1 for RFID-Based Indoor Spatial Query Evaluation with Bayesian Filtering Techniques

Figure 2 for RFID-Based Indoor Spatial Query Evaluation with Bayesian Filtering Techniques

Figure 3 for RFID-Based Indoor Spatial Query Evaluation with Bayesian Filtering Techniques

Figure 4 for RFID-Based Indoor Spatial Query Evaluation with Bayesian Filtering Techniques

Abstract:People spend a significant amount of time in indoor spaces (e.g., office buildings, subway systems, etc.) in their daily lives. Therefore, it is important to develop efficient indoor spatial query algorithms for supporting various location-based applications. However, indoor spaces differ from outdoor spaces because users have to follow the indoor floor plan for their movements. In addition, positioning in indoor environments is mainly based on sensing devices (e.g., RFID readers) rather than GPS devices. Consequently, we cannot apply existing spatial query evaluation techniques devised for outdoor environments for this new challenge. Because Bayesian filtering techniques can be employed to estimate the state of a system that changes over time using a sequence of noisy measurements made on the system, in this research, we propose the Bayesian filtering-based location inference methods as the basis for evaluating indoor spatial queries with noisy RFID raw data. Furthermore, two novel models, indoor walking graph model and anchor point indexing model, are created for tracking object locations in indoor environments. Based on the inference method and tracking models, we develop innovative indoor range and k nearest neighbor (kNN) query algorithms. We validate our solution through use of both synthetic data and real-world data. Our experimental results show that the proposed algorithms can evaluate indoor spatial queries effectively and efficiently. We open-source the code, data, and floor plan at https://github.com/DataScienceLab18/IndoorToolKit.

Via

Access Paper or Ask Questions

Towards Building an Open-Domain Dialogue System Incorporated with Internet Memes

Mar 08, 2022

Hua Lu, Zhen Guo, Chanjuan Li, Yunyi Yang, Huang He, Siqi Bao

Figure 1 for Towards Building an Open-Domain Dialogue System Incorporated with Internet Memes

Figure 2 for Towards Building an Open-Domain Dialogue System Incorporated with Internet Memes

Figure 3 for Towards Building an Open-Domain Dialogue System Incorporated with Internet Memes

Figure 4 for Towards Building an Open-Domain Dialogue System Incorporated with Internet Memes

Abstract:In recent years, Internet memes have been widely used in online chatting. Compared with text-based communication, conversations become more expressive and attractive when Internet memes are incorporated. This paper presents our solutions for the Meme incorporated Open-domain Dialogue (MOD) Challenge of DSTC10, where three tasks are involved: text response modeling, meme retrieval, and meme emotion classification. Firstly, we leverage a large-scale pre-trained dialogue model for coherent and informative response generation. Secondly, based on interaction-based text-matching, our approach can retrieve appropriate memes with good generalization ability. Thirdly, we propose to model the emotion flow (EF) in conversations and introduce an auxiliary task of emotion description prediction (EDP) to boost the performance of meme emotion classification. Experimental results on the MOD dataset demonstrate that our methods can incorporate Internet memes into dialogue systems effectively.

* First two authors contributed equally to this work

Via

Access Paper or Ask Questions

Relational Surrogate Loss Learning

Feb 26, 2022

Tao Huang, Zekang Li, Hua Lu, Yong Shan, Shusheng Yang, Yang Feng, Fei Wang, Shan You, Chang Xu

Figure 1 for Relational Surrogate Loss Learning

Figure 2 for Relational Surrogate Loss Learning

Figure 3 for Relational Surrogate Loss Learning

Figure 4 for Relational Surrogate Loss Learning

Abstract:Evaluation metrics in machine learning are often hardly taken as loss functions, as they could be non-differentiable and non-decomposable, e.g., average precision and F1 score. This paper aims to address this problem by revisiting the surrogate loss learning, where a deep neural network is employed to approximate the evaluation metrics. Instead of pursuing an exact recovery of the evaluation metric through a deep neural network, we are reminded of the purpose of the existence of these evaluation metrics, which is to distinguish whether one model is better or worse than another. In this paper, we show that directly maintaining the relation of models between surrogate losses and metrics suffices, and propose a rank correlation-based optimization method to maximize this relation and learn surrogate losses. Compared to previous works, our method is much easier to optimize and enjoys significant efficiency and performance gains. Extensive experiments show that our method achieves improvements on various tasks including image classification and neural machine translation, and even outperforms state-of-the-art methods on human pose estimation and machine reading comprehension tasks. Code is available at: https://github.com/hunto/ReLoss.

* Accepted to ICLR 2022

Via

Access Paper or Ask Questions