Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kunlong Chen

LLM-Powered Ensemble Learning for Paper Source Tracing: A GPU-Free Approach

Sep 17, 2024

Kunlong Chen, Junjun Wang, Zhaoqun Chen, Kunjin Chen, Yitian Chen

Abstract:We participated in the KDD CUP 2024 paper source tracing competition and achieved the 3rd place. This competition tasked participants with identifying the reference sources (i.e., ref-sources, as referred to by the organizers of the competition) of given academic papers. Unlike most teams that addressed this challenge by fine-tuning pre-trained neural language models such as BERT or ChatGLM, our primary approach utilized closed-source large language models (LLMs). With recent advancements in LLM technology, closed-source LLMs have demonstrated the capability to tackle complex reasoning tasks in zero-shot or few-shot scenarios. Consequently, in the absence of GPUs, we employed closed-source LLMs to directly generate predicted reference sources from the provided papers. We further refined these predictions through ensemble learning. Notably, our method was the only one among the award-winning approaches that did not require the use of GPUs for model training. Code available at https://github.com/Cklwanfifa/KDDCUP2024-PST.

Via

Access Paper or Ask Questions

GP-NAS-ensemble: a model for NAS Performance Prediction

Jan 23, 2023

Kunlong Chen, Liu Yang, Yitian Chen, Kunjin Chen, Yidan Xu, Lujun Li

Abstract:It is of great significance to estimate the performance of a given model architecture without training in the application of Neural Architecture Search (NAS) as it may take a lot of time to evaluate the performance of an architecture. In this paper, a novel NAS framework called GP-NAS-ensemble is proposed to predict the performance of a neural network architecture with a small training dataset. We make several improvements on the GP-NAS model to make it share the advantage of ensemble learning methods. Our method ranks second in the CVPR2022 second lightweight NAS challenge performance prediction track.

Via

Access Paper or Ask Questions

DQN Control Solution for KDD Cup 2021 City Brain Challenge

Aug 14, 2021

Yitian Chen, Kunlong Chen, Kunjin Chen, Lin Wang

Figure 1 for DQN Control Solution for KDD Cup 2021 City Brain Challenge

Figure 2 for DQN Control Solution for KDD Cup 2021 City Brain Challenge

Figure 3 for DQN Control Solution for KDD Cup 2021 City Brain Challenge

Figure 4 for DQN Control Solution for KDD Cup 2021 City Brain Challenge

Abstract:We took part in the city brain challenge competition and achieved the 8th place. In this competition, the players are provided with a real-world city-scale road network and its traffic demand derived from real traffic data. The players are asked to coordinate the traffic signals with a self-designed agent to maximize the number of vehicles served while maintaining an acceptable delay. In this abstract paper, we present an overall analysis and our detailed solution to this competition. Our approach is mainly based on the adaptation of the deep Q-network (DQN) for real-time traffic signal control. From our perspective, the major challenge of this competition is how to extend the classical DQN framework to traffic signals control in real-world complex road network and traffic flow situation. After trying and implementing several classical reward functions, we finally chose to apply our newly-designed reward in our agent. By applying our newly-proposed reward function and carefully tuning the control scheme, an agent based on a single DQN model can rank among the top 15 teams. We hope this paper could serve, to some extent, as a baseline solution to traffic signal control of real-world road network and inspire further attempts and researches.

* 5 pages, report for KDD Cup 2021 City Brain Challenge workshop

Via

Access Paper or Ask Questions

Question Directed Graph Attention Network for Numerical Reasoning over Text

Sep 16, 2020

Kunlong Chen, Weidi Xu, Xingyi Cheng, Zou Xiaochuan, Yuyu Zhang, Le Song, Taifeng Wang, Yuan Qi, Wei Chu

Figure 1 for Question Directed Graph Attention Network for Numerical Reasoning over Text

Figure 2 for Question Directed Graph Attention Network for Numerical Reasoning over Text

Figure 3 for Question Directed Graph Attention Network for Numerical Reasoning over Text

Figure 4 for Question Directed Graph Attention Network for Numerical Reasoning over Text

Abstract:Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation. To address this challenge, we propose a heterogeneous graph representation for the context of the passage and question needed for such reasoning, and design a question directed graph attention network to drive multi-step numerical reasoning over this context graph.

* Accepted at EMNLP 2020

Via

Access Paper or Ask Questions

SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

May 13, 2020

Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu, Yuan Qi

Figure 1 for SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

Figure 2 for SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

Figure 3 for SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

Figure 4 for SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

Abstract:Chinese Spelling Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. Existing methods have made attempts to incorporate the similarity knowledge between Chinese characters. However, they take the similarity knowledge as either an external input resource or just heuristic rules. This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN). The model builds a graph over the characters, and SpellGCN is learned to map this graph into a set of inter-dependent character classifiers. These classifiers are applied to the representations extracted by another network, such as BERT, enabling the whole network to be end-to-end trainable. Experiments (The dataset and all code for this paper are available at https://github.com/ACL2020SpellGCN/SpellGCN) are conducted on three human-annotated datasets. Our method achieves superior performance against previous models by a large margin.

* Accepted by ACL2020

Via

Access Paper or Ask Questions

Symmetric Regularization based BERT for Pair-wise Semantic Reasoning

Sep 08, 2019

Xingyi Cheng, Weidi Xu, Kunlong Chen, Wei Wang, Bin Bi, Ming Yan, Chen Wu, Luo Si, Wei Chu, Taifeng Wang

Figure 1 for Symmetric Regularization based BERT for Pair-wise Semantic Reasoning

Figure 2 for Symmetric Regularization based BERT for Pair-wise Semantic Reasoning

Figure 3 for Symmetric Regularization based BERT for Pair-wise Semantic Reasoning

Figure 4 for Symmetric Regularization based BERT for Pair-wise Semantic Reasoning

Abstract:The ability of semantic reasoning over the sentence pair is essential for many natural language understanding tasks, e.g., natural language inference and machine reading comprehension. A recent significant improvement in these tasks comes from BERT. As reported, the next sentence prediction (NSP) in BERT, which learns the contextual relationship between two sentences, is of great significance for downstream problems with sentence-pair input. Despite the effectiveness of NSP, we suggest that NSP still lacks the essential signal to distinguish between entailment and shallow correlation. To remedy this, we propose to augment the NSP task to a 3-class categorization task, which includes a category for previous sentence prediction (PSP). The involvement of PSP encourages the model to focus on the informative semantics to determine the sentence order, thereby improves the ability of semantic understanding. This simple modification yields remarkable improvement against vanilla BERT. To further incorporate the document-level information, the scope of NSP and PSP is expanded into a broader range, i.e., NSP and PSP also include close but nonsuccessive sentences, the noise of which is mitigated by the label-smoothing technique. Both qualitative and quantitative experimental results demonstrate the effectiveness of the proposed method. Our method consistently improves the performance on the NLI and MRC benchmarks, including the challenging HANS dataset~\cite{hans}, suggesting that the document-level task is still promising for the pre-training.

* 8 pages, 3 figures, 6 tables

Via

Access Paper or Ask Questions

Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning

Mar 11, 2019

Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, Wei Chu

Figure 1 for Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning

Figure 2 for Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning

Figure 3 for Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning

Figure 4 for Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning

Abstract:The ambiguous annotation criteria bring into the divergence of Chinese Word Segmentation (CWS) datasets with various granularities. Multi-criteria learning leverage the annotation style of individual datasets and mine their common basic knowledge. In this paper, we proposed a domain adaptive segmenter to capture diverse criteria of datasets. Our model is based on Bidirectional Encoder Representations from Transformers (BERT), which is responsible for introducing external knowledge. We also optimize its computational efficiency via model pruning, quantization, and compiler optimization. Experiments show that our segmenter outperforms the previous results on 10 CWS datasets and is faster than the previous state-of-the-art Bi-LSTM-CRF model.

Via

Access Paper or Ask Questions

Convolutional Sequence to Sequence Non-intrusive Load Monitoring

Jun 06, 2018

Kunjin Chen, Qin Wang, Ziyu He, Kunlong Chen, Jun Hu, Jinliang He

Figure 1 for Convolutional Sequence to Sequence Non-intrusive Load Monitoring

Figure 2 for Convolutional Sequence to Sequence Non-intrusive Load Monitoring

Figure 3 for Convolutional Sequence to Sequence Non-intrusive Load Monitoring

Figure 4 for Convolutional Sequence to Sequence Non-intrusive Load Monitoring

Abstract:A convolutional sequence to sequence non-intrusive load monitoring model is proposed in this paper. Gated linear unit convolutional layers are used to extract information from the sequences of aggregate electricity consumption. Residual blocks are also introduced to refine the output of the neural network. The partially overlapped output sequences of the network are averaged to produce the final output of the model. We apply the proposed model to the REDD dataset and compare it with the convolutional sequence to point model in the literature. Results show that the proposed model is able to give satisfactory disaggregation performance for appliances with varied characteristics.

* This paper is submitted to IET-The Journal of Engineering

Via

Access Paper or Ask Questions

Short-term Load Forecasting with Deep Residual Networks

May 30, 2018

Kunjin Chen, Kunlong Chen, Qin Wang, Ziyu He, Jun Hu, Jinliang He

Figure 1 for Short-term Load Forecasting with Deep Residual Networks

Figure 2 for Short-term Load Forecasting with Deep Residual Networks

Figure 3 for Short-term Load Forecasting with Deep Residual Networks

Figure 4 for Short-term Load Forecasting with Deep Residual Networks

Abstract:We present in this paper a model for forecasting short-term power loads based on deep residual networks. The proposed model is able to integrate domain knowledge and researchers' understanding of the task by virtue of different neural network building blocks. Specifically, a modified deep residual network is formulated to improve the forecast results. Further, a two-stage ensemble strategy is used to enhance the generalization capability of the proposed model. We also apply the proposed model to probabilistic load forecasting using Monte Carlo dropout. Three public datasets are used to prove the effectiveness of the proposed model. Multiple test cases and comparison with existing models show that the proposed model is able to provide accurate load forecasting results and has high generalization capability.

* This paper is currently accepted by IEEE Transactions on Smart Grid

Via

Access Paper or Ask Questions