Abstract:Large language models (LLMs), built on decoder-only transformers, excel in natural language generation and adapt to diverse tasks using zero-shot and few-shot prompting. However, these prompting methods often struggle on natural language understanding (NLU) tasks, where encoder-only models like BERT-base outperform LLMs on benchmarks like GLUE and SuperGLUE. This paper explores two approaches-supervised fine-tuning (SFT) and proximal policy optimization (PPO)-to enhance LLMs' NLU abilities. To reduce the cost of full-model fine-tuning, we integrate low-rank adaptation (LoRA) layers, limiting updates to these layers during both SFT and PPO. In SFT, task-specific prompts are concatenated with input queries and ground-truth labels, optimizing with next-token prediction. Despite this, LLMs still underperform compared to models like BERT-base on several NLU tasks. To close this gap, we apply PPO, a reinforcement learning technique that treats each token generation as an action and uses a reward function based on alignment with ground-truth answers. PPO then updates the model to maximize these rewards, aligning outputs with correct labels. Our experiments with LLAMA2-7B show that PPO improves performance, with a 6.3-point gain over SFT on GLUE. PPO exceeds zero-shot by 38.7 points and few-shot by 26.1 points on GLUE, while surpassing these by 28.8 and 28.5 points on SuperGLUE. Additionally, PPO outperforms BERT-large by 2.7 points on GLUE and 9.3 points on SuperGLUE. The improvements are consistent across models like Qwen2.5-7B and MPT-7B, highlighting PPO's robustness in enhancing LLMs' NLU capabilities.
Abstract:Image captured under low-light conditions presents unpleasing artifacts, which debilitate the performance of feature extraction for many upstream visual tasks. Low-light image enhancement aims at improving brightness and contrast, and further reducing noise that corrupts the visual quality. Recently, many image restoration methods based on Swin Transformer have been proposed and achieve impressive performance. However, On one hand, trivially employing Swin Transformer for low-light image enhancement would expose some artifacts, including over-exposure, brightness imbalance and noise corruption, etc. On the other hand, it is impractical to capture image pairs of low-light images and corresponding ground-truth, i.e. well-exposed image in same visual scene. In this paper, we propose a dual-branch network based on Swin Transformer, guided by a signal-to-noise ratio prior map which provides the spatial-varying information for low-light image enhancement. Moreover, we leverage unsupervised learning to construct the optimization objective based on Retinex model, to guide the training of proposed network. Experimental results demonstrate that the proposed model is competitive with the baseline models.
Abstract:At the age of big data, recommender systems have shown remarkable success as a key means of information filtering in our daily life. Recent years have witnessed the technical development of recommender systems, from perception learning to cognition reasoning which intuitively build the task of recommendation as the procedure of logical reasoning and have achieve significant improvement. However, the logical statement in reasoning implicitly admits irrelevance of ordering, even does not consider time information which plays an important role in many recommendation tasks. Furthermore, recommendation model incorporated with temporal context would tend to be self-attentive, i.e., automatically focus more (less) on the relevance (irrelevance), respectively. To address these issues, in this paper, we propose a Time-aware Self-Attention with Neural Collaborative Reasoning (TiSANCR) based recommendation model, which integrates temporal patterns and self-attention mechanism into reasoning-based recommendation. Specially, temporal patterns represented by relative time, provide context and auxiliary information to characterize the user's preference in recommendation, while self-attention is leveraged to distill informative patterns and suppress irrelevances. Therefore, the fusion of self-attentive temporal information provides deeper representation of user's preference. Extensive experiments on benchmark datasets demonstrate that the proposed TiSANCR achieves significant improvement and consistently outperforms the state-of-the-art recommendation methods.
Abstract:Lawyers and judges spend a large amount of time researching the proper legal authority to cite while drafting decisions. In this paper, we develop a citation recommendation tool that can help improve efficiency in the process of opinion drafting. We train four types of machine learning models, including a citation-list based method (collaborative filtering) and three context-based methods (text similarity, BiLSTM and RoBERTa classifiers). Our experiments show that leveraging local textual context improves recommendation, and that deep neural models achieve decent performance. We show that non-deep text-based methods benefit from access to structured case metadata, but deep models only benefit from such access when predicting from context of insufficient length. We also find that, even after extensive training, RoBERTa does not outperform a recurrent neural model, despite its benefits of pretraining. Our behavior analysis of the RoBERTa model further shows that predictive performance is stable across time and citation classes.
Abstract:Anomalous diffusion, which shows a deviation of transport dynamics from the framework of standard Brownian motion, is involved in the evolution of various physical, chemical, biological, and economic systems. The study of such random processes is of fundamental importance in unveiling the physical properties of random walkers and complex systems. However, classical methods to characterize anomalous diffusion are often disqualified for individual short trajectories, leading to the launch of the Anomalous Diffusion (AnDi) Challenge. This challenge aims at objectively assessing and comparing new approaches for single trajectory characterization, with respect to three different aspects: the inference of the anomalous diffusion exponent; the classification of the diffusion model; and the segmentation of trajectories. In this article, to address the inference and classification tasks in the challenge, we develop a WaveNet-based deep neural network (WADNet) by combining a modified WaveNet encoder with long short-term memory networks, without any prior knowledge of anomalous diffusion. As the performance of our model has surpassed the current 1st places in the challenge leaderboard on both two tasks for all dimensions (6 subtasks), WADNet could be the part of state-of-the-art techniques to decode the AnDi database. Our method presents a benchmark for future research, and could accelerate the development of a versatile tool for the characterization of anomalous diffusion.
Abstract:Newspaper headlines contribute severely and have an influence on the social media. This work studies the durability of impact of verbs and adjectives on headlines and determine the factors which are responsible for its nature of influence on the social media. Each headline has been categorized into positive, negative or neutral based on its sentiment score. Initial results show that intensity of a sentiment nature is positively correlated with the social media impression. Additionally, verbs and adjectives show a relation with the sentiment scores