Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Danhao Zhu

RevOrder: A Novel Method for Enhanced Arithmetic in Language Models

Feb 06, 2024

Si Shen, Peijun Shen, Danhao Zhu

Abstract:This paper presents RevOrder, a novel technique aimed at improving arithmetic operations in large language models (LLMs) by reversing the output digits in addition, subtraction, and n-digit by 1-digit (nD by 1D) multiplication tasks. Our method significantly reduces the Count of Sequential Intermediate Digits (CSID) to $\mathcal{O}(1)$, a new metric we introduce to assess equation complexity. Through comprehensive testing, RevOrder not only achieves perfect accuracy in basic arithmetic operations but also substantially boosts LLM performance in division tasks, particularly with large numbers where traditional models struggle. Implementation of RevOrder is cost-effective for both training and inference phases. Moreover, applying RevOrder to fine-tune the LLaMA2-7B model on the GSM8K math task results in a considerable improvement, reducing equation calculation errors by 46% and increasing overall scores from 41.6 to 44.4.

Via

Access Paper or Ask Questions

Pre-train and Learn: Preserve Global Information for Graph Neural Networks

Oct 27, 2019

Danhao Zhu, Xin-yu Dai, Jiajun Chen

Figure 1 for Pre-train and Learn: Preserve Global Information for Graph Neural Networks

Figure 2 for Pre-train and Learn: Preserve Global Information for Graph Neural Networks

Figure 3 for Pre-train and Learn: Preserve Global Information for Graph Neural Networks

Figure 4 for Pre-train and Learn: Preserve Global Information for Graph Neural Networks

Abstract:Graph neural networks (GNNs) have shown great power in learning on attributed graphs. However, it is still a challenge for GNNs to utilize information faraway from the source node. Moreover, general GNNs require graph attributes as input, so they cannot be appled to plain graphs. In the paper, we propose new models named G-GNNs (Global information for GNNs) to address the above limitations. First, the global structure and attribute features for each node are obtained via unsupervised pre-training, which preserve the global information associated to the node. Then, using the global features and the raw network attributes, we propose a parallel framework of GNNs to learn different aspects from these features. The proposed learning methods can be applied to both plain graphs and attributed graphs. Extensive experiments have shown that G-GNNs can outperform other state-of-the-art models on three standard evaluation graphs. Specially, our methods establish new benchmark records on Cora (84.31\%) and Pubmed (80.95\%) when learning on attributed graphs.

Via

Access Paper or Ask Questions

Going Wider: Recurrent Neural Network With Parallel Cells

May 03, 2017

Danhao Zhu, Si Shen, Xin-Yu Dai, Jiajun Chen

Figure 1 for Going Wider: Recurrent Neural Network With Parallel Cells

Figure 2 for Going Wider: Recurrent Neural Network With Parallel Cells

Figure 3 for Going Wider: Recurrent Neural Network With Parallel Cells

Figure 4 for Going Wider: Recurrent Neural Network With Parallel Cells

Abstract:Recurrent Neural Network (RNN) has been widely applied for sequence modeling. In RNN, the hidden states at current step are full connected to those at previous step, thus the influence from less related features at previous step may potentially decrease model's learning ability. We propose a simple technique called parallel cells (PCs) to enhance the learning ability of Recurrent Neural Network (RNN). In each layer, we run multiple small RNN cells rather than one single large cell. In this paper, we evaluate PCs on 2 tasks. On language modeling task on PTB (Penn Tree Bank), our model outperforms state of art models by decreasing perplexity from 78.6 to 75.3. On Chinese-English translation task, our model increases BLEU score for 0.39 points than baseline model.

Via

Access Paper or Ask Questions