Abstract:In this study, we introduce Orion-14B, a collection of multilingual large language models with 14 billion parameters. We utilize a data scheduling approach to train a foundational model on a diverse corpus of 2.5 trillion tokens, sourced from texts in English, Chinese, Japanese, Korean, and other languages. Additionally, we fine-tuned a series of models tailored for conversational applications and other specific use cases. Our evaluation results demonstrate that Orion-14B achieves state-of-the-art performance across a broad spectrum of tasks. We make the Orion-14B model family and its associated code publicly accessible https://github.com/OrionStarAI/Orion, aiming to inspire future research and practical applications in the field.
Abstract:Short-trem Load forecasting is of great significance to power system. In this paper, we propose a new connection, Dense Average connection, in which the outputs of all previous layers are averaged as the input of the next layer in a feedforward method.Compared with fully connected layer, the Dense Average connection does not introduce new training parameters.Based on the Dense Average connection,we build the Dense Average Network for load forecasting. In two public datasets and one real dataset, we verify the validity of the model.Compared with ANN, our proposed model has better convergence and prediction effect.Meanwhile, we use the ensemble method to further improve the prediction effect. In order to verify the reliability of the model, we also disturb the input of the model to different degrees. Experimental results show that the proposed model is very robust.