Abstract:While large language models have made significant progress in mathematical reasoning, they remain unreliable at judging the correctness of their own solutions. Existing approaches that equip models with self-verification typically treat solution generation and verification as two separate tasks, leading to substantially increased training time. In this paper, we propose FABSVer, which fuses these two tasks into a single generation pass, dramatically reducing training overhead while jointly optimizing both capabilities. We further identify a convergence bottleneck both theoretically and empirically: as training progresses, the reward reaches a plateau because the policy is constrained by a fixed reference model. To overcome this, we introduce Dynamic Reference Model Update (DRMU), which raises the reward ceiling and enables sustained reward growth. Extensive experiments on math benchmarks demonstrate that FABSVer achieves superior self-verification and reasoning performance across three model scales, while requiring only 51%--71% of the training time of existing methods. Analysis further reveals distinct learning phases in how models acquire self-verification, and that the gap between verify and answer rewards shrinks noticeably as model size increases.
Abstract:In this study, we introduce Orion-14B, a collection of multilingual large language models with 14 billion parameters. We utilize a data scheduling approach to train a foundational model on a diverse corpus of 2.5 trillion tokens, sourced from texts in English, Chinese, Japanese, Korean, and other languages. Additionally, we fine-tuned a series of models tailored for conversational applications and other specific use cases. Our evaluation results demonstrate that Orion-14B achieves state-of-the-art performance across a broad spectrum of tasks. We make the Orion-14B model family and its associated code publicly accessible https://github.com/OrionStarAI/Orion, aiming to inspire future research and practical applications in the field.




Abstract:Short-trem Load forecasting is of great significance to power system. In this paper, we propose a new connection, Dense Average connection, in which the outputs of all previous layers are averaged as the input of the next layer in a feedforward method.Compared with fully connected layer, the Dense Average connection does not introduce new training parameters.Based on the Dense Average connection,we build the Dense Average Network for load forecasting. In two public datasets and one real dataset, we verify the validity of the model.Compared with ANN, our proposed model has better convergence and prediction effect.Meanwhile, we use the ensemble method to further improve the prediction effect. In order to verify the reliability of the model, we also disturb the input of the model to different degrees. Experimental results show that the proposed model is very robust.