Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aleksandr Nikolich

Sber AI

Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian

May 22, 2024

Aleksandr Nikolich, Konstantin Korolev, Artem Shelmanov

Figure 1 for Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian

Figure 2 for Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian

Figure 3 for Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian

Figure 4 for Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian

Abstract:There has been a surge in the development of various Large Language Models (LLMs). However, text generation for languages other than English often faces significant challenges, including poor generation quality and the reduced computational performance due to the disproportionate representation of tokens in model's vocabulary. In this work, we address these issues and introduce Vikhr, a new state-of-the-art open-source instruction-tuned LLM designed specifically for the Russian language. Unlike previous efforts for Russian that utilize computationally inexpensive LoRA adapters on top of English-oriented models, Vikhr features an adapted tokenizer vocabulary and undergoes the continued pre-training and instruction tuning of all weights. This approach not only enhances the model's performance but also significantly improves its computational and contextual efficiency. The remarkable performance of Vikhr across various Russian-language benchmarks can also be attributed to our efforts in expanding instruction datasets and corpora for continued pre-training. Vikhr not only sets the new state of the art among open-source LLMs for Russian, but even outperforms some proprietary closed-source models on certain benchmarks. The model weights, instruction sets, and code are publicly available

Via

Access Paper or Ask Questions

Emojich -- zero-shot emoji generation using Russian language: a technical report

Dec 04, 2021

Alex Shonenkov, Daria Bakshandaeva, Denis Dimitrov, Aleksandr Nikolich

Figure 1 for Emojich -- zero-shot emoji generation using Russian language: a technical report

Figure 2 for Emojich -- zero-shot emoji generation using Russian language: a technical report

Figure 3 for Emojich -- zero-shot emoji generation using Russian language: a technical report

Abstract:This technical report presents a text-to-image neural network "Emojich" that generates emojis using captions in Russian language as a condition. We aim to keep the generalization ability of a pretrained big model ruDALL-E Malevich (XL) 1.3B parameters at the fine-tuning stage, while giving special style to the images generated. Here are presented some engineering methods, code realization, all hyper-parameters for reproducing results and a Telegram bot where everyone can create their own customized sets of stickers. Also, some newly generated emojis obtained by "Emojich" model are demonstrated.

* 5 pages, 4 figures and big figure at appendix, technical report

Via

Access Paper or Ask Questions