Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Amr Kayid

Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Feb 12, 2024

Ahmet Üstün, Viraat Aryabumi, Zheng-Xin Yong, Wei-Yin Ko, Daniel D'souza, Gbemileke Onilude, Neel Bhandari, Shivalika Singh, Hui-Lee Ooi, Amr Kayid(+7 more)

Figure 1 for Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Figure 2 for Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Figure 3 for Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Figure 4 for Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model

Abstract:Recent breakthroughs in large language models (LLMs) have centered around a handful of data-rich languages. What does it take to broaden access to breakthroughs beyond first-class citizen languages? Our work introduces Aya, a massively multilingual generative language model that follows instructions in 101 languages of which over 50% are considered as lower-resourced. Aya outperforms mT0 and BLOOMZ on the majority of tasks while covering double the number of languages. We introduce extensive new evaluation suites that broaden the state-of-art for multilingual eval across 99 languages -- including discriminative and generative tasks, human evaluation, and simulated win rates that cover both held-out tasks and in-distribution performance. Furthermore, we conduct detailed investigations on the optimal finetuning mixture composition, data pruning, as well as the toxicity, bias, and safety of our models. We open-source our instruction datasets and our model at https://hf.co/CohereForAI/aya-101

Via

Access Paper or Ask Questions

Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change

Dec 01, 2020

Ayush Manish Agrawal, Atharva Tendle, Harshvardhan Sikka, Sahib Singh, Amr Kayid

Figure 1 for Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change

Figure 2 for Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change

Figure 3 for Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change

Figure 4 for Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change

Abstract:Understanding the per-layer learning dynamics of deep neural networks is of significant interest as it may provide insights into how neural networks learn and the potential for better training regimens. We investigate learning in Deep Convolutional Neural Networks (CNNs) by measuring the relative weight change of layers while training. Several interesting trends emerge in a variety of CNN architectures across various computer vision classification tasks, including the overall increase in relative weight change of later layers as compared to earlier ones.

* 14 pages, 20 figures

Via

Access Paper or Ask Questions