Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Elnaz Rahmati

CoCo-CoLa: Evaluating Language Adherence in Multilingual LLMs

Feb 18, 2025

Elnaz Rahmati, Alireza S. Ziabari, Morteza Dehghani

Abstract:Multilingual Large Language Models (LLMs) develop cross-lingual abilities despite being trained on limited parallel data. However, they often struggle to generate responses in the intended language, favoring high-resource languages such as English. In this work, we introduce CoCo-CoLa (Correct Concept - Correct Language), a novel metric to evaluate language adherence in multilingual LLMs. Using fine-tuning experiments on a closed-book QA task across seven languages, we analyze how training in one language affects others' performance. Our findings reveal that multilingual models share task knowledge across languages but exhibit biases in the selection of output language. We identify language-specific layers, showing that final layers play a crucial role in determining output language. Accordingly, we propose a partial training strategy that selectively fine-tunes key layers, improving language adherence while significantly reducing computational cost. Our method achieves comparable or superior performance to full fine-tuning, particularly for low-resource languages, offering a more efficient multilingual adaptation.

* 13 pages, 7 figures

Via

Access Paper or Ask Questions

Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash

Nov 15, 2024

Parsa Hejabi, Elnaz Rahmati, Alireza S. Ziabari, Preni Golazizian, Jesse Thomason, Morteza Dehghani

Figure 1 for Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash

Figure 2 for Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash

Figure 3 for Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash

Figure 4 for Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash

Abstract:Large Language Models (LLMs) have shown impressive capabilities in complex tasks and interactive environments, yet their creativity remains underexplored. This paper introduces a simulation framework utilizing the game Balderdash to evaluate both the creativity and logical reasoning of LLMs. In Balderdash, players generate fictitious definitions for obscure terms to deceive others while identifying correct definitions. Our framework enables multiple LLM agents to participate in this game, assessing their ability to produce plausible definitions and strategize based on game rules and history. We implemented a centralized game engine featuring various LLMs as participants and a judge LLM to evaluate semantic equivalence. Through a series of experiments, we analyzed the performance of different LLMs, examining metrics such as True Definition Ratio, Deception Ratio, and Correct Guess Ratio. The results provide insights into the creative and deceptive capabilities of LLMs, highlighting their strengths and areas for improvement. Specifically, the study reveals that infrequent vocabulary in LLMs' input leads to poor reasoning on game rules and historical context (https://github.com/ParsaHejabi/Simulation-Framework-for-Multi-Agent-Balderdash).

* Accepted at Wordplay: When Language Meets Games @ ACL 2024

Via

Access Paper or Ask Questions

naab: A ready-to-use plug-and-play corpus for Farsi

Aug 29, 2022

Sadra Sabouri, Elnaz Rahmati, Soroush Gooran, Hossein Sameti

Figure 1 for naab: A ready-to-use plug-and-play corpus for Farsi

Figure 2 for naab: A ready-to-use plug-and-play corpus for Farsi

Figure 3 for naab: A ready-to-use plug-and-play corpus for Farsi

Figure 4 for naab: A ready-to-use plug-and-play corpus for Farsi

Abstract:Huge corpora of textual data are always known to be a crucial need for training deep models such as transformer-based ones. This issue is emerging more in lower resource languages - like Farsi. We propose naab, the biggest cleaned and ready-to-use open-source textual corpus in Farsi. It contains about 130GB of data, 250 million paragraphs, and 15 billion words. The project name is derived from the Farsi word NAAB K which means pure and high grade. We also provide the raw version of the corpus called naab-raw and an easy-to-use preprocessor that can be employed by those who wanted to make a customized corpus.

* 6 pages, 2 figures

Via

Access Paper or Ask Questions