Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yashashree Chandak

The Geometry of Multilingual Language Models: An Equality Lens

May 13, 2023

Cheril Shah, Yashashree Chandak, Manan Suri

Figure 1 for The Geometry of Multilingual Language Models: An Equality Lens

Figure 2 for The Geometry of Multilingual Language Models: An Equality Lens

Figure 3 for The Geometry of Multilingual Language Models: An Equality Lens

Figure 4 for The Geometry of Multilingual Language Models: An Equality Lens

Abstract:Understanding the representations of different languages in multilingual language models is essential for comprehending their cross-lingual properties, predicting their performance on downstream tasks, and identifying any biases across languages. In our study, we analyze the geometry of three multilingual language models in Euclidean space and find that all languages are represented by unique geometries. Using a geometric separability index we find that although languages tend to be closer according to their linguistic family, they are almost separable with languages from other families. We also introduce a Cross-Lingual Similarity Index to measure the distance of languages with each other in the semantic space. Our findings indicate that the low-resource languages are not represented as good as high resource languages in any of the models

* 8 pages, 6 figues, 1st ICLR TinyPapers

Via

Access Paper or Ask Questions