Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sabino Miranda

Evaluating the Performance of Large Language Models for Spanish Language in Undergraduate Admissions Exams

Dec 28, 2023

Sabino Miranda, Obdulia Pichardo-Lagunas, Bella Martínez-Seis, Pierre Baldi

Abstract:This study evaluates the performance of large language models, specifically GPT-3.5 and BARD (supported by Gemini Pro model), in undergraduate admissions exams proposed by the National Polytechnic Institute in Mexico. The exams cover Engineering/Mathematical and Physical Sciences, Biological and Medical Sciences, and Social and Administrative Sciences. Both models demonstrated proficiency, exceeding the minimum acceptance scores for respective academic programs to up to 75% for some academic programs. GPT-3.5 outperformed BARD in Mathematics and Physics, while BARD performed better in History and questions related to factual information. Overall, GPT-3.5 marginally surpassed BARD with scores of 60.94% and 60.42%, respectively.

* 11 pages, 1 figure. Submitted to a journal

Via

Access Paper or Ask Questions

A large scale lexical and semantic analysis of Spanish language variations in Twitter

Oct 12, 2021

Eric S. Tellez, Daniela Moctezuma, Sabino Miranda, Mario Graff

Figure 1 for A large scale lexical and semantic analysis of Spanish language variations in Twitter

Figure 2 for A large scale lexical and semantic analysis of Spanish language variations in Twitter

Figure 3 for A large scale lexical and semantic analysis of Spanish language variations in Twitter

Figure 4 for A large scale lexical and semantic analysis of Spanish language variations in Twitter

Abstract:Dialectometry is a discipline devoted to studying the variations of a language around a geographical region. One of their goals is the creation of linguistic atlases capturing the similarities and differences of the language under study around the area in question. For instance, Spanish is one of the most spoken languages across the world, but not necessarily Spanish is written and spoken in the same way in different countries. This manuscript presents a broad analysis describing lexical and semantic relationships among 26 Spanish-speaking countries around the globe. For this study, we analyze four-year of the Twitter geotagged public stream to provide an extensive survey of the Spanish language vocabularies of different countries, its distributions, semantic usage of terms, and emojis. We also offer open regional word-embedding resources for Spanish Twitter to help other researchers and practitioners take advantage of regionalized models.

Via

Access Paper or Ask Questions