Picture for M Saiful Bari

M Saiful Bari

ALLaM: Large Language Models for Arabic and English

Add code
Jul 22, 2024
Viaarxiv icon

A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations

Add code
Jul 04, 2024
Figure 1 for A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Figure 2 for A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Figure 3 for A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Figure 4 for A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Viaarxiv icon

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

Add code
Feb 01, 2024
Figure 1 for When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Figure 2 for When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Figure 3 for When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Figure 4 for When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Viaarxiv icon

BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLP

Add code
Sep 22, 2023
Viaarxiv icon

A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets

Add code
Jun 08, 2023
Figure 1 for A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
Figure 2 for A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
Figure 3 for A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
Figure 4 for A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
Viaarxiv icon

xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval

Add code
Mar 06, 2023
Viaarxiv icon

SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning

Add code
Dec 21, 2022
Viaarxiv icon

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting

Add code
Dec 19, 2022
Figure 1 for BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
Figure 2 for BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
Figure 3 for BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
Figure 4 for BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon

What Language Model to Train if You Have One Million GPU Hours?

Add code
Nov 08, 2022
Viaarxiv icon