Picture for Yew Ken Chia

Yew Ken Chia

M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework

Add code
Nov 09, 2024
Viaarxiv icon

Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths

Add code
Oct 07, 2024
Viaarxiv icon

Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models

Add code
Sep 22, 2024
Viaarxiv icon

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Add code
Jul 29, 2024
Figure 1 for SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages
Figure 2 for SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages
Figure 3 for SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages
Figure 4 for SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages
Viaarxiv icon

Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions

Add code
May 30, 2024
Figure 1 for Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions
Figure 2 for Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions
Figure 3 for Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions
Figure 4 for Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions
Viaarxiv icon

PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns

Add code
Mar 20, 2024
Viaarxiv icon

Contrastive Chain-of-Thought Prompting

Add code
Nov 15, 2023
Viaarxiv icon

Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning

Add code
Jul 05, 2023
Viaarxiv icon

INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models

Add code
Jun 15, 2023
Viaarxiv icon

M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models

Add code
Jun 08, 2023
Viaarxiv icon