Picture for Yinxu Pan

Yinxu Pan

PyBench: Evaluating LLM Agent on various real-world coding tasks

Add code
Jul 23, 2024
Figure 1 for PyBench: Evaluating LLM Agent on various real-world coding tasks
Figure 2 for PyBench: Evaluating LLM Agent on various real-world coding tasks
Figure 3 for PyBench: Evaluating LLM Agent on various real-world coding tasks
Figure 4 for PyBench: Evaluating LLM Agent on various real-world coding tasks
Viaarxiv icon

DebugBench: Evaluating Debugging Capability of Large Language Models

Add code
Jan 11, 2024
Figure 1 for DebugBench: Evaluating Debugging Capability of Large Language Models
Figure 2 for DebugBench: Evaluating Debugging Capability of Large Language Models
Figure 3 for DebugBench: Evaluating Debugging Capability of Large Language Models
Figure 4 for DebugBench: Evaluating Debugging Capability of Large Language Models
Viaarxiv icon

Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages

Add code
Aug 23, 2023
Viaarxiv icon

ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding

Add code
Oct 14, 2022
Figure 1 for ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
Figure 2 for ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
Figure 3 for ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
Figure 4 for ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
Viaarxiv icon

ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding

Add code
Sep 18, 2022
Figure 1 for ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Figure 2 for ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Figure 3 for ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Figure 4 for ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Viaarxiv icon