Picture for Zeyang Zhou

Zeyang Zhou

StrucText-Eval: An Autogenerated Benchmark for Evaluating Large Language Model's Ability in Structure-Rich Text Understanding

Add code
Jun 30, 2024
Viaarxiv icon

StructBench: An Autogenerated Benchmark for Evaluating Large Language Model's Ability in Structure-Rich Text Understanding

Add code
Jun 15, 2024
Viaarxiv icon

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

Add code
Jun 11, 2024
Figure 1 for MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Figure 2 for MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Figure 3 for MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Figure 4 for MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Viaarxiv icon

Flames: Benchmarking Value Alignment of Chinese Large Language Models

Add code
Nov 12, 2023
Viaarxiv icon

A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models

Add code
Mar 18, 2023
Figure 1 for A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
Figure 2 for A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
Figure 3 for A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
Figure 4 for A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
Viaarxiv icon