Picture for Yan Teng

Yan Teng

Reflection-Bench: probing AI intelligence with reflection

Add code
Oct 21, 2024
Viaarxiv icon

MEOW: MEMOry Supervised LLM Unlearning Via Inverted Facts

Add code
Sep 18, 2024
Viaarxiv icon

ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models

Add code
Jun 24, 2024
Figure 1 for ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models
Figure 2 for ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models
Figure 3 for ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models
Figure 4 for ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models
Viaarxiv icon

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

Add code
Jun 11, 2024
Figure 1 for MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Figure 2 for MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Figure 3 for MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Figure 4 for MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Viaarxiv icon

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Add code
Jan 29, 2024
Figure 1 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Figure 2 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Figure 3 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Figure 4 for From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Viaarxiv icon

Fake Alignment: Are LLMs Really Aligned Well?

Add code
Nov 14, 2023
Viaarxiv icon

Flames: Benchmarking Value Alignment of Chinese Large Language Models

Add code
Nov 12, 2023
Viaarxiv icon

Plug-and-Play Feature Generation for Few-Shot Medical Image Classification

Add code
Oct 14, 2023
Viaarxiv icon