Picture for Yangqiu Song

Yangqiu Song

Assessing the Robustness of Retrieval-Augmented Generation Systems in K-12 Educational Question Answering with Knowledge Discrepancies

Add code
Dec 12, 2024
Viaarxiv icon

Revolve: Optimizing AI Systems by Tracking Response Evolution in Textual Optimization

Add code
Dec 04, 2024
Viaarxiv icon

Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks

Add code
Nov 24, 2024
Viaarxiv icon

What Really is Commonsense Knowledge?

Add code
Nov 06, 2024
Viaarxiv icon

EcomEdit: An Automated E-commerce Knowledge Editing Framework for Enhanced Product and Purchase Intention Understanding

Add code
Oct 18, 2024
Viaarxiv icon

Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction

Add code
Oct 15, 2024
Figure 1 for Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
Figure 2 for Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
Figure 3 for Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
Figure 4 for Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
Viaarxiv icon

Persona Knowledge-Aligned Prompt Tuning Method for Online Debate

Add code
Oct 05, 2024
Viaarxiv icon

ECon: On the Detection and Resolution of Evidence Conflicts

Add code
Oct 05, 2024
Viaarxiv icon

ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities

Add code
Oct 04, 2024
Viaarxiv icon

DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects

Add code
Oct 03, 2024
Figure 1 for DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Figure 2 for DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Figure 3 for DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Figure 4 for DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Viaarxiv icon