Picture for Yida Lu

Yida Lu

Agent-SafetyBench: Evaluating the Safety of LLM Agents

Add code
Dec 19, 2024
Viaarxiv icon

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Add code
Dec 16, 2024
Viaarxiv icon

Global Challenge for Safe and Secure LLMs Track 1

Add code
Nov 21, 2024
Figure 1 for Global Challenge for Safe and Secure LLMs Track 1
Figure 2 for Global Challenge for Safe and Secure LLMs Track 1
Figure 3 for Global Challenge for Safe and Secure LLMs Track 1
Figure 4 for Global Challenge for Safe and Secure LLMs Track 1
Viaarxiv icon

AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models

Add code
Jun 24, 2024
Figure 1 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 2 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 3 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 4 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Viaarxiv icon

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

Add code
Feb 26, 2024
Viaarxiv icon

Rethinking Dense Retrieval's Few-Shot Ability

Add code
Apr 12, 2023
Figure 1 for Rethinking Dense Retrieval's Few-Shot Ability
Figure 2 for Rethinking Dense Retrieval's Few-Shot Ability
Figure 3 for Rethinking Dense Retrieval's Few-Shot Ability
Figure 4 for Rethinking Dense Retrieval's Few-Shot Ability
Viaarxiv icon