Picture for Xiaoshuai Song

Xiaoshuai Song

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models

Add code
Oct 15, 2024
Figure 1 for MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
Figure 2 for MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
Figure 3 for MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
Figure 4 for MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
Viaarxiv icon

Toward General Instruction-Following Alignment for Retrieval-Augmented Generation

Add code
Oct 12, 2024
Viaarxiv icon

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

Add code
Jul 01, 2024
Viaarxiv icon

CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

Add code
Jun 12, 2024
Figure 1 for CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Figure 2 for CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Figure 3 for CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Figure 4 for CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Viaarxiv icon

Faceptor: A Generalist Model for Face Perception

Add code
Mar 14, 2024
Viaarxiv icon

Noise-BERT: A Unified Perturbation-Robust Framework with Noise Alignment Pre-training for Noisy Slot Filling Task

Add code
Mar 06, 2024
Viaarxiv icon

Beyond the Known: Investigating LLMs Performance on Out-of-Domain Intent Detection

Add code
Mar 04, 2024
Viaarxiv icon

Knowledge Editing on Black-box Large Language Models

Add code
Feb 17, 2024
Viaarxiv icon

APP: Adaptive Prototypical Pseudo-Labeling for Few-shot OOD Detection

Add code
Oct 20, 2023
Viaarxiv icon

Continual Generalized Intent Discovery: Marching Towards Dynamic and Open-world Intent Recognition

Add code
Oct 16, 2023
Viaarxiv icon