Picture for Siwei Wu

Siwei Wu

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Add code
Apr 07, 2025
Viaarxiv icon

ContrastScore: Towards Higher Quality, Less Biased, More Efficient Evaluation Metrics with Contrastive Evaluation

Add code
Apr 02, 2025
Viaarxiv icon

LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm

Add code
Feb 26, 2025
Viaarxiv icon

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Add code
Feb 20, 2025
Viaarxiv icon

A Comparative Study on Reasoning Patterns of OpenAI's o1 Model

Add code
Oct 17, 2024
Figure 1 for A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Figure 2 for A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Figure 3 for A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Figure 4 for A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Viaarxiv icon

SongTrans: An unified song transcription and alignment method for lyrics and notes

Add code
Sep 22, 2024
Figure 1 for SongTrans: An unified song transcription and alignment method for lyrics and notes
Figure 2 for SongTrans: An unified song transcription and alignment method for lyrics and notes
Figure 3 for SongTrans: An unified song transcription and alignment method for lyrics and notes
Figure 4 for SongTrans: An unified song transcription and alignment method for lyrics and notes
Viaarxiv icon

LIME-M: Less Is More for Evaluation of MLLMs

Add code
Sep 10, 2024
Figure 1 for LIME-M: Less Is More for Evaluation of MLLMs
Figure 2 for LIME-M: Less Is More for Evaluation of MLLMs
Figure 3 for LIME-M: Less Is More for Evaluation of MLLMs
Figure 4 for LIME-M: Less Is More for Evaluation of MLLMs
Viaarxiv icon

Overview of the NLPCC 2024 Shared Task on Chinese Metaphor Generation

Add code
Aug 08, 2024
Viaarxiv icon

MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models

Add code
Aug 06, 2024
Figure 1 for MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models
Figure 2 for MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models
Figure 3 for MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models
Figure 4 for MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models
Viaarxiv icon

MMRA: A Benchmark for Multi-granularity Multi-image Relational Association

Add code
Jul 24, 2024
Figure 1 for MMRA: A Benchmark for Multi-granularity Multi-image Relational Association
Figure 2 for MMRA: A Benchmark for Multi-granularity Multi-image Relational Association
Figure 3 for MMRA: A Benchmark for Multi-granularity Multi-image Relational Association
Figure 4 for MMRA: A Benchmark for Multi-granularity Multi-image Relational Association
Viaarxiv icon