Picture for Qingyao Ai

Qingyao Ai

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

Add code
Dec 10, 2024
Viaarxiv icon

CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges

Add code
Oct 20, 2024
Viaarxiv icon

LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models

Add code
Sep 30, 2024
Figure 1 for LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
Figure 2 for LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
Figure 3 for LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
Figure 4 for LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
Viaarxiv icon

LeKUBE: A Legal Knowledge Update BEnchmark

Add code
Jul 19, 2024
Figure 1 for LeKUBE: A Legal Knowledge Update BEnchmark
Figure 2 for LeKUBE: A Legal Knowledge Update BEnchmark
Figure 3 for LeKUBE: A Legal Knowledge Update BEnchmark
Figure 4 for LeKUBE: A Legal Knowledge Update BEnchmark
Viaarxiv icon

Mitigating Entity-Level Hallucination in Large Language Models

Add code
Jul 12, 2024
Figure 1 for Mitigating Entity-Level Hallucination in Large Language Models
Figure 2 for Mitigating Entity-Level Hallucination in Large Language Models
Figure 3 for Mitigating Entity-Level Hallucination in Large Language Models
Figure 4 for Mitigating Entity-Level Hallucination in Large Language Models
Viaarxiv icon

Prompt Refinement with Image Pivot for Text-to-Image Generation

Add code
Jun 28, 2024
Viaarxiv icon

STARD: A Chinese Statute Retrieval Dataset with Real Queries Issued by Non-professionals

Add code
Jun 21, 2024
Viaarxiv icon

EEG-ImageNet: An Electroencephalogram Dataset and Benchmarks with Image Visual Stimuli of Multi-Granularity Labels

Add code
Jun 11, 2024
Viaarxiv icon

Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study

Add code
Apr 04, 2024
Figure 1 for Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study
Figure 2 for Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study
Figure 3 for Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study
Figure 4 for Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study
Viaarxiv icon

Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval

Add code
Apr 01, 2024
Figure 1 for Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Figure 2 for Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Figure 3 for Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Figure 4 for Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Viaarxiv icon