Picture for Haitao Li

Haitao Li

Overview of the NTCIR-18 Automatic Evaluation of LLMs (AEOLLM) Task

Add code
Mar 17, 2025
Viaarxiv icon

LexRAG: Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation

Add code
Feb 28, 2025
Viaarxiv icon

CaseGen: A Benchmark for Multi-Stage Legal Case Documents Generation

Add code
Feb 25, 2025
Viaarxiv icon

PoAct: Policy and Action Dual-Control Agent for Generalized Applications

Add code
Jan 13, 2025
Viaarxiv icon

LegalAgentBench: Evaluating LLM Agents in Legal Domain

Add code
Dec 23, 2024
Viaarxiv icon

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

Add code
Dec 10, 2024
Figure 1 for LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Figure 2 for LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Figure 3 for LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Figure 4 for LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Viaarxiv icon

De-biased Multimodal Electrocardiogram Analysis

Add code
Nov 22, 2024
Viaarxiv icon

Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study

Add code
Nov 19, 2024
Figure 1 for Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study
Figure 2 for Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study
Figure 3 for Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study
Figure 4 for Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study
Viaarxiv icon

CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges

Add code
Oct 20, 2024
Viaarxiv icon

An Automatic and Cost-Efficient Peer-Review Framework for Language Generation Evaluation

Add code
Oct 16, 2024
Viaarxiv icon