Picture for Haitao Li

Haitao Li

PoAct: Policy and Action Dual-Control Agent for Generalized Applications

Add code
Jan 13, 2025
Viaarxiv icon

LegalAgentBench: Evaluating LLM Agents in Legal Domain

Add code
Dec 23, 2024
Viaarxiv icon

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

Add code
Dec 10, 2024
Figure 1 for LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Figure 2 for LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Figure 3 for LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Figure 4 for LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Viaarxiv icon

De-biased Multimodal Electrocardiogram Analysis

Add code
Nov 22, 2024
Viaarxiv icon

Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study

Add code
Nov 19, 2024
Figure 1 for Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study
Figure 2 for Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study
Figure 3 for Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study
Figure 4 for Large-scale cross-modality pretrained model enhances cardiovascular state estimation and cardiomyopathy detection from electrocardiograms: An AI system development and multi-center validation study
Viaarxiv icon

CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges

Add code
Oct 20, 2024
Viaarxiv icon

An Automatic and Cost-Efficient Peer-Review Framework for Language Generation Evaluation

Add code
Oct 16, 2024
Viaarxiv icon

LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models

Add code
Sep 30, 2024
Figure 1 for LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
Figure 2 for LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
Figure 3 for LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
Figure 4 for LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
Viaarxiv icon

Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval

Add code
Apr 01, 2024
Figure 1 for Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Figure 2 for Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Figure 3 for Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Figure 4 for Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Viaarxiv icon

BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models

Add code
Mar 27, 2024
Figure 1 for BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models
Figure 2 for BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models
Figure 3 for BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models
Figure 4 for BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models
Viaarxiv icon