Picture for Xuchao Zhang

Xuchao Zhang

Unveiling Context-Aware Criteria in Self-Assessing LLMs

Add code
Oct 28, 2024
Figure 1 for Unveiling Context-Aware Criteria in Self-Assessing LLMs
Figure 2 for Unveiling Context-Aware Criteria in Self-Assessing LLMs
Figure 3 for Unveiling Context-Aware Criteria in Self-Assessing LLMs
Figure 4 for Unveiling Context-Aware Criteria in Self-Assessing LLMs
Viaarxiv icon

CREAM: Consistency Regularized Self-Rewarding Language Models

Add code
Oct 17, 2024
Figure 1 for CREAM: Consistency Regularized Self-Rewarding Language Models
Figure 2 for CREAM: Consistency Regularized Self-Rewarding Language Models
Figure 3 for CREAM: Consistency Regularized Self-Rewarding Language Models
Figure 4 for CREAM: Consistency Regularized Self-Rewarding Language Models
Viaarxiv icon

Building AI Agents for Autonomous Clouds: Challenges and Design Principles

Add code
Jul 16, 2024
Figure 1 for Building AI Agents for Autonomous Clouds: Challenges and Design Principles
Figure 2 for Building AI Agents for Autonomous Clouds: Challenges and Design Principles
Figure 3 for Building AI Agents for Autonomous Clouds: Challenges and Design Principles
Viaarxiv icon

CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

Add code
Jun 10, 2024
Figure 1 for CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Figure 2 for CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Figure 3 for CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Figure 4 for CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Viaarxiv icon

Exploring LLM-based Agents for Root Cause Analysis

Add code
Mar 07, 2024
Viaarxiv icon

Uncertainty Decomposition and Quantification for In-Context Learning of Large Language Models

Add code
Feb 15, 2024
Viaarxiv icon

Automated Root Causing of Cloud Incidents using In-Context Learning with GPT-4

Add code
Jan 24, 2024
Viaarxiv icon

Open-ended Commonsense Reasoning with Unrestricted Answer Scope

Add code
Oct 27, 2023
Viaarxiv icon

PACE-LM: Prompting and Augmentation for Calibrated Confidence Estimation with GPT-4 in Cloud Incident Root Cause Analysis

Add code
Sep 29, 2023
Viaarxiv icon

Improving Open Information Extraction with Large Language Models: A Study on Demonstration Uncertainty

Add code
Sep 07, 2023
Viaarxiv icon