Picture for Chenguang Wang

Chenguang Wang

Rethinking the "Heatmap + Monte Carlo Tree Search" Paradigm for Solving Large Scale TSP

Add code
Nov 14, 2024
Viaarxiv icon

JudgeBench: A Benchmark for Evaluating LLM-based Judges

Add code
Oct 16, 2024
Figure 1 for JudgeBench: A Benchmark for Evaluating LLM-based Judges
Figure 2 for JudgeBench: A Benchmark for Evaluating LLM-based Judges
Figure 3 for JudgeBench: A Benchmark for Evaluating LLM-based Judges
Figure 4 for JudgeBench: A Benchmark for Evaluating LLM-based Judges
Viaarxiv icon

An Efficient and Explainable Transformer-Based Few-Shot Learning for Modeling Electricity Consumption Profiles Across Thousands of Domains

Add code
Aug 15, 2024
Viaarxiv icon

Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning

Add code
Jul 05, 2024
Viaarxiv icon

RuleR: Improving LLM Controllability by Rule-based Data Recycling

Add code
Jun 22, 2024
Viaarxiv icon

Mosaic IT: Enhancing Instruction Tuning with Data Mosaics

Add code
May 22, 2024
Viaarxiv icon

A Flow-Based Model for Conditional and Probabilistic Electricity Consumption Profile Generation and Prediction

Add code
May 06, 2024
Viaarxiv icon

Measuring Social Norms of Large Language Models

Add code
Apr 07, 2024
Viaarxiv icon

RakutenAI-7B: Extending Large Language Models for Japanese

Add code
Mar 21, 2024
Viaarxiv icon

Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study

Add code
Mar 15, 2024
Viaarxiv icon