Picture for Jamin Shin

Jamin Shin

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Add code
Jun 09, 2024
Figure 1 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 2 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 3 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 4 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Viaarxiv icon

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Add code
May 02, 2024
Viaarxiv icon

HyperCLOVA X Technical Report

Add code
Apr 13, 2024
Viaarxiv icon

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Add code
Oct 12, 2023
Viaarxiv icon

EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria

Add code
Sep 24, 2023
Viaarxiv icon

Who Wrote this Code? Watermarking for Code Generation

Add code
May 24, 2023
Viaarxiv icon

Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation

Add code
May 23, 2023
Viaarxiv icon

Aligning Large Language Models through Synthetic Feedback

Add code
May 23, 2023
Viaarxiv icon

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Add code
May 23, 2023
Viaarxiv icon

Towards Zero-Shot Functional Compositionality of Language Models

Add code
Mar 06, 2023
Viaarxiv icon