Picture for Jamin Shin

Jamin Shin

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Add code
Jun 09, 2024
Figure 1 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 2 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 3 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 4 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Viaarxiv icon

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Add code
May 02, 2024
Viaarxiv icon

HyperCLOVA X Technical Report

Add code
Apr 13, 2024
Viaarxiv icon

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Add code
Oct 12, 2023
Viaarxiv icon

EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria

Add code
Sep 24, 2023
Viaarxiv icon

Who Wrote this Code? Watermarking for Code Generation

Add code
May 24, 2023
Viaarxiv icon

Aligning Large Language Models through Synthetic Feedback

Add code
May 23, 2023
Viaarxiv icon

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Add code
May 23, 2023
Viaarxiv icon

Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation

Add code
May 23, 2023
Viaarxiv icon

Towards Zero-Shot Functional Compositionality of Language Models

Add code
Mar 06, 2023
Viaarxiv icon