Picture for Ekaterina Artemova

Ekaterina Artemova

U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs

Add code
Dec 04, 2024
Figure 1 for U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs
Figure 2 for U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs
Figure 3 for U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs
Figure 4 for U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs
Viaarxiv icon

Hands-On Tutorial: Labeling with LLM and Human-in-the-Loop

Add code
Nov 07, 2024
Viaarxiv icon

Beemo: Benchmark of Expert-edited Machine-generated Outputs

Add code
Nov 06, 2024
Viaarxiv icon

LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection

Add code
Aug 08, 2024
Figure 1 for LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Figure 2 for LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Figure 3 for LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Figure 4 for LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Viaarxiv icon

Papilusion at DAGPap24: Paper or Illusion? Detecting AI-generated Scientific Papers

Add code
Jul 24, 2024
Viaarxiv icon

RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs

Add code
Jun 27, 2024
Viaarxiv icon

AIpom at SemEval-2024 Task 8: Detecting AI-produced Outputs in M4

Add code
Mar 28, 2024
Viaarxiv icon

RuBia: A Russian Language Bias Detection Dataset

Add code
Mar 26, 2024
Viaarxiv icon

Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data

Add code
Mar 19, 2024
Viaarxiv icon

Exploring the Robustness of Task-oriented Dialogue Systems for Colloquial German Varieties

Add code
Feb 03, 2024
Viaarxiv icon