Picture for Mohit Iyyer

Mohit Iyyer

Bio-Inspired Self-Supervised Learning for Wrist-worn IMU Signals

Add code
Mar 11, 2026
Viaarxiv icon

How2Everything: Mining the Web for How-To Procedures to Evaluate and Improve LLMs

Add code
Feb 09, 2026
Viaarxiv icon

OWL: Probing Cross-Lingual Recall of Memorized Texts via World Literature

Add code
May 28, 2025
Figure 1 for OWL: Probing Cross-Lingual Recall of Memorized Texts via World Literature
Figure 2 for OWL: Probing Cross-Lingual Recall of Memorized Texts via World Literature
Figure 3 for OWL: Probing Cross-Lingual Recall of Memorized Texts via World Literature
Figure 4 for OWL: Probing Cross-Lingual Recall of Memorized Texts via World Literature
Viaarxiv icon

Does quantization affect models' performance on long-context tasks?

Add code
May 27, 2025
Figure 1 for Does quantization affect models' performance on long-context tasks?
Figure 2 for Does quantization affect models' performance on long-context tasks?
Figure 3 for Does quantization affect models' performance on long-context tasks?
Figure 4 for Does quantization affect models' performance on long-context tasks?
Viaarxiv icon

Frankentext: Stitching random text fragments into long-form narratives

Add code
May 23, 2025
Figure 1 for Frankentext: Stitching random text fragments into long-form narratives
Figure 2 for Frankentext: Stitching random text fragments into long-form narratives
Figure 3 for Frankentext: Stitching random text fragments into long-form narratives
Figure 4 for Frankentext: Stitching random text fragments into long-form narratives
Viaarxiv icon

VeriFastScore: Speeding up long-form factuality evaluation

Add code
May 22, 2025
Viaarxiv icon

BLEUBERI: BLEU is a surprisingly effective reward for instruction following

Add code
May 16, 2025
Figure 1 for BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Figure 2 for BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Figure 3 for BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Figure 4 for BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Viaarxiv icon

BEARCUBS: A benchmark for computer-using web agents

Add code
Mar 10, 2025
Figure 1 for BEARCUBS: A benchmark for computer-using web agents
Figure 2 for BEARCUBS: A benchmark for computer-using web agents
Figure 3 for BEARCUBS: A benchmark for computer-using web agents
Figure 4 for BEARCUBS: A benchmark for computer-using web agents
Viaarxiv icon

One ruler to measure them all: Benchmarking multilingual long-context language models

Add code
Mar 03, 2025
Figure 1 for One ruler to measure them all: Benchmarking multilingual long-context language models
Figure 2 for One ruler to measure them all: Benchmarking multilingual long-context language models
Figure 3 for One ruler to measure them all: Benchmarking multilingual long-context language models
Figure 4 for One ruler to measure them all: Benchmarking multilingual long-context language models
Viaarxiv icon

CLIPPER: Compression enables long-context synthetic data generation

Add code
Feb 20, 2025
Viaarxiv icon