Picture for Alon Jacovi

Alon Jacovi

CoverBench: A Challenging Benchmark for Complex Claim Verification

Add code
Aug 06, 2024
Viaarxiv icon

Data Contamination Report from the 2024 CONDA Shared Task

Add code
Jul 31, 2024
Figure 1 for Data Contamination Report from the 2024 CONDA Shared Task
Figure 2 for Data Contamination Report from the 2024 CONDA Shared Task
Figure 3 for Data Contamination Report from the 2024 CONDA Shared Task
Figure 4 for Data Contamination Report from the 2024 CONDA Shared Task
Viaarxiv icon

Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP

Add code
Jun 29, 2024
Viaarxiv icon

Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Add code
Jun 19, 2024
Viaarxiv icon

TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools

Add code
Jun 05, 2024
Viaarxiv icon

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

Add code
Feb 02, 2024
Viaarxiv icon

A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Add code
Oct 16, 2023
Viaarxiv icon

Unpacking Human-AI Interaction in Safety-Critical Industries: A Systematic Literature Review

Add code
Oct 05, 2023
Viaarxiv icon

Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks

Add code
May 17, 2023
Viaarxiv icon

Neighboring Words Affect Human Interpretation of Saliency Explanations

Add code
May 06, 2023
Viaarxiv icon