Picture for Leonid Karlinsky

Leonid Karlinsky

LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

Add code
Oct 15, 2024
Viaarxiv icon

GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models

Add code
Oct 08, 2024
Viaarxiv icon

Scaling Granite Code Models to 128K Context

Add code
Jul 18, 2024
Viaarxiv icon

DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners

Add code
Jul 04, 2024
Viaarxiv icon

Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

Add code
Jun 21, 2024
Viaarxiv icon

Navigating the Labyrinth: Evaluating and Enhancing LLMs' Ability to Reason About Search Problems

Add code
Jun 18, 2024
Viaarxiv icon

Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

Add code
Jun 17, 2024
Viaarxiv icon

Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation

Add code
Jun 14, 2024
Viaarxiv icon

Comparison Visual Instruction Tuning

Add code
Jun 13, 2024
Viaarxiv icon

ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs

Add code
Jun 12, 2024
Viaarxiv icon