Picture for Omkar Thawakar

Omkar Thawakar

DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding

Add code
Mar 13, 2025
Viaarxiv icon

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

Add code
Feb 28, 2025
Viaarxiv icon

Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts

Add code
Feb 20, 2025
Viaarxiv icon

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Add code
Jan 10, 2025
Figure 1 for LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Figure 2 for LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Figure 3 for LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Figure 4 for LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Viaarxiv icon

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

Add code
Nov 25, 2024
Figure 1 for All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Figure 2 for All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Figure 3 for All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Figure 4 for All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Viaarxiv icon

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark

Add code
Oct 24, 2024
Viaarxiv icon

Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration

Add code
Apr 02, 2024
Figure 1 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Figure 2 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Figure 3 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Figure 4 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Viaarxiv icon

Composed Video Retrieval via Enriched Context and Discriminative Embeddings

Add code
Mar 25, 2024
Figure 1 for Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Figure 2 for Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Figure 3 for Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Figure 4 for Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Viaarxiv icon

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

Add code
Feb 26, 2024
Figure 1 for MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Figure 2 for MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Figure 3 for MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Figure 4 for MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Viaarxiv icon

Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored Arabic LLM

Add code
Dec 14, 2023
Viaarxiv icon