Picture for Sambit Sahu

Sambit Sahu

Routing with Generated Data: Annotation-Free LLM Skill Estimation and Expert Selection

Add code
Jan 14, 2026
Viaarxiv icon

Lessons from the Field: An Adaptable Lifecycle Approach to Applied Dialogue Summarization

Add code
Jan 13, 2026
Viaarxiv icon

Leveraging Parameter Space Symmetries for Reasoning Skill Transfer in LLMs

Add code
Nov 13, 2025
Viaarxiv icon

SPEAR-MM: Selective Parameter Evaluation and Restoration via Model Merging for Efficient Financial LLM Adaptation

Add code
Nov 11, 2025
Figure 1 for SPEAR-MM: Selective Parameter Evaluation and Restoration via Model Merging for Efficient Financial LLM Adaptation
Figure 2 for SPEAR-MM: Selective Parameter Evaluation and Restoration via Model Merging for Efficient Financial LLM Adaptation
Figure 3 for SPEAR-MM: Selective Parameter Evaluation and Restoration via Model Merging for Efficient Financial LLM Adaptation
Figure 4 for SPEAR-MM: Selective Parameter Evaluation and Restoration via Model Merging for Efficient Financial LLM Adaptation
Viaarxiv icon

Optimizing Reasoning Efficiency through Prompt Difficulty Prediction

Add code
Nov 05, 2025
Viaarxiv icon

T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning

Add code
May 22, 2025
Figure 1 for T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Figure 2 for T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Figure 3 for T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Figure 4 for T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Viaarxiv icon

Critique-Guided Distillation: Improving Supervised Fine-tuning via Better Distillation

Add code
May 16, 2025
Figure 1 for Critique-Guided Distillation: Improving Supervised Fine-tuning via Better Distillation
Figure 2 for Critique-Guided Distillation: Improving Supervised Fine-tuning via Better Distillation
Figure 3 for Critique-Guided Distillation: Improving Supervised Fine-tuning via Better Distillation
Figure 4 for Critique-Guided Distillation: Improving Supervised Fine-tuning via Better Distillation
Viaarxiv icon

Continual Pre-training of MoEs: How robust is your router?

Add code
Mar 06, 2025
Figure 1 for Continual Pre-training of MoEs: How robust is your router?
Figure 2 for Continual Pre-training of MoEs: How robust is your router?
Figure 3 for Continual Pre-training of MoEs: How robust is your router?
Figure 4 for Continual Pre-training of MoEs: How robust is your router?
Viaarxiv icon

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

Add code
Oct 05, 2024
Figure 1 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 2 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 3 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 4 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Viaarxiv icon

Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

Add code
Sep 17, 2024
Figure 1 for Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey
Figure 2 for Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey
Figure 3 for Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey
Figure 4 for Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey
Viaarxiv icon