Picture for Kayhan Behdin

Kayhan Behdin

Robust Batch-Level Query Routing for Large Language Models under Cost and Capacity Constraints

Add code
Mar 25, 2026
Viaarxiv icon

Semantic Search At LinkedIn

Add code
Feb 07, 2026
Viaarxiv icon

Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction

Add code
Sep 15, 2025
Figure 1 for Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction
Figure 2 for Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction
Figure 3 for Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction
Figure 4 for Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction
Viaarxiv icon

BP-Seg: A graphical model approach to unsupervised and non-contiguous text segmentation using belief propagation

Add code
May 22, 2025
Figure 1 for BP-Seg: A graphical model approach to unsupervised and non-contiguous text segmentation using belief propagation
Figure 2 for BP-Seg: A graphical model approach to unsupervised and non-contiguous text segmentation using belief propagation
Viaarxiv icon

An Optimization Framework for Differentially Private Sparse Fine-Tuning

Add code
Mar 17, 2025
Figure 1 for An Optimization Framework for Differentially Private Sparse Fine-Tuning
Figure 2 for An Optimization Framework for Differentially Private Sparse Fine-Tuning
Figure 3 for An Optimization Framework for Differentially Private Sparse Fine-Tuning
Figure 4 for An Optimization Framework for Differentially Private Sparse Fine-Tuning
Viaarxiv icon

Efficient AI in Practice: Training and Deployment of Efficient LLMs for Industry Applications

Add code
Feb 20, 2025
Viaarxiv icon

HASSLE-free: A unified Framework for Sparse plus Low-Rank Matrix Decomposition for LLMs

Add code
Feb 02, 2025
Viaarxiv icon

ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models

Add code
Jun 12, 2024
Figure 1 for ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models
Figure 2 for ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models
Figure 3 for ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models
Figure 4 for ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models
Viaarxiv icon

End-to-end Feature Selection Approach for Learning Skinny Trees

Add code
Oct 28, 2023
Figure 1 for End-to-end Feature Selection Approach for Learning Skinny Trees
Figure 2 for End-to-end Feature Selection Approach for Learning Skinny Trees
Figure 3 for End-to-end Feature Selection Approach for Learning Skinny Trees
Figure 4 for End-to-end Feature Selection Approach for Learning Skinny Trees
Viaarxiv icon

QuantEase: Optimization-based Quantization for Language Models -- An Efficient and Intuitive Algorithm

Add code
Sep 05, 2023
Figure 1 for QuantEase: Optimization-based Quantization for Language Models -- An Efficient and Intuitive Algorithm
Figure 2 for QuantEase: Optimization-based Quantization for Language Models -- An Efficient and Intuitive Algorithm
Figure 3 for QuantEase: Optimization-based Quantization for Language Models -- An Efficient and Intuitive Algorithm
Figure 4 for QuantEase: Optimization-based Quantization for Language Models -- An Efficient and Intuitive Algorithm
Viaarxiv icon