Picture for Kannan Achan

Kannan Achan

Segment and Matte Anything in a Unified Model

Add code
Jan 17, 2026
Viaarxiv icon

Is More Context Always Better? Examining LLM Reasoning Capability for Time Interval Prediction

Add code
Jan 15, 2026
Viaarxiv icon

To See or To Read: User Behavior Reasoning in Multimodal LLMs

Add code
Nov 05, 2025
Viaarxiv icon

Spatial Reasoning in Foundation Models: Benchmarking Object-Centric Spatial Understanding

Add code
Sep 26, 2025
Figure 1 for Spatial Reasoning in Foundation Models: Benchmarking Object-Centric Spatial Understanding
Figure 2 for Spatial Reasoning in Foundation Models: Benchmarking Object-Centric Spatial Understanding
Figure 3 for Spatial Reasoning in Foundation Models: Benchmarking Object-Centric Spatial Understanding
Figure 4 for Spatial Reasoning in Foundation Models: Benchmarking Object-Centric Spatial Understanding
Viaarxiv icon

VL-CLIP: Enhancing Multimodal Recommendations via Visual Grounding and LLM-Augmented CLIP Embeddings

Add code
Jul 22, 2025
Viaarxiv icon

LLM-driven Constrained Copy Generation through Iterative Refinement

Add code
Apr 14, 2025
Viaarxiv icon

Improving Sequential Recommender Systems with Online and In-store User Behavior

Add code
Dec 03, 2024
Figure 1 for Improving Sequential Recommender Systems with Online and In-store User Behavior
Figure 2 for Improving Sequential Recommender Systems with Online and In-store User Behavior
Figure 3 for Improving Sequential Recommender Systems with Online and In-store User Behavior
Figure 4 for Improving Sequential Recommender Systems with Online and In-store User Behavior
Viaarxiv icon

Triple Modality Fusion: Aligning Visual, Textual, and Graph Data with Large Language Models for Multi-Behavior Recommendations

Add code
Oct 16, 2024
Figure 1 for Triple Modality Fusion: Aligning Visual, Textual, and Graph Data with Large Language Models for Multi-Behavior Recommendations
Figure 2 for Triple Modality Fusion: Aligning Visual, Textual, and Graph Data with Large Language Models for Multi-Behavior Recommendations
Figure 3 for Triple Modality Fusion: Aligning Visual, Textual, and Graph Data with Large Language Models for Multi-Behavior Recommendations
Figure 4 for Triple Modality Fusion: Aligning Visual, Textual, and Graph Data with Large Language Models for Multi-Behavior Recommendations
Viaarxiv icon

Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference

Add code
Sep 18, 2024
Figure 1 for Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference
Figure 2 for Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference
Figure 3 for Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference
Figure 4 for Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference
Viaarxiv icon

Leveraging User-Generated Reviews for Recommender Systems with Dynamic Headers

Add code
Sep 11, 2024
Viaarxiv icon