Picture for Rongrong Ji

Rongrong Ji

Xiamen University, Peng Cheng Laboratory

Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval

Add code
Dec 09, 2025
Viaarxiv icon

Polybasic Speculative Decoding Through a Theoretical Perspective

Add code
Oct 30, 2025
Figure 1 for Polybasic Speculative Decoding Through a Theoretical Perspective
Figure 2 for Polybasic Speculative Decoding Through a Theoretical Perspective
Figure 3 for Polybasic Speculative Decoding Through a Theoretical Perspective
Figure 4 for Polybasic Speculative Decoding Through a Theoretical Perspective
Viaarxiv icon

CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning

Add code
Oct 09, 2025
Figure 1 for CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning
Figure 2 for CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning
Figure 3 for CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning
Figure 4 for CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning
Viaarxiv icon

CCF: A Context Compression Framework for Efficient Long-Sequence Language Modeling

Add code
Sep 11, 2025
Viaarxiv icon

Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval

Add code
Aug 27, 2025
Figure 1 for Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
Figure 2 for Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
Figure 3 for Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
Figure 4 for Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
Viaarxiv icon

VISA: Group-wise Visual Token Selection and Aggregation via Graph Summarization for Efficient MLLMs Inference

Add code
Aug 25, 2025
Viaarxiv icon

DS$^2$Net: Detail-Semantic Deep Supervision Network for Medical Image Segmentation

Add code
Aug 06, 2025
Viaarxiv icon

MIHBench: Benchmarking and Mitigating Multi-Image Hallucinations in Multimodal Large Language Models

Add code
Aug 01, 2025
Viaarxiv icon

Towards Universal Modal Tracking with Online Dense Temporal Token Learning

Add code
Jul 27, 2025
Figure 1 for Towards Universal Modal Tracking with Online Dense Temporal Token Learning
Figure 2 for Towards Universal Modal Tracking with Online Dense Temporal Token Learning
Figure 3 for Towards Universal Modal Tracking with Online Dense Temporal Token Learning
Figure 4 for Towards Universal Modal Tracking with Online Dense Temporal Token Learning
Viaarxiv icon

GS-Bias: Global-Spatial Bias Learner for Single-Image Test-Time Adaptation of Vision-Language Models

Add code
Jul 16, 2025
Viaarxiv icon