Tokenization


QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension

Add code
Mar 11, 2025
Viaarxiv icon

Attention Hijackers: Detect and Disentangle Attention Hijacking in LVLMs for Hallucination Mitigation

Add code
Mar 11, 2025
Viaarxiv icon

DeepRAG: Building a Custom Hindi Embedding Model for Retrieval Augmented Generation from Scratch

Add code
Mar 11, 2025
Viaarxiv icon

ProTeX: Structure-In-Context Reasoning and Editing of Proteins with Large Language Models

Add code
Mar 11, 2025
Viaarxiv icon

HOTFormerLoc: Hierarchical Octree Transformer for Versatile Lidar Place Recognition Across Ground and Aerial Views

Add code
Mar 11, 2025
Viaarxiv icon

Uni$\textbf{F}^2$ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models

Add code
Mar 11, 2025
Viaarxiv icon

Context-aware Biases for Length Extrapolation

Add code
Mar 11, 2025
Viaarxiv icon

LongProLIP: A Probabilistic Vision-Language Model with Long Context Text

Add code
Mar 11, 2025
Viaarxiv icon

Multi-Cue Adaptive Visual Token Pruning for Large Vision-Language Models

Add code
Mar 11, 2025
Viaarxiv icon

CDI3D: Cross-guided Dense-view Interpolation for 3D Reconstruction

Add code
Mar 11, 2025
Viaarxiv icon