Picture for Jungang Li

Jungang Li

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Add code
Feb 04, 2026
Viaarxiv icon

BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models

Add code
Feb 04, 2026
Viaarxiv icon

A Visual Semantic Adaptive Watermark grounded by Prefix-Tuning for Large Vision-Language Model

Add code
Jan 12, 2026
Viaarxiv icon

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Add code
Dec 28, 2025
Viaarxiv icon

Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents

Add code
Aug 27, 2025
Figure 1 for Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents
Figure 2 for Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents
Figure 3 for Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents
Figure 4 for Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents
Viaarxiv icon

VSI: Visual Subtitle Integration for Keyframe Selection to enhance Long Video Understanding

Add code
Aug 09, 2025
Viaarxiv icon

Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities

Add code
May 27, 2025
Viaarxiv icon

PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions

Add code
May 21, 2025
Viaarxiv icon

UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models

Add code
May 20, 2025
Viaarxiv icon

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video

Add code
May 04, 2025
Viaarxiv icon