Picture for Xiang Chen

Xiang Chen

Adobe Research

Take What You Need: Flexible Multi-Task Semantic Communications with Channel Adaptation

Add code
Feb 12, 2025
Viaarxiv icon

RoboGrasp: A Universal Grasping Policy for Robust Robotic Control

Add code
Feb 05, 2025
Viaarxiv icon

HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding

Add code
Jan 25, 2025
Figure 1 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Figure 2 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Figure 3 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Figure 4 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Viaarxiv icon

Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness

Add code
Jan 14, 2025
Viaarxiv icon

Data and System Perspectives of Sustainable Artificial Intelligence

Add code
Jan 13, 2025
Viaarxiv icon

LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding

Add code
Jan 09, 2025
Viaarxiv icon

Less is More: Towards Green Code Large Language Models via Unified Structural Pruning

Add code
Dec 20, 2024
Viaarxiv icon

Threshold Neuron: A Brain-inspired Artificial Neuron for Efficient On-device Inference

Add code
Dec 18, 2024
Viaarxiv icon

Numerical Pruning for Efficient Autoregressive Models

Add code
Dec 17, 2024
Viaarxiv icon

Depth-Centric Dehazing and Depth-Estimation from Real-World Hazy Driving Video

Add code
Dec 16, 2024
Viaarxiv icon