Picture for Xiaotong Li

Xiaotong Li

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Add code
Mar 29, 2026
Viaarxiv icon

Building Explicit World Model for Zero-Shot Open-World Object Manipulation

Add code
Mar 14, 2026
Viaarxiv icon

Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning

Add code
Nov 11, 2025
Figure 1 for Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning
Figure 2 for Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning
Figure 3 for Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning
Figure 4 for Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning
Viaarxiv icon

Beyond Entropy: Region Confidence Proxy for Wild Test-Time Adaptation

Add code
May 27, 2025
Viaarxiv icon

Composite Indicator-Guided Infilling Sampling for Expensive Multi-Objective Optimization

Add code
Mar 28, 2025
Viaarxiv icon

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

Add code
Feb 10, 2025
Figure 1 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Figure 2 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Figure 3 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Figure 4 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Viaarxiv icon

InstructBioMol: Advancing Biomolecule Understanding and Design Following Human Instructions

Add code
Oct 10, 2024
Viaarxiv icon

Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach

Add code
Oct 08, 2024
Figure 1 for Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Figure 2 for Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Figure 3 for Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Figure 4 for Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Viaarxiv icon

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

Add code
Jul 11, 2024
Figure 1 for DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Figure 2 for DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Figure 3 for DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Figure 4 for DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Viaarxiv icon

Unveiling Encoder-Free Vision-Language Models

Add code
Jun 17, 2024
Viaarxiv icon