Picture for Jingyi Liao

Jingyi Liao

Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs

Add code
Oct 02, 2025
Figure 1 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Figure 2 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Figure 3 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Figure 4 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Viaarxiv icon

Box-Level Class-Balanced Sampling for Active Object Detection

Add code
Aug 25, 2025
Figure 1 for Box-Level Class-Balanced Sampling for Active Object Detection
Figure 2 for Box-Level Class-Balanced Sampling for Active Object Detection
Figure 3 for Box-Level Class-Balanced Sampling for Active Object Detection
Figure 4 for Box-Level Class-Balanced Sampling for Active Object Detection
Viaarxiv icon

AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization

Add code
Aug 06, 2025
Figure 1 for AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization
Figure 2 for AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization
Figure 3 for AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization
Figure 4 for AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization
Viaarxiv icon

MLLM-Guided VLM Fine-Tuning with Joint Inference for Zero-Shot Composed Image Retrieval

Add code
May 26, 2025
Viaarxiv icon

VORTA: Efficient Video Diffusion via Routing Sparse Attention

Add code
May 24, 2025
Viaarxiv icon

Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift

Add code
Mar 19, 2025
Figure 1 for Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift
Figure 2 for Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift
Figure 3 for Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift
Figure 4 for Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift
Viaarxiv icon

Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion

Add code
Mar 14, 2025
Figure 1 for Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Figure 2 for Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Figure 3 for Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Figure 4 for Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Viaarxiv icon

Ambient Backscatter Communication in LTE Uplink Sounding Reference Signal

Add code
Jan 19, 2025
Figure 1 for Ambient Backscatter Communication in LTE Uplink Sounding Reference Signal
Figure 2 for Ambient Backscatter Communication in LTE Uplink Sounding Reference Signal
Figure 3 for Ambient Backscatter Communication in LTE Uplink Sounding Reference Signal
Figure 4 for Ambient Backscatter Communication in LTE Uplink Sounding Reference Signal
Viaarxiv icon

AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration

Add code
Dec 16, 2024
Figure 1 for AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Figure 2 for AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Figure 3 for AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Figure 4 for AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Viaarxiv icon

SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing

Add code
Nov 28, 2024
Figure 1 for SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing
Figure 2 for SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing
Figure 3 for SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing
Figure 4 for SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing
Viaarxiv icon