Picture for Ling Shao

Ling Shao

Terminus Group, Beijing, China

FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation

Add code
Feb 06, 2025
Viaarxiv icon

Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields

Add code
Jan 31, 2025
Viaarxiv icon

Enhanced Multi-Scale Cross-Attention for Person Image Generation

Add code
Jan 15, 2025
Viaarxiv icon

Multimodal 3D Reasoning Segmentation with Complex Scenes

Add code
Nov 21, 2024
Viaarxiv icon

Novel View Extrapolation with Video Diffusion Priors

Add code
Nov 21, 2024
Viaarxiv icon

AllRestorer: All-in-One Transformer for Image Restoration under Composite Degradations

Add code
Nov 16, 2024
Viaarxiv icon

Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey

Add code
Nov 05, 2024
Figure 1 for Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Figure 2 for Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Figure 3 for Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Figure 4 for Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Viaarxiv icon

GWQ: Gradient-Aware Weight Quantization for Large Language Models

Add code
Oct 30, 2024
Figure 1 for GWQ: Gradient-Aware Weight Quantization for Large Language Models
Figure 2 for GWQ: Gradient-Aware Weight Quantization for Large Language Models
Figure 3 for GWQ: Gradient-Aware Weight Quantization for Large Language Models
Figure 4 for GWQ: Gradient-Aware Weight Quantization for Large Language Models
Viaarxiv icon

Historical Test-time Prompt Tuning for Vision Foundation Models

Add code
Oct 27, 2024
Figure 1 for Historical Test-time Prompt Tuning for Vision Foundation Models
Figure 2 for Historical Test-time Prompt Tuning for Vision Foundation Models
Figure 3 for Historical Test-time Prompt Tuning for Vision Foundation Models
Figure 4 for Historical Test-time Prompt Tuning for Vision Foundation Models
Viaarxiv icon

LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models

Add code
Oct 15, 2024
Figure 1 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Figure 2 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Figure 3 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Figure 4 for LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Viaarxiv icon