Picture for Lingqiao Liu

Lingqiao Liu

Let Your Video Listen to Your Music!

Add code
Jun 23, 2025
Viaarxiv icon

Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation

Add code
Jun 11, 2025
Viaarxiv icon

Enhancing Close-up Novel View Synthesis via Pseudo-labeling

Add code
Mar 20, 2025
Viaarxiv icon

HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models

Add code
Mar 14, 2025
Viaarxiv icon

Close-up-GS: Enhancing Close-Up View Synthesis in 3D Gaussian Splatting with Progressive Self-Training

Add code
Mar 12, 2025
Viaarxiv icon

Efficient Response Generation Method Selection for Fine-Tuning Large Language Models

Add code
Feb 17, 2025
Viaarxiv icon

Exploring Primitive Visual Measurement Understanding and the Role of Output Format in Learning in Vision-Language Models

Add code
Jan 25, 2025
Viaarxiv icon

Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss

Add code
Jan 13, 2025
Figure 1 for Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss
Figure 2 for Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss
Figure 3 for Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss
Figure 4 for Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss
Viaarxiv icon

Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning

Add code
Dec 14, 2024
Figure 1 for Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning
Figure 2 for Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning
Figure 3 for Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning
Figure 4 for Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning
Viaarxiv icon

Categorical Keypoint Positional Embedding for Robust Animal Re-Identification

Add code
Dec 01, 2024
Viaarxiv icon