Picture for Li Liu

Li Liu

Long-Video Audio Synthesis with Multi-Agent Collaboration

Add code
Mar 17, 2025
Viaarxiv icon

HexPlane Representation for 3D Semantic Scene Understanding

Add code
Mar 07, 2025
Figure 1 for HexPlane Representation for 3D Semantic Scene Understanding
Figure 2 for HexPlane Representation for 3D Semantic Scene Understanding
Figure 3 for HexPlane Representation for 3D Semantic Scene Understanding
Figure 4 for HexPlane Representation for 3D Semantic Scene Understanding
Viaarxiv icon

BioD2C: A Dual-level Semantic Consistency Constraint Framework for Biomedical VQA

Add code
Mar 04, 2025
Viaarxiv icon

CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter

Add code
Feb 24, 2025
Viaarxiv icon

Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition

Add code
Feb 03, 2025
Figure 1 for Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition
Figure 2 for Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition
Figure 3 for Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition
Figure 4 for Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition
Viaarxiv icon

Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition

Add code
Jan 19, 2025
Figure 1 for Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition
Figure 2 for Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition
Figure 3 for Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition
Figure 4 for Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition
Viaarxiv icon

Reliable Imputed-Sample Assisted Vertical Federated Learning

Add code
Jan 11, 2025
Viaarxiv icon

Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection

Add code
Jan 08, 2025
Viaarxiv icon

DepthMaster: Taming Diffusion Models for Monocular Depth Estimation

Add code
Jan 05, 2025
Figure 1 for DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
Figure 2 for DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
Figure 3 for DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
Figure 4 for DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
Viaarxiv icon

Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey

Add code
Dec 09, 2024
Figure 1 for Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey
Figure 2 for Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey
Figure 3 for Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey
Figure 4 for Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey
Viaarxiv icon