Picture for Kai Li

Kai Li

Department of Computer Science and Technology, Tsinghua University, Beijing, China

Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images

Add code
Nov 20, 2024
Viaarxiv icon

Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning

Add code
Oct 21, 2024
Viaarxiv icon

SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios

Add code
Oct 02, 2024
Figure 1 for SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Figure 2 for SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Figure 3 for SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Figure 4 for SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Viaarxiv icon

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

Add code
Oct 02, 2024
Viaarxiv icon

A Novel Framework of Horizontal-Vertical Hybrid Federated Learning for EdgeIoT

Add code
Oct 02, 2024
Figure 1 for A Novel Framework of Horizontal-Vertical Hybrid Federated Learning for EdgeIoT
Figure 2 for A Novel Framework of Horizontal-Vertical Hybrid Federated Learning for EdgeIoT
Figure 3 for A Novel Framework of Horizontal-Vertical Hybrid Federated Learning for EdgeIoT
Viaarxiv icon

Subjective and Objective Quality-of-Experience Evaluation Study for Live Video Streaming

Add code
Sep 26, 2024
Viaarxiv icon

Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment

Add code
Sep 22, 2024
Figure 1 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Figure 2 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Figure 3 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Figure 4 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Viaarxiv icon

SafeEar: Content Privacy-Preserving Audio Deepfake Detection

Add code
Sep 14, 2024
Figure 1 for SafeEar: Content Privacy-Preserving Audio Deepfake Detection
Figure 2 for SafeEar: Content Privacy-Preserving Audio Deepfake Detection
Figure 3 for SafeEar: Content Privacy-Preserving Audio Deepfake Detection
Figure 4 for SafeEar: Content Privacy-Preserving Audio Deepfake Detection
Viaarxiv icon

Apollo: Band-sequence Modeling for High-Quality Audio Restoration

Add code
Sep 13, 2024
Viaarxiv icon

Machine Anomalous Sound Detection Using Spectral-temporal Modulation Representations Derived from Machine-specific Filterbanks

Add code
Sep 09, 2024
Viaarxiv icon