Picture for Ke Zhang

Ke Zhang

Senior Member, IEEE

A Survey on Foundation-Model-Based Industrial Defect Detection

Add code
Feb 26, 2025
Viaarxiv icon

IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning

Add code
Feb 04, 2025
Viaarxiv icon

Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction

Add code
Jan 19, 2025
Figure 1 for Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction
Figure 2 for Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction
Figure 3 for Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction
Figure 4 for Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction
Viaarxiv icon

Distributed satellite information networks: Architecture, enabling technologies, and trends

Add code
Dec 17, 2024
Figure 1 for Distributed satellite information networks: Architecture, enabling technologies, and trends
Figure 2 for Distributed satellite information networks: Architecture, enabling technologies, and trends
Figure 3 for Distributed satellite information networks: Architecture, enabling technologies, and trends
Figure 4 for Distributed satellite information networks: Architecture, enabling technologies, and trends
Viaarxiv icon

V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D Annotations

Add code
Dec 16, 2024
Viaarxiv icon

MoMuSE: Momentum Multi-modal Target Speaker Extraction for Real-time Scenarios with Impaired Visual Cues

Add code
Dec 11, 2024
Viaarxiv icon

TL-CLIP: A Power-specific Multimodal Pre-trained Visual Foundation Model for Transmission Line Defect Recognition

Add code
Nov 18, 2024
Figure 1 for TL-CLIP: A Power-specific Multimodal Pre-trained Visual Foundation Model for Transmission Line Defect Recognition
Figure 2 for TL-CLIP: A Power-specific Multimodal Pre-trained Visual Foundation Model for Transmission Line Defect Recognition
Figure 3 for TL-CLIP: A Power-specific Multimodal Pre-trained Visual Foundation Model for Transmission Line Defect Recognition
Figure 4 for TL-CLIP: A Power-specific Multimodal Pre-trained Visual Foundation Model for Transmission Line Defect Recognition
Viaarxiv icon

MA^2: A Self-Supervised and Motion Augmenting Autoencoder for Gait-Based Automatic Disease Detection

Add code
Nov 05, 2024
Viaarxiv icon

Multi-Level Speaker Representation for Target Speaker Extraction

Add code
Oct 21, 2024
Viaarxiv icon

Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient

Add code
Oct 11, 2024
Figure 1 for Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
Figure 2 for Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
Figure 3 for Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
Figure 4 for Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
Viaarxiv icon