Picture for Jianqin Yin

Jianqin Yin

L2HCount:Generalizing Crowd Counting from Low to High Crowd Density via Density Simulation

Add code
Mar 17, 2025
Viaarxiv icon

Exploring Position Encoding in Diffusion U-Net for Training-free High-resolution Image Generation

Add code
Mar 12, 2025
Viaarxiv icon

GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting

Add code
Mar 07, 2025
Figure 1 for GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting
Figure 2 for GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting
Figure 3 for GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting
Figure 4 for GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting
Viaarxiv icon

MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection

Add code
Dec 02, 2024
Figure 1 for MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection
Figure 2 for MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection
Figure 3 for MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection
Figure 4 for MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection
Viaarxiv icon

InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models

Add code
Nov 18, 2024
Figure 1 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Figure 2 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Figure 3 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Figure 4 for InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
Viaarxiv icon

Towards Physically-Realizable Adversarial Attacks in Embodied Vision Navigation

Add code
Sep 16, 2024
Figure 1 for Towards Physically-Realizable Adversarial Attacks in Embodied Vision Navigation
Figure 2 for Towards Physically-Realizable Adversarial Attacks in Embodied Vision Navigation
Figure 3 for Towards Physically-Realizable Adversarial Attacks in Embodied Vision Navigation
Figure 4 for Towards Physically-Realizable Adversarial Attacks in Embodied Vision Navigation
Viaarxiv icon

MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection

Add code
Sep 11, 2024
Figure 1 for MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection
Figure 2 for MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection
Figure 3 for MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection
Figure 4 for MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection
Viaarxiv icon

SELD-Mamba: Selective State-Space Model for Sound Event Localization and Detection with Source Distance Estimation

Add code
Aug 09, 2024
Figure 1 for SELD-Mamba: Selective State-Space Model for Sound Event Localization and Detection with Source Distance Estimation
Figure 2 for SELD-Mamba: Selective State-Space Model for Sound Event Localization and Detection with Source Distance Estimation
Figure 3 for SELD-Mamba: Selective State-Space Model for Sound Event Localization and Detection with Source Distance Estimation
Figure 4 for SELD-Mamba: Selective State-Space Model for Sound Event Localization and Detection with Source Distance Estimation
Viaarxiv icon

ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality

Add code
Jul 29, 2024
Figure 1 for ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality
Figure 2 for ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality
Figure 3 for ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality
Figure 4 for ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality
Viaarxiv icon

Micro-expression recognition based on depth map to point cloud

Add code
Jun 12, 2024
Viaarxiv icon