Picture for Zhiyong Wang

Zhiyong Wang

B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens

Add code
Dec 13, 2024
Viaarxiv icon

DuoCast: Duo-Probabilistic Meteorology-Aware Model for Extended Precipitation Nowcasting

Add code
Dec 03, 2024
Viaarxiv icon

Language-Assisted Human Part Motion Learning for Skeleton-Based Temporal Action Segmentation

Add code
Oct 08, 2024
Figure 1 for Language-Assisted Human Part Motion Learning for Skeleton-Based Temporal Action Segmentation
Figure 2 for Language-Assisted Human Part Motion Learning for Skeleton-Based Temporal Action Segmentation
Figure 3 for Language-Assisted Human Part Motion Learning for Skeleton-Based Temporal Action Segmentation
Figure 4 for Language-Assisted Human Part Motion Learning for Skeleton-Based Temporal Action Segmentation
Viaarxiv icon

When Graph Neural Networks Meet Dynamic Mode Decomposition

Add code
Oct 08, 2024
Figure 1 for When Graph Neural Networks Meet Dynamic Mode Decomposition
Figure 2 for When Graph Neural Networks Meet Dynamic Mode Decomposition
Figure 3 for When Graph Neural Networks Meet Dynamic Mode Decomposition
Figure 4 for When Graph Neural Networks Meet Dynamic Mode Decomposition
Viaarxiv icon

Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning

Add code
Oct 08, 2024
Figure 1 for Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
Figure 2 for Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
Figure 3 for Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
Figure 4 for Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
Viaarxiv icon

Intelligent Fish Detection System with Similarity-Aware Transformer

Add code
Sep 28, 2024
Viaarxiv icon

Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0

Add code
Sep 18, 2024
Figure 1 for Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0
Figure 2 for Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0
Figure 3 for Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0
Figure 4 for Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0
Viaarxiv icon

DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech

Add code
Sep 18, 2024
Figure 1 for DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Figure 2 for DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Figure 3 for DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Figure 4 for DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Viaarxiv icon

HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR

Add code
Sep 09, 2024
Figure 1 for HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR
Figure 2 for HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR
Figure 3 for HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR
Figure 4 for HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR
Viaarxiv icon

Sight View Constraint for Robust Point Cloud Registration

Add code
Sep 08, 2024
Viaarxiv icon