Picture for Chiuman Ho

Chiuman Ho

Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

Add code
May 04, 2023
Viaarxiv icon

VideoXum: Cross-modal Visual and Textural Summarization of Videos

Add code
Mar 21, 2023
Viaarxiv icon

ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient Self-Supervised Monocular Depth Estimation

Add code
Dec 16, 2022
Viaarxiv icon

A Dual Modality Approach For Multi-Label Classification

Add code
Aug 19, 2022
Figure 1 for A Dual Modality Approach For  Multi-Label Classification
Figure 2 for A Dual Modality Approach For  Multi-Label Classification
Figure 3 for A Dual Modality Approach For  Multi-Label Classification
Figure 4 for A Dual Modality Approach For  Multi-Label Classification
Viaarxiv icon

Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation

Add code
Jan 22, 2022
Figure 1 for Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation
Figure 2 for Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation
Figure 3 for Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation
Figure 4 for Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation
Viaarxiv icon

GCF-Net: Gated Clip Fusion Network for Video Action Recognition

Add code
Feb 02, 2021
Figure 1 for GCF-Net: Gated Clip Fusion Network for Video Action Recognition
Figure 2 for GCF-Net: Gated Clip Fusion Network for Video Action Recognition
Figure 3 for GCF-Net: Gated Clip Fusion Network for Video Action Recognition
Figure 4 for GCF-Net: Gated Clip Fusion Network for Video Action Recognition
Viaarxiv icon