Picture for Zhenhai Zhu

Zhenhai Zhu

Wavelet-Based Image Tokenizer for Vision Transformers

Add code
May 28, 2024
Viaarxiv icon

H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences

Add code
Jul 25, 2021
Figure 1 for H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences
Figure 2 for H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences
Figure 3 for H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences
Figure 4 for H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences
Viaarxiv icon

Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube

Add code
Apr 29, 2020
Figure 1 for Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Figure 2 for Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Figure 3 for Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Figure 4 for Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Viaarxiv icon

A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions

Add code
Oct 07, 2019
Figure 1 for A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions
Figure 2 for A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions
Figure 3 for A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions
Figure 4 for A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions
Viaarxiv icon

Improved Image Captioning via Policy Gradient optimization of SPIDEr

Add code
Mar 12, 2018
Figure 1 for Improved Image Captioning via Policy Gradient optimization of SPIDEr
Figure 2 for Improved Image Captioning via Policy Gradient optimization of SPIDEr
Figure 3 for Improved Image Captioning via Policy Gradient optimization of SPIDEr
Figure 4 for Improved Image Captioning via Policy Gradient optimization of SPIDEr
Viaarxiv icon