Picture for Duy-Kien Nguyen

Duy-Kien Nguyen

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Add code
Jun 13, 2024
Viaarxiv icon

SimPLR: A Simple and Plain Transformer for Object Detection and Segmentation

Add code
Oct 09, 2023
Viaarxiv icon

R-MAE: Regions Meet Masked Autoencoders

Add code
Jun 08, 2023
Viaarxiv icon

BoxeR: Box-Attention for 2D and 3D Transformers

Add code
Nov 25, 2021
Figure 1 for BoxeR: Box-Attention for 2D and 3D Transformers
Figure 2 for BoxeR: Box-Attention for 2D and 3D Transformers
Figure 3 for BoxeR: Box-Attention for 2D and 3D Transformers
Figure 4 for BoxeR: Box-Attention for 2D and 3D Transformers
Viaarxiv icon

Revisiting Modulated Convolutions for Visual Counting and Beyond

Add code
Apr 24, 2020
Figure 1 for Revisiting Modulated Convolutions for Visual Counting and Beyond
Figure 2 for Revisiting Modulated Convolutions for Visual Counting and Beyond
Figure 3 for Revisiting Modulated Convolutions for Visual Counting and Beyond
Figure 4 for Revisiting Modulated Convolutions for Visual Counting and Beyond
Viaarxiv icon

UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision

Add code
Jan 20, 2020
Figure 1 for UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision
Figure 2 for UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision
Figure 3 for UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision
Figure 4 for UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision
Viaarxiv icon

Multi-task Learning of Hierarchical Vision-Language Representation

Add code
Dec 03, 2018
Figure 1 for Multi-task Learning of Hierarchical Vision-Language Representation
Figure 2 for Multi-task Learning of Hierarchical Vision-Language Representation
Figure 3 for Multi-task Learning of Hierarchical Vision-Language Representation
Figure 4 for Multi-task Learning of Hierarchical Vision-Language Representation
Viaarxiv icon

Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering

Add code
Apr 03, 2018
Figure 1 for Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
Figure 2 for Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
Figure 3 for Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
Figure 4 for Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
Viaarxiv icon