Picture for Haiping Wu

Haiping Wu

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion

Add code
Dec 05, 2024
Viaarxiv icon

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Add code
Nov 10, 2023
Viaarxiv icon

Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency

Add code
May 13, 2021
Figure 1 for Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency
Figure 2 for Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency
Figure 3 for Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency
Figure 4 for Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency
Viaarxiv icon

CvT: Introducing Convolutions to Vision Transformers

Add code
Mar 29, 2021
Figure 1 for CvT: Introducing Convolutions to Vision Transformers
Figure 2 for CvT: Introducing Convolutions to Vision Transformers
Figure 3 for CvT: Introducing Convolutions to Vision Transformers
Figure 4 for CvT: Introducing Convolutions to Vision Transformers
Viaarxiv icon

Sequence Level Semantics Aggregation for Video Object Detection

Add code
Aug 20, 2019
Figure 1 for Sequence Level Semantics Aggregation for Video Object Detection
Figure 2 for Sequence Level Semantics Aggregation for Video Object Detection
Figure 3 for Sequence Level Semantics Aggregation for Video Object Detection
Figure 4 for Sequence Level Semantics Aggregation for Video Object Detection
Viaarxiv icon

Simple Baselines for Human Pose Estimation and Tracking

Add code
Aug 21, 2018
Figure 1 for Simple Baselines for Human Pose Estimation and Tracking
Figure 2 for Simple Baselines for Human Pose Estimation and Tracking
Figure 3 for Simple Baselines for Human Pose Estimation and Tracking
Figure 4 for Simple Baselines for Human Pose Estimation and Tracking
Viaarxiv icon