Picture for Zixu Zhao

Zixu Zhao

VideoSAM: Open-World Video Segmentation

Add code
Oct 11, 2024
Figure 1 for VideoSAM: Open-World Video Segmentation
Figure 2 for VideoSAM: Open-World Video Segmentation
Figure 3 for VideoSAM: Open-World Video Segmentation
Figure 4 for VideoSAM: Open-World Video Segmentation
Viaarxiv icon

Rethinking The Training And Evaluation of Rich-Context Layout-to-Image Generation

Add code
Sep 07, 2024
Viaarxiv icon

PeFoMed: Parameter Efficient Fine-tuning on Multimodal Large Language Models for Medical Visual Question Answering

Add code
Jan 05, 2024
Viaarxiv icon

Unsupervised Open-Vocabulary Object Localization in Videos

Add code
Sep 18, 2023
Figure 1 for Unsupervised Open-Vocabulary Object Localization in Videos
Figure 2 for Unsupervised Open-Vocabulary Object Localization in Videos
Figure 3 for Unsupervised Open-Vocabulary Object Localization in Videos
Figure 4 for Unsupervised Open-Vocabulary Object Localization in Videos
Viaarxiv icon

Object-Centric Multiple Object Tracking

Add code
Sep 05, 2023
Figure 1 for Object-Centric Multiple Object Tracking
Figure 2 for Object-Centric Multiple Object Tracking
Figure 3 for Object-Centric Multiple Object Tracking
Figure 4 for Object-Centric Multiple Object Tracking
Viaarxiv icon

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

Add code
Jul 11, 2023
Figure 1 for Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering
Figure 2 for Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering
Figure 3 for Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering
Figure 4 for Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering
Viaarxiv icon

PointPatchMix: Point Cloud Mixing with Patch Scoring

Add code
Mar 12, 2023
Viaarxiv icon

Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical Scene Segmentation with Limited Annotations

Add code
Jul 20, 2022
Figure 1 for Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical Scene Segmentation with Limited Annotations
Figure 2 for Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical Scene Segmentation with Limited Annotations
Figure 3 for Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical Scene Segmentation with Limited Annotations
Figure 4 for Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical Scene Segmentation with Limited Annotations
Viaarxiv icon

Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation

Add code
Mar 29, 2022
Figure 1 for Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation
Figure 2 for Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation
Figure 3 for Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation
Figure 4 for Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation
Viaarxiv icon

TraSeTR: Track-to-Segment Transformer with Contrastive Query for Instance-level Instrument Segmentation in Robotic Surgery

Add code
Feb 17, 2022
Figure 1 for TraSeTR: Track-to-Segment Transformer with Contrastive Query for Instance-level Instrument Segmentation in Robotic Surgery
Figure 2 for TraSeTR: Track-to-Segment Transformer with Contrastive Query for Instance-level Instrument Segmentation in Robotic Surgery
Figure 3 for TraSeTR: Track-to-Segment Transformer with Contrastive Query for Instance-level Instrument Segmentation in Robotic Surgery
Figure 4 for TraSeTR: Track-to-Segment Transformer with Contrastive Query for Instance-level Instrument Segmentation in Robotic Surgery
Viaarxiv icon