Picture for Jiale Cao

Jiale Cao

CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation

Add code
Nov 21, 2024
Figure 1 for CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation
Figure 2 for CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation
Figure 3 for CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation
Figure 4 for CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation
Viaarxiv icon

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Add code
Nov 07, 2024
Figure 1 for VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Figure 2 for VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Figure 3 for VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Figure 4 for VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Viaarxiv icon

DB-SAM: Delving into High Quality Universal Medical Image Segmentation

Add code
Oct 05, 2024
Viaarxiv icon

iSeg: An Iterative Refinement-based Framework for Training-free Segmentation

Add code
Sep 05, 2024
Figure 1 for iSeg: An Iterative Refinement-based Framework for Training-free Segmentation
Figure 2 for iSeg: An Iterative Refinement-based Framework for Training-free Segmentation
Figure 3 for iSeg: An Iterative Refinement-based Framework for Training-free Segmentation
Figure 4 for iSeg: An Iterative Refinement-based Framework for Training-free Segmentation
Viaarxiv icon

Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective

Add code
Jul 24, 2024
Figure 1 for Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective
Figure 2 for Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective
Figure 3 for Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective
Figure 4 for Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective
Viaarxiv icon

Multi-Granularity Language-Guided Multi-Object Tracking

Add code
Jun 07, 2024
Figure 1 for Multi-Granularity Language-Guided Multi-Object Tracking
Figure 2 for Multi-Granularity Language-Guided Multi-Object Tracking
Figure 3 for Multi-Granularity Language-Guided Multi-Object Tracking
Figure 4 for Multi-Granularity Language-Guided Multi-Object Tracking
Viaarxiv icon

VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection

Add code
Apr 15, 2024
Viaarxiv icon

Implicit and Explicit Language Guidance for Diffusion-based Visual Perception

Add code
Apr 11, 2024
Viaarxiv icon

SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior

Add code
Mar 29, 2024
Viaarxiv icon

CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation

Add code
Mar 19, 2024
Figure 1 for CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
Figure 2 for CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
Figure 3 for CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
Figure 4 for CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
Viaarxiv icon