Picture for Mengmeng Wang

Mengmeng Wang

Visual Object Tracking across Diverse Data Modalities: A Review

Add code
Dec 13, 2024
Viaarxiv icon

GLOVER: Generalizable Open-Vocabulary Affordance Reasoning for Task-Oriented Grasping

Add code
Nov 19, 2024
Viaarxiv icon

Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing

Add code
Oct 24, 2024
Viaarxiv icon

Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery

Add code
Sep 29, 2024
Figure 1 for Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery
Figure 2 for Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery
Figure 3 for Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery
Figure 4 for Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery
Viaarxiv icon

SpotActor: Training-Free Layout-Controlled Consistent Image Generation

Add code
Sep 07, 2024
Viaarxiv icon

LV-UNet: A Lightweight and Vanilla Model for Medical Image Segmentation

Add code
Aug 29, 2024
Viaarxiv icon

LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments

Add code
Jun 24, 2024
Figure 1 for LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments
Figure 2 for LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments
Figure 3 for LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments
Figure 4 for LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments
Viaarxiv icon

DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation

Add code
Mar 28, 2024
Viaarxiv icon

SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking

Add code
Mar 28, 2024
Viaarxiv icon

M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition

Add code
Jan 22, 2024
Viaarxiv icon