Picture for Ser-Nam Lim

Ser-Nam Lim

Facebook Research, New York, NY, USA

Fast Encoding and Decoding for Implicit Video Representation

Add code
Sep 28, 2024
Viaarxiv icon

Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning

Add code
Sep 16, 2024
Viaarxiv icon

DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks

Add code
Sep 10, 2024
Viaarxiv icon

AirSketch: Generative Motion to Sketch

Add code
Jul 12, 2024
Viaarxiv icon

Composing Object Relations and Attributes for Image-Text Matching

Add code
Jun 17, 2024
Figure 1 for Composing Object Relations and Attributes for Image-Text Matching
Figure 2 for Composing Object Relations and Attributes for Image-Text Matching
Figure 3 for Composing Object Relations and Attributes for Image-Text Matching
Figure 4 for Composing Object Relations and Attributes for Image-Text Matching
Viaarxiv icon

Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval

Add code
May 01, 2024
Viaarxiv icon

Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval

Add code
Apr 23, 2024
Viaarxiv icon

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Add code
Apr 08, 2024
Viaarxiv icon

Mitigating Dialogue Hallucination for Large Multi-modal Models via Adversarial Instruction Tuning

Add code
Mar 15, 2024
Viaarxiv icon

Universal Pyramid Adversarial Training for Improved ViT Performance

Add code
Dec 26, 2023
Viaarxiv icon