Picture for Erkut Erdem

Erkut Erdem

Shammie

A Vision-Language Framework for Multispectral Scene Representation Using Language-Grounded Features

Add code
Jan 17, 2025
Viaarxiv icon

GaussianVideo: Efficient Video Representation via Hierarchical Gaussian Splatting

Add code
Jan 08, 2025
Viaarxiv icon

HyperGAN-CLIP: A Unified Framework for Domain Adaptation, Image Synthesis and Manipulation

Add code
Nov 19, 2024
Figure 1 for HyperGAN-CLIP: A Unified Framework for Domain Adaptation, Image Synthesis and Manipulation
Figure 2 for HyperGAN-CLIP: A Unified Framework for Domain Adaptation, Image Synthesis and Manipulation
Figure 3 for HyperGAN-CLIP: A Unified Framework for Domain Adaptation, Image Synthesis and Manipulation
Figure 4 for HyperGAN-CLIP: A Unified Framework for Domain Adaptation, Image Synthesis and Manipulation
Viaarxiv icon

HUE Dataset: High-Resolution Event and Frame Sequences for Low-Light Vision

Add code
Oct 24, 2024
Figure 1 for HUE Dataset: High-Resolution Event and Frame Sequences for Low-Light Vision
Figure 2 for HUE Dataset: High-Resolution Event and Frame Sequences for Low-Light Vision
Figure 3 for HUE Dataset: High-Resolution Event and Frame Sequences for Low-Light Vision
Figure 4 for HUE Dataset: High-Resolution Event and Frame Sequences for Low-Light Vision
Viaarxiv icon

Evaluating Linguistic Capabilities of Multimodal LLMs in the Lens of Few-Shot Learning

Add code
Jul 17, 2024
Viaarxiv icon

CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Models

Add code
Jun 13, 2024
Viaarxiv icon

SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models

Add code
May 01, 2024
Viaarxiv icon

Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare

Add code
Apr 25, 2024
Viaarxiv icon

Sequential Compositional Generalization in Multimodal Models

Add code
Apr 18, 2024
Figure 1 for Sequential Compositional Generalization in Multimodal Models
Figure 2 for Sequential Compositional Generalization in Multimodal Models
Figure 3 for Sequential Compositional Generalization in Multimodal Models
Figure 4 for Sequential Compositional Generalization in Multimodal Models
Viaarxiv icon

ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models

Add code
Nov 13, 2023
Viaarxiv icon