Picture for Junmo Kim

Junmo Kim

Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance

Add code
Oct 17, 2024
Figure 1 for Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance
Figure 2 for Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance
Figure 3 for Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance
Figure 4 for Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance
Viaarxiv icon

StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models

Add code
Oct 10, 2024
Figure 1 for StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models
Figure 2 for StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models
Figure 3 for StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models
Figure 4 for StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models
Viaarxiv icon

Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality

Add code
Oct 07, 2024
Figure 1 for Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
Figure 2 for Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
Figure 3 for Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
Figure 4 for Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
Viaarxiv icon

Beta Sampling is All You Need: Efficient Image Generation Strategy for Diffusion Models using Stepwise Spectral Analysis

Add code
Jul 16, 2024
Viaarxiv icon

AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning

Add code
Jul 11, 2024
Figure 1 for AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning
Figure 2 for AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning
Figure 3 for AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning
Figure 4 for AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning
Viaarxiv icon

Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition

Add code
Jun 13, 2024
Figure 1 for Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition
Figure 2 for Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition
Figure 3 for Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition
Figure 4 for Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition
Viaarxiv icon

Towards Understanding Dual BN In Hybrid Adversarial Training

Add code
Mar 28, 2024
Viaarxiv icon

ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object

Add code
Mar 27, 2024
Figure 1 for ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Figure 2 for ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Figure 3 for ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Figure 4 for ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Viaarxiv icon

EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning

Add code
Mar 14, 2024
Figure 1 for EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Figure 2 for EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Figure 3 for EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Figure 4 for EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Viaarxiv icon

Stereo-Matching Knowledge Distilled Monocular Depth Estimation Filtered by Multiple Disparity Consistency

Add code
Jan 23, 2024
Figure 1 for Stereo-Matching Knowledge Distilled Monocular Depth Estimation Filtered by Multiple Disparity Consistency
Figure 2 for Stereo-Matching Knowledge Distilled Monocular Depth Estimation Filtered by Multiple Disparity Consistency
Figure 3 for Stereo-Matching Knowledge Distilled Monocular Depth Estimation Filtered by Multiple Disparity Consistency
Figure 4 for Stereo-Matching Knowledge Distilled Monocular Depth Estimation Filtered by Multiple Disparity Consistency
Viaarxiv icon