Picture for Longyin Wen

Longyin Wen

Multi-Reward as Condition for Instruction-based Image Editing

Add code
Nov 06, 2024
Figure 1 for Multi-Reward as Condition for Instruction-based Image Editing
Viaarxiv icon

DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models

Add code
Nov 05, 2024
Viaarxiv icon

AIPO: Improving Training Objective for Iterative Preference Optimization

Add code
Sep 13, 2024
Viaarxiv icon

Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model

Add code
Jun 15, 2024
Viaarxiv icon

CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Add code
May 09, 2024
Viaarxiv icon

Edit3K: Universal Representation Learning for Video Editing Components

Add code
Mar 24, 2024
Viaarxiv icon

Accurate and Fast Compressed Video Captioning

Add code
Sep 22, 2023
Viaarxiv icon

Exploring the Role of Audio in Video Captioning

Add code
Jun 21, 2023
Viaarxiv icon

Text with Knowledge Graph Augmented Transformer for Video Captioning

Add code
Mar 25, 2023
Viaarxiv icon

DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training

Add code
Mar 06, 2023
Viaarxiv icon