Picture for Sijie Zhu

Sijie Zhu

Multi-Reward as Condition for Instruction-based Image Editing

Add code
Nov 06, 2024
Figure 1 for Multi-Reward as Condition for Instruction-based Image Editing
Viaarxiv icon

Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model

Add code
Jun 15, 2024
Viaarxiv icon

CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Add code
May 09, 2024
Figure 1 for CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Figure 2 for CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Figure 3 for CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Figure 4 for CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Viaarxiv icon

Edit3K: Universal Representation Learning for Video Editing Components

Add code
Mar 24, 2024
Viaarxiv icon

$R^{2}$Former: Unified $R$etrieval and $R$eranking Transformer for Place Recognition

Add code
Apr 06, 2023
Viaarxiv icon

TopNet: Transformer-based Object Placement Network for Image Compositing

Add code
Apr 06, 2023
Viaarxiv icon

GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing

Add code
Mar 31, 2022
Figure 1 for GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing
Figure 2 for GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing
Figure 3 for GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing
Figure 4 for GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing
Viaarxiv icon

TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization

Add code
Mar 31, 2022
Figure 1 for TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
Figure 2 for TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
Figure 3 for TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
Figure 4 for TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
Viaarxiv icon

BDANet: Multiscale Convolutional Neural Network with Cross-directional Attention for Building Damage Assessment from Satellite Images

Add code
May 16, 2021
Figure 1 for BDANet: Multiscale Convolutional Neural Network with Cross-directional Attention for Building Damage Assessment from Satellite Images
Figure 2 for BDANet: Multiscale Convolutional Neural Network with Cross-directional Attention for Building Damage Assessment from Satellite Images
Figure 3 for BDANet: Multiscale Convolutional Neural Network with Cross-directional Attention for Building Damage Assessment from Satellite Images
Figure 4 for BDANet: Multiscale Convolutional Neural Network with Cross-directional Attention for Building Damage Assessment from Satellite Images
Viaarxiv icon

MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations

Add code
May 14, 2021
Figure 1 for MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations
Figure 2 for MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations
Figure 3 for MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations
Figure 4 for MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations
Viaarxiv icon