Picture for Yan Shu

Yan Shu

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

Add code
Oct 14, 2024
Figure 1 for TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
Figure 2 for TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
Figure 3 for TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
Figure 4 for TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
Viaarxiv icon

First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending

Add code
Oct 14, 2024
Figure 1 for First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending
Figure 2 for First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending
Figure 3 for First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending
Figure 4 for First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending
Viaarxiv icon

Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding

Add code
Sep 24, 2024
Figure 1 for Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding
Figure 2 for Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding
Figure 3 for Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding
Figure 4 for Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding
Viaarxiv icon

MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding

Add code
Jun 06, 2024
Figure 1 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Figure 2 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Figure 3 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Figure 4 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Viaarxiv icon

Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing

Add code
Feb 05, 2024
Viaarxiv icon

Depth-agnostic Single Image Dehazing

Add code
Jan 14, 2024
Viaarxiv icon

CLiF-VQA: Enhancing Video Quality Assessment by Incorporating High-Level Semantic Information related to Human Feelings

Add code
Nov 13, 2023
Figure 1 for CLiF-VQA: Enhancing Video Quality Assessment by Incorporating High-Level Semantic Information related to Human Feelings
Figure 2 for CLiF-VQA: Enhancing Video Quality Assessment by Incorporating High-Level Semantic Information related to Human Feelings
Figure 3 for CLiF-VQA: Enhancing Video Quality Assessment by Incorporating High-Level Semantic Information related to Human Feelings
Figure 4 for CLiF-VQA: Enhancing Video Quality Assessment by Incorporating High-Level Semantic Information related to Human Feelings
Viaarxiv icon

Read Pointer Meters in complex environments based on a Human-like Alignment and Recognition Algorithm

Add code
Feb 28, 2023
Viaarxiv icon

Condensing a Sequence to One Informative Frame for Video Recognition

Add code
Jan 11, 2022
Figure 1 for Condensing a Sequence to One Informative Frame for Video Recognition
Figure 2 for Condensing a Sequence to One Informative Frame for Video Recognition
Figure 3 for Condensing a Sequence to One Informative Frame for Video Recognition
Figure 4 for Condensing a Sequence to One Informative Frame for Video Recognition
Viaarxiv icon

A Characterization of Mean Squared Error for Estimator with Bagging

Add code
Aug 07, 2019
Figure 1 for A Characterization of Mean Squared Error for Estimator with Bagging
Figure 2 for A Characterization of Mean Squared Error for Estimator with Bagging
Figure 3 for A Characterization of Mean Squared Error for Estimator with Bagging
Viaarxiv icon