Picture for Yanjie Wang

Yanjie Wang

Dynamic-VLM: Simple Dynamic Visual Token Compression for VideoLLM

Add code
Dec 12, 2024
Viaarxiv icon

Perceptual-Distortion Balanced Image Super-Resolution is a Multi-Objective Optimization Problem

Add code
Sep 05, 2024
Viaarxiv icon

A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding

Add code
Jul 02, 2024
Viaarxiv icon

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

Add code
May 20, 2024
Viaarxiv icon

Elysium: Exploring Object-level Perception in Videos via MLLM

Add code
Mar 29, 2024
Viaarxiv icon

PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition

Add code
Feb 15, 2024
Figure 1 for PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Figure 2 for PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Figure 3 for PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Figure 4 for PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Viaarxiv icon

GloTSFormer: Global Video Text Spotting Transformer

Add code
Jan 08, 2024
Viaarxiv icon

Rethinking Skip Connections in Encoder-decoder Networks for Monocular Depth Estimation

Add code
Aug 29, 2022
Figure 1 for Rethinking Skip Connections in Encoder-decoder Networks for Monocular Depth Estimation
Figure 2 for Rethinking Skip Connections in Encoder-decoder Networks for Monocular Depth Estimation
Figure 3 for Rethinking Skip Connections in Encoder-decoder Networks for Monocular Depth Estimation
Figure 4 for Rethinking Skip Connections in Encoder-decoder Networks for Monocular Depth Estimation
Viaarxiv icon

Learning Oriented Remote Sensing Object Detection via Naive Geometric Computing

Add code
Dec 01, 2021
Figure 1 for Learning Oriented Remote Sensing Object Detection via Naive Geometric Computing
Figure 2 for Learning Oriented Remote Sensing Object Detection via Naive Geometric Computing
Figure 3 for Learning Oriented Remote Sensing Object Detection via Naive Geometric Computing
Figure 4 for Learning Oriented Remote Sensing Object Detection via Naive Geometric Computing
Viaarxiv icon

A Relational Tucker Decomposition for Multi-Relational Link Prediction

Add code
Feb 03, 2019
Figure 1 for A Relational Tucker Decomposition for Multi-Relational Link Prediction
Figure 2 for A Relational Tucker Decomposition for Multi-Relational Link Prediction
Figure 3 for A Relational Tucker Decomposition for Multi-Relational Link Prediction
Figure 4 for A Relational Tucker Decomposition for Multi-Relational Link Prediction
Viaarxiv icon