Picture for Weihan Wang

Weihan Wang

CogVLM2: Visual Language Models for Image and Video Understanding

Add code
Aug 29, 2024
Figure 1 for CogVLM2: Visual Language Models for Image and Video Understanding
Figure 2 for CogVLM2: Visual Language Models for Image and Video Understanding
Figure 3 for CogVLM2: Visual Language Models for Image and Video Understanding
Figure 4 for CogVLM2: Visual Language Models for Image and Video Understanding
Viaarxiv icon

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Add code
Aug 12, 2024
Viaarxiv icon

VIPeR: Visual Incremental Place Recognition with Adaptive Mining and Lifelong Learning

Add code
Jul 31, 2024
Viaarxiv icon

LVBench: An Extreme Long Video Understanding Benchmark

Add code
Jun 12, 2024
Figure 1 for LVBench: An Extreme Long Video Understanding Benchmark
Figure 2 for LVBench: An Extreme Long Video Understanding Benchmark
Figure 3 for LVBench: An Extreme Long Video Understanding Benchmark
Figure 4 for LVBench: An Extreme Long Video Understanding Benchmark
Viaarxiv icon

Benchmarking Neural Radiance Fields for Autonomous Robots: An Overview

Add code
May 09, 2024
Viaarxiv icon

Stereo-NEC: Enhancing Stereo Visual-Inertial SLAM Initialization with Normal Epipolar Constraints

Add code
Mar 12, 2024
Viaarxiv icon

CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion

Add code
Mar 08, 2024
Viaarxiv icon

CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations

Add code
Feb 06, 2024
Figure 1 for CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations
Figure 2 for CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations
Figure 3 for CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations
Figure 4 for CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations
Viaarxiv icon

PlanarNeRF: Online Learning of Planar Primitives with Neural Radiance Fields

Add code
Dec 30, 2023
Viaarxiv icon

CogAgent: A Visual Language Model for GUI Agents

Add code
Dec 21, 2023
Viaarxiv icon