Picture for Boqing Gong

Boqing Gong

Moiré Video Authentication: A Physical Signature Against AI Video Generation

Add code
Apr 02, 2026
Viaarxiv icon

Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos

Add code
Mar 23, 2026
Viaarxiv icon

Image Diffusion Preview with Consistency Solver

Add code
Dec 15, 2025
Viaarxiv icon

BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models

Add code
Dec 11, 2025
Viaarxiv icon

Culture in Action: Evaluating Text-to-Image Models through Social Activities

Add code
Nov 07, 2025
Viaarxiv icon

Lifting Data-Tracing Machine Unlearning to Knowledge-Tracing for Foundation Models

Add code
Jun 12, 2025
Viaarxiv icon

Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck

Add code
May 30, 2025
Figure 1 for Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck
Figure 2 for Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck
Figure 3 for Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck
Figure 4 for Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck
Viaarxiv icon

SITE: towards Spatial Intelligence Thorough Evaluation

Add code
May 08, 2025
Viaarxiv icon

BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning

Add code
Apr 13, 2025
Figure 1 for BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
Figure 2 for BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
Figure 3 for BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
Figure 4 for BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
Viaarxiv icon

VideoAds for Fast-Paced Video Understanding: Where Opensource Foundation Models Beat GPT-4o & Gemini-1.5 Pro

Add code
Apr 12, 2025
Viaarxiv icon