Picture for Shiyu Huang

Shiyu Huang

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Viaarxiv icon

Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?

Add code
Feb 17, 2025
Viaarxiv icon

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Add code
Jan 06, 2025
Viaarxiv icon

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Add code
Dec 30, 2024
Viaarxiv icon

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Add code
Nov 22, 2024
Viaarxiv icon

DreamPolish: Domain Score Distillation With Progressive Geometry Generation

Add code
Nov 03, 2024
Figure 1 for DreamPolish: Domain Score Distillation With Progressive Geometry Generation
Figure 2 for DreamPolish: Domain Score Distillation With Progressive Geometry Generation
Figure 3 for DreamPolish: Domain Score Distillation With Progressive Geometry Generation
Figure 4 for DreamPolish: Domain Score Distillation With Progressive Geometry Generation
Viaarxiv icon

CogVLM2: Visual Language Models for Image and Video Understanding

Add code
Aug 29, 2024
Figure 1 for CogVLM2: Visual Language Models for Image and Video Understanding
Figure 2 for CogVLM2: Visual Language Models for Image and Video Understanding
Figure 3 for CogVLM2: Visual Language Models for Image and Video Understanding
Figure 4 for CogVLM2: Visual Language Models for Image and Video Understanding
Viaarxiv icon

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Add code
Aug 12, 2024
Viaarxiv icon

A Survey on Self-play Methods in Reinforcement Learning

Add code
Aug 02, 2024
Viaarxiv icon

Priorformer: A UGC-VQA Method with content and distortion priors

Add code
Jun 24, 2024
Viaarxiv icon