Picture for Shiyu Huang

Shiyu Huang

A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models

Add code
Aug 12, 2025
Figure 1 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Figure 2 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Figure 3 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Figure 4 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Viaarxiv icon

Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models

Add code
Aug 10, 2025
Figure 1 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Figure 2 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Figure 3 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Figure 4 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Viaarxiv icon

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Figure 1 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 2 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 3 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 4 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Viaarxiv icon

Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?

Add code
Feb 17, 2025
Viaarxiv icon

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Add code
Jan 06, 2025
Figure 1 for MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Figure 2 for MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Figure 3 for MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Figure 4 for MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Viaarxiv icon

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Add code
Dec 30, 2024
Viaarxiv icon

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Add code
Nov 22, 2024
Viaarxiv icon

DreamPolish: Domain Score Distillation With Progressive Geometry Generation

Add code
Nov 03, 2024
Figure 1 for DreamPolish: Domain Score Distillation With Progressive Geometry Generation
Figure 2 for DreamPolish: Domain Score Distillation With Progressive Geometry Generation
Figure 3 for DreamPolish: Domain Score Distillation With Progressive Geometry Generation
Figure 4 for DreamPolish: Domain Score Distillation With Progressive Geometry Generation
Viaarxiv icon

CogVLM2: Visual Language Models for Image and Video Understanding

Add code
Aug 29, 2024
Figure 1 for CogVLM2: Visual Language Models for Image and Video Understanding
Figure 2 for CogVLM2: Visual Language Models for Image and Video Understanding
Figure 3 for CogVLM2: Visual Language Models for Image and Video Understanding
Figure 4 for CogVLM2: Visual Language Models for Image and Video Understanding
Viaarxiv icon

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Add code
Aug 12, 2024
Viaarxiv icon