Picture for Zehao Li

Zehao Li

OS-Themis: A Scalable Critic Framework for Generalist GUI Rewards

Add code
Mar 19, 2026
Viaarxiv icon

FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning

Add code
Feb 26, 2026
Viaarxiv icon

PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding

Add code
Feb 24, 2026
Viaarxiv icon

CamReasoner: Reinforcing Camera Movement Understanding via Structured Spatial Reasoning

Add code
Jan 30, 2026
Viaarxiv icon

OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent

Add code
Jan 12, 2026
Viaarxiv icon

OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models

Add code
Dec 18, 2025
Figure 1 for OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models
Figure 2 for OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models
Figure 3 for OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models
Figure 4 for OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models
Viaarxiv icon

A 96pJ/Frame/Pixel and 61pJ/Event Anti-UAV System with Hybrid Object Tracking Modes

Add code
Dec 12, 2025
Viaarxiv icon

Effective Code Membership Inference for Code Completion Models via Adversarial Prompts

Add code
Nov 19, 2025
Viaarxiv icon

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Add code
Sep 18, 2025
Figure 1 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 2 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 3 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 4 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Viaarxiv icon

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Add code
Aug 25, 2025
Figure 1 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Figure 2 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Figure 3 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Figure 4 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Viaarxiv icon