Picture for Jiaqi Wang

Jiaqi Wang

Michael Pokorny

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

Add code
Jun 17, 2026
Viaarxiv icon

PermaVid: Consistent Video Generation Across Edits via Disentangled Context Memory

Add code
Jun 16, 2026
Viaarxiv icon

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

Add code
Jun 10, 2026
Viaarxiv icon

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning

Add code
Jun 08, 2026
Viaarxiv icon

OmniCap-IF: Benchmarking and Improving Instruction Following Abilities for Omni-Video Captioning

Add code
Jun 07, 2026
Viaarxiv icon

Harnessing Streaming Video in the Wild

Add code
Jun 07, 2026
Viaarxiv icon

Light-WAM: Efficient World Action Models with State-Fusion Action Decoding

Add code
Jun 06, 2026
Viaarxiv icon

AdaGRPO: A Capability-Aware Adaptive Enhancement for Flow-based GRPO

Add code
Jun 05, 2026
Viaarxiv icon

Right Makes Might: Aligning Verified Hidden States Empowers RL Reasoning

Add code
Jun 02, 2026
Viaarxiv icon

AdaCodec: A Predictive Visual Code for Video MLLMs

Add code
Jun 01, 2026
Viaarxiv icon