Picture for Yuanchun Shi

Yuanchun Shi

MOAT: Evaluating LMMs for Capability Integration and Instruction Grounding

Add code
Mar 12, 2025
Viaarxiv icon

GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training

Add code
Mar 11, 2025
Viaarxiv icon

CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation

Add code
Nov 29, 2024
Figure 1 for CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Figure 2 for CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Figure 3 for CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Figure 4 for CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Viaarxiv icon

Summit Vitals: Multi-Camera and Multi-Signal Biosensing at High Altitudes

Add code
Sep 28, 2024
Viaarxiv icon

PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-based Motion Capture

Add code
Sep 21, 2024
Viaarxiv icon

G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios

Add code
May 13, 2024
Viaarxiv icon

Time2Stop: Adaptive and Explainable Human-AI Loop for Smartphone Overuse Intervention

Add code
Mar 03, 2024
Viaarxiv icon

MindShift: Leveraging Large Language Models for Mental-States-Based Problematic Smartphone Use Intervention

Add code
Sep 28, 2023
Viaarxiv icon

Modeling the Trade-off of Privacy Preservation and Activity Recognition on Low-Resolution Images

Add code
Mar 18, 2023
Viaarxiv icon

GazeReader: Detecting Unknown Word Using Webcam for English as a Second Language (ESL) Learners

Add code
Mar 18, 2023
Viaarxiv icon