Picture for Yifei Huang

Yifei Huang

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant

Add code
Mar 06, 2025
Viaarxiv icon

Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning

Add code
Mar 02, 2025
Viaarxiv icon

SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training

Add code
Feb 21, 2025
Figure 1 for SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training
Figure 2 for SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training
Figure 3 for SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training
Figure 4 for SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training
Viaarxiv icon

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model

Add code
Dec 30, 2024
Figure 1 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Figure 2 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Figure 3 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Figure 4 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Viaarxiv icon

CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding

Add code
Dec 16, 2024
Figure 1 for CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Figure 2 for CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Figure 3 for CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Figure 4 for CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Viaarxiv icon

Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers

Add code
Oct 10, 2024
Figure 1 for Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
Figure 2 for Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
Figure 3 for Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
Figure 4 for Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
Viaarxiv icon

Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild

Add code
Sep 15, 2024
Figure 1 for Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild
Figure 2 for Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild
Figure 3 for Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild
Figure 4 for Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild
Viaarxiv icon

ActionVOS: Actions as Prompts for Video Object Segmentation

Add code
Jul 10, 2024
Figure 1 for ActionVOS: Actions as Prompts for Video Object Segmentation
Figure 2 for ActionVOS: Actions as Prompts for Video Object Segmentation
Figure 3 for ActionVOS: Actions as Prompts for Video Object Segmentation
Figure 4 for ActionVOS: Actions as Prompts for Video Object Segmentation
Viaarxiv icon

Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition

Add code
Jul 09, 2024
Viaarxiv icon

EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation

Add code
Jun 27, 2024
Figure 1 for EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
Figure 2 for EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
Figure 3 for EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
Figure 4 for EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
Viaarxiv icon