Picture for Kumar Ashutosh

Kumar Ashutosh

IIT Bombay

FIction: 4D Future Interaction Prediction from Video

Add code
Dec 01, 2024
Viaarxiv icon

ExpertAF: Expert Actionable Feedback from Video

Add code
Aug 01, 2024
Viaarxiv icon

SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos

Add code
Apr 08, 2024
Figure 1 for SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Figure 2 for SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Figure 3 for SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Figure 4 for SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Viaarxiv icon

Detours for Navigating Instructional Videos

Add code
Jan 03, 2024
Figure 1 for Detours for Navigating Instructional Videos
Figure 2 for Detours for Navigating Instructional Videos
Figure 3 for Detours for Navigating Instructional Videos
Figure 4 for Detours for Navigating Instructional Videos
Viaarxiv icon

Learning Object State Changes in Videos: An Open-World Perspective

Add code
Dec 19, 2023
Figure 1 for Learning Object State Changes in Videos: An Open-World Perspective
Figure 2 for Learning Object State Changes in Videos: An Open-World Perspective
Figure 3 for Learning Object State Changes in Videos: An Open-World Perspective
Figure 4 for Learning Object State Changes in Videos: An Open-World Perspective
Viaarxiv icon

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Add code
Nov 30, 2023
Figure 1 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 2 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 3 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 4 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Viaarxiv icon

Video-Mined Task Graphs for Keystep Recognition in Instructional Videos

Add code
Jul 17, 2023
Viaarxiv icon

What You Say Is What You Show: Visual Narration Detection in Instructional Videos

Add code
Jan 05, 2023
Viaarxiv icon

HierVL: Learning Hierarchical Video-Language Embeddings

Add code
Jan 05, 2023
Viaarxiv icon

RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging

Add code
Oct 15, 2022
Figure 1 for RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
Figure 2 for RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
Figure 3 for RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
Figure 4 for RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
Viaarxiv icon