Picture for Sudheendra Vijayanarasimhan

Sudheendra Vijayanarasimhan

$IC^3$: Image Captioning by Committee Consensus

Add code
Feb 16, 2023
Viaarxiv icon

Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features

Add code
Dec 20, 2022
Viaarxiv icon

Distribution Aware Metrics for Conditional Natural Language Generation

Add code
Sep 29, 2022
Figure 1 for Distribution Aware Metrics for Conditional Natural Language Generation
Figure 2 for Distribution Aware Metrics for Conditional Natural Language Generation
Figure 3 for Distribution Aware Metrics for Conditional Natural Language Generation
Figure 4 for Distribution Aware Metrics for Conditional Natural Language Generation
Viaarxiv icon

What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics

Add code
May 12, 2022
Figure 1 for What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
Figure 2 for What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
Figure 3 for What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
Figure 4 for What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
Viaarxiv icon

Active Learning for Video Description With Cluster-Regularized Ensemble Ranking

Add code
Jul 29, 2020
Figure 1 for Active Learning for Video Description With Cluster-Regularized Ensemble Ranking
Figure 2 for Active Learning for Video Description With Cluster-Regularized Ensemble Ranking
Figure 3 for Active Learning for Video Description With Cluster-Regularized Ensemble Ranking
Figure 4 for Active Learning for Video Description With Cluster-Regularized Ensemble Ranking
Viaarxiv icon

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

Add code
Apr 30, 2018
Figure 1 for AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Figure 2 for AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Figure 3 for AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Figure 4 for AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Viaarxiv icon

Rethinking the Faster R-CNN Architecture for Temporal Action Localization

Add code
Apr 20, 2018
Figure 1 for Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Figure 2 for Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Figure 3 for Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Figure 4 for Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Viaarxiv icon

End-to-End Learning of Semantic Grasping

Add code
Nov 09, 2017
Figure 1 for End-to-End Learning of Semantic Grasping
Figure 2 for End-to-End Learning of Semantic Grasping
Figure 3 for End-to-End Learning of Semantic Grasping
Figure 4 for End-to-End Learning of Semantic Grasping
Viaarxiv icon

The Kinetics Human Action Video Dataset

Add code
May 19, 2017
Figure 1 for The Kinetics Human Action Video Dataset
Figure 2 for The Kinetics Human Action Video Dataset
Figure 3 for The Kinetics Human Action Video Dataset
Figure 4 for The Kinetics Human Action Video Dataset
Viaarxiv icon

Motion Prediction Under Multimodality with Conditional Stochastic Networks

Add code
May 05, 2017
Figure 1 for Motion Prediction Under Multimodality with Conditional Stochastic Networks
Figure 2 for Motion Prediction Under Multimodality with Conditional Stochastic Networks
Figure 3 for Motion Prediction Under Multimodality with Conditional Stochastic Networks
Figure 4 for Motion Prediction Under Multimodality with Conditional Stochastic Networks
Viaarxiv icon