Picture for Yash Jain

Yash Jain

DH-Bench: Probing Depth and Height Perception of Large Visual-Language Models

Add code
Aug 21, 2024
Viaarxiv icon

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

Add code
Mar 28, 2024
Viaarxiv icon

PEEKABOO: Interactive Video Generation via Masked-Diffusion

Add code
Dec 12, 2023
Viaarxiv icon

Signed Binarization: Unlocking Efficiency Through Repetition-Sparsity Trade-Off

Add code
Dec 04, 2023
Figure 1 for Signed Binarization: Unlocking Efficiency Through Repetition-Sparsity Trade-Off
Figure 2 for Signed Binarization: Unlocking Efficiency Through Repetition-Sparsity Trade-Off
Figure 3 for Signed Binarization: Unlocking Efficiency Through Repetition-Sparsity Trade-Off
Figure 4 for Signed Binarization: Unlocking Efficiency Through Repetition-Sparsity Trade-Off
Viaarxiv icon

DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets

Add code
Nov 08, 2023
Viaarxiv icon

Fine-grained Human Activity Recognition Using Virtual On-body Acceleration Data

Add code
Nov 02, 2022
Viaarxiv icon

ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition

Add code
Feb 01, 2022
Figure 1 for ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition
Figure 2 for ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition
Figure 3 for ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition
Figure 4 for ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition
Viaarxiv icon

Deep Neural Matching Models for Graph Retrieval

Add code
Oct 03, 2021
Figure 1 for Deep Neural Matching Models for Graph Retrieval
Figure 2 for Deep Neural Matching Models for Graph Retrieval
Figure 3 for Deep Neural Matching Models for Graph Retrieval
Figure 4 for Deep Neural Matching Models for Graph Retrieval
Viaarxiv icon

Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy

Add code
Aug 23, 2021
Figure 1 for Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy
Figure 2 for Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy
Figure 3 for Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy
Figure 4 for Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy
Viaarxiv icon

Towards an Interpretable Latent Space in Structured Models for Video Prediction

Add code
Jul 16, 2021
Figure 1 for Towards an Interpretable Latent Space in Structured Models for Video Prediction
Figure 2 for Towards an Interpretable Latent Space in Structured Models for Video Prediction
Figure 3 for Towards an Interpretable Latent Space in Structured Models for Video Prediction
Figure 4 for Towards an Interpretable Latent Space in Structured Models for Video Prediction
Viaarxiv icon