Picture for Zane Durante

Zane Durante

Towards Fine-Grained Video Question Answering

Add code
Mar 10, 2025
Viaarxiv icon

HourVideo: 1-Hour Video-Language Understanding

Add code
Nov 07, 2024
Figure 1 for HourVideo: 1-Hour Video-Language Understanding
Figure 2 for HourVideo: 1-Hour Video-Language Understanding
Figure 3 for HourVideo: 1-Hour Video-Language Understanding
Figure 4 for HourVideo: 1-Hour Video-Language Understanding
Viaarxiv icon

When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?

Add code
Jul 21, 2024
Figure 1 for When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
Figure 2 for When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
Figure 3 for When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
Figure 4 for When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
Viaarxiv icon

Few-Shot Classification of Interactive Activities of Daily Living (InteractADL)

Add code
Jun 03, 2024
Viaarxiv icon

Position Paper: Agent AI Towards a Holistic Intelligence

Add code
Feb 28, 2024
Viaarxiv icon

An Interactive Agent Foundation Model

Add code
Feb 08, 2024
Figure 1 for An Interactive Agent Foundation Model
Figure 2 for An Interactive Agent Foundation Model
Figure 3 for An Interactive Agent Foundation Model
Figure 4 for An Interactive Agent Foundation Model
Viaarxiv icon

Agent AI: Surveying the Horizons of Multimodal Interaction

Add code
Jan 07, 2024
Figure 1 for Agent AI: Surveying the Horizons of Multimodal Interaction
Figure 2 for Agent AI: Surveying the Horizons of Multimodal Interaction
Figure 3 for Agent AI: Surveying the Horizons of Multimodal Interaction
Figure 4 for Agent AI: Surveying the Horizons of Multimodal Interaction
Viaarxiv icon

MindAgent: Emergent Gaming Interaction

Add code
Sep 19, 2023
Viaarxiv icon

Differentially Private Video Activity Recognition

Add code
Jun 27, 2023
Figure 1 for Differentially Private Video Activity Recognition
Figure 2 for Differentially Private Video Activity Recognition
Figure 3 for Differentially Private Video Activity Recognition
Figure 4 for Differentially Private Video Activity Recognition
Viaarxiv icon

Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin

Add code
Aug 27, 2021
Figure 1 for Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin
Figure 2 for Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin
Figure 3 for Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin
Figure 4 for Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin
Viaarxiv icon