Picture for Yoshimitsu Aoki

Yoshimitsu Aoki

Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos

Add code
Jul 16, 2025
Viaarxiv icon

Iterative Event-based Motion Segmentation by Variational Contrast Maximization

Add code
Apr 25, 2025
Viaarxiv icon

Formula-Supervised Sound Event Detection: Pre-Training Without Real Data

Add code
Apr 06, 2025
Viaarxiv icon

Simultaneous Motion And Noise Estimation with Event Cameras

Add code
Apr 05, 2025
Figure 1 for Simultaneous Motion And Noise Estimation with Event Cameras
Figure 2 for Simultaneous Motion And Noise Estimation with Event Cameras
Figure 3 for Simultaneous Motion And Noise Estimation with Event Cameras
Figure 4 for Simultaneous Motion And Noise Estimation with Event Cameras
Viaarxiv icon

BoundMatch: Boundary detection applied to semi-supervised segmentation for urban-driving scenes

Add code
Mar 30, 2025
Viaarxiv icon

Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering

Add code
Mar 27, 2025
Figure 1 for Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering
Figure 2 for Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering
Figure 3 for Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering
Figure 4 for Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering
Viaarxiv icon

Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding

Add code
Jan 16, 2025
Figure 1 for Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding
Figure 2 for Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding
Figure 3 for Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding
Figure 4 for Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding
Viaarxiv icon

Acoustic-based 3D Human Pose Estimation Robust to Human Position

Add code
Nov 08, 2024
Figure 1 for Acoustic-based 3D Human Pose Estimation Robust to Human Position
Figure 2 for Acoustic-based 3D Human Pose Estimation Robust to Human Position
Figure 3 for Acoustic-based 3D Human Pose Estimation Robust to Human Position
Figure 4 for Acoustic-based 3D Human Pose Estimation Robust to Human Position
Viaarxiv icon

Pre-training with Synthetic Patterns for Audio

Add code
Oct 01, 2024
Viaarxiv icon

Data Collection-free Masked Video Modeling

Add code
Sep 10, 2024
Figure 1 for Data Collection-free Masked Video Modeling
Figure 2 for Data Collection-free Masked Video Modeling
Figure 3 for Data Collection-free Masked Video Modeling
Figure 4 for Data Collection-free Masked Video Modeling
Viaarxiv icon