Picture for Alessio Tonioni

Alessio Tonioni

Test-Time Visual In-Context Tuning

Add code
Mar 27, 2025
Viaarxiv icon

Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos

Add code
Mar 17, 2025
Viaarxiv icon

UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency

Add code
Dec 19, 2024
Figure 1 for UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency
Figure 2 for UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency
Figure 3 for UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency
Figure 4 for UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency
Viaarxiv icon

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Add code
Nov 27, 2024
Viaarxiv icon

BRAVE: Broadening the visual encoding of vision-language models

Add code
Apr 10, 2024
Viaarxiv icon

Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces

Add code
Mar 29, 2024
Figure 1 for Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces
Figure 2 for Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces
Figure 3 for Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces
Figure 4 for Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces
Viaarxiv icon

InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes

Add code
Jan 10, 2024
Viaarxiv icon

Text-Conditioned Resampler For Long Form Video Understanding

Add code
Dec 19, 2023
Figure 1 for Text-Conditioned Resampler For Long Form Video Understanding
Figure 2 for Text-Conditioned Resampler For Long Form Video Understanding
Figure 3 for Text-Conditioned Resampler For Long Form Video Understanding
Figure 4 for Text-Conditioned Resampler For Long Form Video Understanding
Viaarxiv icon

LIME: Localized Image Editing via Attention Regularization in Diffusion Models

Add code
Dec 14, 2023
Figure 1 for LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Figure 2 for LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Figure 3 for LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Figure 4 for LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Viaarxiv icon

TouchSDF: A DeepSDF Approach for 3D Shape Reconstruction using Vision-Based Tactile Sensing

Add code
Nov 21, 2023
Figure 1 for TouchSDF: A DeepSDF Approach for 3D Shape Reconstruction using Vision-Based Tactile Sensing
Figure 2 for TouchSDF: A DeepSDF Approach for 3D Shape Reconstruction using Vision-Based Tactile Sensing
Figure 3 for TouchSDF: A DeepSDF Approach for 3D Shape Reconstruction using Vision-Based Tactile Sensing
Figure 4 for TouchSDF: A DeepSDF Approach for 3D Shape Reconstruction using Vision-Based Tactile Sensing
Viaarxiv icon