Picture for Volker Tresp

Volker Tresp

PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection

Add code
Feb 17, 2025
Viaarxiv icon

FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings

Add code
Jan 11, 2025
Viaarxiv icon

Perceive, Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries

Add code
Dec 26, 2024
Viaarxiv icon

Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering

Add code
Dec 16, 2024
Viaarxiv icon

PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model

Add code
Nov 12, 2024
Figure 1 for PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model
Figure 2 for PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model
Figure 3 for PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model
Figure 4 for PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model
Viaarxiv icon

FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models

Add code
Oct 07, 2024
Figure 1 for FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Figure 2 for FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Figure 3 for FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Figure 4 for FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Viaarxiv icon

VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs

Add code
Sep 30, 2024
Figure 1 for VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
Figure 2 for VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
Figure 3 for VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
Figure 4 for VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
Viaarxiv icon

Visual Question Decomposition on Multimodal Large Language Models

Add code
Sep 28, 2024
Figure 1 for Visual Question Decomposition on Multimodal Large Language Models
Figure 2 for Visual Question Decomposition on Multimodal Large Language Models
Figure 3 for Visual Question Decomposition on Multimodal Large Language Models
Figure 4 for Visual Question Decomposition on Multimodal Large Language Models
Viaarxiv icon

Multimodal Pragmatic Jailbreak on Text-to-image Models

Add code
Sep 27, 2024
Viaarxiv icon

WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration

Add code
Aug 28, 2024
Viaarxiv icon