Picture for Volker Tresp

Volker Tresp

Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering

Add code
Dec 16, 2024
Viaarxiv icon

PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model

Add code
Nov 12, 2024
Figure 1 for PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model
Figure 2 for PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model
Figure 3 for PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model
Figure 4 for PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model
Viaarxiv icon

FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models

Add code
Oct 07, 2024
Figure 1 for FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Figure 2 for FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Figure 3 for FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Figure 4 for FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Viaarxiv icon

VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs

Add code
Sep 30, 2024
Figure 1 for VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
Figure 2 for VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
Figure 3 for VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
Figure 4 for VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
Viaarxiv icon

Visual Question Decomposition on Multimodal Large Language Models

Add code
Sep 28, 2024
Figure 1 for Visual Question Decomposition on Multimodal Large Language Models
Figure 2 for Visual Question Decomposition on Multimodal Large Language Models
Figure 3 for Visual Question Decomposition on Multimodal Large Language Models
Figure 4 for Visual Question Decomposition on Multimodal Large Language Models
Viaarxiv icon

Multimodal Pragmatic Jailbreak on Text-to-image Models

Add code
Sep 27, 2024
Viaarxiv icon

WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration

Add code
Aug 28, 2024
Viaarxiv icon

Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance

Add code
Aug 24, 2024
Viaarxiv icon

DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models

Add code
Aug 08, 2024
Figure 1 for DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models
Figure 2 for DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models
Figure 3 for DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models
Figure 4 for DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models
Viaarxiv icon

LookupViT: Compressing visual information to a limited number of tokens

Add code
Jul 17, 2024
Figure 1 for LookupViT: Compressing visual information to a limited number of tokens
Figure 2 for LookupViT: Compressing visual information to a limited number of tokens
Figure 3 for LookupViT: Compressing visual information to a limited number of tokens
Figure 4 for LookupViT: Compressing visual information to a limited number of tokens
Viaarxiv icon