Picture for Dimitris N. Metaxas

Dimitris N. Metaxas

Rutgers University

The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering

Add code
Feb 05, 2025
Figure 1 for The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering
Figure 2 for The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering
Figure 3 for The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering
Figure 4 for The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering
Viaarxiv icon

LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation

Add code
Feb 04, 2025
Viaarxiv icon

RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models

Add code
Feb 04, 2025
Viaarxiv icon

MLLM-as-a-Judge for Image Safety without Human Labeling

Add code
Dec 31, 2024
Figure 1 for MLLM-as-a-Judge for Image Safety without Human Labeling
Figure 2 for MLLM-as-a-Judge for Image Safety without Human Labeling
Figure 3 for MLLM-as-a-Judge for Image Safety without Human Labeling
Figure 4 for MLLM-as-a-Judge for Image Safety without Human Labeling
Viaarxiv icon

Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction

Add code
Nov 30, 2024
Viaarxiv icon

Steering Rectified Flow Models in the Vector Field for Controlled Image Generation

Add code
Nov 27, 2024
Viaarxiv icon

Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment

Add code
Sep 22, 2024
Figure 1 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Figure 2 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Figure 3 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Figure 4 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Viaarxiv icon

APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking

Add code
Jun 20, 2024
Viaarxiv icon

SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

Add code
Jun 03, 2024
Figure 1 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Figure 2 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Figure 3 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Figure 4 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Viaarxiv icon

Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models

Add code
May 31, 2024
Viaarxiv icon