Picture for Ruiyi Zhang

Ruiyi Zhang

Defense against Prompt Injection Attacks via Mixture of Encodings

Add code
Apr 10, 2025
Viaarxiv icon

Towards Visual Text Grounding of Multimodal Large Language Model

Add code
Apr 07, 2025
Viaarxiv icon

Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models

Add code
Mar 20, 2025
Viaarxiv icon

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

Add code
Mar 18, 2025
Viaarxiv icon

From Selection to Generation: A Survey of LLM-based Active Learning

Add code
Feb 17, 2025
Viaarxiv icon

GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration

Add code
Jan 27, 2025
Viaarxiv icon

A High-Quality Text-Rich Image Instruction Tuning Dataset via Hybrid Instruction Generation

Add code
Dec 20, 2024
Viaarxiv icon

GUI Agents: A Survey

Add code
Dec 18, 2024
Viaarxiv icon

Numerical Pruning for Efficient Autoregressive Models

Add code
Dec 17, 2024
Figure 1 for Numerical Pruning for Efficient Autoregressive Models
Figure 2 for Numerical Pruning for Efficient Autoregressive Models
Figure 3 for Numerical Pruning for Efficient Autoregressive Models
Figure 4 for Numerical Pruning for Efficient Autoregressive Models
Viaarxiv icon

SUGAR: Subject-Driven Video Customization in a Zero-Shot Manner

Add code
Dec 13, 2024
Viaarxiv icon