Picture for Aishwarya Agrawal

Aishwarya Agrawal

TECCI: Tricky Edits of Collected and Curated Images

Add code
May 31, 2026
Viaarxiv icon

How and What to Imagine? Visual Thinking in Unified Multimodal Models for Cross-View Spatial Reasoning

Add code
May 26, 2026
Viaarxiv icon

RiT: Vanilla Diffusion Transformers Suffice in Representation Space

Add code
May 21, 2026
Viaarxiv icon

From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs

Add code
May 04, 2026
Viaarxiv icon

Discovering Failure Modes in Vision-Language Models using RL

Add code
Apr 06, 2026
Viaarxiv icon

Communicating about Space: Language-Mediated Spatial Integration Across Partial Views

Add code
Apr 01, 2026
Viaarxiv icon

CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics

Add code
Jun 10, 2025
Viaarxiv icon

REARANK: Reasoning Re-ranking Agent via Reinforcement Learning

Add code
May 26, 2025
Viaarxiv icon

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Add code
Mar 27, 2025
Figure 1 for CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Figure 2 for CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Figure 3 for CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Viaarxiv icon

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction

Add code
Mar 19, 2025
Viaarxiv icon