Picture for Anand Mishra

Anand Mishra

Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex Interactions

Add code
Feb 12, 2025
Viaarxiv icon

Towards Making Flowchart Images Machine Interpretable

Add code
Jan 29, 2025
Viaarxiv icon

PatentLMM: Large Multimodal Model for Generating Descriptions for Patent Figures

Add code
Jan 25, 2025
Viaarxiv icon

Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant

Add code
Oct 24, 2024
Viaarxiv icon

Sketch-guided Image Inpainting with Partial Discrete Diffusion Process

Add code
Apr 18, 2024
Viaarxiv icon

Towards Scene-Text to Scene-Text Translation

Add code
Aug 06, 2023
Viaarxiv icon

Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering

Add code
Jun 29, 2023
Viaarxiv icon

Query-guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch

Add code
Mar 15, 2023
Viaarxiv icon

Multimodal Query-guided Object Localization

Add code
Dec 01, 2022
Viaarxiv icon

Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification

Add code
Nov 23, 2022
Figure 1 for Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification
Figure 2 for Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification
Figure 3 for Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification
Figure 4 for Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification
Viaarxiv icon