Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pranav Shetty

Where is this coming from? Making groundedness count in the evaluation of Document VQA models

Mar 24, 2025

Armineh Nourbakhsh, Siddharth Parekh, Pranav Shetty, Zhao Jin, Sameena Shah, Carolyn Rose

Abstract:Document Visual Question Answering (VQA) models have evolved at an impressive rate over the past few years, coming close to or matching human performance on some benchmarks. We argue that common evaluation metrics used by popular benchmarks do not account for the semantic and multimodal groundedness of a model's outputs. As a result, hallucinations and major semantic errors are treated the same way as well-grounded outputs, and the evaluation scores do not reflect the reasoning capabilities of the model. In response, we propose a new evaluation methodology that accounts for the groundedness of predictions with regard to the semantic characteristics of the output as well as the multimodal placement of the output within the input document. Our proposed methodology is parameterized in such a way that users can configure the score according to their preferences. We validate our scoring methodology using human judgment and show its potential impact on existing popular leaderboards. Through extensive analyses, we demonstrate that our proposed method produces scores that are a better indicator of a model's robustness and tends to give higher rewards to better-calibrated answers.

* Accepted to NAACL Findings 2025

Via

Access Paper or Ask Questions

"What is the value of {templates}?" Rethinking Document Information Extraction Datasets for LLMs

Oct 20, 2024

Ran Zmigrod, Pranav Shetty, Mathieu Sibue, Zhiqiang Ma, Armineh Nourbakhsh, Xiaomo Liu, Manuela Veloso

Figure 1 for "What is the value of {templates}?" Rethinking Document Information Extraction Datasets for LLMs

Figure 2 for "What is the value of {templates}?" Rethinking Document Information Extraction Datasets for LLMs

Figure 3 for "What is the value of {templates}?" Rethinking Document Information Extraction Datasets for LLMs

Figure 4 for "What is the value of {templates}?" Rethinking Document Information Extraction Datasets for LLMs

Abstract:The rise of large language models (LLMs) for visually rich document understanding (VRDU) has kindled a need for prompt-response, document-based datasets. As annotating new datasets from scratch is labor-intensive, the existing literature has generated prompt-response datasets from available resources using simple templates. For the case of key information extraction (KIE), one of the most common VRDU tasks, past work has typically employed the template "What is the value for the {key}?". However, given the variety of questions encountered in the wild, simple and uniform templates are insufficient for creating robust models in research and industrial contexts. In this work, we present K2Q, a diverse collection of five datasets converted from KIE to a prompt-response format using a plethora of bespoke templates. The questions in K2Q can span multiple entities and be extractive or boolean. We empirically compare the performance of seven baseline generative models on K2Q with zero-shot prompting. We further compare three of these models when training on K2Q versus training on simpler templates to motivate the need of our work. We find that creating diverse and intricate KIE questions enhances the performance and robustness of VRDU models. We hope this work encourages future studies on data quality for generative model training.

* Accepted to EMNLP Findings 2024

Via

Access Paper or Ask Questions

Accelerating materials discovery for polymer solar cells: Data-driven insights enabled by natural language processing

Feb 29, 2024

Pranav Shetty, Aishat Adeboye, Sonakshi Gupta, Chao Zhang, Rampi Ramprasad

Figure 1 for Accelerating materials discovery for polymer solar cells: Data-driven insights enabled by natural language processing

Figure 2 for Accelerating materials discovery for polymer solar cells: Data-driven insights enabled by natural language processing

Figure 3 for Accelerating materials discovery for polymer solar cells: Data-driven insights enabled by natural language processing

Figure 4 for Accelerating materials discovery for polymer solar cells: Data-driven insights enabled by natural language processing

Abstract:We present a natural language processing pipeline that was used to extract polymer solar cell property data from the literature and simulate various active learning strategies. While data-driven methods have been well established to discover novel materials faster than Edisonian trial-and-error approaches, their benefits have not been quantified. Our approach demonstrates a potential reduction in discovery time by approximately 75 %, equivalent to a 15 year acceleration in material innovation. Our pipeline enables us to extract data from more than 3300 papers which is ~5 times larger than similar data sets reported by others. We also trained machine learning models to predict the power conversion efficiency and used our model to identify promising donor-acceptor combinations that are as yet unreported. We thus demonstrate a workflow that goes from published literature to extracted material property data which in turn is used to obtain data-driven insights. Our insights include active learning strategies that can simultaneously optimize the material system and train strong predictive models of material properties. This work provides a valuable framework for research in material science.

Via

Access Paper or Ask Questions

PolyIE: A Dataset of Information Extraction from Polymer Material Scientific Literature

Nov 13, 2023

Jerry Junyang Cheung, Yuchen Zhuang, Yinghao Li, Pranav Shetty, Wantian Zhao, Sanjeev Grampurohit, Rampi Ramprasad, Chao Zhang

Abstract:Scientific information extraction (SciIE), which aims to automatically extract information from scientific literature, is becoming more important than ever. However, there are no existing SciIE datasets for polymer materials, which is an important class of materials used ubiquitously in our daily lives. To bridge this gap, we introduce POLYIE, a new SciIE dataset for polymer materials. POLYIE is curated from 146 full-length polymer scholarly articles, which are annotated with different named entities (i.e., materials, properties, values, conditions) as well as their N-ary relations by domain experts. POLYIE presents several unique challenges due to diverse lexical formats of entities, ambiguity between entities, and variable-length relations. We evaluate state-of-the-art named entity extraction and relation extraction models on POLYIE, analyze their strengths and weaknesses, and highlight some difficult cases for these models. To the best of our knowledge, POLYIE is the first SciIE benchmark for polymer materials, and we hope it will lead to more research efforts from the community on this challenging task. Our code and data are available on: https://github.com/jerry3027/PolyIE.

* Work in progress

Via

Access Paper or Ask Questions

Cross-Geography Generalization of Machine Learning Methods for Classification of Flooded Regions in Aerial Images

Oct 04, 2022

Sushant Lenka, Pratyush Kerhalkar, Pranav Shetty, Harsh Gupta, Bhavam Vidyarthi, Ujjwal Verma

Figure 1 for Cross-Geography Generalization of Machine Learning Methods for Classification of Flooded Regions in Aerial Images

Figure 2 for Cross-Geography Generalization of Machine Learning Methods for Classification of Flooded Regions in Aerial Images

Figure 3 for Cross-Geography Generalization of Machine Learning Methods for Classification of Flooded Regions in Aerial Images

Figure 4 for Cross-Geography Generalization of Machine Learning Methods for Classification of Flooded Regions in Aerial Images

Abstract:Identification of regions affected by floods is a crucial piece of information required for better planning and management of post-disaster relief and rescue efforts. Traditionally, remote sensing images are analysed to identify the extent of damage caused by flooding. The data acquired from sensors onboard earth observation satellites are analyzed to detect the flooded regions, which can be affected by low spatial and temporal resolution. However, in recent years, the images acquired from Unmanned Aerial Vehicles (UAVs) have also been utilized to assess post-disaster damage. Indeed, a UAV based platform can be rapidly deployed with a customized flight plan and minimum dependence on the ground infrastructure. This work proposes two approaches for identifying flooded regions in UAV aerial images. The first approach utilizes texture-based unsupervised segmentation to detect flooded areas, while the second uses an artificial neural network on the texture features to classify images as flooded and non-flooded. Unlike the existing works where the models are trained and tested on images of the same geographical regions, this work studies the performance of the proposed model in identifying flooded regions across geographical regions. An F1-score of 0.89 is obtained using the proposed segmentation-based approach which is higher than existing classifiers. The robustness of the proposed approach demonstrates that it can be utilized to identify flooded regions of any region with minimum or no user intervention.

Via

Access Paper or Ask Questions

A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing

Sep 27, 2022

Pranav Shetty, Arunkumar Chitteth Rajan, Christopher Kuenneth, Sonkakshi Gupta, Lakshmi Prerana Panchumarti, Lauren Holm, Chao Zhang, Rampi Ramprasad

Figure 1 for A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing

Figure 2 for A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing

Figure 3 for A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing

Figure 4 for A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing

Abstract:The ever-increasing number of materials science articles makes it hard to infer chemistry-structure-property relations from published literature. We used natural language processing (NLP) methods to automatically extract material property data from the abstracts of polymer literature. As a component of our pipeline, we trained MaterialsBERT, a language model, using 2.4 million materials science abstracts, which outperforms other baseline models in three out of five named entity recognition datasets when used as the encoder for text. Using this pipeline, we obtained ~300,000 material property records from ~130,000 abstracts in 60 hours. The extracted data was analyzed for a diverse range of applications such as fuel cells, supercapacitors, and polymer solar cells to recover non-trivial insights. The data extracted through our pipeline is made available through a web platform at https://polymerscholar.org which can be used to locate material property data recorded in abstracts conveniently. This work demonstrates the feasibility of an automatic pipeline that starts from published literature and ends with a complete set of extracted material property information.

Via

Access Paper or Ask Questions

PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning

Mar 18, 2022

Rongzhi Zhang, Yue Yu, Pranav Shetty, Le Song, Chao Zhang

Figure 1 for PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning

Figure 2 for PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning

Figure 3 for PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning

Figure 4 for PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning

Abstract:Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set is tedious and difficult. We study interactive weakly-supervised learning -- the problem of iteratively and automatically discovering novel labeling rules from data to improve the WSL model. Our proposed model, named PRBoost, achieves this goal via iterative prompt-based rule discovery and model boosting. It uses boosting to identify large-error instances and then discovers candidate rules from them by prompting pre-trained LMs with rule templates. The candidate rules are judged by human experts, and the accepted rules are used to generate complementary weak labels and strengthen the current model. Experiments on four tasks show PRBoost outperforms state-of-the-art WSL baselines up to 7.1% and bridges the gaps with fully supervised models. Our Implementation is available at \url{https://github.com/rz-zhang/PRBoost}.

* ACL 2022
* ACL 2022 (Main Conference). Code: https://github.com/rz-zhang/PRBoost

Via

Access Paper or Ask Questions

BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition

May 30, 2021

Yinghao Li, Pranav Shetty, Lucas Liu, Chao Zhang, Le Song

Figure 1 for BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition

Figure 2 for BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition

Figure 3 for BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition

Figure 4 for BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition

Abstract:We study the problem of learning a named entity recognition (NER) tagger using noisy labels from multiple weak supervision sources. Though cheap to obtain, the labels from weak supervision sources are often incomplete, inaccurate, and contradictory, making it difficult to learn an accurate NER model. To address this challenge, we propose a conditional hidden Markov model (CHMM), which can effectively infer true labels from multi-source noisy labels in an unsupervised way. CHMM enhances the classic hidden Markov model with the contextual representation power of pre-trained language models. Specifically, CHMM learns token-wise transition and emission probabilities from the BERT embeddings of the input tokens to infer the latent true labels from noisy observations. We further refine CHMM with an alternate-training approach (CHMM-ALT). It fine-tunes a BERT-NER model with the labels inferred by CHMM, and this BERT-NER's output is regarded as an additional weak source to train the CHMM in return. Experiments on four NER benchmarks from various domains show that our method outperforms state-of-the-art weakly supervised NER models by wide margins.

Via

Access Paper or Ask Questions