Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kim Gerdes

LISN

PatentEval: Understanding Errors in Patent Generation

Jun 05, 2024

You Zuo, Kim Gerdes, Eric Villemonte de La Clergerie, Benoît Sagot

Figure 1 for PatentEval: Understanding Errors in Patent Generation

Figure 2 for PatentEval: Understanding Errors in Patent Generation

Figure 3 for PatentEval: Understanding Errors in Patent Generation

Figure 4 for PatentEval: Understanding Errors in Patent Generation

Abstract:In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated by humans, of various models. These range from those specifically adapted during training for tasks within the patent domain to the latest general-purpose large language models (LLMs). Furthermore, we explored and evaluated some metrics to approximate human judgments in patent text evaluation, analyzing the extent to which these metrics align with expert assessments. These approaches provide valuable insights into the capabilities and limitations of current language models in the specialized field of patent text generation.

* NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico

Via

Access Paper or Ask Questions

PatFig: Generating Short and Long Captions for Patent Figures

Sep 15, 2023

Dana Aubakirova, Kim Gerdes, Lufei Liu

Figure 1 for PatFig: Generating Short and Long Captions for Patent Figures

Figure 2 for PatFig: Generating Short and Long Captions for Patent Figures

Figure 3 for PatFig: Generating Short and Long Captions for Patent Figures

Figure 4 for PatFig: Generating Short and Long Captions for Patent Figures

Abstract:This paper introduces Qatent PatFig, a novel large-scale patent figure dataset comprising 30,000+ patent figures from over 11,000 European patent applications. For each figure, this dataset provides short and long captions, reference numerals, their corresponding terms, and the minimal claim set that describes the interactions between the components of the image. To assess the usability of the dataset, we finetune an LVLM model on Qatent PatFig to generate short and long descriptions, and we investigate the effects of incorporating various text-based cues at the prediction stage of the patent figure captioning process.

* accepted to the ICCV 2023, CLVL: 5th Workshop on Closing the Loop Between Vision and Language

Via

Access Paper or Ask Questions