Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marzieh Zare

The future of document indexing: GPT and Donut revolutionize table of content processing

Mar 12, 2024

Degaga Wolde Feyisa, Haylemicheal Berihun, Amanuel Zewdu, Mahsa Najimoghadam, Marzieh Zare

Figure 1 for The future of document indexing: GPT and Donut revolutionize table of content processing

Figure 2 for The future of document indexing: GPT and Donut revolutionize table of content processing

Figure 3 for The future of document indexing: GPT and Donut revolutionize table of content processing

Figure 4 for The future of document indexing: GPT and Donut revolutionize table of content processing

Abstract:Industrial projects rely heavily on lengthy, complex specification documents, making tedious manual extraction of structured information a major bottleneck. This paper introduces an innovative approach to automate this process, leveraging the capabilities of two cutting-edge AI models: Donut, a model that extracts information directly from scanned documents without OCR, and OpenAI GPT-3.5 Turbo, a robust large language model. The proposed methodology is initiated by acquiring the table of contents (ToCs) from construction specification documents and subsequently structuring the ToCs text into JSON data. Remarkable accuracy is achieved, with Donut reaching 85% and GPT-3.5 Turbo reaching 89% in effectively organizing the ToCs. This landmark achievement represents a significant leap forward in document indexing, demonstrating the immense potential of AI to automate information extraction tasks across diverse document types, boosting efficiency and liberating critical resources in various industries.

* Document AI, Document Classification, Information extraction, Large Language Models, OCR Models, Visual Document Understanding

Via

Access Paper or Ask Questions

Comparison of single and multitask learning for predicting cognitive decline based on MRI data

Sep 21, 2021

Vandad Imani, Mithilesh Prakash, Marzieh Zare, Jussi Tohka

Figure 1 for Comparison of single and multitask learning for predicting cognitive decline based on MRI data

Figure 2 for Comparison of single and multitask learning for predicting cognitive decline based on MRI data

Figure 3 for Comparison of single and multitask learning for predicting cognitive decline based on MRI data

Figure 4 for Comparison of single and multitask learning for predicting cognitive decline based on MRI data

Abstract:The Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog) is a neuropsychological tool that has been designed to assess the severity of cognitive symptoms of dementia. Personalized prediction of the changes in ADAS-Cog scores could help in timing therapeutic interventions in dementia and at-risk populations. In the present work, we compared single and multitask learning approaches to predict the changes in ADAS-Cog scores based on T1-weighted anatomical magnetic resonance imaging (MRI). In contrast to most machine learning-based prediction methods ADAS-Cog changes, we stratified the subjects based on their baseline diagnoses and evaluated the prediction performances in each group. Our experiments indicated a positive relationship between the predicted and observed ADAS-Cog score changes in each diagnostic group, suggesting that T1-weighted MRI has a predictive value for evaluating cognitive decline in the entire AD continuum. We further studied whether correction of the differences in the magnetic field strength of MRI would improve the ADAS-Cog score prediction. The partial least square-based domain adaptation slightly improved the prediction performance, but the improvement was marginal. In summary, this study demonstrated that ADAS-Cog change could be, to some extent, predicted based on anatomical MRI. Based on this study, the recommended method for learning the predictive models is a single-task regularized linear regression due to its simplicity and good performance. It appears important to combine the training data across all subject groups for the most effective predictive models.

Via

Access Paper or Ask Questions