Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ali Mohammadi

REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs

May 03, 2024

Deepa Tilwani, Yash Saxena, Ali Mohammadi, Edward Raff, Amit Sheth, Srinivasan Parthasarathy, Manas Gaur

Figure 1 for REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs

Figure 2 for REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs

Figure 3 for REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs

Figure 4 for REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs

Abstract:Automatic citation generation for sentences in a document or report is paramount for intelligence analysts, cybersecurity, news agencies, and education personnel. In this research, we investigate whether large language models (LLMs) are capable of generating references based on two forms of sentence queries: (a) Direct Queries, LLMs are asked to provide author names of the given research article, and (b) Indirect Queries, LLMs are asked to provide the title of a mentioned article when given a sentence from a different article. To demonstrate where LLM stands in this task, we introduce a large dataset called REASONS comprising abstracts of the 12 most popular domains of scientific research on arXiv. From around 20K research articles, we make the following deductions on public and proprietary LLMs: (a) State-of-the-art, often called anthropomorphic GPT-4 and GPT-3.5, suffers from high pass percentage (PP) to minimize the hallucination rate (HR). When tested with Perplexity.ai (7B), they unexpectedly made more errors; (b) Augmenting relevant metadata lowered the PP and gave the lowest HR; (c) Advance retrieval-augmented generation (RAG) using Mistral demonstrates consistent and robust citation support on indirect queries and matched performance to GPT-3.5 and GPT-4. The HR across all domains and models decreased by an average of 41.93% and the PP was reduced to 0% in most cases. In terms of generation quality, the average F1 Score and BLEU were 68.09% and 57.51%, respectively; (d) Testing with adversarial samples showed that LLMs, including the Advance RAG Mistral, struggle to understand context, but the extent of this issue was small in Mistral and GPT-4-Preview. Our study con tributes valuable insights into the reliability of RAG for automated citation generation tasks.

* Submitted to ACL ARR April 2024

Via

Access Paper or Ask Questions

Empowering Distributed Solutions in Renewable Energy Systems and Grid Optimization

Oct 24, 2023

Mohammad Mohammadi, Ali Mohammadi

Abstract:This study delves into the shift from centralized to decentralized approaches in the electricity industry, with a particular focus on how machine learning (ML) advancements play a crucial role in empowering renewable energy sources and improving grid management. ML models have become increasingly important in predicting renewable energy generation and consumption, utilizing various techniques like artificial neural networks, support vector machines, and decision trees. Furthermore, data preprocessing methods, such as data splitting, normalization, decomposition, and discretization, are employed to enhance prediction accuracy. The incorporation of big data and ML into smart grids offers several advantages, including heightened energy efficiency, more effective responses to demand, and better integration of renewable energy sources. Nevertheless, challenges like handling large data volumes, ensuring cybersecurity, and obtaining specialized expertise must be addressed. The research investigates various ML applications within the realms of solar energy, wind energy, and electric distribution and storage, illustrating their potential to optimize energy systems. To sum up, this research demonstrates the evolving landscape of the electricity sector as it shifts from centralized to decentralized solutions through the application of ML innovations and distributed decision-making, ultimately shaping a more efficient and sustainable energy future.

Via

Access Paper or Ask Questions

Deep neural operator for learning transient response of interpenetrating phase composites subject to dynamic loading

Mar 30, 2023

Minglei Lu, Ali Mohammadi, Zhaoxu Meng, Xuhui Meng, Gang Li, Zhen Li

Abstract:Additive manufacturing has been recognized as an industrial technological revolution for manufacturing, which allows fabrication of materials with complex three-dimensional (3D) structures directly from computer-aided design models. The mechanical properties of interpenetrating phase composites (IPCs), especially response to dynamic loading, highly depend on their 3D structures. In general, for each specified structural design, it could take hours or days to perform either finite element analysis (FEA) or experiments to test the mechanical response of IPCs to a given dynamic load. To accelerate the physics-based prediction of mechanical properties of IPCs for various structural designs, we employ a deep neural operator (DNO) to learn the transient response of IPCs under dynamic loading as surrogate of physics-based FEA models. We consider a 3D IPC beam formed by two metals with a ratio of Young's modulus of 2.7, wherein random blocks of constituent materials are used to demonstrate the generality and robustness of the DNO model. To obtain FEA results of IPC properties, 5,000 random time-dependent strain loads generated by a Gaussian process kennel are applied to the 3D IPC beam, and the reaction forces and stress fields inside the IPC beam under various loading are collected. Subsequently, the DNO model is trained using an incremental learning method with sequence-to-sequence training implemented in JAX, leading to a 100X speedup compared to widely used vanilla deep operator network models. After an offline training, the DNO model can act as surrogate of physics-based FEA to predict the transient mechanical response in terms of reaction force and stress distribution of the IPCs to various strain loads in one second at an accuracy of 98%. Also, the learned operator is able to provide extended prediction of the IPC beam subject to longer random strain loads at a reasonably well accuracy.

* 23 pages, 14 figures

Via

Access Paper or Ask Questions