Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aditi Singh

Multilingual Prompt Engineering in Large Language Models: A Survey Across NLP Tasks

May 16, 2025

Shubham Vatsal, Harsh Dubey, Aditi Singh

Abstract:Large language models (LLMs) have demonstrated impressive performance across a wide range of Natural Language Processing (NLP) tasks. However, ensuring their effectiveness across multiple languages presents unique challenges. Multilingual prompt engineering has emerged as a key approach to enhance LLMs' capabilities in diverse linguistic settings without requiring extensive parameter re-training or fine-tuning. With growing interest in multilingual prompt engineering over the past two to three years, researchers have explored various strategies to improve LLMs' performance across languages and NLP tasks. By crafting structured natural language prompts, researchers have successfully extracted knowledge from LLMs across different languages, making these techniques an accessible pathway for a broader audience, including those without deep expertise in machine learning, to harness the capabilities of LLMs. In this paper, we survey and categorize different multilingual prompting techniques based on the NLP tasks they address across a diverse set of datasets that collectively span around 250 languages. We further highlight the LLMs employed, present a taxonomy of approaches and discuss potential state-of-the-art (SoTA) methods for specific multilingual datasets. Additionally, we derive a range of insights across language families and resource levels (high-resource vs. low-resource), including analyses such as the distribution of NLP tasks by language resource type and the frequency of prompting methods across different language families. Our survey reviews 36 research papers covering 39 prompting techniques applied to 30 multilingual NLP tasks, with the majority of these studies published in the last two years.

Via

Access Paper or Ask Questions

A survey of agent interoperability protocols: Model Context Protocol (MCP), Agent Communication Protocol (ACP), Agent-to-Agent Protocol (A2A), and Agent Network Protocol (ANP)

May 04, 2025

Abul Ehtesham, Aditi Singh, Gaurav Kumar Gupta, Saket Kumar

Abstract:Large language model (LLM)-powered autonomous agents demand robust, standardized protocols to integrate tools, share contextual data, and coordinate tasks across heterogeneous systems. Ad-hoc integrations are difficult to scale, secure, and generalize across domains. This survey examines four emerging agent communication protocols: Model Context Protocol (MCP), Agent Communication Protocol (ACP), Agent-to-Agent Protocol (A2A), and Agent Network Protocol (ANP), each addressing interoperability in distinct deployment contexts. MCP provides a JSON-RPC client-server interface for secure tool invocation and typed data exchange. ACP introduces REST-native messaging via multi-part messages and asynchronous streaming to support multimodal agent responses. A2A enables peer-to-peer task outsourcing through capability-based Agent Cards, facilitating enterprise-scale workflows. ANP supports open-network agent discovery and secure collaboration using decentralized identifiers (DIDs) and JSON-LD graphs. The protocols are compared across multiple dimensions, including interaction modes, discovery mechanisms, communication patterns, and security models. Based on the comparative analysis, a phased adoption roadmap is proposed: beginning with MCP for tool access, followed by ACP for multimodal messaging, A2A for collaborative task execution, and extending to ANP for decentralized agent marketplaces. This work provides a comprehensive foundation for designing secure, interoperable, and scalable ecosystems of LLM-powered agents.

Via

Access Paper or Ask Questions

Pediatric Asthma Detection with Googles HeAR Model: An AI-Driven Respiratory Sound Classifier

Apr 28, 2025

Abul Ehtesham, Saket Kumar, Aditi Singh, Tala Talaei Khoei

Abstract:Early detection of asthma in children is crucial to prevent long-term respiratory complications and reduce emergency interventions. This work presents an AI-powered diagnostic pipeline that leverages Googles Health Acoustic Representations (HeAR) model to detect early signs of asthma from pediatric respiratory sounds. The SPRSound dataset, the first open-access collection of annotated respiratory sounds in children aged 1 month to 18 years, is used to extract 2-second audio segments labeled as wheeze, crackle, rhonchi, stridor, or normal. Each segment is embedded into a 512-dimensional representation using HeAR, a foundation model pretrained on 300 million health-related audio clips, including 100 million cough sounds. Multiple classifiers, including SVM, Random Forest, and MLP, are trained on these embeddings to distinguish between asthma-indicative and normal sounds. The system achieves over 91\% accuracy, with strong performance on precision-recall metrics for positive cases. In addition to classification, learned embeddings are visualized using PCA, misclassifications are analyzed through waveform playback, and ROC and confusion matrix insights are provided. This method demonstrates that short, low-resource pediatric recordings, when powered by foundation audio models, can enable fast, noninvasive asthma screening. The approach is especially promising for digital diagnostics in remote or underserved healthcare settings.

Via

Access Paper or Ask Questions

Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

Jan 15, 2025

Aditi Singh, Abul Ehtesham, Saket Kumar, Tala Talaei Khoei

Figure 1 for Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

Figure 2 for Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

Figure 3 for Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

Figure 4 for Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

Abstract:Large Language Models (LLMs) have revolutionized artificial intelligence (AI) by enabling human like text generation and natural language understanding. However, their reliance on static training data limits their ability to respond to dynamic, real time queries, resulting in outdated or inaccurate outputs. Retrieval Augmented Generation (RAG) has emerged as a solution, enhancing LLMs by integrating real time data retrieval to provide contextually relevant and up-to-date responses. Despite its promise, traditional RAG systems are constrained by static workflows and lack the adaptability required for multistep reasoning and complex task management. Agentic Retrieval-Augmented Generation (Agentic RAG) transcends these limitations by embedding autonomous AI agents into the RAG pipeline. These agents leverage agentic design patterns reflection, planning, tool use, and multiagent collaboration to dynamically manage retrieval strategies, iteratively refine contextual understanding, and adapt workflows to meet complex task requirements. This integration enables Agentic RAG systems to deliver unparalleled flexibility, scalability, and context awareness across diverse applications. This survey provides a comprehensive exploration of Agentic RAG, beginning with its foundational principles and the evolution of RAG paradigms. It presents a detailed taxonomy of Agentic RAG architectures, highlights key applications in industries such as healthcare, finance, and education, and examines practical implementation strategies. Additionally, it addresses challenges in scaling these systems, ensuring ethical decision making, and optimizing performance for real-world applications, while providing detailed insights into frameworks and tools for implementing Agentic RAG

Via

Access Paper or Ask Questions

A Survey of Sustainability in Large Language Models: Applications, Economics, and Challenges

Dec 06, 2024

Aditi Singh, Nirmal Prakashbhai Patel, Abul Ehtesham, Saket Kumar, Tala Talaei Khoei

Abstract:Large Language Models (LLMs) have transformed numerous domains by providing advanced capabilities in natural language understanding, generation, and reasoning. Despite their groundbreaking applications across industries such as research, healthcare, and creative media, their rapid adoption raises critical concerns regarding sustainability. This survey paper comprehensively examines the environmental, economic, and computational challenges associated with LLMs, focusing on energy consumption, carbon emissions, and resource utilization in data centers. By synthesizing insights from existing literature, this work explores strategies such as resource-efficient training, sustainable deployment practices, and lifecycle assessments to mitigate the environmental impacts of LLMs. Key areas of emphasis include energy optimization, renewable energy integration, and balancing performance with sustainability. The findings aim to guide researchers, practitioners, and policymakers in developing actionable strategies for sustainable AI systems, fostering a responsible and environmentally conscious future for artificial intelligence.

Via

Access Paper or Ask Questions

A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

Dec 06, 2024

Aditi Singh, Akash Shetty, Abul Ehtesham, Saket Kumar, Tala Talaei Khoei

Figure 1 for A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

Figure 2 for A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

Figure 3 for A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

Figure 4 for A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

Abstract:Text-to-SQL systems facilitate smooth interaction with databases by translating natural language queries into Structured Query Language (SQL), bridging the gap between non-technical users and complex database management systems. This survey provides a comprehensive overview of the evolution of AI-driven text-to-SQL systems, highlighting their foundational components, advancements in large language model (LLM) architectures, and the critical role of datasets such as Spider, WikiSQL, and CoSQL in driving progress. We examine the applications of text-to-SQL in domains like healthcare, education, and finance, emphasizing their transformative potential for improving data accessibility. Additionally, we analyze persistent challenges, including domain generalization, query optimization, support for multi-turn conversational interactions, and the limited availability of datasets tailored for NoSQL databases and dynamic real-world scenarios. To address these challenges, we outline future research directions, such as extending text-to-SQL capabilities to support NoSQL databases, designing datasets for dynamic multi-turn interactions, and optimizing systems for real-world scalability and robustness. By surveying current advancements and identifying key gaps, this paper aims to guide the next generation of research and applications in LLM-based text-to-SQL systems.

Via

Access Paper or Ask Questions

Movie Gen: SWOT Analysis of Meta's Generative AI Foundation Model for Transforming Media Generation, Advertising, and Entertainment Industries

Dec 05, 2024

Abul Ehtesham, Saket Kumar, Aditi Singh, Tala Talaei Khoei

Figure 1 for Movie Gen: SWOT Analysis of Meta's Generative AI Foundation Model for Transforming Media Generation, Advertising, and Entertainment Industries

Figure 2 for Movie Gen: SWOT Analysis of Meta's Generative AI Foundation Model for Transforming Media Generation, Advertising, and Entertainment Industries

Abstract:Generative AI is reshaping the media landscape, enabling unprecedented capabilities in video creation, personalization, and scalability. This paper presents a comprehensive SWOT analysis of Metas Movie Gen, a cutting-edge generative AI foundation model designed to produce 1080p HD videos with synchronized audio from simple text prompts. We explore its strengths, including high-resolution video generation, precise editing, and seamless audio integration, which make it a transformative tool across industries such as filmmaking, advertising, and education. However, the analysis also addresses limitations, such as constraints on video length and potential biases in generated content, which pose challenges for broader adoption. In addition, we examine the evolving regulatory and ethical considerations surrounding generative AI, focusing on issues like content authenticity, cultural representation, and responsible use. Through comparative insights with leading models like DALL-E and Google Imagen, this paper highlights Movie Gens unique features, such as video personalization and multimodal synthesis, while identifying opportunities for innovation and areas requiring further research. Our findings provide actionable insights for stakeholders, emphasizing both the opportunities and challenges of deploying generative AI in media production. This work aims to guide future advancements in generative AI, ensuring scalability, quality, and ethical integrity in this rapidly evolving field.

Via

Access Paper or Ask Questions

Architectural Flaw Detection in Civil Engineering Using GPT-4

Oct 26, 2024

Saket Kumar, Abul Ehtesham, Aditi Singh, Tala Talaei Khoei

Abstract:The application of artificial intelligence (AI) in civil engineering presents a transformative approach to enhancing design quality and safety. This paper investigates the potential of the advanced LLM GPT4 Turbo vision model in detecting architectural flaws during the design phase, with a specific focus on identifying missing doors and windows. The study evaluates the model's performance through metrics such as precision, recall, and F1 score, demonstrating AI's effectiveness in accurately detecting flaws compared to human-verified data. Additionally, the research explores AI's broader capabilities, including identifying load-bearing issues, material weaknesses, and ensuring compliance with building codes. The findings highlight how AI can significantly improve design accuracy, reduce costly revisions, and support sustainable practices, ultimately revolutionizing the civil engineering field by ensuring safer, more efficient, and aesthetically optimized structures.

Via

Access Paper or Ask Questions

Exploring Prompt Engineering: A Systematic Review with SWOT Analysis

Oct 09, 2024

Aditi Singh, Abul Ehtesham, Gaurav Kumar Gupta, Nikhil Kumar Chatta, Saket Kumar, Tala Talaei Khoei

Abstract:In this paper, we conduct a comprehensive SWOT analysis of prompt engineering techniques within the realm of Large Language Models (LLMs). Emphasizing linguistic principles, we examine various techniques to identify their strengths, weaknesses, opportunities, and threats. Our findings provide insights into enhancing AI interactions and improving language model comprehension of human prompts. The analysis covers techniques including template-based approaches and fine-tuning, addressing the problems and challenges associated with each. The conclusion offers future research directions aimed at advancing the effectiveness of prompt engineering in optimizing human-machine communication.

* 14 pages, 1 figures

Via

Access Paper or Ask Questions

Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses

May 09, 2024

Gaurav Kumar Gupta, Aditi Singh, Sijo Valayakkad Manikandan, Abul Ehtesham

Figure 1 for Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses

Figure 2 for Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses

Figure 3 for Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses

Figure 4 for Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses

Abstract:The recent swift development of LLMs like GPT-4, Gemini, and GPT-3.5 offers a transformative opportunity in medicine and healthcare, especially in digital diagnostics. This study evaluates each model diagnostic abilities by interpreting a user symptoms and determining diagnoses that fit well with common illnesses, and it demonstrates how each of these models could significantly increase diagnostic accuracy and efficiency. Through a series of diagnostic prompts based on symptoms from medical databases, GPT-4 demonstrates higher diagnostic accuracy from its deep and complete history of training on medical data. Meanwhile, Gemini performs with high precision as a critical tool in disease triage, demonstrating its potential to be a reliable model when physicians are trying to make high-risk diagnoses. GPT-3.5, though slightly less advanced, is a good tool for medical diagnostics. This study highlights the need to study LLMs for healthcare and clinical practices with more care and attention, ensuring that any system utilizing LLMs promotes patient privacy and complies with health information privacy laws such as HIPAA compliance, as well as the social consequences that affect the varied individuals in complex healthcare contexts. This study marks the start of a larger future effort to study the various ways in which assigning ethical concerns to LLMs task of learning from human biases could unearth new ways to apply AI in complex medical settings.

* 14 pages, 4 figures

Via

Access Paper or Ask Questions