Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Long Nguyen

Advancing Vietnamese Information Retrieval with Learning Objective and Benchmark

Mar 10, 2025

Phu-Vinh Nguyen, Minh-Nam Tran, Long Nguyen, Dien Dinh

Abstract:With the rapid development of natural language processing, many language models have been invented for multiple tasks. One important task is information retrieval (IR), which requires models to retrieve relevant documents. Despite its importance in many real-life applications, especially in retrieval augmented generation (RAG) systems, this task lacks Vietnamese benchmarks. This situation causes difficulty in assessing and comparing many existing Vietnamese embedding language models on the task and slows down the advancement of Vietnamese natural language processing (NLP) research. In this work, we aim to provide the Vietnamese research community with a new benchmark for information retrieval, which mainly focuses on retrieval and reranking tasks. Furthermore, we also present a new objective function based on the InfoNCE loss function, which is used to train our Vietnamese embedding model. Our function aims to be better than the origin in information retrieval tasks. Finally, we analyze the effect of temperature, a hyper-parameter in both objective functions, on the performance of text embedding models.

* PACLIC38-2024

Via

Access Paper or Ask Questions

URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT

Jan 27, 2025

Long Nguyen, Tho Quan

Figure 1 for URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT

Figure 2 for URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT

Figure 3 for URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT

Figure 4 for URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT

Abstract:With the rapid advancement of Artificial Intelligence, particularly in Natural Language Processing, Large Language Models (LLMs) have become pivotal in educational question-answering systems, especially university admission chatbots. Concepts such as Retrieval-Augmented Generation (RAG) and other advanced techniques have been developed to enhance these systems by integrating specific university data, enabling LLMs to provide informed responses on admissions and academic counseling. However, these enhanced RAG techniques often involve high operational costs and require the training of complex, specialized modules, which poses challenges for practical deployment. Additionally, in the educational context, it is crucial to provide accurate answers to prevent misinformation, a task that LLM-based systems find challenging without appropriate strategies and methods. In this paper, we introduce the Unified RAG (URAG) Framework, a hybrid approach that significantly improves the accuracy of responses, particularly for critical queries. Experimental results demonstrate that URAG enhances our in-house, lightweight model to perform comparably to state-of-the-art commercial models. Moreover, to validate its practical applicability, we conducted a case study at our educational institution, which received positive feedback and acclaim. This study not only proves the effectiveness of URAG but also highlights its feasibility for real-world implementation in educational settings.

* Under review at SoICT'24

Via

Access Paper or Ask Questions

RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval

Jan 27, 2025

Long Nguyen, Huy Nguyen, Bao Khuu, Huy Luu, Huy Le, Tuan Nguyen, Tho Quan

Figure 1 for RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval

Figure 2 for RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval

Figure 3 for RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval

Figure 4 for RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval

Abstract:Retrieving events from videos using text queries has become increasingly challenging due to the rapid growth of multimedia content. Existing methods for text-based video event retrieval often focus heavily on object-level descriptions, overlooking the crucial role of contextual information. This limitation is especially apparent when queries lack sufficient context, such as missing location details or ambiguous background elements. To address these challenges, we propose a novel system called RAPID (Retrieval-Augmented Parallel Inference Drafting), which leverages advancements in Large Language Models (LLMs) and prompt-based learning to semantically correct and enrich user queries with relevant contextual information. These enriched queries are then processed through parallel retrieval, followed by an evaluation step to select the most relevant results based on their alignment with the original query. Through extensive experiments on our custom-developed dataset, we demonstrate that RAPID significantly outperforms traditional retrieval methods, particularly for contextually incomplete queries. Our system was validated for both speed and accuracy through participation in the Ho Chi Minh City AI Challenge 2024, where it successfully retrieved events from over 300 hours of video. Further evaluation comparing RAPID with the baseline proposed by the competition organizers demonstrated its superior effectiveness, highlighting the strength and robustness of our approach.

* Under review at SoICT'24

Via

Access Paper or Ask Questions

Automated Body Composition Analysis Using DAFS Express on 2D MRI Slices at L3 Vertebral Level

Sep 11, 2024

Varun Akella, Razeyeh Bagherinasab, Jia Ming Li, Long Nguyen, Vincent Tze Yang Chow, Hyunwoo Lee, Karteek Popuri, Mirza Faisal Beg

Figure 1 for Automated Body Composition Analysis Using DAFS Express on 2D MRI Slices at L3 Vertebral Level

Figure 2 for Automated Body Composition Analysis Using DAFS Express on 2D MRI Slices at L3 Vertebral Level

Figure 3 for Automated Body Composition Analysis Using DAFS Express on 2D MRI Slices at L3 Vertebral Level

Figure 4 for Automated Body Composition Analysis Using DAFS Express on 2D MRI Slices at L3 Vertebral Level

Abstract:Body composition analysis is vital in assessing health conditions such as obesity, sarcopenia, and metabolic syndromes. MRI provides detailed images of skeletal muscle (SKM), visceral adipose tissue (VAT), and subcutaneous adipose tissue (SAT), but their manual segmentation is labor-intensive and limits clinical applicability. This study validates an automated tool for MRI-based 2D body composition analysis- (Data Analysis Facilitation Suite (DAFS) Express), comparing its automated measurements with expert manual segmentations using UK Biobank data. A cohort of 399 participants from the UK Biobank dataset was selected, yielding 423 single L3 slices for analysis. DAFS Express performed automated segmentations of SKM, VAT, and SAT, which were then manually corrected by expert raters for validation. Evaluation metrics included Jaccard coefficients, Dice scores, Intraclass Correlation Coefficients (ICCs), and Bland-Altman Plots to assess segmentation agreement and reliability. High agreements were observed between automated and manual segmentations with mean Jaccard scores: SKM 99.03%, VAT 95.25%, and SAT 99.57%; and mean Dice scores: SKM 99.51%, VAT 97.41%, and SAT 99.78%. Cross-sectional area comparisons showed consistent measurements with automated methods closely matching manual measurements for SKM and SAT, and slightly higher values for VAT (SKM: Auto 132.51 cm^2, Manual 132.36 cm^2; VAT: Auto 137.07 cm^2, Manual 134.46 cm^2; SAT: Auto 203.39 cm^2, Manual 202.85 cm^2). ICCs confirmed strong reliability (SKM: 0.998, VAT: 0.994, SAT: 0.994). Bland-Altman plots revealed minimal biases, and boxplots illustrated distribution similarities across SKM, VAT, and SAT areas. On average DAFS Express took 18 seconds per DICOM. This underscores its potential to streamline image analysis processes in research and clinical settings, enhancing diagnostic accuracy and efficiency.

Via

Access Paper or Ask Questions

Public Health in Disaster: Emotional Health and Life Incidents Extraction during Hurricane Harvey

Aug 20, 2024

Thomas Hoang, Quynh Anh Nguyen, Long Nguyen

Abstract:Countless disasters have resulted from climate change, causing severe damage to infrastructure and the economy. These disasters have significant societal impacts, necessitating mental health services for the millions affected. To prepare for and respond effectively to such events, it is important to understand people's emotions and the life incidents they experience before and after a disaster strikes. In this case study, we collected a dataset of approximately 400,000 public tweets related to the storm. Using a BERT-based model, we predicted the emotions associated with each tweet. To efficiently identify these topics, we utilized the Latent Dirichlet Allocation (LDA) technique for topic modeling, which allowed us to bypass manual content analysis and extract meaningful patterns from the data. However, rather than stopping at topic identification like previous methods \cite{math11244910}, we further refined our analysis by integrating Graph Neural Networks (GNN) and Large Language Models (LLM). The GNN was employed to generate embeddings and construct a similarity graph of the tweets, which was then used to optimize clustering. Subsequently, we used an LLM to automatically generate descriptive names for each event cluster, offering critical insights for disaster preparedness and response strategies.

Via

Access Paper or Ask Questions

Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A~Case~Study~at~HCMUT

Apr 14, 2024

Tuan Bui, Oanh Tran, Phuong Nguyen, Bao Ho, Long Nguyen, Thang Bui, Tho Quan

Figure 1 for Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A~Case~Study~at~HCMUT

Figure 2 for Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A~Case~Study~at~HCMUT

Figure 3 for Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A~Case~Study~at~HCMUT

Figure 4 for Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A~Case~Study~at~HCMUT

Abstract:In today's rapidly evolving landscape of Artificial Intelligence, large language models (LLMs) have emerged as a vibrant research topic. LLMs find applications in various fields and contribute significantly. Despite their powerful language capabilities, similar to pre-trained language models (PLMs), LLMs still face challenges in remembering events, incorporating new information, and addressing domain-specific issues or hallucinations. To overcome these limitations, researchers have proposed Retrieval-Augmented Generation (RAG) techniques, some others have proposed the integration of LLMs with Knowledge Graphs (KGs) to provide factual context, thereby improving performance and delivering more accurate feedback to user queries. Education plays a crucial role in human development and progress. With the technology transformation, traditional education is being replaced by digital or blended education. Therefore, educational data in the digital environment is increasing day by day. Data in higher education institutions are diverse, comprising various sources such as unstructured/structured text, relational databases, web/app-based API access, etc. Constructing a Knowledge Graph from these cross-data sources is not a simple task. This article proposes a method for automatically constructing a Knowledge Graph from multiple data sources and discusses some initial applications (experimental trials) of KG in conjunction with LLMs for question-answering tasks.

* 8 pages, 7 figures

Via

Access Paper or Ask Questions

A Deep Learning-Based System for Automatic Case Summarization

Dec 13, 2023

Minh Duong, Long Nguyen, Yen Vuong, Trong Le, Ha-Thanh Nguyen

Figure 1 for A Deep Learning-Based System for Automatic Case Summarization

Figure 2 for A Deep Learning-Based System for Automatic Case Summarization

Abstract:This paper presents a deep learning-based system for efficient automatic case summarization. Leveraging state-of-the-art natural language processing techniques, the system offers both supervised and unsupervised methods to generate concise and relevant summaries of lengthy legal case documents. The user-friendly interface allows users to browse the system's database of legal case documents, select their desired case, and choose their preferred summarization method. The system generates comprehensive summaries for each subsection of the legal text as well as an overall summary. This demo streamlines legal case document analysis, potentially benefiting legal professionals by reducing workload and increasing efficiency. Future work will focus on refining summarization techniques and exploring the application of our methods to other types of legal texts.

Via

Access Paper or Ask Questions

Classifying Whole Slide Images: What Matters?

Oct 05, 2023

Long Nguyen, Aiden Nibali, Joshua Millward, Zhen He

Figure 1 for Classifying Whole Slide Images: What Matters?

Figure 2 for Classifying Whole Slide Images: What Matters?

Figure 3 for Classifying Whole Slide Images: What Matters?

Figure 4 for Classifying Whole Slide Images: What Matters?

Abstract:Recently there have been many algorithms proposed for the classification of very high resolution whole slide images (WSIs). These new algorithms are mostly focused on finding novel ways to combine the information from small local patches extracted from the slide, with an emphasis on effectively aggregating more global information for the final predictor. In this paper we thoroughly explore different key design choices for WSI classification algorithms to investigate what matters most for achieving high accuracy. Surprisingly, we found that capturing global context information does not necessarily mean better performance. A model that captures the most global information consistently performs worse than a model that captures less global information. In addition, a very simple multi-instance learning method that captures no global information performs almost as well as models that capture a lot of global information. These results suggest that the most important features for effective WSI classification are captured at the local small patch level, where cell and tissue micro-environment detail is most pronounced. Another surprising finding was that unsupervised pre-training on a larger set of 33 cancers gives significantly worse performance compared to pre-training on a smaller dataset of 7 cancers (including the target cancer). We posit that pre-training on a smaller, more focused dataset allows the feature extractor to make better use of the limited feature space to better discriminate between subtle differences in the input patch.

Via

Access Paper or Ask Questions

Semi-supervised Neural Machine Translation with Consistency Regularization for Low-Resource Languages

Apr 02, 2023

Viet H. Pham, Thang M. Pham, Giang Nguyen, Long Nguyen, Dien Dinh

Abstract:The advent of deep learning has led to a significant gain in machine translation. However, most of the studies required a large parallel dataset which is scarce and expensive to construct and even unavailable for some languages. This paper presents a simple yet effective method to tackle this problem for low-resource languages by augmenting high-quality sentence pairs and training NMT models in a semi-supervised manner. Specifically, our approach combines the cross-entropy loss for supervised learning with KL Divergence for unsupervised fashion given pseudo and augmented target sentences derived from the model. We also introduce a SentenceBERT-based filter to enhance the quality of augmenting data by retaining semantically similar sentence pairs. Experimental results show that our approach significantly improves NMT baselines, especially on low-resource datasets with 0.46--2.03 BLEU scores. We also demonstrate that using unsupervised training for augmented data is more efficient than reusing the ground-truth target sentences for supervised learning.

* TMP and GN contributed equally

Via

Access Paper or Ask Questions

A Multi-variable Stacked Long-Short Term Memory Network for Wind Speed Forecasting

Nov 24, 2018

Sisheng Liang, Long Nguyen, Fang Jin

Figure 1 for A Multi-variable Stacked Long-Short Term Memory Network for Wind Speed Forecasting

Figure 2 for A Multi-variable Stacked Long-Short Term Memory Network for Wind Speed Forecasting

Figure 3 for A Multi-variable Stacked Long-Short Term Memory Network for Wind Speed Forecasting

Figure 4 for A Multi-variable Stacked Long-Short Term Memory Network for Wind Speed Forecasting

Abstract:Precisely forecasting wind speed is essential for wind power producers and grid operators. However, this task is challenging due to the stochasticity of wind speed. To accurately predict short-term wind speed under uncertainties, this paper proposed a multi-variable stacked LSTMs model (MSLSTM). The proposed method utilizes multiple historical meteorological variables, such as wind speed, temperature, humidity, pressure, dew point and solar radiation to accurately predict wind speeds. The prediction performance is extensively assessed using real data collected in West Texas, USA. The experimental results show that the proposed MSLSTM can preferably capture and learn uncertainties while output competitive performance.

Via

Access Paper or Ask Questions