Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ha-Thanh Nguyen

BIS Reasoning 1.0: The First Large-Scale Japanese Benchmark for Belief-Inconsistent Syllogistic Reasoning

Jun 08, 2025

Ha-Thanh Nguyen, Chaoran Liu, Hirokazu Kiyomaru, Koichi Takeda, Yusuke Miyao, Maki Matsuda, Yusuke Oda, Pontus Stenetorp, Qianying Liu, Su Myat Noe(+3 more)

Abstract:We present BIS Reasoning 1.0, the first large-scale Japanese dataset of syllogistic reasoning problems explicitly designed to evaluate belief-inconsistent reasoning in large language models (LLMs). Unlike prior datasets such as NeuBAROCO and JFLD, which focus on general or belief-aligned reasoning, BIS Reasoning 1.0 introduces logically valid yet belief-inconsistent syllogisms to uncover reasoning biases in LLMs trained on human-aligned corpora. We benchmark state-of-the-art models - including GPT models, Claude models, and leading Japanese LLMs - revealing significant variance in performance, with GPT-4o achieving 79.54% accuracy. Our analysis identifies critical weaknesses in current LLMs when handling logically valid but belief-conflicting inputs. These findings have important implications for deploying LLMs in high-stakes domains such as law, healthcare, and scientific literature, where truth must override intuitive belief to ensure integrity and safety.

Via

Access Paper or Ask Questions

Exploiting LLMs' Reasoning Capability to Infer Implicit Concepts in Legal Information Retrieval

Oct 16, 2024

Hai-Long Nguyen, Tan-Minh Nguyen, Duc-Minh Nguyen, Thi-Hai-Yen Vuong, Ha-Thanh Nguyen, Xuan-Hieu Phan

Abstract:Statutory law retrieval is a typical problem in legal language processing, that has various practical applications in law engineering. Modern deep learning-based retrieval methods have achieved significant results for this problem. However, retrieval systems relying on semantic and lexical correlations often exhibit limitations, particularly when handling queries that involve real-life scenarios, or use the vocabulary that is not specific to the legal domain. In this work, we focus on overcoming this weaknesses by utilizing the logical reasoning capabilities of large language models (LLMs) to identify relevant legal terms and facts related to the situation mentioned in the query. The proposed retrieval system integrates additional information from the term--based expansion and query reformulation to improve the retrieval accuracy. The experiments on COLIEE 2022 and COLIEE 2023 datasets show that extra knowledge from LLMs helps to improve the retrieval result of both lexical and semantic ranking models. The final ensemble retrieval system outperformed the highest results among all participating teams in the COLIEE 2022 and 2023 competitions.

* Presented at NeLaMKRR@KR, 2024 (arXiv:2410.05339)

Via

Access Paper or Ask Questions

Detection of ransomware attacks using federated learning based on the CNN model

May 01, 2024

Hong-Nhung Nguyen, Ha-Thanh Nguyen, Damien Lescos

Figure 1 for Detection of ransomware attacks using federated learning based on the CNN model

Figure 2 for Detection of ransomware attacks using federated learning based on the CNN model

Figure 3 for Detection of ransomware attacks using federated learning based on the CNN model

Figure 4 for Detection of ransomware attacks using federated learning based on the CNN model

Abstract:Computing is still under a significant threat from ransomware, which necessitates prompt action to prevent it. Ransomware attacks can have a negative impact on how smart grids, particularly digital substations. In addition to examining a ransomware detection method using artificial intelligence (AI), this paper offers a ransomware attack modeling technique that targets the disrupted operation of a digital substation. The first, binary data is transformed into image data and fed into the convolution neural network model using federated learning. The experimental findings demonstrate that the suggested technique detects ransomware with a high accuracy rate.

Via

Access Paper or Ask Questions

Enhancing Legal Document Retrieval: A Multi-Phase Approach with Large Language Models

Mar 26, 2024

Hai-Long Nguyen, Duc-Minh Nguyen, Tan-Minh Nguyen, Ha-Thanh Nguyen, Thi-Hai-Yen Vuong, Ken Satoh

Abstract:Large language models with billions of parameters, such as GPT-3.5, GPT-4, and LLaMA, are increasingly prevalent. Numerous studies have explored effective prompting techniques to harness the power of these LLMs for various research problems. Retrieval, specifically in the legal data domain, poses a challenging task for the direct application of Prompting techniques due to the large number and substantial length of legal articles. This research focuses on maximizing the potential of prompting by placing it as the final phase of the retrieval system, preceded by the support of two phases: BM25 Pre-ranking and BERT-based Re-ranking. Experiments on the COLIEE 2023 dataset demonstrate that integrating prompting techniques on LLMs into the retrieval system significantly improves retrieval accuracy. However, error analysis reveals several existing issues in the retrieval system that still need resolution.

* JURISIN 2024

Via

Access Paper or Ask Questions

GPTs and Language Barrier: A Cross-Lingual Legal QA Examination

Mar 26, 2024

Ha-Thanh Nguyen, Hiroaki Yamada, Ken Satoh

Abstract:In this paper, we explore the application of Generative Pre-trained Transformers (GPTs) in cross-lingual legal Question-Answering (QA) systems using the COLIEE Task 4 dataset. In the COLIEE Task 4, given a statement and a set of related legal articles that serve as context, the objective is to determine whether the statement is legally valid, i.e., if it can be inferred from the provided contextual articles or not, which is also known as an entailment task. By benchmarking four different combinations of English and Japanese prompts and data, we provide valuable insights into GPTs' performance in multilingual legal QA scenarios, contributing to the development of more efficient and accurate cross-lingual QA solutions in the legal domain.

* NLP 2024, Kobe, Japan

Via

Access Paper or Ask Questions

VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition

Mar 06, 2024

Vu Tran, Ha-Thanh Nguyen, Trung Vo, Son T. Luu, Hoang-Anh Dang, Ngoc-Cam Le, Thi-Thuy Le, Minh-Tien Nguyen, Truong-Son Nguyen, Le-Minh Nguyen

Figure 1 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition

Figure 2 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition

Figure 3 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition

Figure 4 for VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition

Abstract:In this new era of rapid AI development, especially in language processing, the demand for AI in the legal domain is increasingly critical. In the context where research in other languages such as English, Japanese, and Chinese has been well-established, we introduce the first fundamental research for the Vietnamese language in the legal domain: legal textual entailment recognition through the Vietnamese Language and Speech Processing workshop. In analyzing participants' results, we discuss certain linguistic aspects critical in the legal domain that pose challenges that need to be addressed.

Via

Access Paper or Ask Questions

Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding

Mar 02, 2024

Ha-Thanh Nguyen, Ken Satoh

Abstract:Finetuning approaches in NLP often focus on exploitation rather than exploration, which may lead to suboptimal models. Given the vast search space of natural language, this limited exploration can restrict their performance in complex, high-stakes domains, where accurate negation understanding and logical reasoning abilities are crucial. To address this issue, we leverage Reinforcement Learning from Logical Feedback (RLLF) to create an effective balance between exploration and exploitation in LLMs. Our approach employs an appropriate benchmark dataset for training and evaluation, highlighting the importance of exploration in enhancing negation understanding capabilities. We compare the performance of our RLLF-enhanced LLMs with baseline models trained without RLLF, demonstrating the value of this balanced approach. Furthermore, we showcase the potential of our method in legal AI applications by employing transfer learning and evaluating its impact on negation understanding. Our experimental results exhibit the effectiveness of balancing exploration and exploitation with RLLF in improving LLMs' negation capabilities. This has implications for the development of more accurate, reliable, and logically consistent language models in high-stakes domains.

* JURISIN 2024

Via

Access Paper or Ask Questions

A Deep Learning-Based System for Automatic Case Summarization

Dec 13, 2023

Minh Duong, Long Nguyen, Yen Vuong, Trong Le, Ha-Thanh Nguyen

Abstract:This paper presents a deep learning-based system for efficient automatic case summarization. Leveraging state-of-the-art natural language processing techniques, the system offers both supervised and unsupervised methods to generate concise and relevant summaries of lengthy legal case documents. The user-friendly interface allows users to browse the system's database of legal case documents, select their desired case, and choose their preferred summarization method. The system generates comprehensive summaries for each subsection of the legal text as well as an overall summary. This demo streamlines legal case document analysis, potentially benefiting legal professionals by reducing workload and increasing efficiency. Future work will focus on refining summarization techniques and exploring the application of our methods to other types of legal texts.

Via

Access Paper or Ask Questions

Enhancing Logical Reasoning in Large Language Models to Facilitate Legal Applications

Nov 22, 2023

Ha-Thanh Nguyen, Wachara Fungwacharakorn, Ken Satoh

Abstract:Language serves as a vehicle for conveying thought, enabling communication among individuals. The ability to distinguish between diverse concepts, identify fairness and injustice, and comprehend a range of legal notions fundamentally relies on logical reasoning. Large Language Models (LLMs) attempt to emulate human language understanding and generation, but their competency in logical reasoning remains limited. This paper seeks to address the philosophical question: How can we effectively teach logical reasoning to LLMs while maintaining a deep understanding of the intricate relationship between language and logic? By focusing on bolstering LLMs' capabilities in logical reasoning, we aim to expand their applicability in law and other logic-intensive disciplines. To this end, we propose a Reinforcement Learning from Logical Feedback (RLLF) approach, which serves as a potential framework for refining LLMs' reasoning capacities. Through RLLF and a revised evaluation methodology, we explore new avenues for research in this domain and contribute to the development of LLMs capable of handling complex legal reasoning tasks while acknowledging the fundamental connection between language and logic.

* ALP@JURIX2023

Via

Access Paper or Ask Questions

RMDM: A Multilabel Fakenews Dataset for Vietnamese Evidence Verification

Sep 16, 2023

Hai-Long Nguyen, Thi-Kieu-Trang Pham, Thai-Son Le, Tan-Minh Nguyen, Thi-Hai-Yen Vuong, Ha-Thanh Nguyen

Abstract:In this study, we present a novel and challenging multilabel Vietnamese dataset (RMDM) designed to assess the performance of large language models (LLMs), in verifying electronic information related to legal contexts, focusing on fake news as potential input for electronic evidence. The RMDM dataset comprises four labels: real, mis, dis, and mal, representing real information, misinformation, disinformation, and mal-information, respectively. By including these diverse labels, RMDM captures the complexities of differing fake news categories and offers insights into the abilities of different language models to handle various types of information that could be part of electronic evidence. The dataset consists of a total of 1,556 samples, with 389 samples for each label. Preliminary tests on the dataset using GPT-based and BERT-based models reveal variations in the models' performance across different labels, indicating that the dataset effectively challenges the ability of various language models to verify the authenticity of such information. Our findings suggest that verifying electronic information related to legal contexts, including fake news, remains a difficult problem for language models, warranting further attention from the research community to advance toward more reliable AI models for potential legal applications.

* ISAILD@KSE 2023

Via

Access Paper or Ask Questions