Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zenan Zhai

RuozhiBench: Evaluating LLMs with Logical Fallacies and Misleading Premises

Feb 18, 2025

Zenan Zhai, Hao Li, Xudong Han, Zhenxuan Zhang, Yixuan Zhang, Timothy Baldwin, Haonan Li

Abstract:Recent advances in large language models (LLMs) have shown that they can answer questions requiring complex reasoning. However, their ability to identify and respond to text containing logical fallacies or deliberately misleading premises remains less studied. To address this gap, we introduce RuozhiBench, a bilingual dataset comprising 677 carefully curated questions that contain various forms of deceptive reasoning, meticulously crafted through extensive human effort and expert review. In a comprehensive evaluation of 17 LLMs from 5 Series over RuozhiBench using both open-ended and two-choice formats, we conduct extensive analyses on evaluation protocols and result patterns. Despite their high scores on conventional benchmarks, these models showed limited ability to detect and reason correctly about logical fallacies, with even the best-performing model, Claude-3-haiku, achieving only 62% accuracy compared to the human of more than 90%.

Via

Access Paper or Ask Questions

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Dec 24, 2024

Haonan Li, Xudong Han, Zenan Zhai, Honglin Mu, Hao Wang, Zhenxuan Zhang, Yilin Geng, Shom Lin, Renxi Wang, Artem Shelmanov(+25 more)

Figure 1 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Figure 2 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Figure 3 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Figure 4 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Abstract:To address this gap, we introduce Libra-Leaderboard, a comprehensive framework designed to rank LLMs through a balanced evaluation of performance and safety. Combining a dynamic leaderboard with an interactive LLM arena, Libra-Leaderboard encourages the joint optimization of capability and safety. Unlike traditional approaches that average performance and safety metrics, Libra-Leaderboard uses a distance-to-optimal-score method to calculate the overall rankings. This approach incentivizes models to achieve a balance rather than excelling in one dimension at the expense of some other ones. In the first release, Libra-Leaderboard evaluates 26 mainstream LLMs from 14 leading organizations, identifying critical safety challenges even in state-of-the-art models.

Via

Access Paper or Ask Questions

Loki: An Open-Source Tool for Fact Verification

Oct 02, 2024

Haonan Li, Xudong Han, Hao Wang, Yuxia Wang, Minghan Wang, Rui Xing, Yilin Geng, Zenan Zhai, Preslav Nakov, Timothy Baldwin

Figure 1 for Loki: An Open-Source Tool for Fact Verification

Figure 2 for Loki: An Open-Source Tool for Fact Verification

Figure 3 for Loki: An Open-Source Tool for Fact Verification

Figure 4 for Loki: An Open-Source Tool for Fact Verification

Abstract:We introduce Loki, an open-source tool designed to address the growing problem of misinformation. Loki adopts a human-centered approach, striking a balance between the quality of fact-checking and the cost of human involvement. It decomposes the fact-checking task into a five-step pipeline: breaking down long texts into individual claims, assessing their check-worthiness, generating queries, retrieving evidence, and verifying the claims. Instead of fully automating the claim verification process, Loki provides essential information at each step to assist human judgment, especially for general users such as journalists and content moderators. Moreover, it has been optimized for latency, robustness, and cost efficiency at a commercially usable level. Loki is released under an MIT license and is available on GitHub. We also provide a video presenting the system and its capabilities.

Via

Access Paper or Ask Questions

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

Mar 31, 2024

Lizhi Lin, Honglin Mu, Zenan Zhai, Minghan Wang, Yuxia Wang, Renxi Wang, Junjie Gao, Yixuan Zhang, Wanxiang Che, Timothy Baldwin(+2 more)

Figure 1 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

Figure 2 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

Figure 3 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

Figure 4 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

Abstract:Generative models are rapidly gaining popularity and being integrated into everyday applications, raising concerns over their safety issues as various vulnerabilities are exposed. Faced with the problem, the field of red teaming is experiencing fast-paced growth, which highlights the need for a comprehensive organization covering the entire pipeline and addressing emerging topics for the community. Our extensive survey, which examines over 120 papers, introduces a taxonomy of fine-grained attack strategies grounded in the inherent capabilities of language models. Additionally, we have developed the searcher framework that unifies various automatic red teaming approaches. Moreover, our survey covers novel areas including multimodal attacks and defenses, risks around multilingual models, overkill of harmless queries, and safety of downstream applications. We hope this survey can provide a systematic perspective on the field and unlock new areas of research.

Via

Access Paper or Ask Questions

A Chinese Dataset for Evaluating the Safeguards in Large Language Models

Feb 19, 2024

Yuxia Wang, Zenan Zhai, Haonan Li, Xudong Han, Lizhi Lin, Zhenxuan Zhang, Jingru Zhao, Preslav Nakov, Timothy Baldwin

Figure 1 for A Chinese Dataset for Evaluating the Safeguards in Large Language Models

Figure 2 for A Chinese Dataset for Evaluating the Safeguards in Large Language Models

Figure 3 for A Chinese Dataset for Evaluating the Safeguards in Large Language Models

Figure 4 for A Chinese Dataset for Evaluating the Safeguards in Large Language Models

Abstract:Many studies have demonstrated that large language models (LLMs) can produce harmful responses, exposing users to unexpected risks when LLMs are deployed. Previous studies have proposed comprehensive taxonomies of the risks posed by LLMs, as well as corresponding prompts that can be used to examine the safety mechanisms of LLMs. However, the focus has been almost exclusively on English, and little has been explored for other languages. Here we aim to bridge this gap. We first introduce a dataset for the safety evaluation of Chinese LLMs, and then extend it to two other scenarios that can be used to better identify false negative and false positive examples in terms of risky prompt rejections. We further present a set of fine-grained safety assessment criteria for each risk type, facilitating both manual annotation and automatic evaluation in terms of LLM response harmfulness. Our experiments on five LLMs show that region-specific risks are the prevalent type of risk, presenting the major issue with all Chinese LLMs we experimented with. Warning: this paper contains example data that may be offensive, harmful, or biased.

Via

Access Paper or Ask Questions

COVID-SEE: Scientific Evidence Explorer for COVID-19 Related Research

Aug 18, 2020

Karin Verspoor, Simon Šuster, Yulia Otmakhova, Shevon Mendis, Zenan Zhai, Biaoyan Fang, Jey Han Lau, Timothy Baldwin, Antonio Jimeno Yepes, David Martinez

Figure 1 for COVID-SEE: Scientific Evidence Explorer for COVID-19 Related Research

Figure 2 for COVID-SEE: Scientific Evidence Explorer for COVID-19 Related Research

Figure 3 for COVID-SEE: Scientific Evidence Explorer for COVID-19 Related Research

Figure 4 for COVID-SEE: Scientific Evidence Explorer for COVID-19 Related Research

Abstract:We present COVID-SEE, a system for medical literature discovery based on the concept of information exploration, which builds on several distinct text analysis and natural language processing methods to structure and organise information in publications, and augments search by providing a visual overview supporting exploration of a collection to identify key articles of interest. We developed this system over COVID-19 literature to help medical professionals and researchers explore the literature evidence, and improve findability of relevant information. COVID-SEE is available at http://covid-see.com.

* COVID-SEE is available at http://covid-see.com

Via

Access Paper or Ask Questions

Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

Jul 05, 2019

Zenan Zhai, Dat Quoc Nguyen, Saber A. Akhondi, Camilo Thorne, Christian Druckenbrodt, Trevor Cohn, Michelle Gregory, Karin Verspoor

Figure 1 for Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

Figure 2 for Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

Figure 3 for Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

Figure 4 for Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

Abstract:Chemical patents are an important resource for chemical information. However, few chemical Named Entity Recognition (NER) systems have been evaluated on patent documents, due in part to their structural and linguistic complexity. In this paper, we explore the NER performance of a BiLSTM-CRF model utilising pre-trained word embeddings, character-level word representations and contextualized ELMo word representations for chemical patents. We compare word embeddings pre-trained on biomedical and chemical patent corpora. The effect of tokenizers optimized for the chemical domain on NER performance in chemical patents is also explored. The results on two patent corpora show that contextualized word representations generated from ELMo substantially improve chemical NER performance w.r.t. the current state-of-the-art. We also show that domain-specific resources such as word embeddings trained on chemical patents and chemical-specific tokenizers have a positive impact on NER performance.

Via

Access Paper or Ask Questions

A bag-of-concepts model improves relation extraction in a narrow knowledge domain with limited data

Apr 24, 2019

Jiyu Chen, Karin Verspoor, Zenan Zhai

Figure 1 for A bag-of-concepts model improves relation extraction in a narrow knowledge domain with limited data

Figure 2 for A bag-of-concepts model improves relation extraction in a narrow knowledge domain with limited data

Figure 3 for A bag-of-concepts model improves relation extraction in a narrow knowledge domain with limited data

Figure 4 for A bag-of-concepts model improves relation extraction in a narrow knowledge domain with limited data

Abstract:This paper focuses on a traditional relation extraction task in the context of limited annotated data and a narrow knowledge domain. We explore this task with a clinical corpus consisting of 200 breast cancer follow-up treatment letters in which 16 distinct types of relations are annotated. We experiment with an approach to extracting typed relations called window-bounded co-occurrence (WBC), which uses an adjustable context window around entity mentions of a relevant type, and compare its performance with a more typical intra-sentential co-occurrence baseline. We further introduce a new bag-of-concepts (BoC) approach to feature engineering based on the state-of-the-art word embeddings and word synonyms. We demonstrate the competitiveness of BoC by comparing with methods of higher complexity, and explore its effectiveness on this small dataset.

* In Proceedings of the Student Research Workshop at North American Association for Computational Linguistics (NAACL) 2019
* To appear in Proceedings of the Student Research Workshop at the North American Association for Computational Linguistics (NAACL) meeting 2019

Via

Access Paper or Ask Questions

Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

Aug 25, 2018

Zenan Zhai, Dat Quoc Nguyen, Karin Verspoor

Figure 1 for Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

Figure 2 for Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

Figure 3 for Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

Figure 4 for Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

Abstract:We compare the use of LSTM-based and CNN-based character-level word embeddings in BiLSTM-CRF models to approach chemical and disease named entity recognition (NER) tasks. Empirical results over the BioCreative V CDR corpus show that the use of either type of character-level word embeddings in conjunction with the BiLSTM-CRF models leads to comparable state-of-the-art performance. However, the models using CNN-based character-level word embeddings have a computational performance advantage, increasing training time over word-based models by 25% while the LSTM-based character-level word embeddings more than double the required training time.

* In Proceedings of the 9th International Workshop on Health Text Mining and Information Analysis (LOUHI 2018), to appear

Via

Access Paper or Ask Questions