Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ji-rong Wen

UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models

Feb 22, 2024

Zhaoheng Huang, Zhicheng Dou, Yutao Zhu, Ji-rong Wen

Figure 1 for UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models

Figure 2 for UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models

Figure 3 for UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models

Figure 4 for UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models

Abstract:Large language models (LLMs) may generate text that lacks consistency with human knowledge, leading to factual inaccuracies or \textit{hallucination}. Existing research for evaluating the factuality of LLMs involves extracting fact claims using an LLM and verifying them against a predefined fact source. However, these evaluation metrics are task-specific, and not scalable, and the substitutability of fact sources in different tasks is under-explored. To address these challenges, we categorize four available fact sources: human-written evidence, reference documents, search engine results, and LLM knowledge, along with five text generation tasks containing six representative datasets. Then, we propose \texttt{UFO}, an LLM-based unified and flexible evaluation framework to verify facts against plug-and-play fact sources. We implement five evaluation scenarios based on this framework. Experimental results show that for most QA tasks, human-written evidence and reference documents are crucial, and they can substitute for each other in retrieval-augmented QA tasks. In news fact generation tasks, search engine results and LLM knowledge are essential. Our dataset and code are available at \url{https://github.com/WaldenRUC/UFO}.

* under review

Via

Access Paper or Ask Questions

Temporal Embedding in Convolutional Neural Networks for Robust Learning of Abstract Snippets

Feb 18, 2015

Jiajun Liu, Kun Zhao, Brano Kusy, Ji-rong Wen, Raja Jurdak

Figure 1 for Temporal Embedding in Convolutional Neural Networks for Robust Learning of Abstract Snippets

Figure 2 for Temporal Embedding in Convolutional Neural Networks for Robust Learning of Abstract Snippets

Figure 3 for Temporal Embedding in Convolutional Neural Networks for Robust Learning of Abstract Snippets

Figure 4 for Temporal Embedding in Convolutional Neural Networks for Robust Learning of Abstract Snippets

Abstract:The prediction of periodical time-series remains challenging due to various types of data distortions and misalignments. Here, we propose a novel model called Temporal embedding-enhanced convolutional neural Network (TeNet) to learn repeatedly-occurring-yet-hidden structural elements in periodical time-series, called abstract snippets, for predicting future changes. Our model uses convolutional neural networks and embeds a time-series with its potential neighbors in the temporal domain for aligning it to the dominant patterns in the dataset. The model is robust to distortions and misalignments in the temporal domain and demonstrates strong prediction power for periodical time-series. We conduct extensive experiments and discover that the proposed model shows significant and consistent advantages over existing methods on a variety of data modalities ranging from human mobility to household power consumption records. Empirical results indicate that the model is robust to various factors such as number of samples, variance of data, numerical ranges of data etc. The experiments also verify that the intuition behind the model can be generalized to multiple data types and applications and promises significant improvement in prediction performances across the datasets studied.

* a submission to kdd 15'

Via

Access Paper or Ask Questions