Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shun Wang

Exploring Task Performance with Interpretable Models via Sparse Auto-Encoders

Jul 08, 2025

Shun Wang, Tyler Loakman, Youbo Lei, Yi Liu, Bohao Yang, Yuting Zhao, Dong Yang, Chenghua Lin

Figure 1 for Exploring Task Performance with Interpretable Models via Sparse Auto-Encoders

Figure 2 for Exploring Task Performance with Interpretable Models via Sparse Auto-Encoders

Figure 3 for Exploring Task Performance with Interpretable Models via Sparse Auto-Encoders

Figure 4 for Exploring Task Performance with Interpretable Models via Sparse Auto-Encoders

Abstract:Large Language Models (LLMs) are traditionally viewed as black-box algorithms, therefore reducing trustworthiness and obscuring potential approaches to increasing performance on downstream tasks. In this work, we apply an effective LLM decomposition method using a dictionary-learning approach with sparse autoencoders. This helps extract monosemantic features from polysemantic LLM neurons. Remarkably, our work identifies model-internal misunderstanding, allowing the automatic reformulation of the prompts with additional annotations to improve the interpretation by LLMs. Moreover, this approach demonstrates a significant performance improvement in downstream tasks, such as mathematical reasoning and metaphor detection.

Via

Access Paper or Ask Questions

ZENN: A Thermodynamics-Inspired Computational Framework for Heterogeneous Data-Driven Modeling

May 14, 2025

Shun Wang, Shun-Li Shang, Zi-Kui Liu, Wenrui Hao

Abstract:Traditional entropy-based methods - such as cross-entropy loss in classification problems - have long been essential tools for quantifying uncertainty and disorder in data and developing artificial intelligence algorithms. However, the rapid growth of data across various domains has introduced new challenges, particularly the integration of heterogeneous datasets with intrinsic disparities. In this paper, we extend zentropy theory into the data science domain by introducing intrinsic entropy, enabling more effective learning from heterogeneous data sources. We propose a zentropy-enhanced neural network (ZENN) that simultaneously learns both energy and intrinsic entropy components, capturing the underlying structure of multi-source data. To support this, we redesign the neural network architecture to better reflect the intrinsic properties and variability inherent in diverse datasets. We demonstrate the effectiveness of ZENN on classification tasks and energy landscape reconstructions, showing its superior generalization capabilities and robustness-particularly in predicting high-order derivatives. As a practical application, we employ ZENN to reconstruct the Helmholtz energy landscape of Fe3Pt using data generated from DFT and capture key material behaviors, including negative thermal expansion and the critical point in the temperature-pressure space. Overall, our study introduces a novel approach for data-driven machine learning grounded in zentropy theory, highlighting ZENN as a versatile and robust deep learning framework for scientific problems involving complex, heterogeneous datasets.

* 9 pages, 4 figures

Via

Access Paper or Ask Questions

MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language

Jun 19, 2024

Shun Wang, Ge Zhang, Han Wu, Tyler Loakman, Wenhao Huang, Chenghua Lin

Figure 1 for MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language

Figure 2 for MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language

Figure 3 for MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language

Figure 4 for MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language

Abstract:Machine Translation (MT) has developed rapidly since the release of Large Language Models and current MT evaluation is performed through comparison with reference human translations or by predicting quality scores from human-labeled data. However, these mainstream evaluation methods mainly focus on fluency and factual reliability, whilst paying little attention to figurative quality. In this paper, we investigate the figurative quality of MT and propose a set of human evaluation metrics focused on the translation of figurative language. We additionally present a multilingual parallel metaphor corpus generated by post-editing. Our evaluation protocol is designed to estimate four aspects of MT: Metaphorical Equivalence, Emotion, Authenticity, and Quality. In doing so, we observe that translations of figurative expressions display different traits from literal ones.

Via

Access Paper or Ask Questions

Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Oct 24, 2023

Chen Tang, Shun Wang, Tomas Goldsack, Chenghua Lin

Figure 1 for Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Figure 2 for Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Figure 3 for Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Figure 4 for Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Abstract:Abstracts derived from biomedical literature possess distinct domain-specific characteristics, including specialised writing styles and biomedical terminologies, which necessitate a deep understanding of the related literature. As a result, existing language models struggle to generate technical summaries that are on par with those produced by biomedical experts, given the absence of domain-specific background knowledge. This paper aims to enhance the performance of language models in biomedical abstractive summarisation by aggregating knowledge from external papers cited within the source article. We propose a novel attention-based citation aggregation model that integrates domain-specific knowledge from citation papers, allowing neural networks to generate summaries by leveraging both the paper content and relevant knowledge from citation papers. Furthermore, we construct and release a large-scale biomedical summarisation dataset that serves as a foundation for our research. Extensive experiments demonstrate that our model outperforms state-of-the-art approaches and achieves substantial improvements in abstractive biomedical text summarisation.

* EMNLP 2023
* Accepted by EMNLP 2023

Via

Access Paper or Ask Questions

Metaphor Detection via Explicit Basic Meanings Modelling

May 26, 2023

Yucheng Li, Shun Wang, Chenghua Lin, Guerin Frank

Figure 1 for Metaphor Detection via Explicit Basic Meanings Modelling

Figure 2 for Metaphor Detection via Explicit Basic Meanings Modelling

Figure 3 for Metaphor Detection via Explicit Basic Meanings Modelling

Figure 4 for Metaphor Detection via Explicit Basic Meanings Modelling

Abstract:One noticeable trend in metaphor detection is the embrace of linguistic theories such as the metaphor identification procedure (MIP) for model architecture design. While MIP clearly defines that the metaphoricity of a lexical unit is determined based on the contrast between its \textit{contextual meaning} and its \textit{basic meaning}, existing work does not strictly follow this principle, typically using the \textit{aggregated meaning} to approximate the basic meaning of target words. In this paper, we propose a novel metaphor detection method, which models the basic meaning of the word based on literal annotation from the training set, and then compares this with the contextual meaning in a target sentence to identify metaphors. Empirical results show that our method outperforms the state-of-the-art method significantly by 1.0\% in F1 score. Moreover, our performance even reaches the theoretical upper bound on the VUA18 benchmark for targets with basic annotations, which demonstrates the importance of modelling basic meanings for metaphor detection.

* ACL 2023

Via

Access Paper or Ask Questions

Metaphor Detection with Effective Context Denoising

Feb 11, 2023

Shun Wang, Yucheng Li, Chenghua Lin, Loïc Barrault, Frank Guerin

Figure 1 for Metaphor Detection with Effective Context Denoising

Figure 2 for Metaphor Detection with Effective Context Denoising

Figure 3 for Metaphor Detection with Effective Context Denoising

Figure 4 for Metaphor Detection with Effective Context Denoising

Abstract:We propose a novel RoBERTa-based model, RoPPT, which introduces a target-oriented parse tree structure in metaphor detection. Compared to existing models, RoPPT focuses on semantically relevant information and achieves the state-of-the-art on several main metaphor datasets. We also compare our approach against several popular denoising and pruning methods, demonstrating the effectiveness of our approach in context denoising. Our code and dataset can be found at https://github.com/MajiBear000/RoPPT

Via

Access Paper or Ask Questions

FrameBERT: Conceptual Metaphor Detection with Frame Embedding Learning

Feb 09, 2023

Yucheng Li, Shun Wang, Chenghua Lin, Frank Guerin, Loïc Barrault

Abstract:In this paper, we propose FrameBERT, a RoBERTa-based model that can explicitly learn and incorporate FrameNet Embeddings for concept-level metaphor detection. FrameBERT not only achieves better or comparable performance to the state-of-the-art, but also is more explainable and interpretable compared to existing models, attributing to its ability of accounting for external knowledge of FrameNet.

Via

Access Paper or Ask Questions