Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhenghao Zhou

Meaning Beyond Truth Conditions: Evaluating Discourse Level Understanding via Anaphora Accessibility

Feb 19, 2025

Xiaomeng Zhu, Zhenghao Zhou, Simon Charlow, Robert Frank

Abstract:We present a hierarchy of natural language understanding abilities and argue for the importance of moving beyond assessments of understanding at the lexical and sentence levels to the discourse level. We propose the task of anaphora accessibility as a diagnostic for assessing discourse understanding, and to this end, present an evaluation dataset inspired by theoretical research in dynamic semantics. We evaluate human and LLM performance on our dataset and find that LLMs and humans align on some tasks and diverge on others. Such divergence can be explained by LLMs' reliance on specific lexical items during language comprehension, in contrast to human sensitivity to structural abstractions.

Via

Access Paper or Ask Questions

A White-Box Deep-Learning Method for Electrical Energy System Modeling Based on Kolmogorov-Arnold Network

Sep 12, 2024

Zhenghao Zhou, Yiyan Li, Zelin Guo, Zheng Yan, Mo-Yuen Chow

Abstract:Deep learning methods have been widely used as an end-to-end modeling strategy of electrical energy systems because of their conveniency and powerful pattern recognition capability. However, due to the "black-box" nature, deep learning methods have long been blamed for their poor interpretability when modeling a physical system. In this paper, we introduce a novel neural network structure, Kolmogorov-Arnold Network (KAN), to achieve "white-box" modeling for electrical energy systems to enhance the interpretability. The most distinct feature of KAN lies in the learnable activation function together with the sparse training and symbolification process. Consequently, KAN can express the physical process with concise and explicit mathematical formulas while remaining the nonlinear-fitting capability of deep neural networks. Simulation results based on three electrical energy systems demonstrate the effectiveness of KAN in the aspects of interpretability, accuracy, robustness and generalization ability.

Via

Access Paper or Ask Questions

Unsupervised and Interpretable Synthesizing for Electrical Time Series Based on Information Maximizing Generative Adversarial Nets

Jul 18, 2024

Zhenghao Zhou, Yiyan Li, Runlong Liu, Zheng Yan, Mo-Yuen Chow

Figure 1 for Unsupervised and Interpretable Synthesizing for Electrical Time Series Based on Information Maximizing Generative Adversarial Nets

Figure 2 for Unsupervised and Interpretable Synthesizing for Electrical Time Series Based on Information Maximizing Generative Adversarial Nets

Figure 3 for Unsupervised and Interpretable Synthesizing for Electrical Time Series Based on Information Maximizing Generative Adversarial Nets

Figure 4 for Unsupervised and Interpretable Synthesizing for Electrical Time Series Based on Information Maximizing Generative Adversarial Nets

Abstract:Generating synthetic data has become a popular alternative solution to deal with the difficulties in accessing and sharing field measurement data in power systems. However, to make the generation results controllable, existing methods (e.g. Conditional Generative Adversarial Nets, cGAN) require labeled dataset to train the model, which is demanding in practice because many field measurement data lacks descriptive labels. In this paper, we introduce the Information Maximizing Generative Adversarial Nets (infoGAN) to achieve interpretable feature extraction and controllable synthetic data generation based on the unlabeled electrical time series dataset. Features with clear physical meanings can be automatically extracted by maximizing the mutual information between the input latent code and the classifier output of infoGAN. Then the extracted features are used to control the generation results similar to a vanilla cGAN framework. Case study is based on the time series datasets of power load and renewable energy output. Results demonstrate that infoGAN can extract both discrete and continuous features with clear physical meanings, as well as generating realistic synthetic time series that satisfy given features.

Via

Access Paper or Ask Questions

Is In-Context Learning a Type of Gradient-Based Learning? Evidence from the Inverse Frequency Effect in Structural Priming

Jun 26, 2024

Zhenghao Zhou, Robert Frank, R. Thomas McCoy

Abstract:Large language models (LLMs) have shown the emergent capability of in-context learning (ICL). One line of research has explained ICL as functionally performing gradient descent. In this paper, we introduce a new way of diagnosing whether ICL is functionally equivalent to gradient-based learning. Our approach is based on the inverse frequency effect (IFE) -- a phenomenon in which an error-driven learner is expected to show larger updates when trained on infrequent examples than frequent ones. The IFE has previously been studied in psycholinguistics because humans show this effect in the context of structural priming (the tendency for people to produce sentence structures they have encountered recently); the IFE has been used as evidence that human structural priming must involve error-driven learning mechanisms. In our experiments, we simulated structural priming within ICL and found that LLMs display the IFE, with the effect being stronger in larger models. We conclude that ICL is indeed a type of gradient-based learning, supporting the hypothesis that a gradient component is implicitly computed in the forward pass during ICL. Our results suggest that both humans and LLMs make use of gradient-based, error-driven processing mechanisms.

Via

Access Paper or Ask Questions