Abstract:The world's languages exhibit certain so-called typological or implicational universals; for example, Subject-Object-Verb (SOV) word order typically employs postpositions. Explaining the source of such biases is a key goal in linguistics. We study the word-order universals through a computational simulation with language models (LMs). Our experiments show that typologically typical word orders tend to have lower perplexity estimated by LMs with cognitively plausible biases: syntactic biases, specific parsing strategies, and memory limitations. This suggests that the interplay of these cognitive biases and predictability (perplexity) can explain many aspects of word-order universals. This also showcases the advantage of cognitively-motivated LMs, which are typically employed in cognitive modeling, in the computational simulation of language universals.
Abstract:As a sub-discipline of evolutionary and computational linguistics, emergent communication (EC) studies communication protocols, called emergent languages, arising in simulations where agents communicate. A key goal of EC is to give rise to languages that share statistical properties with natural languages. In this paper, we reinterpret Lewis's signaling game, a frequently used setting in EC, as beta-VAE and reformulate its objective function as ELBO. Consequently, we clarify the existence of prior distributions of emergent languages and show that the choice of the priors can influence their statistical properties. Specifically, we address the properties of word lengths and segmentation, known as Zipf's law of abbreviation (ZLA) and Harris's articulation scheme (HAS), respectively. It has been reported that the emergent languages do not follow them when using the conventional objective. We experimentally demonstrate that by selecting an appropriate prior distribution, more natural segments emerge, while suggesting that the conventional one prevents the languages from following ZLA and HAS.
Abstract:To develop computational agents that better communicate using their own emergent language, we endow the agents with an ability to focus their attention on particular concepts in the environment. Humans often understand an object or scene as a composite of concepts and those concepts are further mapped onto words. We implement this intuition as cross-modal attention mechanisms in Speaker and Listener agents in a referential game and show attention leads to more compositional and interpretable emergent language. We also demonstrate how attention aids in understanding the learned communication protocol by investigating the attention weights associated with each message symbol and the alignment of attention weights between Speaker and Listener agents. Overall, our results suggest that attention is a promising mechanism for developing more human-like emergent language.
Abstract:A unique feature of Recurrent Neural Networks (RNNs) is that it incrementally processes input sequences. In this research, we aim to uncover the inherent generalization properties, i.e., inductive bias, of RNNs with respect to how frequently RNNs switch the outputs through time steps in the sequence classification task, which we call output sequence frequency. Previous work analyzed inductive bias by training models with a few synthetic data and comparing the model's generalization with candidate generalization patterns. However, when examining the output sequence frequency, previous methods cannot be directly applied since enumerating candidate patterns is computationally difficult for longer sequences. To this end, we propose to directly calculate the output sequence frequency for each model by regarding the outputs of the model as discrete-time signals and applying frequency domain analysis. Experimental results showed that Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) have an inductive bias towards lower-frequency patterns, while Elman RNN tends to learn patterns in which the output changes at high frequencies. We also found that the inductive bias of LSTM and GRU varies with the number of layers and the size of hidden layers.