Abstract:Recent research in computational linguistics has developed algorithms which associate matrices with adjectives and verbs, based on the distribution of words in a corpus of text. These matrices are linear operators on a vector space of context words. They are used to construct the meaning of composite expressions from that of the elementary constituents, forming part of a compositional distributional approach to semantics. We propose a Matrix Theory approach to this data, based on permutation symmetry along with Gaussian weights and their perturbations. A simple Gaussian model is tested against word matrices created from a large corpus of text. We characterize the cubic and quartic departures from the model, which we propose, alongside the Gaussian parameters, as signatures for comparison of linguistic corpora. We propose that perturbed Gaussian models with permutation symmetry provide a promising framework for characterizing the nature of universality in the statistical properties of word matrices. The matrix theory framework developed here exploits the view of statistics as zero dimensional perturbative quantum field theory. It perceives language as a physical system realizing a universality class of matrix statistics characterized by permutation symmetry.
Abstract:This volume contains the Proceedings of the 2016 Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science (SLPCS 2016), which was held on the 11th of June at the University of Strathclyde, Glasgow, and was co-located with Quantum Physics and Logic (QPL 2016). Exploiting the common ground provided by the concept of a vector space, the workshop brought together researchers working at the intersection of Natural Language Processing (NLP), cognitive science, and physics, offering them an appropriate forum for presenting their uniquely motivated work and ideas. The interplay between these three disciplines inspired theoretically motivated approaches to the understanding of how word meanings interact with each other in sentences and discourse, how diagrammatic reasoning depicts and simplifies this interaction, how language models are determined by input from the world, and how word and sentence meanings interact logically. This first edition of the workshop consisted of three invited talks from distinguished speakers (Hans Briegel, Peter G\"ardenfors, Dominic Widdows) and eight presentations of selected contributed papers. Each submission was refereed by at least three members of the Programme Committee, who delivered detailed and insightful comments and suggestions.