Abstract:We tackle the hard problem of consciousness taking the naturally-selected, self-organising, embodied organism as our starting point. We provide a mathematical formalism describing how biological systems self-organise to hierarchically interpret unlabelled sensory information according to valence and specific needs. Such interpretations imply behavioural policies which can only be differentiated from each other by the qualitative aspect of information processing. Selection pressures favour systems that can intervene in the world to achieve homeostatic and reproductive goals. Quality is a property arising in such systems to link cause to affect to motivate real world interventions. This produces a range of qualitative classifiers (interoceptive and exteroceptive) that motivate specific actions and determine priorities and preferences. Building upon the seminal distinction between access and phenomenal consciousness, our radical claim here is that phenomenal consciousness without access consciousness is likely very common, but the reverse is implausible. To put it provocatively: Nature does not like zombies. We formally describe the multilayered architecture of self-organisation from rocks to Einstein, illustrating how our argument applies in the real world. We claim that access consciousness at the human level is impossible without the ability to hierarchically model i) the self, ii) the world/others and iii) the self as modelled by others. Phenomenal consciousness is therefore required for human-level functionality. Our proposal lays the foundations of a formal science of consciousness, deeply connected with natural selection rather than abstract thinking, closer to human fact than zombie fiction.
Abstract:Biological intelligence uses a "multiscale competency architecture" (MCA). It exhibits adaptive, goal directed behaviour at all scales, from cells to organs to organisms. In contrast, machine intelligence is only adaptive and goal directed at a high level. Learned policies are passively interpreted using abstractions (e.g. arithmetic) embodied in static interpreters (e.g. x86). Biological intelligence excels at causal learning. Machine intelligence does not. Previous work showed causal learning follows from weak policy optimisation, which is hindered by presupposed abstractions in silico. Here we formalise MCAs as nested "agentic abstraction layers", to understand how they might learn causes. We show that weak policy optimisation at low levels enables weak policy optimisation at high. This facilitates what we call "multiscale causal learning" and high level goal directed behaviour. We argue that by engineering human abstractions in silico we disconnect high level goal directed behaviour from the low level goal directed behaviour that gave rise to it. This inhibits causal learning, and we speculate this is one reason why human recall would be accompanied by feeling, and in silico recall not.
Abstract:Simplicity is held by many to be the key to general intelligence. Simpler models tend to "generalise", identifying the cause or generator of data with greater sample efficiency. The implications of the correlation between simplicity and generalisation extend far beyond computer science, addressing questions of physics and even biology. Yet simplicity is a property of form, while generalisation is of function. In interactive settings, any correlation between the two depends on interpretation. In theory there could be no correlation and yet in practice, there is. Previous theoretical work showed generalisation to be a consequence of "weak" constraints implied by function, not form. Experiments demonstrated choosing weak constraints over simple forms yielded a 110-500% improvement in generalisation rate. Here we measure the complexity of weak constraints, and show that if one does not presuppose an abstraction layer, then all have equal complexity. However, in the context of a spatially and temporally extended abstraction layer, efficiency demands weak constraints take simple forms, and simplicity becomes correlated with generalisation. Simplicity has no causal influence on generalisation, but appears to due to confounding.
Abstract:We integrate foundational theories of meaning with a mathematical formalism of artificial general intelligence (AGI) to offer a comprehensive mechanistic explanation of meaning, communication, and symbol emergence. This synthesis holds significance for both AGI and broader debates concerning the nature of language, as it unifies pragmatics, logical truth conditional semantics, Peircean semiotics, and a computable model of enactive cognition, addressing phenomena that have traditionally evaded mechanistic explanation. By examining the conditions under which a machine can generate meaningful utterances or comprehend human meaning, we establish that the current generation of language models do not possess the same understanding of meaning as humans nor intend any meaning that we might attribute to their responses. To address this, we propose simulating human feelings and optimising models to construct weak representations. Our findings shed light on the relationship between meaning and intelligence, and how we can build machines that comprehend and intend meaning.
Abstract:To make accurate inferences in an interactive setting, an agent must not confuse passive observation of events with having participated in causing those events. The do operator formalises interventions so that we may reason about their effect. Yet there exist at least two pareto optimal mathematical formalisms of general intelligence in an interactive setting which, presupposing no explicit representation of intervention, make maximally accurate inferences. We examine one such formalism. We show that in the absence of an operator, an intervention can still be represented by a variable. Furthermore, the need to explicitly represent interventions in advance arises only because we presuppose abstractions. The aforementioned formalism avoids this and so, initial conditions permitting, representations of relevant causal interventions will emerge through induction. These emergent abstractions function as representations of one`s self and of any other object, inasmuch as the interventions of those objects impact the satisfaction of goals. We argue (with reference to theory of mind) that this explains how one might reason about one`s own identity and intent, those of others, of one's own as perceived by others and so on. In a narrow sense this describes what it is to be aware, and is a mechanistic explanation of aspects of consciousness.
Abstract:Software's effect upon the world hinges upon the hardware that interprets it. This tends not to be an issue, because we standardise hardware. AI is typically conceived of as a software ``mind'' running on such interchangeable hardware. This formalises mind-body dualism, in that a software ``mind'' can be run on any number of standardised bodies. While this works well for simple applications, we argue that this approach is less than ideal for the purposes of formalising artificial general intelligence (AGI) or artificial super-intelligence (ASI). The general reinforcement learning agent AIXI is pareto optimal. However, this claim regarding AIXI's performance is highly subjective, because that performance depends upon the choice of interpreter. We examine this problem and formulate an approach based upon enactive cognition and pancomputationalism to address the issue. Weakness is a measure of simplicity, a ``proxy for intelligence'' unrelated to compression. If hypotheses are evaluated in terms of weakness, rather than length, we are able to make objective claims regarding performance. Subsequently, we propose objectively optimal notions of AGI and ASI such that the former is computable and the latter anytime computable (though impractical).
Abstract:If $A$ and $B$ are sets such that $A \subset B$, generalisation may be understood as the inference from $A$ of a hypothesis sufficient to construct $B$. One might infer any number of hypotheses from $A$, yet only some of those may generalise to $B$. How can one know which are likely to generalise? One strategy is to choose the shortest, equating the ability to compress information with the ability to generalise (a proxy for intelligence). We examine this in the context of a mathematical formalism of enactive cognition. We show that compression is neither necessary nor sufficient to maximise performance (measured in terms of the probability of a hypothesis generalising). We formulate a proxy unrelated to length or simplicity, called weakness. We show that if tasks are uniformly distributed, then there is no choice of proxy that performs at least as well as weakness maximisation in all tasks while performing strictly better in at least one. In other words, weakness is the pareto optimal choice of proxy. In experiments comparing maximum weakness and minimum description length in the context of binary arithmetic, the former generalised at between $1.1$ and $5$ times the rate of the latter. We argue this demonstrates that weakness is a far better proxy, and explains why Deepmind's Apperception Engine is able to generalise effectively.
Abstract:An artificial general intelligence (AGI), by one definition, is an agent that requires less information than any other to make an accurate prediction. It is arguable that the general reinforcement learning agent AIXI not only met this definition, but was the only mathematical formalism to do so. Though a significant result, AIXI was incomputable and its performance subjective. This paper proposes an alternative formalism of AGI which overcomes both problems. Formal proof of its performance is given, along with a simple implementation and experimental results that support these claims.
Abstract:The following briefly discusses possible difficulties in communication with and control of an AGI (artificial general intelligence), building upon an explanation of The Fermi Paradox and preceding work on symbol emergence and artificial general intelligence. The latter suggests that to infer what someone means, an agent constructs a rationale for the observed behaviour of others. Communication then requires two agents labour under similar compulsions and have similar experiences (construct similar solutions to similar tasks). Any non-human intelligence may construct solutions such that any rationale for their behaviour (and thus the meaning of their signals) is outside the scope of what a human is inclined to notice or comprehend. Further, the more compressed a signal, the closer it will appear to random noise. Another intelligence may possess the ability to compress information to the extent that, to us, their signals would appear indistinguishable from noise (an explanation for The Fermi Paradox). To facilitate predictive accuracy an AGI would tend to more compressed representations of the world, making any rationale for their behaviour more difficult to comprehend for the same reason. Communication with and control of an AGI may subsequently necessitate not only human-like compulsions and experiences, but imposed cognitive impairment.
Abstract:We attempt to define what is necessary to construct an Artificial Scientist, explore and evaluate several approaches to artificial general intelligence (AGI) which may facilitate this, conclude that a unified or hybrid approach is necessary and explore two theories that satisfy this requirement to some degree.