Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shu Yu

Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks

May 26, 2025

Sirui Chen, Shuqin Ma, Shu Yu, Hanwang Zhang, Shengjie Zhao, Chaochao Lu

Abstract:Consciousness stands as one of the most profound and distinguishing features of the human mind, fundamentally shaping our understanding of existence and agency. As large language models (LLMs) develop at an unprecedented pace, questions concerning intelligence and consciousness have become increasingly significant. However, discourse on LLM consciousness remains largely unexplored territory. In this paper, we first clarify frequently conflated terminologies (e.g., LLM consciousness and LLM awareness). Then, we systematically organize and synthesize existing research on LLM consciousness from both theoretical and empirical perspectives. Furthermore, we highlight potential frontier risks that conscious LLMs might introduce. Finally, we discuss current challenges and outline future directions in this emerging field. The references discussed in this paper are organized at https://github.com/OpenCausaLab/Awesome-LLM-Consciousness.

Via

Access Paper or Ask Questions

ADAM: An Embodied Causal Agent in Open-World Environments

Oct 29, 2024

Shu Yu, Chaochao Lu

Figure 1 for ADAM: An Embodied Causal Agent in Open-World Environments

Figure 2 for ADAM: An Embodied Causal Agent in Open-World Environments

Figure 3 for ADAM: An Embodied Causal Agent in Open-World Environments

Figure 4 for ADAM: An Embodied Causal Agent in Open-World Environments

Abstract:In open-world environments like Minecraft, existing agents face challenges in continuously learning structured knowledge, particularly causality. These challenges stem from the opacity inherent in black-box models and an excessive reliance on prior knowledge during training, which impair their interpretability and generalization capability. To this end, we introduce ADAM, An emboDied causal Agent in Minecraft, that can autonomously navigate the open world, perceive multimodal contexts, learn causal world knowledge, and tackle complex tasks through lifelong learning. ADAM is empowered by four key components: 1) an interaction module, enabling the agent to execute actions while documenting the interaction processes; 2) a causal model module, tasked with constructing an ever-growing causal graph from scratch, which enhances interpretability and diminishes reliance on prior knowledge; 3) a controller module, comprising a planner, an actor, and a memory pool, which uses the learned causal graph to accomplish tasks; 4) a perception module, powered by multimodal large language models, which enables ADAM to perceive like a human player. Extensive experiments show that ADAM constructs an almost perfect causal graph from scratch, enabling efficient task decomposition and execution with strong interpretability. Notably, in our modified Minecraft games where no prior knowledge is available, ADAM maintains its performance and shows remarkable robustness and generalization capability. ADAM pioneers a novel paradigm that integrates causal methods and embodied agents in a synergistic manner. Our project page is at https://opencausalab.github.io/ADAM.

Via

Access Paper or Ask Questions

From Imitation to Introspection: Probing Self-Consciousness in Language Models

Oct 24, 2024

Sirui Chen, Shu Yu, Shengjie Zhao, Chaochao Lu

Figure 1 for From Imitation to Introspection: Probing Self-Consciousness in Language Models

Figure 2 for From Imitation to Introspection: Probing Self-Consciousness in Language Models

Figure 3 for From Imitation to Introspection: Probing Self-Consciousness in Language Models

Figure 4 for From Imitation to Introspection: Probing Self-Consciousness in Language Models

Abstract:Self-consciousness, the introspection of one's existence and thoughts, represents a high-level cognitive process. As language models advance at an unprecedented pace, a critical question arises: Are these models becoming self-conscious? Drawing upon insights from psychological and neural science, this work presents a practical definition of self-consciousness for language models and refines ten core concepts. Our work pioneers an investigation into self-consciousness in language models by, for the first time, leveraging causal structural games to establish the functional definitions of the ten core concepts. Based on our definitions, we conduct a comprehensive four-stage experiment: quantification (evaluation of ten leading models), representation (visualization of self-consciousness within the models), manipulation (modification of the models' representation), and acquisition (fine-tuning the models on core concepts). Our findings indicate that although models are in the early stages of developing self-consciousness, there is a discernible representation of certain concepts within their internal mechanisms. However, these representations of self-consciousness are hard to manipulate positively at the current stage, yet they can be acquired through targeted fine-tuning. Our datasets and code are at https://github.com/OpenCausaLab/SelfConsciousness.

Via

Access Paper or Ask Questions

An Adapter based Multi-label Pre-training for Speech Separation and Enhancement

Nov 11, 2022

Tianrui Wang, Xie Chen, Zhuo Chen, Shu Yu, Weibin Zhu

Figure 1 for An Adapter based Multi-label Pre-training for Speech Separation and Enhancement

Figure 2 for An Adapter based Multi-label Pre-training for Speech Separation and Enhancement

Figure 3 for An Adapter based Multi-label Pre-training for Speech Separation and Enhancement

Figure 4 for An Adapter based Multi-label Pre-training for Speech Separation and Enhancement

Abstract:In recent years, self-supervised learning (SSL) has achieved tremendous success in various speech tasks due to its power to extract representations from massive unlabeled data. However, compared with tasks such as speech recognition (ASR), the improvements from SSL representation in speech separation (SS) and enhancement (SE) are considerably smaller. Based on HuBERT, this work investigates improving the SSL model for SS and SE. We first update HuBERT's masked speech prediction (MSP) objective by integrating the separation and denoising terms, resulting in a multiple pseudo label pre-training scheme, which significantly improves HuBERT's performance on SS and SE but degrades the performance on ASR. To maintain its performance gain on ASR, we further propose an adapter-based architecture for HuBERT's Transformer encoder, where only a few parameters of each layer are adjusted to the multiple pseudo label MSP while other parameters remain frozen as default HuBERT. Experimental results show that our proposed adapter-based multiple pseudo label HuBERT yield consistent and significant performance improvements on SE, SS, and ASR tasks, with a faster pre-training speed, at only marginal parameters increase.

* 5 pages

Via

Access Paper or Ask Questions