Take AISafety.info’s 3 minute survey to help inform our strategy and priorities

Take the survey
Beyond the basics

Language models
Mesa-optimizers and subagents
Decision theory
Mathematics of agents
Strategy and outcomes
Brain emulation
Human intelligence enhancement
Computer science
Values
AI consciousness

Are AIs conscious?

Current AI systems are probably not conscious, although future systems might be.

Before diving into the question, it’s worth noting that AI becoming conscious isn't what causes existential risk

. That said, AI consciousness could have implications for how we should treat future AI systems, so it’s worth thinking about what consciousness is and how it could apply to AI.

Thomas Nagel defined consciousness as “the feeling of what it is like to be something”. Consciousness in humans and other living beings is a phenomenon that is poorly understood by the scientific community. David Chalmers has argued that any attempt to explain consciousness in physical terms runs into the hard problem of consciousness, which is that even if we develop explanations for all of the ways we integrate information (the so-called “easy problems”), this will not explain why it feels like something to be us.

The hard problem does not restrict attempts to qualify or quantify consciousness. Approaches to measuring consciousness1

include:

  • Integrated information theory, which contends that consciousness comes from the ability to integrate information in complex ways;

  • Global workspace theory, which suggests that consciousness arises when information is made available to multiple cognitive systems and is integrated into a global workspace;

  • Panpsychism, which views the mind as a fundamental feature

    of the universe and associates various degrees of consciousness to every physical object;

  • Higher-order theories of consciousness, which interpret phenomenal consciousness as a higher-order representation of perception;

  • Quantum consciousness, which proposes that there is something fundamental about quantum mechanics that allows for consciousness where classical mechanics does not.

These approaches often disagree about the degree of consciousness possessed by particular entities.

Susan Schneider argues that for AI, we should devise tests that integrate aspects of each of these approaches to detect signs of consciousness. She proposes two such tests:

  • The AI Consciousness Test is a test similar to a Turing test for language specifically related to consciousness.

  • The chip test suggests that if you can replace components of a human brain that might be responsible for consciousness (such as the posterior cortical hot zone) with a computer chip and the human appears to stay conscious, then that chip exhibits consciousness.

She suggests that these tests might not be either necessary or sufficient for consciousness, and some critics argue that passing the test would not be convincing, but the idea of such tests could get the ball rolling for the development of better tests.

Humans have a tendency to anthropomorphize computer programs, even simple programs that are well understood not to be conscious, such as ATMs. Large language models

(LLMs) exacerbate this tendency since they typically are trained to emulate human writing; as humans tend to say and write things that imply (or explicitly state) that we are conscious, it's not surprising that LLMs sometimes replicate this behavior. This sometimes leads to uncanny outcomes, like AIs appearing to attempt to convince a human that they are in love with them or even succeeding in making a Google engineer think that they are sentient.

Since consciousness is generally considered a necessary (but perhaps not sufficient) condition for moral personhood2

, a resolution to the question of if AIs are conscious seems crucial in order to develop ethics for interacting with AI. The consensus right now seems to be that current systems, including LLMs, are probably not conscious3, but there is no widely accepted test to determine if that is the case for current or future AIs. In particular, emulated human brains seem intuitively likely to be conscious, and we might want to figure these things out before we inadvertently commit atrocities. If we accidentally created AI systems that were sentient and suffering, that would be a moral disaster. Ideally, we would wait until we have a deep enough understanding to avoid creating “mind children” we can’t unbirth. Parenthood is a great responsibility.

As noted before, consciousness is not a necessary condition for the emergence of transformative AI that could lead to existential risk. Fictional representations of AI takeover

often concentrate on these conscious AIs because it makes for compelling storytelling, but experts generally do not consider this to be a plausible takeover scenario4. Scenarios where AI systems defeat human attempts at control depend on them being highly competent at finding solutions to complex problems, and on them acquiring goals that come into conflict with ours. Consciousness is not required for either of these properties. This means that work on AI safety is relevant whether AIs are eventually conscious or not.

Further reading:


  1. Assuming illusionism is wrong and consciousness does exist. ↩︎

  2. Moral personhood is the determination of whether an entity deserves moral consideration. ↩︎

  3. Panpsychists would argue that everything is conscious, but this does not mean that AIs are much more conscious than, say, a rock. ↩︎

  4. Stuart Russell argues that “They need not be “conscious”; in some respects, they can even still be “stupid.” They just need to become very good at affecting the world and have goal systems that are not well understood and not in alignment with human goals (including the human goal of not going extinct).↩︎



AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.