What is embedded agency?

An embedded agent is an agent which is a part of its environment.

Standard decision theory models agents as separate from the environment they act upon. For example, when you play a video game, you affect the world of the game, but are not yourself part of the game. There are defined input channels (e.g. a mouse and keyboard) and output channels (e.g. images on the screen), and you don’t need to model yourself as part of the video game world in order to understand and play the game. You, as an agent, are not embedded in the game.

This division is not as present in the physical world, where objects around you can more directly influence your physical state. Therefore, when you act, the action impacts you as well, and you cannot restrict your analysis of the situation only to your impact on the world outside you. In this sense, you are an “embedded agent” in the world.

Being an embedded agent has numerous implications.

  • Your plans can affect you directly, and you need to model that

    • If you accidentally cause an explosion, you could be injured

    • In particular, since you are part of the world, your actions can also change your own abilities

      • For example, you can improve your skills to be better able to solve future problems.
  • You cannot contain a complete description of your environment, since you are contained by the environment.

  • Your investigation of counterfactual scenarios is limited since you cannot model your entire environment.

Many areas of alignment research can be understood through the embedded agency lens. For example, an embedded agent can change its own internal structure, which could result in subagents or mesa-optimizers which do not have the same goals as the original system.

The following figure (from here) illustrates how many alignment challenges relate to embedded agency:



AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.

© AISafety.info, 2022—1970

Aisafety.info is an Ashgro Inc Project. Ashgro Inc (EIN: 88-4232889) is a 501(c)(3) Public Charity incorporated in Delaware.