Take AISafety.info’s 3 minute survey to help inform our strategy and priorities

Take the survey
Basic concepts

Prompting
Capabilities
Current systems
Algorithms
Alignment concepts
Intelligence and optimization
AI goals
Risks and outcomes

What is behavioral cloning?

Behavioral cloning is a form of imitation learning. It involves gathering observations of the behavior of an “expert demonstrator” who is good at the task being trained for, and then using supervised learning to train an AI agent

to imitate the observed behavior.

Behavioral cloning differs from other forms of imitation learning (such as inverse reinforcement learning

or cooperative inverse reinforcement learning) in that it aims to have the AI replicate the demonstrator's behavior as closely as possible (rather than to have the AI, e.g., infer the demonstrator's goals or implicit reward function).

Behavioral cloning was originally developed to train self-driving cars, and this use case serves as a good example of how behavioral cloning works:

  • First, while a human "demonstrator" drives a car, we collect data about 1) states of the environment (using sensors such as cameras and Lidars) and 2) the actions that the demonstrator takes in each environmental state (such as steering, accelerating/braking, and gear shifting).
  • Next, we create a dataset consisting of (state, action) pairs.
  • Finally, we use supervised learning to train a model that takes the environmental state as an input and predicts the driver’s action.

When the accuracy of this model is high enough, we can say that the driver’s behavior has been “cloned”.

Behavioral cloning is also sometimes used to fine-tune

large language models (LLMs). In this case, the dataset consists of human-generated (prompt, completion) pairs. As an example, after learning how to predict text from the internet, LLMs can be fine-tuned to follow instructions by copying humans.

Sources

Keep Reading

Continue with the next entry in "Basic concepts"
What is imitation learning?
Next


AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.

© AISafety.info, 2022—2025

Aisafety.info is an Ashgro Inc Project. Ashgro Inc (EIN: 88-4232889) is a 501(c)(3) Public Charity incorporated in Delaware.