What is Future of Humanity Instititute working on?
From Stampy's Wiki
Non-Canonical Answers
FHI does a lot of work on non-technical AI safety, but as far we can tell their primary technical agenda is the Causal incentives group (joint between FHI and DeepMind), who uses notions from causality to study incentives and their application to AI Safety. Recent work includes:
- Agent Incentives: A Causal Perspective, a paper which formalizes concepts such as the value of information and control incentives.
- Reward tampering problems and solutions in reinforcement learning: A causal influence diagram perspective, a paper which theoretically analyzes wireheading.
Stamps: None
Show your endorsement of this answer by giving it a stamp of approval!
Asked by: | RoseMcClelland () |
OriginWhere was this question originally asked |
Wiki |
Date: | 2022/09/13 |
Discussion