Almost, but not entirely, Unreasonable's question on Reward Hacking Reloaded
From Stampy's Wiki
How is AI 'per se' NOT a HUMAN REWARD HACK? It seems humans cannot be bothered to solve problems at a marginal level anymore, so some specialists develop AI to 'think / solve' electronically on behalf of people, ultimately displacing them entirely. Designed self-annihilating is AI, no more , no less.
Awesome, clear insight into KPI's: Show me how you measure me, and I'll show you how I behave. Its an age-old operations vs management issue, where both sets are trying to MINIMISE the other's influence, while trying to MAXIMISE their own. What an awesome problem to hand to a Technocratic Optimizing System. Who knows, it may even turn out balanced, in which case Management will summarily drop it. Maybe there IS hope for AI?
Asked by: | Almost, but not entirely, Unreasonable () |
OriginWhere was this question originally asked |
YouTube (comment link) |
On video: | Reward Hacking Reloaded: Concrete Problems in AI Safety Part 3.5 |
Date: | 2017-08-30T11:16 |
Asked on Discord? | No |
Discussion