Super interesting! If this kind of reward hacking exists in current AI, does that have any kind of serious implications if someone wanted to deploy one for the stock market, for example? Like would the AI seek to "cheat" and commit fraud or some gain insider info rather than play the stock market fairly?

Asked by: doublebrass
YouTube (comment link)
On video: Reward Hacking: Concrete Problems in AI Safety Part 3
