Artis Zelmenis's question on Reward Modeling

From Stampy's Wiki
Artis Zelmenis's question on Reward Modeling id:UgxxG92UOJ7wO76Ud254AaABAg

What if, instead of going with linear NN that mainly goes around (feedforward and backpropogation), we try NN as an actual >net<. A self contained system with several sub-NNs who does the thing described in the video, but many times mirrored and they feed their own result in each other. Something like brain areas with a task, and the area who will get it the best result will be the main 'executor' subduing other non-permanently. Like ever-changing (plasticity?) brain en small scale.

Tags: None (add tags)
Question Info
Asked by: Artis Zelmenis
OriginWhere was this question originally asked
YouTube (comment link)
On video: Training AI Without Writing A Reward Function, with Reward Modelling
Date: 2019-12-13T18:05
Asked on Discord? No