What about having a human supervisor who must approve all the AI's decisions before executing them?
The problem is that the actions can be harmful in a very non-obvious, indirect way. It's not at all obvious which actions should be stopped.
For example when the system comes up with a very clever way to acquire resources - this action's safety depends on what it intends to use these resources for.
Such a supervision may buy us some safety, if we find a way to make the system's intentions very transparent.
OriginWhere was this question originally asked