utility functions

From Stampy's Wiki
Utility functions
utility functions
Alignment Forum Tag
Wikipedia Page

Description

A utility function assigns numerical values ("utilities") to outcomes, in such a way that outcomes with higher utilities are always preferred to outcomes with lower utilities.

A utility function assigns numerical values ("utilities") to outcomes, in such a way that outcomes with higher utilities are always preferred to outcomes with lower utilities.

See also: Complexity of Value, Decision Theory, Game Theory, Orthogonality Thesis, Utilitarianism, Preference, Utility, VNM Theorem

Utility Functions do not work very well in practice for individual humans. Human drives are not coherent nor is there any reason to think they would be (Thou Art Godshatter), and even people with a strong interest in the concept have trouble working out what their utility function actually is even slightly (Post Your Utility Function). Furthermore, humans appear to calculate utility and disutility separately - adding one to the other does not predict their behavior accurately. This makes humans highly exploitable.

pjeby posits humans' difficulty in understanding their own utility functions as the root of akrasia.

However, utility functions can be a useful model for dealing with humans in groups, e.g. in economics.

The VNM Theorem tag is likely to be a strict subtag of the Utility Functions tag, because the VNM theorem establishes when preferences can be represented by a utility function, but a post discussing utility functions may or may not discuss the VNM theorem/axioms.

Non-canonical answers

Might an aligned superintelligence immediately kill everyone and then go on to create a "hedonium shockwave"?

Show your endorsement of this answer by giving it a stamp of approval!

I think an AI inner aligned to optimize a utility function of maximize happiness minus suffering is likely to do something like this.

Inner aligned meaning the AI is trying to do the thing we trained it to do. Whether this is what we actually want or not.

"Aligned to what" is the outer alignment problem which is where the failure in this example is. There is a lot of debate on what utility functions are safe or desirable to maximize, and if human values can even be described by a utility function.