Terminal Value - History - AI Alignment Forum

•

Created by Joshua Fox at 4y

TheAncientGeek v1.43.0Mar 27th 2014 GMT (+7/-7) /* Human terminal values */ ce

It is not known whether humans have terminal values that are clearly distinct from another set of instrumental values. Humans appear to adopt different values at different points in life. Nonetheless, if the theory of terminal values applies to ~~Humans'~~humans', then their system of terminal values is quite complex. The values were forged by evolution in the ancestral environment to maximize inclusive genetic fitness. These values include survival, health, friendship, social status, love, joy, aesthetic pleasure, curiosity, and much more. Evolution's implicit goal is inclusive genetic fitness, but humans do not have inclusive genetic fitness as a goal. Rather, these values, which were instrumental to inclusive genetic fitness, have become humans' terminal values (an example of subgoal stomp).

TheAncientGeek v1.42.0Mar 27th 2014 GMT (+4/-4)

In an artificial general intelligence with a utility or reward function, the terminal value is the maximization of that function. The concept is not usefully applicable to all ~~ASs,~~Als, and it is not known how applicable it is to organic entities.

TheAncientGeek v1.41.0Mar 27th 2014 GMT (+370) clarify the dintinction between the theoretical and factual aspectsof this subject

In an artificial general intelligence with a utility or reward function, the terminal value is the maximization of that function. The concept is not usefully applicable to all ASs, and it is not known how applicable it is to organic entities.

It is not known whether humans have terminal values that are clearly distinct from another set of instrumental values. Humans appear to adopt different values at different points in life. Nonetheless, if the theory of terminal values applies to Humans', then their system of terminal values is quite complex. The values were forged by evolution in the ancestral environment to maximize inclusive genetic fitness. These values include survival, health, friendship, social status, love, joy, aesthetic pleasure, curiosity, and much more. Evolution's implicit goal is inclusive genetic fitness, but humans do not have inclusive genetic fitness as a goal. Rather, these values, which were instrumental to inclusive genetic fitness, have become humans' terminal values (an example of subgoal stomp).

Joshua Fox v1.40.0Jan 19th 2014 GMT (+183/-55) /* Terminal vs. instrumental values */

Some values may be called "terminal" merely in relation to an instrumental goal, yet themselves serve instrumentally towards a higher goal. However, in considering future artificial general intelligence, the phrase "terminal value" is generally used only for the top level of the goal ~~hierarchy:~~hierarchy of the AGI itself: the true ultimate goals of ~~a system, those~~the system; but excluding goals inside the AGI in service of other goals, and excluding the purpose of the AGI's makers, the goal for which ~~do not serve any higher value.~~they built the system.

andre v1.39.0Jan 18th 2014 GMT (+6/-5) /* Terminal vs. instrumental vales */

Terminal vs. instrumental valesvalues

Joshua Fox v1.38.0Dec 3rd 2013 GMT (+11/-17) /* Human terminal values */

Humans cannot fully introspect their terminal values. Humans' terminal values are often mutually contradictory, inconsistent, and ~~change over time.~~changeable.

Joshua Fox v1.37.0Dec 3rd 2013 GMT (+9) /* Terminal vs. instrumental vales */

Terminal values stand in contrast to instrumental values (also known as extrinsic values), which are means-to-an-end, mere tools in achieving terminal values. For example, if a given university student studies merely as a professional qualification, his terminal value is getting a job, while getting good grades is an instrument to that end. If a (simple) chess program tries to maximize piece value three turns into the future, that is an instrumental value to its implicit terminal value of winning the game.

Joao Fabiano v1.36.0Oct 10th 2012 GMT (+6/-6)

A terminal value (also known as an intrinsic ~~value)~~value) is an ultimate goal, an end-in-itself. The non-standard term "supergoal" is used for this concept in Eliezer Yudkowsky's earlier writings.

Joshua Fox v1.35.0Sep 4th 2012 GMT (+100/-100)

A terminal value (also known as an intrinsic value) is an ultimate goal, an end-in-itself. The non-standard term "supergoal" is used for this concept in Eliezer Yudkowsky's earlier writings.

In an artificial general intelligence with a utility or reward function, the terminal value is the maximization of that function. ~~The non-standard term "supergoal" is used for this concept in Eliezer Yudkowsky's~~ ~~earlier writings~~.

Joshua Fox v1.34.0Aug 31st 2012 GMT (+40/-7) /* Terminal vs. instrumental vales */

Terminal values stand in contrast to instrumental ~~values,~~values (also known as extrinsic values), which are means-to-an-end, mere tools in achieving terminal values. For example, if a given university student studies merely as a professional qualification, his terminal value is getting a job, while getting good grades is an instrument to that end. If a (simple) chess program tries to maximize piece value three turns into the future, that is an instrumental value to its terminal value of winning the game.

Joshua Fox v1.33.0Aug 31st 2012 GMT (+7/-243) /* Terminal vs. instrumental vales */

Terminal values stand in contrast to instrumental values, which are means-to-an-end, mere tools in achieving terminal values. For example, if a given university student ~~does not enjoy studying, but is doing so~~studies merely as a professional qualification, his terminal value is getting a job, while getting good grades is an instrument to that end. If a (simple) chess program tries to maximize piece value three turns into the future, that is an instrumental value to its terminal value of winning the game.

Some values may be called "terminal" merely in relation to an instrumental goal, yet themselves serve instrumentally towards a higher goal. The student described above may want the job to gain social status and money; if he could get prestige and money without working he would; and in this case the job is instrumental to these other values. However, in considering future artificial general intelligence, the phrase "terminal value" is generally used only for the top level of the goal hierarchy: the true ultimate goals of a system, those which do not serve any higher value.

Joshua Fox v1.32.0Aug 31st 2012 GMT (-13) /* Non-human terminal values */

An intelligence can ~~in principle~~ work towards any terminal value, not just human-like ones. AIXI is a mathematical formalism for modeling intelligence. It illustrates that the arbitrariness of terminal values may be optimized by an intelligence: AIXI is provably more intelligent than any other agent for any computable reward function.

Joshua Fox v1.30.0Aug 30th 2012 GMT (-234) /* References */

References

~~Joshua Fox and Carl Shulman (2010), "Superintelligence does not imply benevolence", Proceedings of the VIII European Conference on Computing and Philosophy, Oct, 2010. Ed. Klaus Mainzer. (Munich: Verlag Dr. Hut), pp. 456-461~~

Joshua Fox v1.29.0Aug 30th 2012 GMT (-253) /* References */

Joshua Fox and Carl Shulman (2010), "Superintelligence does not imply benevolence", Proceedings of the VIII European Conference on Computing and Philosophy, Oct, 2010. Ed. Klaus Mainzer. (Munich: Verlag Dr. Hut), pp. 456-461
~~S. Omohundro, The basic AI drives~~. In Artificial general intelligence 2008: Proceedings of the first AGI conference, ed. Pei Wang, Ben Goertzel, and Stan Franklin, 483–492. Frontiers in Artificial Intelligence and Applications 171. Amsterdam: IOS Press.

Joshua Fox v1.28.0Aug 30th 2012 GMT (-1227)

Benevolence may arise even if not specified as an end-goal, is it is a common instrumental value for agents with a variety of terminal values. For example, humans often cooperate because they expect either an immediate benefit in response; or because they want to establish a reputation that may engender future cooperation; or because they have live in a human society that rewards cooperation and punishes misbehavior. Humans sometimes undergo a moral shift (described by Immanuel Kant) in which benevolence changes from a merely instrumental value to a terminal one--they become altruistic and learn to value benevolence in its own right.

However, such shifts cannot be relied on to bring about benevolence in an artificial general intelligence. Benevolence as an instrumental value for an AGI only when humans are at roughly equal power to it. If the AGI is much more intelligent than humans, it will not care about the rewards and punishments which humans can deliver. Moreover, a Kantian shift is unlikely in a sufficiently powerful AGI, as any changes in one's goals, including ~~replacement of terminal by instrumental values, generally reduces the likelihood of achieving one's goals (Fox & Shulman 2010; Omohundro 2008).~~

Joshua Fox v1.27.0Aug 30th 2012 GMT (+23/-29)

Humans' system of terminal values is quite complex. The values were forged by evolution in the ancestral environment to maximize inclusive genetic fitness. These values include survival, health, friendship, social status, love, joy, aesthetic pleasure, curiosity, and much more. Evolution's implicit goal is inclusive genetic fitness, but humans do not have inclusive genetic fitness as a goal. Rather, these values, which were ~~*instrumental*~~instrumental to inclusive genetic fitness, have become humans' ~~*terminal*~~terminal values (an example of subgoal stomp).

An intelligence can in principle work towards any terminal value, not just human-like ones. AIXI is a mathematical formalism for modeling intelligence. It illustrates that the arbitrariness of terminal values may be optimized by an intelligence: AIXI is provably more intelligent than any other agent for ~~*any*~~any computable reward function.

Joshua Fox v1.26.0Aug 30th 2012 GMT (+221/-118)

In an ~~AGI~~artificial general intelligence with a utility or reward function, the terminal value is the maximization of that function. The non-standard term "supergoal" is used for this concept in Eliezer Yudkowsky's earlier writings.

Some values may be called "terminal" merely in relation to an instrumental goal, yet themselves serve instrumentally towards a higher goal. The student described above may want the job to gain social status and money; if he could get prestige and money without working he would; and in this case the job is instrumental to these other values. However, in considering future ~~AI,~~artificial general intelligence, the phrase "terminal value" is generally used only for the top level of the goal hierarchy: the true ultimate goals of a system, those which do not serve any higher value.

Since people make tools instrumentally, to serve specific human values, the ~~AI's~~ assigned value system of the artificial general intelligence may be much simpler than humans'. This will pose a danger, as an AI must seek to protect all human values if a positive human future is to be achieved. The paperclip maximizer is a thought experiment about an artificial general intelligence with consequences disastrous to humanity, with the the apparently innocuous terminal value of maximizing the number of paperclips in its collection,

However, such shifts cannot be relied on to bring about benevolence in an ~~AI.~~artificial general intelligence. Benevolence as an instrumental value for an ~~AI is relevant~~AGI only when humans are at roughly equal power to ~~the AI.~~it. If the AIAGI is much more intelligent than humans, it will not care about the rewards and punishments ~~from humans..~~which humans can deliver. Moreover, a Kantian shift is unlikely in a sufficiently powerful ~~AI is unlikely to undergo a Kantian shift,~~AGI, as any changes in one's goals, including replacement of terminal by instrumental values, generally reduces the likelihood of ~~maximizing~~achieving one's ~~utility function~~goals (Fox & Shulman 2010; Omohundro 2008).

Kaj Sotala v1.25.0Aug 29th 2012 GMT (+55/-35)

A terminal value (also known as an intrinsic value) is an ultimate goal, an end-in-itself.

In an AIAGI with a utility or reward function, the terminal value is the maximization of that function.

~~the~~ The non-standard term "supergoal" is used for this concept in Eliezer Yudkowsky's earlier writings.

Future artificial general intelligences may have the maximization of a utility function or of a reward function (reinforcement learning) as their terminal value. The function will likely be set by the ~~AI'~~AGI's designers.

Since people make tools instrumentally, to serve specific human values, the AI's assigned value system may be much simpler than humans'. This will pose a danger, as an AI must seek to protect ~~*all*~~all human values if a positive human future is to be achieved. The paperclip maximizer is a thought experiment about an artificial general intelligence with consequences disastrous to humanity, with the the apparently innocuous terminal value of maximizing the number of paperclips in its collection,

However, such shifts cannot be relied on to bring about benevolence in an AI. Benevolence as an instrumental value for an AI is relevant only when humans are at roughly equal power to the AI. If the AI is much more intelligent than humans, it will not care about rewards and punishments from humans.. Moreover, a sufficiently powerful AI is unlikely to undergo a Kantian shift, as any changes in one's goals, including ~~[Subgoal stomp|~~replacement of terminal by instrumental ~~values]~~values, generally reduces the likelihood of maximizing one's utility function (Fox & Shulman 2010; Omohundro 2008).

Joshua Fox v1.24.0Aug 26th 2012 GMT (+8/-8) /* In a Friendly AI */

However, such shifts cannot be relied on to bring about benevolence in an AI. Benevolence as an instrumental value for an AI is relevant only when humans are at roughly equal power to the AI. If the AI is much more intelligent than humans, it will not care about rewards and punishments from humans.. Moreover, a sufficiently powerful AI is unlikely to undergo a Kantian shift, as any changes in one's goals, including ~~[subgoal~~[Subgoal stomp|replacement of terminal by instrumental values], generally reduces the likelihood of maximizing one's utility function (Fox & Shulman 2010; Omohundro 2008).