AI ALIGNMENT FORUM
Tags
AF

Ethical Injunction

•

Applied to Moral Reality Check (a short story) by Roman Hauksson 1y ago

Yoav Ravid v1.16.0Nov 16th 2021 GMT (+164/-42) LW2

~~Related to the~~Sequences: ~~Metaethics sequence~~Ethical Injunctions.

Ethical Injunctions Sequence Summary

See alsoRelated Pages

•

Created by Eugine_Nier at 4y

Tobias D. v1.15.0Apr 8th 2014 GMT (+7/-4) typo LW1

Certain opportunities to violate an injunction will only arise if the injunction exists; ~~some~~someone planning a murder will only confess if he expects the priest not to testify. Thus the apparent gain from violating an injunction in a single case does not actually exist on a systemic level. If prospective murders know that priests makes exception for murders, then they won’t confess to the priest and the priest will not have the opportunity to make an exception. Injunctions that seem value destructive in single instance hypotheticals can be beneficial at a systemic level.

Rick_from_Castify v1.14.0Mar 19th 2014 GMT (+68/-58) /* Alternative Formats */ LW1

Podcast: ~~Castify~~http://castify.co/channels/2-less-wrong-ethical-injunctions ~~offers~~ ~~this sequence~~ ~~as a podcast for a small fee.~~

Friendofasquid v1.13.0Dec 7th 2012 GMT (+77) LW1

Alternative Formats

Castify offers this sequence as a podcast for a small fee.

Eugine_Nier v1.12.0Nov 2nd 2012 GMT (-12) Removed the "you believe" disclaimer. See http://lesswrong.com/lw/uh/trying_to_try/ and http://lesswrong.com/lw/uv/ends_dont_justify_means_among_humans/ for more details. LW2

Ethical injunctions are rules not to do something even when ~~you believe~~ it's the right thing to do. (That is, you refrain "even when your brain has computed it's the right thing to do", but this will just seem like "the right thing to do".)

William_Quixote v1.11.0Aug 22nd 2012 GMT (+1631) /* Sequence */ LW2

Linking the previous posts in the sequence to the problem of AI, this post explores ethical injunctions as failsafe mechanisms in a self-modifying AI. A simple example is that if an AI in the takeoff phase decides at iteration N that it needs to deceive it programmers about its end goals, then the goals have likely drifted too far during the modification process. An injunction against deceiving the programmers will shut down the AI before it gets any worse. Further, the AI at step N-1 will hopefully have seen this itself and built the injunction into its next iteration. As humans with many subconscious biases, a choice to impose ethical injunctions on ourselves can serve as a similar failsafe.

This post is not cross listed as a part of the listed main sequences.

Certain opportunities to violate an injunction will only arise if the injunction exists; some planning a murder will only confess if he expects the priest not to testify. Thus the apparent gain from violating an injunction in a single case does not actually exist on a systemic level. If prospective murders know that priests makes exception for murders, then they won’t confess to the priest and the priest will not have the opportunity to make an exception. Injunctions that seem value destructive in single instance hypotheticals can be beneficial at a systemic level.

This post is not cross listed as a part of the listed main sequences.

This is a round-up of some of the more interesting and insightful comments to prior posts in the sequence with detailed responses brought to the front.

This post is not cross listed as a part of the listed main sequences.

William_Quixote v1.10.0Aug 22nd 2012 GMT (+346) /* Sequence */ LW2

A speculative evo psych post reasoning that "ethical instincts" would have been adaptive in a context where people systemically underestimated the risks of getting caught ( see general overconfidence bias) and were punished heavily via exile from the tribe or outright death.

This post is not cross listed as a part of the listed main sequences.

William_Quixote v1.9.0Aug 22nd 2012 GMT (+382) /* Sequence */ LW2

A more personal / reflective post in which Eliezer looks back and observes that his ethically motivated truthfulness has led to better outcomes than he would have achieved by lying. He proposes several reasons for this including that honesty makes it harder to sweep problems away forcing him to deal with them.

This post is not cross listed as a part of the listed main sequences.

William_Quixote v1.8.0Aug 21st 2012 GMT (+439) /* Sequence */ LW2

Most lies, in order to stand against rigorous investigation, would require additional lies about supporting facts. Since people do not know all aspects of all disciplines, the web of supporting lies will eventually entail making a claim that is self evidently false to someone with expert knowledge the liar does not possess. Only a god could lie to an AI.

Part of the Against Rationalization subsequence of How To Actually Change Your Mind

William_Quixote v1.7.0Aug 21st 2012 GMT (+529) /* Sequence */ LW2

"The end does not justify the means" is just consequentialist reasoning at one meta-level up. If a human starts thinking on the object level that the end justifies the means, this has awful consequences given our untrustworthy brains; therefore a human shouldn't think this way. But it is all still ultimately consequentialism. It's just reflective consequentialism, for beings who know that their moment-by-moment decisions are made by untrusted hardware.

This post is not cross listed as a part of the listed main sequences.

William_Quixote v1.6.0Aug 21st 2012 GMT (+481) /* Sequence */ LW2

Power corrupts is well known folk wisdom. This post gives an evo-psych explanation. Corrupt behavior provides a fitness advantage, but signaling corruption makes it hard to get power. The cleanest way to not signal corruption is to honestly believe that one will not be corrupt. Thus the fittest strategy is to couple an honest desire to do good with a tendency to find the common abuses of power pleasurable.

This post is not cross listed as a part of the listed main sequences.

Zack M. Davis v1.5.0Feb 28th 2012 GMT (-21) /* Sequence */ LW2

Sequence by Eliezer Yudkowsky

Eugine_Nier v1.4.0Jun 3rd 2011 GMT (+33) /* Sequence by Eliezer Yudkowsky */ LW2

Vladimir Nesov v1.1.0Mar 26th 2011 GMT (+56/-25) LW2

Ethical ~~Injunctions~~injunctions are rules not to do something even when you believe it's the right thing to do. (That is, you refrain "even when your brain has computed it's the right thing to do", but this will just seem like "the right thing to do".)

Blog PostsSequence by Eliezer Yudkowsky

See Alsoalso

Eugine_Nier v1.0.0Mar 19th 2011 GMT (+673) Created page with "Ethical Injunctions are rules not to do something even when it's the right thing to do. (That is, you refrain "even when your brain has computed it's the right thing to do", but..." LW2

Ethical Injunctions are rules not to do something even when it's the right thing to do. (That is, you refrain "even when your brain has computed it's the right thing to do", but this will just seem like "the right thing to do".)

For example, you shouldn't rob banks even if you plan to give the money to a good cause.

This is to protect you from your own cleverness (especially taking bad black swan bets), and the Corrupted hardware you're running on.

Related to the Metaethics sequence.

Blog Posts

See Also

Corrupted hardware