AI ALIGNMENT FORUM
AF

Wikitags

Hansonian Pre-Rationality

Edited by abramdemski, et al. last updated 30th Jul 2020

In defining Hansonian Pre-Rationality Robin Hanson offers an intriguing argument that, upon learning that our beliefs were created by an irrational process (be it a religious upbringing or a genetic predisposition to paranoid depression), we should update to agree with the alternate version of ourselves who could have had different beliefs. Agents who agree with alternate selves in this way are "pre-rational". (NOTE: not to be confused with "pre-rational" meaning "not yet rational" or "less than rational".)

Suppose you are an AI who was designed by a drunk programmer. Your prior contains an "optimism" parameter which broadly skews how you see the world -- set it to -100 and you'd expect world-ending danger around every corner, while +100 would make you expect heaven around every corner. Although your powerful learning algorithm allows you to accurately predict the world, the optimism/pessimism bias never fully goes away: it skews your views about anything you don't know.

Unfortunately for you, your programmer set the parameter randomly, rather than attempting to figure out which setting was most accurate or useful. You know for a fact they just mashed the num pad randomly.

How should you think about this?

Subscribe
Subscribe
Discussion0
Discussion0
Posts tagged Hansonian Pre-Rationality
20Towards a mechanistic understanding of corrigibility
evhub
6y
25
9Towards an Intentional Research Agenda
romeostevensit
6y
5
Add Posts