2 Non-manipulative oracles

by Stuart_Armstrong

6th Feb 2015

1 min read

1

2

Oracle AI

Personal Blog

New Comment

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 8:13 PM

[-]jessicata11y00

I discussed this with Benja at a previous MIRIx workshop and I don't remember exactly what we concluded, but I think it mostly works, it just requires that people behave sensibly when they get scrubbed predictions.

Now that I think about it: to handle cases when people don't behave that sensibly with scrubbed predictions, maybe we want some kind of sequence of oracles, where oracle 0 outputs nothing, and oracle n+1 outputs what would happen if it were replaced with oracle n. We could take the limit as n approaches infinity, but then we don't know that much about which fixed point we will get (it will be controlled by subtle feedback loops), so maybe we want something like n=3 being most probable (although we will want to make n random between 0 and 3 so it's meaningful to condition on n=0, n=1, n=2).

Reply

Moderation Log

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

2

Non-manipulative oracles

2