Posts

Sorted by New

Wiki Contributions

Comments

Sorted by

Newest

How to solve deception and still fail.

Nathaniel Monson1y4-2

Those doesn't necessarily seem correct to me. If, eg, OpenAI develops a super intelligent, non deceptive AI, then I'd expect some of the first questions they'd ask it to be of the form "are there questions which we would regret asking you, according to our own current values? How can we avoid asking you those while still getting lots of use and insight from you? What are some standard prefaces we should attach to questions to make sure following through on your answer is good for us? What are some security measures that we can take to make sure our users lives are generally improved by interacting with you? What are some security measures we can take to minimize the chances of a world turning out very badly according to our own desires?" Etc.

Reply