Slider - AI Alignment Forum

The More Power At Stake, The Stronger Instrumental Convergence Gets For Optimal Policies

As I understand expanding candy into A and B but not expanding the other will make the ratios go differently.

In probablity one can have the assumtion of equiprobability, if you have no reason to think one is more likely than other then it might be reaosnable to assume they are equally likely.

If we knew what was important and what not we would be sure about the optimality. But since we think we don't know it or might be in error about it we are treating that the value could be hiding anywhere. It seems to work in a world where each node is pretty comparably likely to contain value. I guess it comes from the effect of the relevant utility functions being defined in the terms of states we know about.

Comparing Utilities

Slider5y80

This jumps from mathematical consistency to a kind of opinion when pareto improvement enters the picture. Sure if we have choice between two social policies and everyone prefers one over the other because their personal lot is better there is no conflict on the order. This could be warranted if for some reason we needed consensus to get a "thing passed". However where there is true conflict it seems to say that a "good" social policy can't be formed.

To be somewhat analogous with "utility monster", construct a "consensus spoiler". He exactly prefers what everyone anti-prefers, having a coference of -1 for everyone. If someone would gain something he is of the opinion that he losses. So no pareto improvements are possible. If you have a community of 100 agents that would agree to pick some states over others and construct a new comunity of 101 with the consensus spoiler then they can't form any choice function. The consensus spoiler is in effect maximally antagonistic towards everything else. The question whether it is warranted, allowed or forbidden that the coalition of 100 just proceeds with the policy choice that screws the spoiler over doesn't seem to be a mathematical kind of claim.

And even in the less extreme degree I don't get how you could use this setup to judge values that are in conflict. And if you encounter a unknown agent it seems it is ambigious whether you should take heed of its values in compromise or just treat it as a possible enemy and just adhere to your personal choices.

Deducing Impact

Slider6y20

Vzcnpg vf gur nzbhag V zhfg qb guvatf qvssreragyl gb ernpu zl tbnyf
Ngyrnfg guerr ovt fgebat vaghvgvbaf. N guvat gung unccraf vs vg gheaf gur erfhygf bs zl pheerag npgvbaf gb or jnl jbefr vf ovt vzcnpg. N guvat gung unccraf vs gur srnfvovyvgl be hgvyvgl bs npgvba ng zl qvfcbfny vf punatrq n ybg gura gung vf n ovt qrny (juvpu bsgra zrnaf gung npgvba zhfg or cresbezrq be zhfg abg or cresbezrq). Vs gurer vf n ybg bs fhecevfr ohg gur jnl gb birepbzr gur fhecevfrf vf gb pneel ba rknpgyl nf V jnf nyernql qbvat vf ybj gb ab vzcnpg.

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments