This is one of the key open questions in the plan experiment.
Coherent extrapolated volition (alignment target)
From Policy desiderata in the development of machine superintelligence:
In the most general terms, we optimistically take the overarching objective to be the realization of a widely appealing and inclusive near- and long-term future that ultimately achieves humanity’s potential for desirable development while being considerate to beings of all kinds whose interests may be affected by our choices. An ideal proposal for governance arrangements would be one conducive to that end.