A behaviorist genie is an AI that has been averted from modeling minds in more detail than some whitelisted class of models.
This is possibly a good idea because many possible difficulties seem to be associated with the AI having a sufficiently advanced model of human minds or AI minds, including:
...and yet an AI that is extremely good at understanding material objects and technology (just not other minds) would still be capable of some important classes of pivotal achievement.
A behaviorist genie would still require most of genie theory and corrigibility to be solved. But it's plausible that the restriction away from modeling humans, programmers, and some types of reflectivity, would collectively make it significantly easier to make a safe form of this genie.
Thus, a behaviorist genie is one of fairly few open candidates for "AI that is restricted in a way that actually makes it safer to build, without it being so restricted as to be incapable of game-changing achievements".
Nonetheless, limiting the degree to which the AI can understand cognitive science, other minds, its own programmers, and itself is a very severe restriction that would prevent a number of obvious ways to make progress on the AGI subproblem and the value identification problem even for commands given to Task AGIs (Genies). Furthermore, there could perhaps be easier types of genies to build, or there might be grave difficulties in restricting the model class to some space that is useful without being dangerous.
Broadly speaking, two possible clusters of behaviorist-genie design are:
Breaking the first case down into more detail, the potential desiderata for a behavioristic design are:
These are different goals, but with some overlap between them. Some of the things we might need:
In the KANSI case, we'd presumably be 'naturally' working with limited model classes (on the assumption that everything the AI is using is being monitored, has a known algorithm, and has a known model class) and the goal would just be to prevent the KANSI agent from spilling over and creating other human models somewhere else, which might fit well into a general agenda against self-modification and subagent creation. Similarly, if every new subject is being identified and whitelisted by human monitors, then just not whitelisting the topic of modeling distant superintelligences or devising strategies for programmer manipulation, might get most of the job done to an acceptable level if the underlying whitelist is never being evaded (even emergently). This would require a lot of successfully maintained vigilance and human monitoring, though, especially if the KANSI agent is trying to allocate a new human-modeling domain once per second and every instance has to be manually checked.