You (correctly, I believe) distinguish between controlling the reward function and controlling the rewards. This is very important as reflected in your noting the disanalogy to AGI. So I'm a little puzzled by your association of the second bullet point (controlling the reward function, which parents have quite low but non-zero control over) with behaviorism (controlling the rewards, which parents have a lot of control over).
UPDATE: I WROTE A BETTER DISCUSSION OF THIS TOPIC AT: Heritability, Behaviorism, and Within-Lifetime RL)
Hmm. I’m not sure it’s that important what is or isn’t “behaviorism”, and anyway I’m not an expert on that (I haven’t read original behaviorist writing, so maybe my understanding of “behaviorism” is a caricature by its critics). But anyway, I thought Scott & Eliezer were both interested in the question of what happens when the kid grows up and the parents are no longer around.
My comment above was a bit sloppy. Let me try again. Here are two stories:
“RL with continuous learning” story: The person has an internal reward function in their head, and over time they’ll settle into the patterns of thought & behavior that best tickle their internal reward function. If they spend a lot of time in the presence of their parents, they’ll gradually learn patterns of thought & behavior that best tickle their internal reward function in the presence of their parents. If they spend a lot of time hanging out with friends, they’ll gradually learn patterns of thought & behavior that best tickle their internal reward function when they’re hanging out with friends. As adults in society, they’ll gradually learn patterns of thought & behavior that best tickle their internal reward function as adults in society.
“RL learn-then-get-stuck” story: As Scott wrote in OP, “a child does something socially proscribed (eg steal). Their parents punish them. They learn some combination of "don't steal" and "don't get caught stealing". A few people (eg sociopaths) learn only "don't get caught stealing", but most of the rest of us get at least some genuine aversion to stealing that eventually generalizes into a real sense of ethics.” (And that “real sense of ethics” persists through adulthood.)
I think lots of evidence favors the first story over the second story, at least in humans (I don’t know much about non-human animals). Particularly: (1) heritability studies, (2) cultural shifts, (3) p
You (correctly, I believe) distinguish between controlling the reward function and controlling the rewards. This is very important as reflected in your noting the disanalogy to AGI. So I'm a little puzzled by your association of the second bullet point (controlling the reward function, which parents have quite low but non-zero control over) with behaviorism (controlling the rewards, which parents have a lot of control over).