All of Joe Kwon's Comments + Replies

Really appreciated this post and I'm especially excited for post 13 now! In the past month or two, I've been thinking about stuff like "I crave chocolate" and "I should abstain from eating chocolate" as being a result of two independent value systems (one whose policy was shaped by evolutionary pressure and one whose policy is... idk vaguely "higher order" stuff where you will endure higher states of cortisol to contribute to society or something). 

I'm starting to lean away from this a little bit, and I think reading this post gave me a good idea of w... (read more)

3Steve Byrnes
Thanks! Right, I think there's one reward function (well, one reward function that's relevant for this discussion), and that for every thought we think, we're thinking it because it's rewarding to do so—or at least, more rewarding than alternative thoughts. Sometimes a thought is rewarding because it involves feeling good now, sometimes it's rewarding because it involves an expectation of feeling good in the distant future, sometimes it's rewarding because it involves an expectation that it will make your beloved friend feel good, sometimes it's rewarding because it involves an expectation that it will make your admired in-group members very impressed with you, etc. I think that the thing that gets rewarded is thoughts / plans, not just actions / states. So we don't have to assume that the Thought Generator is proposing an action that's unrewarding now (going to the gym) in order to get into a more-rewarding state later on (being ripped). Instead, the Thought Generator can generate one thought right now, “I'm gonna go to the gym so that I can get ripped”. That one thought can be rewarding right now, because the “…so that I can get ripped” is right there in the thought, providing evidence to the brainstem that the thought should be rewarded, and that evidence can plausibly outweigh the countervailing evidence from the “I'm gonna go to the gym…” part of the thought. I do think there's still an adjustable parameter in the brain related to time-discounting, even if the details are kinda different than in normal RL. But I don't see a strong connection between that and social instincts. For example, if you abstain from ice cream to avoid a stomach ache, that's a time-discounting thing, but it's not a social-instincts thing. It's possible that social animals in general are genetically wired to time-discount less than non-social animals, but I don't have any particular reason to expect that to be the case. Or, maybe humans in particular are genetically wired to time-disc