Finally, if we want to make the model capture certain non-Bayesian human behaviors while still keeping most of the picture, we can assume that instrumental values and/or epistemic updates are cached. This creates the possibility of cache inconsistency/incoherence.
In my mind, there is an amount of internal confusion which feels much stronger than what I would expect for an agent as in the OP.
Or is the idea possibly that everything in the architecture uses caching and instrumental values? From reading, I imagined a memory+cache structure instead of being... (read more)
"Cached" might be an unhelpful term here, compared to "amortized". 'Cache' makes one think of databases or memories, as something you 'know' (in a database or long-term memory somewhere), whereas in practice it tends to be more something you do - fusing inference with action. (They are 'cached' in the same way that you might loosely talk about a neural net 'caching' a complicated-to-compute function, like a value function in RL/decision theory.)
So 'amortized' tends to be more used in the Bayesian RL literature, and give you an idea of what Bayesian RL agen... (read more)
In my mind, there is an amount of internal confusion which feels much stronger than what I would expect for an agent as in the OP. Or is the idea possibly that everything in the architecture uses caching and instrumental values? From reading, I imagined a memory+cache structure instead of being... (read more)
"Cached" might be an unhelpful term here, compared to "amortized". 'Cache' makes one think of databases or memories, as something you 'know' (in a database or long-term memory somewhere), whereas in practice it tends to be more something you do - fusing inference with action. (They are 'cached' in the same way that you might loosely talk about a neural net 'caching' a complicated-to-compute function, like a value function in RL/decision theory.)
So 'amortized' tends to be more used in the Bayesian RL literature, and give you an idea of what Bayesian RL agen... (read more)