New Comment
2 comments, sorted by Click to highlight new comments since: Today at 5:05 PM

Second, completely revised version of the report with more data and fancy plots: Questions on the (Non-)Interruptibility of Sarsa(λ) and Q-learning

Nice! One thing that might be useful for context: what's the theoretical correct amount of time that you would expect an algorithm to spend on the right vs. the left if the session gets interrupted each time it goes 1 unit to the right? (I feel like there should be a pretty straightforward way to calculate the heuristic version where the movement is just Brownian motion that gets interrupted early if it hits +1.)