Temporal-Difference Learning and the importance of exploration: An illustrated guide

A comparison of Temporal-Difference(0) and Constant-α Monte Carlo methods on the Random Walk Task