A tabular implementation of the SARSA reinforcement learning algorithm which is related to Q-learning