Use of reinforcement learning, agent acts, receives reward