A reinforcement learning algorithm that learns what the value of the action is for a particular state and doesn’t require a model of its environment, thus enabling it to handle random transitions and rewards. Q-learning is known as an off-policy learning algorithm.
Login to register for events. Don’t have an account? Just register for an event and an account will be created for you!