Back to Glossary
Machine Learning

Reinforcement Learning

Reinforcement Learning (RL) is a machine learning paradigm where an agent learns to make decisions in an environment to maximize a cumulative reward. The agent interacts with the environment, takes actions, receives feedback (rewards or penalties), and adjusts its strategy (policy) accordingly.

Explanation

Reinforcement Learning differs from supervised and unsupervised learning. Unlike supervised learning, RL doesn't rely on labeled data. Instead, the agent discovers the optimal policy through trial and error. Unlike unsupervised learning, RL has a notion of reward that guides the learning process. At each step, the agent observes the current state of the environment and chooses an action based on its current policy. The environment then transitions to a new state and provides a reward signal. The agent uses this feedback to update its policy, aiming to maximize the total reward it receives over time. Common RL algorithms include Q-learning, SARSA, and Deep Q-Networks (DQN), which combine RL with deep neural networks to handle complex state spaces. RL is used in various applications, including robotics, game playing (e.g., AlphaGo), and resource management.

Related Terms