This document provides an overview of reinforcement learning concepts including:
1) It defines the key components of a Markov Decision Process (MDP) including states, actions, transitions, rewards, and discount rate.
2) It describes value functions which estimate the expected return for following a particular policy from each state or state-action pair.
3) It discusses several elementary solution methods for reinforcement learning problems including dynamic programming, Monte Carlo methods, temporal-difference learning, and actor-critic methods.