This equation describes the expected reward for taking the action prescribed by some policy pi .
The equation for the optimal policy is referred to as the Bellman optimality equation:
V^{*}(s)=max _{a}R(s,a)+gamma sum _{s'}P(s'|s,a)V^{*}(s').
It describes the reward for taking the action giving the highest expected return.
Extremely important and useful
recurrence relations
Can be used to compute the return from a given policy or to compute the optimal return via value iteration