You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let $\pi$ and $\pi'$ be any pair of deterministic policies such that, for all $s\in S$,
$$
q_\pi (s, \pi'(s)) \ge v_\pi (s).
$$
Then the policy $\pi'$ must be as good as, or better than, $\pi$ . That is, it must obtain greater or equal expected return from all states $s\in S$:
$$
v_{\pi'} (s) \ge v_\pi (s).
$$
Moreover, if there is strict inequality at any state in the condition, then there must be strict inequality at that state in the conclusion.