A Concentration Bound for LSPE(λ)
The popular LSPE(λ) algorithm for policy evaluation is revisited to derive a concentration bound that gives high probability performance guarantees from some time on.
READ FULL TEXTThe popular LSPE(λ) algorithm for policy evaluation is revisited to derive a concentration bound that gives high probability performance guarantees from some time on.
READ FULL TEXT