README
Part I
Chapter 1: Introduction
Chapter 2: Multi-armed Bandits
Chapter 3: Finite Markov Decision Processes
Chapter 4: Dynamic Programming
Chapter 5: Monte Carlo Methods
Chapter 6: Temporal-Difference Learning
Chapter 7: n-step Bootstrapping
Chapter 8: Planning and Learning with Tabular Methods
Part II
Chapter 9: On-policy Prediction with Approximation
Chapter 10: On-policy Control with Approximation
Chapter 11: Off-policy Methods with Approximation
Chapter 12: Eligibility Traces
Chapter 13: Policy Gradient Methods
Part III
Chapter 14: Psychology
Chapter 15: Neuroscience
Chapter 16: Applications and Case Studies
本书使用 GitBook 发布
Chapter 14: Psychology
Chapter 14 Psychology
略
results matching "
"
No results matching "
"