Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning series) Hardcover – 8 May 1998
- Choose from over 13,000 locations across the UK
- Prime members get unlimited deliveries at no additional cost
- Find your preferred location and add it to your address book
- Dispatch to this address when you check out
There is a newer edition of this item:
Frequently bought together
Customers who viewed this item also viewed
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Would you like to tell us about a lower price?
If you are a seller for this product, would you like to suggest updates through seller support?
Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. This text aims to provide a clear and simple account of the key ideas and algorithms of reinforcement learning. The discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part one defines the reinforcement learning problems in terms of Markov decision problems. Part two provides basic solution methods - dynamic programming, Monte Carlo simulation and temporal-difference learning - and part three presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces and planning. The two final chapters present case studies and consider the future of reinforcement learning.
From the Author
A unified approach to AI, machine learning, and control
Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In this book, we provide an explanation of the key ideas and algorithms of reinforcement learning. The discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. This book is meant to be an introductory treatment of reinforcement learning, emphasizing foundations and ideas rather than the latest developments and mathematical proofs. We divide the ideas underlying the field into a half dozen primary dimensions, consider each in detail, and then combine them to form a much larger space of possible methods including all the most popular ones from Q-learning to value iteration and heuristic search. In this way we have tried to make the book interesting to both newcomers and experts alike. We have tried to make the work accessible to the broadest possible audiences in artificial intelligence, control engineering, operations research, psychology, and neuroscience. If you are a teacher, we urge you to consider creating or altering a course to use the book. We have found that the book works very well as the text for a course on reinforcement learning at the graduate or advanced undergraduate level. The eleven chapters can be covered one per week. Exercises are provided in each chapter to help the students think on their own about the material. Answers to the exercises are available to instructors, for now from me, and probably later from MIT Press in an instructor's manual. Programming projects are also suggested throughout the book. Of course, the book can also be used to help teach reinforcement learning as it is most commonly done now, that is, as part of a broader course on machine learning, artificial intelligence, neural networks, or advanced control. I have taught all the material in the book in as little as four weeks, and of course subsets can be covered in less time. Table of contents: Part I: The Problem 1 Introduction 2 Evaluative Feedback 3 The Reinforcement Learning Problem Part II: Elementary Methods 4 Dynamic Programming 5 Monte Carlo Methods 6 Temporal Difference Learning Part III: A Unified View 7 Eligibility Traces 8 Generalization and Function Approximation 9 Planning and Learning 10 Dimensions of Rreinforcement Learning 11 Case Studies For further information, see http://envy.cs.umass.edu/~rich/book/the-book.html.See all Product description
Customers who bought this item also bought
There was a problem filtering reviews right now. Please try again later.
Most helpful customer reviews on Amazon.com
The authors define reinforcement learning as learning how to map situations to actions so as to maximize a numerical reward. The machine that is indulging in reinforcement learning discovers on its own which actions will optimize the reward by trying out these actions. It is the ability of such a machine to learn from experience that distinguishes it from one that is indulging in supervised learning, for in the latter examples are needed to guide the machine to the proper concept or knowledge. The authors emphasize the "exploration-exploitation" tradeoffs that reinforcement-learning machines have to deal with as they interact with the environment.
For the authors, a reinforcement learning system consists of a `policy', a `reward function', a `value function', and a `model' of the environment. A policy is a mapping from the states of the environment that are perceived by the machine to the actions that are to be taken by the machine when in those states. The reward function maps each perceived state of the environment to a number (the reward). A value function specifies what is the good for the machine over the long run. A model, as the name implies, is a representation of the behavior of the environment. The authors emphasize that all of the reinforcement learning methods that are discussed in the book are concerned with the estimation of value functions, but they point out that other techniques are available for solving reinforcement learning problems, such as genetic algorithms and simulated annealing.
The authors use dynamic programming, Monte Carlo simulation, and temporal-difference learning to solve the reinforcement learning problem, but they emphasize that each of these methods will not give a free-lunch. An entire chapter is devoted to each of these methods however, giving the reader a good overview of the weaknesses and strengths of each of these approaches. The differences between them usual boil down to issues of performance rather than accuracy in the generated solutions. Temporal difference learning in fact is viewed in the book as a combination of Monte Carlo and dynamic programming techniques, and in the opinion of this reviewer, has resulted in some of the most impressive successes for applications based on reinforcement learning. One of these is TD-Gammon, developed to play backgammon, and which is also discussed in the book.
The authors emphasize that these three main strategies for solving reinforcement learning problems are not mutually exclusive. Instead each of them could be used simultaneously with the others, and they devote a few chapters in the book illustrating how this "unified" approach can be advantageous for reinforcement learning problems. They do this by using explicit algorithms and not just philosophical discussion. These discussions are very interesting and illustrate beautifully the idea that there is no "free lunch" in any of the different algorithms involved in reinforcement learning.
In the last chapter of the book the authors overview some of the more successful applications of reinforcement learning, one of them already mentioned. Another one discussed is the `acrobot', which is a two-link, underactuated robot, which models to some extent the motion of a gymnast on a high bar. The motion of the acrobot is to be controlled by swinging its tip above the first joint, with appropriate rewards given until this goal is reached. The authors use the `Sarsa' learning algorithm, developed earlier in the book, for solving this reinforcement learning problem. The acrobot is an example of the current intense interest in machine learning of physical motion and intelligent control theory.
Another example discussed in this chapter deals with the problem of elevator dispatching, which the authors include as an example of a problem that cannot be dealt with efficiently by dynamic programming. This problem is studied with Q-learning and via the use of a neural network trained by back propagation.
The authors also treat a problem of great importance in the cellular phone industry, namely that of dynamic channel allocation. This problem is formulated as a semi-Markov decision problem, and reinforcement learning techniques were used to minimize the probability of blocking a call. Reinforcement learning has become very important in the communications industry of late, as well as in queuing networks.