About the book
This book is written using Markdown, and compiled into HTML using Jupyter Book — an extension of Jupyter notebooks.
Each individual HTML page can be downloaded as a Jupyter notebook from the Github pages version of these notes, but note that you need to install the code and dependencies below.
About the code
All code in this book is executable. You can download the code from GitHub (Zip file, 2.5MB).
Note: The code in this book is written for understandability rather than efficiency. It is not intended to be production-level code, such as the Keras RL reinforcement learning package.
If you understand the code in these notes, you will have little problem using production-level packages such as Keras RL.
The code in this book is written with the attempt to use very little Python-specific syntax, to enable those less familiar with Python to understand code snippets.
The code in this book is written using as few external libraries as possible, to make this easy to download and run yourself.
Once you have downloaded, unzip the code and add the folder to your PYTHONPATH variable if you want to download the Jupyter notebooks.
Most files in the code have a main
function that can be run using just python <filename>py
. For most of these, no external libraries are required. However, if you want to plot the graphs or draw the trees, you will need to install:
- The Matplotlib library for plotting graphs. You can download from the website or install with
pip install matplotlib
. - The Scipy library for helping with the graph plotting. You can download from the website or install with
pip install scipy
. - The Graphviz Python library for drawing trees. You can download from the website or use
pip install graphviz
. To render the generated graphs, you will also need to install Graphviz the tool, which is called by the Python package. - The PyTorch deep learning framework is used for all deep reinforcement learning code in the book. You can download from the website or use “pip install torch’’.
Acknowledgements
Thanks to Alan Lewis for his excellent idea of demonstrating policy gradients using a logistic regression policy; and furthermore, for implementing the source for this and the deep policy gradient agent. Thanks also to Alan for setting up the library for play GIF files, which supports the interactive visualisations that are so useful in this book.
Thanks to Emma Baillie for the idea and implementation of the Contested Crossing examples, and for writing the code to run these examples on the various algorithms.