In this project-based course, we will explore Reinforcement Learning in Python. Reinforcement Learning, or RL for short, is different from supervised learning methods in that, rather than being given correct examples by humans, the AI finds the correct answers for itself through a predefined framework of reward signals.
In this course, we will discuss theories and concepts that are integral to RL, such as the Multi-Arm Bandit problem and its implications, and how Markov Decision processes can be leveraged to find solutions. Then we will implement code examples in Python of basic Temporal Difference algorithms and Monte Carlo techniques. Finally, we implement an example of Q-learning in Python.
I would encourage learners to experiment with the tools and methods discussed in this course. The learner is highly encouraged to experiment beyond the scope of the course.
Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
In this course, we will discuss theories and concepts that are integral to RL, such as the Multi-Arm Bandit problem and its implications, and how Markov Decision processes can be leveraged to find solutions. Then we will implement code examples in Python of basic Temporal Difference algorithms and Monte Carlo techniques. Finally, we implement an example of Q-learning in Python.
I would encourage learners to experiment with the tools and methods discussed in this course. The learner is highly encouraged to experiment beyond the scope of the course.
Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.