Practical Reinforcement Learning

بواسطة: Coursera

Overview

Welcome to the Reinforcement Learning online course.

Here you will find out about:

- foundations of RL methods: value/policy iteration, q-learning, policy gradient, etc.
--- with math & batteries included

- using deep neural networks for RL tasks
--- also known as "the hype train"

- state of the art RL algorithms
--- and how to apply duct tape to them for practical problems.

- and, of course, teaching your neural network to play games
--- because that's what everyone thinks RL is about. We'll also use it for seq2seq and contextual bandits.

Jump in. It's gonna be fun!

Do you have technical problems? Write to us: [email protected].

Syllabus

  • Intro: why should I care?
    • In this module we are gonna define and "taste" what reinforcement learning is about. We'll also learn one simple algorithm that can solve reinforcement learning problems with embarrassing efficiency.
  • At the heart of RL: Dynamic Programming
    • This week we'll consider the reinforcement learning formalisms in a more rigorous, mathematical way. You'll learn how to effectively compute the return your agent gets for a particular action - and how to pick best actions based on that return.
  • Model-free methods
    • This week we'll find out how to apply last week's ideas to the real world problems: ones where you don't have a perfect model of your environment.
  • Approximate Value Based Methods
    • This week we'll learn to scale things even farther up by training agents based on neural networks.
  • Policy-based methods
    • We spent 3 previous modules working on the value-based methods: learning state values, action values and whatnot. Now's the time to see an alternative approach that doesn't require you to predict all future rewards to learn something.
  • Exploration
    • In this final week you'll learn how to build better exploration strategies with a focus on contextual bandit setup. In honor track, you'll also learn how to apply reinforcement learning to train structured deep learning models.

Taught by

Pavel Shvechikov and Alexander Panin

Practical Reinforcement Learning
الذهاب الي الدورة

Practical Reinforcement Learning

بواسطة: Coursera

  • Coursera
  • مجانية
  • الإنجليزية
  • متاح شهادة
  • أيام محددة
  • الجميع
  • Arabic, French, Portuguese, Italian, German, Russian, English, Spanish, Korean
8.1.2PHP Version232msRequest Duration2MBMemory UsageGET ar/الدورات/{slug}Route
    • Booting (139ms)
    • Application (92.07ms)
    • 1 x Booting (60.01%)
      138.99ms
      1 x Application (39.75%)
      92.07ms
      14 templates were rendered
      • public.courses.show (resources/views/public/courses/show.blade.php)3bladefile
        Params
        0
        course
        1
        links
        2
        config
      • public.courses.partials.breadcrumbs (resources/views/public/courses/partials/breadcrumbs.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.courses.partials.heading (resources/views/public/courses/partials/heading.blade.php)7bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        classes
      • public.courses.partials.details (resources/views/public/courses/partials/details.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.courses.partials.breadcrumbs (resources/views/public/courses/partials/breadcrumbs.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.courses.partials.heading (resources/views/public/courses/partials/heading.blade.php)7bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        classes
      • public.layouts.main (resources/views/public/layouts/main.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.layouts.partials.meta (resources/views/public/layouts/partials/meta.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.layouts.partials.navbar (resources/views/public/layouts/partials/navbar.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.auth.profile.partials.links (resources/views/public/auth/profile/partials/links.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.auth.profile.partials.link (resources/views/public/auth/profile/partials/link.blade.php)8bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        route
        7
        title
      • public.auth.profile.partials.link (resources/views/public/auth/profile/partials/link.blade.php)8bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        route
        7
        title
      • public.auth.profile.partials.link (resources/views/public/auth/profile/partials/link.blade.php)8bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        route
        7
        title
      • public.layouts.partials.flash-session (resources/views/public/layouts/partials/flash-session.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      uri
      GET ar/الدورات/{slug}
      middleware
      web, localize:ar
      controller
      App\Http\Controllers\CourseController@show
      as
      ar.courses.show
      namespace
      prefix
      /ar
      where
      file
      app/Http/Controllers/CourseController.php:17-35
      7 statements were executed13.29ms
      • select * from `courses` where `slug_ar` = 'practical-reinforcement-learning' limit 1
        7.25ms/app/Http/Controllers/CourseController.php:20corspedia
        Metadata
        Bindings
        • 0. practical-reinforcement-learning
        Backtrace
        • 17. /app/Http/Controllers/CourseController.php:20
        • 18. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 19. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 20. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 21. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • update `courses` set `visitors` = `visitors` + 1, `courses`.`updated_at` = '2025-04-17 09:37:22' where `id` = 2427
        5.02ms/app/Http/Controllers/CourseController.php:21corspedia
        Metadata
        Bindings
        • 0. 2025-04-17 09:37:22
        • 1. 2427
        Backtrace
        • 17. /app/Http/Controllers/CourseController.php:21
        • 18. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 19. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 20. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 21. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • select `id`, `name_en`, `name_ar`, `topic_id`, `slug_en`, `slug_ar` from `subjects` where `subjects`.`id` in (3)
        230μs/app/Http/Controllers/CourseController.php:23corspedia
        Metadata
        Backtrace
        • 20. /app/Http/Controllers/CourseController.php:23
        • 21. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 22. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 23. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 24. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • select `id`, `name_en`, `name_ar`, `slug_en`, `slug_ar` from `topics` where `topics`.`id` in (1)
        180μs/app/Http/Controllers/CourseController.php:23corspedia
        Metadata
        Backtrace
        • 25. /app/Http/Controllers/CourseController.php:23
        • 26. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 27. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 28. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 29. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • select * from `institutions` where `institutions`.`id` in (35) and `institutions`.`deleted_at` is null
        260μs/app/Http/Controllers/CourseController.php:23corspedia
        Metadata
        Backtrace
        • 20. /app/Http/Controllers/CourseController.php:23
        • 21. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 22. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 23. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 24. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • select * from `providers` where `providers`.`id` in (2) and `providers`.`deleted_at` is null
        170μs/app/Http/Controllers/CourseController.php:23corspedia
        Metadata
        Backtrace
        • 20. /app/Http/Controllers/CourseController.php:23
        • 21. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 22. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 23. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 24. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • select * from `html_files` where `html_files`.`id` = 2418 limit 1
        180μs/app/Models/Course.php:84corspedia
        Metadata
        Bindings
        • 0. 2418
        Backtrace
        • 21. /app/Models/Course.php:84
        • 28. view::public.courses.show:29
        • 30. /vendor/laravel/framework/src/Illuminate/Filesystem/Filesystem.php:125
        • 31. /vendor/laravel/framework/src/Illuminate/View/Engines/PhpEngine.php:58
        • 32. /vendor/laravel/framework/src/Illuminate/View/Engines/CompilerEngine.php:72
      App\Models\HtmlFile
      1
      App\Models\Provider
      1
      App\Models\Institution
      1
      App\Models\Topic
      1
      App\Models\Subject
      1
      App\Models\Course
      1
        _token
        I3H3ldMWWhFhf5I5OEWazo7cwUWXeBhKjAo7l0xg
        locale
        ar
        _previous
        array:1 [ "url" => "https://www.corspedia.com/ar/%D8%A7%D9%84%D8%AF%D9%88%D8%B1%D8%A7%D8%AA/practi...
        _flash
        array:2 [ "old" => [] "new" => [] ]
        PHPDEBUGBAR_STACK_DATA
        []
        path_info
        /ar/%D8%A7%D9%84%D8%AF%D9%88%D8%B1%D8%A7%D8%AA/practical-reinforcement-learning
        status_code
        200
        
        status_text
        OK
        format
        html
        content_type
        text/html; charset=UTF-8
        request_query
        []
        
        request_request
        []
        
        request_headers
        0 of 0
        array:24 [ "cf-ipcountry" => array:1 [ 0 => "US" ] "cf-connecting-ip" => array:1 [ 0 => "3.133.91.217" ] "cdn-loop" => array:1 [ 0 => "cloudflare; loops=1" ] "x-forwarded-proto" => array:1 [ 0 => "https" ] "cf-visitor" => array:1 [ 0 => "{"scheme":"https"}" ] "sec-fetch-site" => array:1 [ 0 => "none" ] "accept" => array:1 [ 0 => "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7" ] "user-agent" => array:1 [ 0 => "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)" ] "upgrade-insecure-requests" => array:1 [ 0 => "1" ] "sec-ch-ua-platform" => array:1 [ 0 => ""Windows"" ] "sec-ch-ua-mobile" => array:1 [ 0 => "?0" ] "sec-ch-ua" => array:1 [ 0 => ""HeadlessChrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"" ] "cache-control" => array:1 [ 0 => "no-cache" ] "pragma" => array:1 [ 0 => "no-cache" ] "sec-fetch-dest" => array:1 [ 0 => "document" ] "cf-ray" => array:1 [ 0 => "931af17e3eaf2250-ORD" ] "accept-encoding" => array:1 [ 0 => "gzip, br" ] "priority" => array:1 [ 0 => "u=0, i" ] "sec-fetch-user" => array:1 [ 0 => "?1" ] "sec-fetch-mode" => array:1 [ 0 => "navigate" ] "x-forwarded-for" => array:1 [ 0 => "3.133.91.217" ] "host" => array:1 [ 0 => "www.corspedia.com" ] "content-length" => array:1 [ 0 => "" ] "content-type" => array:1 [ 0 => "" ] ]
        request_server
        0 of 0
        array:50 [ "USER" => "www-data" "HOME" => "/var/www" "HTTP_CF_IPCOUNTRY" => "US" "HTTP_CF_CONNECTING_IP" => "3.133.91.217" "HTTP_CDN_LOOP" => "cloudflare; loops=1" "HTTP_X_FORWARDED_PROTO" => "https" "HTTP_CF_VISITOR" => "{"scheme":"https"}" "HTTP_SEC_FETCH_SITE" => "none" "HTTP_ACCEPT" => "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7" "HTTP_USER_AGENT" => "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)" "HTTP_UPGRADE_INSECURE_REQUESTS" => "1" "HTTP_SEC_CH_UA_PLATFORM" => ""Windows"" "HTTP_SEC_CH_UA_MOBILE" => "?0" "HTTP_SEC_CH_UA" => ""HeadlessChrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"" "HTTP_CACHE_CONTROL" => "no-cache" "HTTP_PRAGMA" => "no-cache" "HTTP_SEC_FETCH_DEST" => "document" "HTTP_CF_RAY" => "931af17e3eaf2250-ORD" "HTTP_ACCEPT_ENCODING" => "gzip, br" "HTTP_PRIORITY" => "u=0, i" "HTTP_SEC_FETCH_USER" => "?1" "HTTP_SEC_FETCH_MODE" => "navigate" "HTTP_X_FORWARDED_FOR" => "3.133.91.217" "HTTP_HOST" => "www.corspedia.com" "REDIRECT_STATUS" => "200" "SERVER_NAME" => "corspedia.com" "SERVER_PORT" => "443" "SERVER_ADDR" => "141.95.147.152" "REMOTE_USER" => "" "REMOTE_PORT" => "54222" "REMOTE_ADDR" => "172.71.255.135" "SERVER_SOFTWARE" => "nginx/1.18.0" "GATEWAY_INTERFACE" => "CGI/1.1" "HTTPS" => "on" "REQUEST_SCHEME" => "https" "SERVER_PROTOCOL" => "HTTP/2.0" "DOCUMENT_ROOT" => "/var/www/corspedia/public" "DOCUMENT_URI" => "/index.php" "REQUEST_URI" => "/ar/%D8%A7%D9%84%D8%AF%D9%88%D8%B1%D8%A7%D8%AA/practical-reinforcement-learning" "SCRIPT_NAME" => "/index.php" "CONTENT_LENGTH" => "" "CONTENT_TYPE" => "" "REQUEST_METHOD" => "GET" "QUERY_STRING" => "" "SCRIPT_FILENAME" => "/var/www/corspedia/public/index.php" "PATH_INFO" => "" "FCGI_ROLE" => "RESPONDER" "PHP_SELF" => "/index.php" "REQUEST_TIME_FLOAT" => 1744882641.8952 "REQUEST_TIME" => 1744882641 ]
        request_cookies
        []
        
        response_headers
        0 of 0
        array:5 [ "content-type" => array:1 [ 0 => "text/html; charset=UTF-8" ] "cache-control" => array:1 [ 0 => "no-cache, private" ] "date" => array:1 [ 0 => "Thu, 17 Apr 2025 09:37:22 GMT" ] "set-cookie" => array:2 [ 0 => "XSRF-TOKEN=eyJpdiI6Ilk4bjNpM2M3WDVNV1VJTDhibDI5a1E9PSIsInZhbHVlIjoiR2l2VkNnLzRCbEVYK1dsa3hpWW91ZXhuWTJHeWJXbGFCUWRxTWxtZjJ1VkNaSnRWU0xrdmtsYmIyU0V4NjNzeTRYTEZCK0IzR3FBNDVuemdBMHpYVktrUk0ySzU2MDZIT2FQMHBWQ0VnLy9aZDR0dE9mR0Exd2RqQ2kzdDlXTFciLCJtYWMiOiIyOGJmNGNlMWU4MWVhYzQzMjNhOWNjMzQ0ZmYxNjFjMThiN2I4MzZkNjlhMTUzYWRmNjI4MDE4YjM1YjMzZTIxIiwidGFnIjoiIn0%3D; expires=Thu, 17 Apr 2025 11:37:22 GMT; Max-Age=7200; path=/; samesite=laxXSRF-TOKEN=eyJpdiI6Ilk4bjNpM2M3WDVNV1VJTDhibDI5a1E9PSIsInZhbHVlIjoiR2l2VkNnLzRCbEVYK1dsa3hpWW91ZXhuWTJHeWJXbGFCUWRxTWxtZjJ1VkNaSnRWU0xrdmtsYmIyU0V4NjNzeTRYTEZCK" 1 => "laravel_session=eyJpdiI6Ikg2YWE5a09SNFBuT2NVWTJ4eUJBWHc9PSIsInZhbHVlIjoidTgzSHc3TlBTRnpQRHVTUHJLV2MvUkcyQ0xLbXdTMnN5MzJsWHpLZWdZTHJvb0UyL1BZSUZJVjI5WDVVUFNMMElKS2xSTWhVczFiamhBdWdpdXo4UUlGbEVLMk5BVC9FUjNZR1k5WkdFbDNFcnRvcklzTkZNWEExandDTlhmSzAiLCJtYWMiOiI2NWVhMmZkY2QxMWM1N2FiOTE2MjIxMDk0NjRhNDg1MzQ2ZDE2ODE1MzgwMTExYjMzMGZjOWIyZTE4ZGY1NmU0IiwidGFnIjoiIn0%3D; expires=Thu, 17 Apr 2025 11:37:22 GMT; Max-Age=7200; path=/; httponly; samesite=laxlaravel_session=eyJpdiI6Ikg2YWE5a09SNFBuT2NVWTJ4eUJBWHc9PSIsInZhbHVlIjoidTgzSHc3TlBTRnpQRHVTUHJLV2MvUkcyQ0xLbXdTMnN5MzJsWHpLZWdZTHJvb0UyL1BZSUZJVjI5WDVVUFNMMElK" ] "Set-Cookie" => array:2 [ 0 => "XSRF-TOKEN=eyJpdiI6Ilk4bjNpM2M3WDVNV1VJTDhibDI5a1E9PSIsInZhbHVlIjoiR2l2VkNnLzRCbEVYK1dsa3hpWW91ZXhuWTJHeWJXbGFCUWRxTWxtZjJ1VkNaSnRWU0xrdmtsYmIyU0V4NjNzeTRYTEZCK0IzR3FBNDVuemdBMHpYVktrUk0ySzU2MDZIT2FQMHBWQ0VnLy9aZDR0dE9mR0Exd2RqQ2kzdDlXTFciLCJtYWMiOiIyOGJmNGNlMWU4MWVhYzQzMjNhOWNjMzQ0ZmYxNjFjMThiN2I4MzZkNjlhMTUzYWRmNjI4MDE4YjM1YjMzZTIxIiwidGFnIjoiIn0%3D; expires=Thu, 17-Apr-2025 11:37:22 GMT; path=/XSRF-TOKEN=eyJpdiI6Ilk4bjNpM2M3WDVNV1VJTDhibDI5a1E9PSIsInZhbHVlIjoiR2l2VkNnLzRCbEVYK1dsa3hpWW91ZXhuWTJHeWJXbGFCUWRxTWxtZjJ1VkNaSnRWU0xrdmtsYmIyU0V4NjNzeTRYTEZCK" 1 => "laravel_session=eyJpdiI6Ikg2YWE5a09SNFBuT2NVWTJ4eUJBWHc9PSIsInZhbHVlIjoidTgzSHc3TlBTRnpQRHVTUHJLV2MvUkcyQ0xLbXdTMnN5MzJsWHpLZWdZTHJvb0UyL1BZSUZJVjI5WDVVUFNMMElKS2xSTWhVczFiamhBdWdpdXo4UUlGbEVLMk5BVC9FUjNZR1k5WkdFbDNFcnRvcklzTkZNWEExandDTlhmSzAiLCJtYWMiOiI2NWVhMmZkY2QxMWM1N2FiOTE2MjIxMDk0NjRhNDg1MzQ2ZDE2ODE1MzgwMTExYjMzMGZjOWIyZTE4ZGY1NmU0IiwidGFnIjoiIn0%3D; expires=Thu, 17-Apr-2025 11:37:22 GMT; path=/; httponlylaravel_session=eyJpdiI6Ikg2YWE5a09SNFBuT2NVWTJ4eUJBWHc9PSIsInZhbHVlIjoidTgzSHc3TlBTRnpQRHVTUHJLV2MvUkcyQ0xLbXdTMnN5MzJsWHpLZWdZTHJvb0UyL1BZSUZJVjI5WDVVUFNMMElK" ] ]
        session_attributes
        0 of 0
        array:5 [ "_token" => "I3H3ldMWWhFhf5I5OEWazo7cwUWXeBhKjAo7l0xg" "locale" => "ar" "_previous" => array:1 [ "url" => "https://www.corspedia.com/ar/%D8%A7%D9%84%D8%AF%D9%88%D8%B1%D8%A7%D8%AA/practical-reinforcement-learning" ] "_flash" => array:2 [ "old" => [] "new" => [] ] "PHPDEBUGBAR_STACK_DATA" => [] ]