Training and Validation
Train and simulate reinforcement learning agents
To learn an optimal policy, a reinforcement learning agent interacts with the environment through a repeated trial-and-error process. During training, the agent tunes the parameters of its policy representation to maximize the long-term reward. Reinforcement Learning Toolbox™ software provides functions for training agents and validating the training results through simulation. For more information, see Train Reinforcement Learning Agents.
Apps
Reinforcement Learning Designer | Design, train, and simulate reinforcement learning agents |
Functions
Topics
Training and Simulation Basics
- Train Reinforcement Learning Agents
Find the optimal policy by training your agent within a specified environment. - Train Reinforcement Learning Agent in Basic Grid World
Train Q-learning and SARSA agents to solve a grid world in MATLAB®. - Train Reinforcement Learning Agent in MDP Environment
Train a reinforcement learning agent in a generic Markov decision process environment. - Create Simulink Environment and Train Agent
Train a controller using reinforcement learning with a plant modeled in Simulink® as the training environment. - Train Reinforcement Learning Agent for Simple Contextual Bandit Problem
Train Q and DQN agents to solve a contextual bandit problem. - Log Training Data to Disk
Log a variety of data to disk while training an agent. - Train Agent or Tune Environment Parameters Using Parameter Sweeping
Tune a DDPG agent using hyperparameter sweeping.
Use the Reinforcement Learning Designer App
- Design and Train Agent Using Reinforcement Learning Designer
Design and train a DQN agent for a cart-pole system using the Reinforcement Learning Designer app. - Specify Simulation Options in Reinforcement Learning Designer
Interactively specify options for simulating reinforcement learning agents using the Reinforcement Learning Designer app. - Specify Training Options in Reinforcement Learning Designer
Interactively specify options for training reinforcement learning agents using the Reinforcement Learning Designer app.
Use Multiple Processes and GPUs
- Train Agents Using Parallel Computing and GPUs
Accelerate agent training by running simulations in parallel on multiple cores, GPUs, clusters or cloud resources. - Train AC Agent to Balance Cart-Pole System Using Parallel Computing
Train a discrete action space AC agent using asynchronous parallel computing. - Train DQN Agent for Lane Keeping Assist Using Parallel Computing
Train a DQN agent for an automated driving application using parallel computing.
Multi-Agent Training
- Train Multiple Agents to Perform Collaborative Task
Train two continuous action space PPO agents to collaboratively move an object. - Train Multiple Agents for Area Coverage
Train three discrete action space PPO agents to explore a grid-world environment in a collaborative-competitive manner. - Train Multiple Agents for Path Following Control
Train a DQN and a DDPG agent to collaboratively perform adaptive cruise control and lane keeping assist to follow a path.
Train Agents to Control Double Integrator System
- Train DDPG Agent to Control Double Integrator System
Train a DDPG agent to control a second-order dynamic system modeled in MATLAB and compare it to an LQR controller. - Train PG Agent with Baseline to Control Double Integrator System
Train a discrete action space PG agent with a baseline to control a double integrator system modeled in MATLAB.
Train Agents to Balance Cart-Pole System
- Train DQN Agent to Balance Cart-Pole System
Train a DQN agent to balance a cart-pole system modeled in MATLAB. - Train PG Agent to Balance Cart-Pole System
Train a discrete action space PG agent to balance a cart-pole system modeled in MATLAB. - Train AC Agent to Balance Cart-Pole System
Train a discrete action space AC agent to balance a cart-pole system modeled in MATLAB. - Train DDPG Agent to Swing Up and Balance Cart-Pole System
Train a DDPG agent to swing up and balance a cart-pole system modeled in Simscape™ Multibody™. - Train MBPO Agent to Balance Cart-Pole System
A model-based reinforcement learning agent learns a model of its environment that it can use to generate additional experiences for training.
Train Agents to Swing Up and Balance Pendulum
- Train DQN Agent to Swing Up and Balance Pendulum
Train a DQN agent to swing up and balance a pendulum modeled in Simulink. - Train DDPG Agent to Swing Up and Balance Pendulum
Train a DDPG agent to balance a pendulum modeled in Simulink. - Train DDPG Agent to Swing Up and Balance Pendulum with Bus Signal
Train a DDPG agent to balance a pendulum Simulink model that contains observations in a bus signal. - Train DDPG Agent to Swing Up and Balance Pendulum with Image Observation
Train a DDPG agent using an image-based observation signal. - Create DQN Agent Using Deep Network Designer and Train Using Image Observations
Create a reinforcement learning agent using the Deep Network Designer app from the Deep Learning Toolbox™.
Train Agents to Perform Control Tasks
- Tune PI Controller Using Reinforcement Learning
Tune the gains of a PI controller using a TD3 agent. - Train SAC Agent for Ball Balance Control
Train a SAC agent to balance a ball on a flat surface using a robot arm. - Train Reinforcement Learning Agents to Control Quanser QUBE Pendulum
Train SAC and PPO agents to balance the Quanser QUBE rotational inverted pendulum. - Train TD3 Agent for PMSM Control
Train a TD3 agent to control the currents in a permanent magnet synchronous motor. - Train DQN Agent with LSTM Network to Control House Heating System
Train a DQN agent with a recurrent network to control the temperature of an house. - Train Reinforcement Learning Agent with Constraint Enforcement
Train a DDPG agent with actions constrained using the Constraint Enforcement block.
Train Agents to Control Robots
- Train DDPG Agent to Control Flying Robot
Train a DDPG agent to control a flying robot model. - Train PPO Agent for a Lander Vehicle
Train a discrete action space PPO agent to land a flying robot. - Train Biped Robot to Walk Using Reinforcement Learning Agents
Compare DDPG and TD3 agent for the control a biped walking robot modeled in Simscape Multibody.
Generate Rewards from Control Specifications
- Generate Reward Function from a Model Predictive Controller for a Servomotor
Generate a reward function from an MPC controller applied to a servomotor and use it to train a TD3 agent. - Generate Reward Function from a Model Verification Block for a Water Tank System
Generate a reward function from an model verification block applied to a water tank system and use it to train a TD3 agent.
Imitation Learning
- Imitate MPC Controller for Lane Keeping Assist
Train a deep neural network to imitate the behavior of a model predictive controller within a lane keeping assist system. - Imitate Nonlinear MPC Controller for Flying Robot
Train a deep neural network to imitate the behavior of a nonlinear model predictive controller for a flying robot. - Train DDPG Agent with Pretrained Actor Network
Train a DDPG agent using an actor network that has been previously trained using supervised learning.
Train Agents for Automotive Applications
- Train DQN Agent for Lane Keeping Assist
Train a DQN agent for a lane keeping assist application. - Train DDPG Agent for Adaptive Cruise Control
Train a DDPG agent for an adaptive cruise control application. - Train DDPG Agent for Path-Following Control
Train a DDPG agent for a lane following application. - Train PPO Agent for Automatic Parking Valet
Train a discrete action space PPO agent to park a car in an open parking space.
Other Applications
- Deep Reinforcement Learning for Optimal Trade Execution
This example shows how to use the Reinforcement Learning Toolbox™ and Deep Learning Toolbox™ to design agents for optimal trade execution. - Train DQN Agent for Beam Selection
Train a deep Q-network (DQN) reinforcement learning agent for beam selection in a 5G new radio communications system. - Water Distribution System Scheduling Using Reinforcement Learning
Train a DQN agent to optimally activate pumps in a water distribution system.
Develop Custom Agents and Training Algorithms
- Train Reinforcement Learning Policy Using Custom Training Loop
Train a reinforcement learning policy using your own custom training loop. - Custom Training Loop with Simulink Action Noise
Use a custom training loop to train a continuous action space reinforcement learning policy in Simulink when action noise is generated within the model. - Create Agent for Custom Reinforcement Learning Algorithm
Create agent for custom reinforcement learning algorithm. - Train Custom LQR Agent
Create and train a custom agent that solves an LQR problem. - Model-Based Reinforcement Learning Using Custom Training Loop
You can create a model-based reinforcement learning agent using your own custom training loop.
Deploy Agents and Policies
- Run SIL and PIL Verification for Reinforcement Learning
Verify a reinforcement learning agent in software-in-the-loop and processor-in-the-loop modes. - Generate Policy Block for Deployment
Generate a policy block to deploy a trained policy.