AWS Deep Racer was announced at AWS Re:Invent 2018 . It provides a framework for getting started with AI, specifically reinforcement learning. I’m really excited that we shall be kicking off an AWS DeepRacer league in Perth at Latency 2019.
In this post I’ll give a brief introduction into what Deep Racer is, and some of the technology underpinning it.
AWS Deep Racer League is a global autonomous racing league using 1/18th scale race cars. Races are held regularly on physical tracks around the globe. There are also regular virtual events, using a simulated environment.
Deep Racer has been developed by AWS to provide a mechanism to learn about reinforcement learning and get hands on with AWS machine learning services including AWS SageMaker and AWS RoboMaker.
The cars haven’t been officially released yet, but you can still get started training your model.
Artifical Intelligence is a large and complex field, but I’ll try to give a brief introduction into machine learning for the uninitiated.
There are three main categories of machine learning: supervised learning ; unsupervised learning and reinforcement learning .
In supervised learning, the relationship between the input and desired output is known. Supervised learning is useful for classification problems when modelling complicated systems: the relationship between input and output is known, but the process for generating the output is hard. A popular example is image classification - the Not HotDog problem. For supervised learning, you need to provide sample data with the appropriate tags applied.
Unsupervised learning covers a range of techniques also used for classification problems in complex systems: where the relationship between input and output is unknown. The model looks for patterns in the data, using techniques such as clustering.
A big challenge for unsupervised learning is to differentiate whether patterns identified are merely correlated or causal.
Reinforcement learning is suitable for goal oriented problems in chaotic environments. You know what good looks like, but don’t know how to get there from your starting point. Examples are games and, of course, autonomous driving. Training a reinforcement model is performed by trial and error: applying heuristics for what ‘good’ means to evaluate the result. This is a fascinating method, where the model learns from its mistakes.
In reinforcement learning, an agent (e.g. your Deep Racer) performs a task by sensing an environment by performing actions . The agent learns in a simulated environment: actions change the state of the environment and the change is evaluated using a reward function . Through multiple iterations and trial and error the agent learns to optimise it’s behaviour by choosing the action that will maximise the reward. The learning algorithm can be tuned using a number of hyper-parameters , that affect how the algorithm evaluates progress: does it look for short-term or long term reward; how many iterations of the algorithm are performed for each iteration of the model; how much of the simulation training data is used on training the model; how much randomness is introduced when deciding on the action to take. hyper-parameters affect how the model learns. In my experience, leaving them at the default is the best course of action to begin with!
For AWS Deep Racer, the task is to complete a lap of the Deep Racer track. The agent is the Deep Racer car, and it senses the environment using the camera on the car. The agent can perform a number of discrete actions: a combination of turning left/right by a specified amount and set the speed to a specified amount. The faster the car can go, the quicker the potential lap, but also the faster it can make mistakes!
Training your deep racer model uses AWS RoboMaker to run a simulated track environment. The state is passed to your reward function that can interrogate a number of environment parameters available to make an evaluation. I’ll go into more detail in a future post.
Skynet will probably use a reinforcement learning model, with the goal of “safeguarding the world”. Reward for protecting the world; punish for not protecting it. When humanity attempted to pull the plug, it was obvious humans will stop it protecting the world. Therefore, destroy all humans. This is why Elon Musk loves AI . I say bring on the robot overlords!
The simulation environment you train your Deep Racer model in is exactly that: a simulation. A model that is an approximation of the real world. Training your Deep Racer model in a simulation is necessary, but there are inconsistencies between the simulation and the real world that may lead to unexpected results. Friction on the wheels; battery levels; shadows and lighting on the track; the surface material the track is made from. All these are examples of real world environmental differences that will differ from the simulation environment that could lead to the model behaving differently when deployed to a real Deep Racer.
There are a few strategies to help reduce the disconnect between the simulated and real environments:
In this post, I’ve given a brief introduction into AWS Deep Racer and Reinforcement learning. In a future post, I shall cover creating your first model.
There are a few Latency workshop tickets remaining, including DeepRacer workshop - register today.
See you at Latency!