How an autonomous vehicle learns to drive on the UK’s roads
By William Sachiti, Academy of Robotics
Have you ever wondered how an autonomous vehicle sees? How it manages to navigate its way around obstacles and avoid pedestrians - even if one runs out in front of it? And how it moves among the traffic avoiding collisions with other vehicles as they change lanes, make turns and stop/start? Can it spot a cat making a dash for it and avoid it? What about the wind blowing a dustbin into the road? Can the car predict what might happen next, a bit like we do when driving, and anticipate its next move?
To teach an autonomous vehicle to do all these things, we need to start by gathering huge quantities of data. To do this a data gathering car is used.
These custom-made vehicles, such as the example above produced by Pilgrim Motorsports with guidance from the Academy of Robotics in the UK, carry specialist sophisticated camera and computing equipment to be able to gather the required autonomous car data. Its job is to go around a town to capture visual the data is in the form of video footage from up to 12 cameras with a combined 360 view around the car as well as capturing feedback from sensors and infrared detectors. This is all to gain a comprehensive understanding of the road environment and the road’s users, particularly in residential areas
Learning by watching
We then take this data back to a bank of
supercomputers which watch it over and over again to learn. This type of
computer science is called machine learning and uses evolutionary neural
networks. Neural Networks are a computer system modelled on the human brain and nervous
system, we run computer algorithms on neural networks. In this way the
algorithms not only learn but also evolve with each iteration. This is not dissimilar to how we, as humans,
have to have driving lessons and we learn a little bit more with each session.
Much like a child is taught what objects are at school, we take images of similar scenes to roads where the car will drive. From these scenes we mark out what objects are, we call this annotation. Using a branch of computer science called Machine Learning, we apply the annotated data to an algorithm which now begins to compare images and learn the difference between a car, a pedestrian, cyclist road, sky, etc. After some time of doing this and us showing the computer more complex or harder to understand scenes, the algorithm in the computer eventually figures out the rest by applying what it has been taught and what it sees.
Now that the algorithm can tell what objects are, we attach multiple cameras looking in all directions. And in real-time, the algorithm is able to identify pretty much everything that is relevant in a scene. Using onboard supercomputers, that are performing up to 7 trillion calculations per second, the camera data is interpreted to reveal something like the image below.
Understanding what is in the scene is just one
small part of the puzzle. The next step is to predict what each person, car,
bicycle, traffic light, is going to do next. Yes, we are going to predict the
potential future of everything in the scene. While this sounds complex at
first, if you break it down, it’s actually quite simple.
In the real world, if your smartphone were to slip from your fingers and start to fall, you know it will hit the ground. It is not going to stop all of a sudden and float or spontaneously shoot up. It falling and hitting the ground, to you, is a simple predictable action with an inevitable result. Similarly, the vehicle is able to see and identify pedestrians, cars, and bicycles etc. and then predict multiple realistic potential scenarios, taking action based on which potential scenarios are more likely to happen.
We are using computer power not only to see
everything in the scene but then to predict what everything is likely to do in
the next three seconds. While three seconds sounds like not a very long time to
calculate potential futures, from the frame of reference of the car, everything
is happening very, very slowly. It sees the world at 1000 frames per second. To
it, all objects on screen, are moving as slow as snails do to you and I.
Keeping in mind that there is more than one camera looking around the car, we fuse the findings from each camera creating a combined view of the world as seen by the car. This combined view gives us a more accurate account of what is happening in the world around it.
keeping in lane
There is a similar process for a vehicle to know how to keep in lane, where the road is and where it needs to be driving.
The example below shows a vehicle driving
through a residential street in the UK.
As the vehicle needs to give way, it highlights in red the areas that it cannot drive and in green the areas that it considers space which is free on the road. There is an entire algorithm with its own neural network which has been trained to understand just the road taking into account details like texture, colour, obstructions etc.
These are a few ways an autonomous car
understands the world around it - but there is more. We also have sub systems for reading road
markings, reading traffic signs, Infra-red and more. All these subsystems
running in their own Neural Network are combined to create one super view of
the world as the car sees it.
The end result is that currently, some of our test vehicles driven by neural networks are already out performing human counterparts in many scenarios. For what we do, autonomous delivery in the last mile, we have no need to learn how to drive on every road in the UK; we only need to master specific postcodes for residential last mile delivery, which is why we are already so close to deployment.
The first smartphones were giant bricks which could not do much more than make phone calls, as time went on, they got more advanced and could do more.
Self-driving vehicles are the result of years
of computer science and their arrival is the next step in the evolution of
vehicles. First, we saw vehicles with cruise control, then cruise control with
lane assist, then self-parking and now we’re moving onto self-driving. The
first autonomous cars will do an excellent job of driving themselves on very
specific routes. With time, the vehicles will begin to drive more complex roads
and routes, eventually, they will connect to each other and share data between
each other; it is a step by step process.
I predict that we will begin seriously to see passenger carrying self-driving cars on the roads by 2020 and then a period of mass adoption between 2021 and 2025. The first self-driving cars you will see on the road are likely to be autonomous cars which deliver goods and don’t carry people. This is a simple, low risk start with a valid use. Our own autonomous delivery vehicle Kar-go is scheduled for trials later this year.
ABOUT THE AUTHOR
William Sachiti is the Founder and CEO of the Academy of Robotics, a UK company specialising in creating ground-breaking robotics technologies such as autonomous vehicles. Their first commercial solution, Kar-Go, is an autonomous delivery vehicle that will vastly reduce the last-mile costs associated with deliveries.