Many autonomous robots — whether cleaning the kitchen floor or exploring the surface of Mars — use a combination of sensors, hardware and software to navigate.
It’s hard to imagine the world without GPS. The satellite-based global positioning system helps people get from point A to point B as quickly and efficiently as possible. Scientists employ even more powerful GPS sensors to study everything from earthquakes to snowmelt. And it’s one of the essential technologies behind the coming autonomous vehicle revolution.
But GPS isn’t foolproof. How many of us have had a momentary panic when losing cell service and GPS at a crucial juncture? Now imagine trying to navigate without GPS in unfamiliar territory — in the dark or even underground — and build a map of your route at the same time?
This is a challenge that the U.S. Army asked SRI International to solve for robotic vehicles and autonomous aerial systems tasked with search and rescue or other missions that take place in unknown territory where GPS doesn’t work and with no or low visibility.
“Our goal is not only to do navigation and mapping of the environment, but we also want to understand what’s inside the environment,” explained Han-Pang Chiu, Senior Technical Manager in the Vision and Robotics Laboratory in SRI’s Center for Vision Technologies. Chiu led a year-long development effort as principal investigator on the Semantically Informed GPS-denied NAVigation (SIGNAV) in Visually Degraded Environments project.
SLAM is not always a slam dunk
Many autonomous robots — whether cleaning the kitchen floor or exploring the surface of Mars — use a combination of sensors, hardware and software to navigate. This methodology has become known as SLAM, for simultaneous localization and mapping. SLAM refers to how an autonomous robot determines its location and maps its environment.
Techniques vary, but on the hardware side, these robotic platforms carry a range of sensors, including familiar technologies like GPS and cameras that operate in both visible and infrared wavelengths. Autonomous robots also have many of the same sensors found in smartphones and wearables today, such as accelerometers, gyroscopes and magnetometers that enable our devices to count steps and orient a picture on the screen. More sophisticated sensors might include LiDAR, a laser device that provides high-precision distance measurements in two or three dimensions to calculate depth.
And, of course, there’s an onboard computer processor. The SRI version of the SLAM navigation system runs on a small Nvidia Jetson AGX Xavior unit, that includes graphics processing units (GPUs) — the gold standard for computers running artificial intelligence algorithms.
On the software end, sophisticated algorithms help fuse the different incoming streams of sensor data into a coherent picture that enables a robot to maneuver and map its environment. Deep learning-based AI techniques further enable this world-building by training the robot to identify objects and locate such objects in the scene — a process known as semantic navigation. This type of computer vision system can also semantically segment images, with each pixel corresponding to a class of what is being represented, such as a portion of a door or window.
However, in the sort of dark GPS-denied scenarios envisioned by the Army, SLAM systems that use deep learning-enabled semantic navigation struggle to work because the image quality is very poor, according to Chiu. Fortunately, the SRI team wasn’t in the dark when it came to solving this challenge.
A SLAM in the dark
SRI had already successfully developed novel SLAM navigation systems on previous programs for the Defense Advanced Research Projects Agency (DARPA). On the DARPA All Source Positioning and Navigation (ASPN) program, for example, SRI scientists developed a real-time plug-and-play positioning and navigation system for multiple platforms using more than 30 different types of sensors.
Under the succeeding DARPA Squad-X program, SRI expanded the navigation system to include data sharing between ranging radios on multiple platforms. This innovation serves as a key part of the SIGNAV system because it improves SLAM performance by measuring the distance between multiple autonomous robots in real time. SRI scientists also developed a new way to use these distance measurements to improve navigation accuracy, without knowing or estimating the positions of ranging radios. This enables more flexiable and applicable navigation among multiple robots, Chiu said. The team is also working to share specific semantic data between robots to build and merge more accurate maps between the robots.
However, the real breakthrough on the 12-month project involved fusing the LiDAR data with 2D images to produce what Chiu called a 2.5D semantic segmentation. The laser sensor produces a three-dimensional point cloud that provides precise depth measurements that are simultaneously projected into a two-dimensional semantic segmented image. This combination of 2D and 3D techniques creates a sort of hybrid 2.5D semantic segmentations generated in real time.
“The depth information is not influenced by the lighting, so even in the dark, you will have depth from LiDAR and use that to improve your semantic segmentation,” Chiu explained. “That’s the central idea.”
In addition, SRI’s navigation system applies loop-closure detection using semantic information. This means the robot improves its location accuracy by detecting a place that it has visited before based on geometric knowledge and semantic scene information.
In demonstrating this capability in a GPS-denied environment, the team’s navigation system estimate of location was off course by less than six meters for a vehicle driving nearly 33 kilometers over 75 minutes. Due to the COVID-19 pandemic, a recent demo of SIGNAV took place in an SRI facility, which Chiu streamed to the Army customers on his phone.
“They were very happy about the live demo results,” Chiu said proudly. “We are the first ones to combine semantic segmentation with traditional SLAM, and make it work for the GPS-denied dark environment.”