Defining levels of automation for machine learning applied to geospatial intelligence
By David Lindenbaum and Ryan Lewis, CosmiQ Works; Todd M. Bacastow, Radiant Solutions; and Joe Flasher, Amazon Web Services
Machine Learning Applied to Remote Sensing
Since the release of AlexNet by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton in 2012 to compete in the ImageNet Large Scale Visual Recognition Challenge, there has been an explosion of computer vision research focused on deep learning. There have been marked improvements in computer vision tasks such as image classification, object detection, and instance segmentation. Improvements in these computer vision tasks have profound implications for geospatial intelligence (GEOINT).
In recent years, there have been several data science competitions that aim to direct more computer vision research and development toward remote sensing applications. These competitions have generated new analytic techniques, ranging from general object detection to feature segmentation and classification (see Figure 1), that combine state-of-the-art computer vision with geospatial problems. As remote sensing-focused machine learning techniques mature, GEOINT practitioners need to understand and engage the research community to help structure the application of these new techniques against geospatial problems.
Currently, it is difficult to translate mission requirements to machine learning evaluation metrics and vice versa. For example, in the computer vision community, most results are described by certain image-specific metrics such as mAP, F1 Score, Precision, and Recall. Alternatively, a GEOINT practitioner may want to incorporate machine learning capabilities into his or her workflow, but not know what level of performance (or augmented support) is necessary for a specific mission.
In 2013, the automotive industry addressed this challenge for autonomous vehicle capabilities by establishing a taxonomy for levels of autonomous driving. The levels were defined from zero (no automation) to five (full automation). In this article, we will explore parallels of this framework relevant to GEOINT practitioners and propose a framework for defining levels of analyst augmentation. We hope this will allow geospatial end users and machine learning researchers to better understand each other, and perhaps help direct the application of these algorithms against geospatial problems.
The use case of foundational mapping requirements, before, during, and after a hurricane is relevant given recent natural disaster events. We will define a taxonomy similar to the Society of Automotive Engineers’ (SAE) Levels of Automation to understand which capabilities are nearing readiness and which require more directed research.
- This article is part of USGIF’s 2019 State & Future of GEOINT Report. Download the PDF to view the report in its entirety.
Hurricane Disaster Response Use Case
Disaster response scenarios present a challenge for geospatial analysts and geographic information systems (GIS) professionals. Throughout the preparatory, response, and recovery phases of a disaster, analysts and aid organizations are charged with providing mapping solutions that are timely, dynamic, and accurate in order to support aid functions such as the delivery of critical supplies and services. Yet, the complexity, volatility, and sheer geographic scale of many natural disasters may limit the speed, and in some cases the accuracy, of manual mapping annotation techniques. While global crowdsourcing initiatives such as Humanitarian OpenStreetMap Team (HOT) have significantly increased the speed and robustness of dynamic mapping data generation and dissemination, rapidly maturing machine learning techniques, specifically computer vision, can help accelerate the development of timely maps over large geographic areas.
Hurricanes Irma and Maria wreaked near record-level economic and humanitarian devastation across a large portion of the Caribbean in September 2017. Some of the hardest hit areas, such as Puerto Rico, are still recovering from the storm’s effects more than a year later. The large number of affected areas along with the speed of the storms, particularly Hurricane Maria, pushed open-source, manual mapping processes to the limit. For example, HOT leveraged more than 5,300 mapping volunteers to produce more than 950,000 building footprint labels and upward of 30,000 kilometers of roads labels in approximately five weeks for locations affected by Maria. This was truly an amazing feat, but it presents the question: How could machine learning accelerate this map generation process? More specifically, what are the map key features (layers) contributors are labeling and which features could benefit from automation?
During the early response to Maria, the most important map feature was arguably building footprints as they represent the foundational infrastructure of where people live and work. Since there were limited preexisting quality data on structure counts, locations, and classifications (i.e., purpose of the structure), first responders did not have detailed information on the number of people potentially in vulnerable or remote locations. As a result, it was difficult for responders to prioritize aid missions. For instance, when authorities decided to evacuate areas downstream from the Guajataca Dam in Puerto Rico due to the dam’s potential for collapse, officials needed to know the size of the surrounding population. Counting and classifying structures was one method for approximating population size. From the American Red Cross’ request for updated building footprints on September 22 to the release of the “first pass” map on October 25, HOT, in conjunction with its mission partners, conducted 12 separate labeling campaigns for buildings in Puerto Rico.
Although there were existing road network maps for a majority of Puerto Rico, the dynamic nature of Hurricane Maria required timely updates to the road network. More than 1,500 roads were damaged, blocked, or washed out from the hurricane. As a result, first responders needed rapid updates to transportation maps to determine where supplies could and should be sent. Given the widespread damage to the road network, initial mapping efforts were primarily focused on identifying which routes were passable. Efficient logistics and route planning were particularly important during the first days of the response phase because Puerto Rico did not have sufficient aid supplies such as generators and water filtration systems warehoused locally. Analysts and mapping volunteers completely updated the labels for Puerto Rico’s road network during a five-week period.
The third map feature category analysts provided were critical infrastructure points of interest (POIs). Since the entire island of Puerto Rico lost power when Maria made landfall, an important classification feature was power infrastructure. The island’s prolonged blackouts, and the associated catastrophic effects including loss of life, highlight the complexity of identifying specific types of infrastructure. Puerto Rico also experienced severe communications challenges in the days following Maria. To make matters worse, officials and responders had an insufficient supply of satellite phones. Analysts were also asked to identify communications infrastructure such as microwave towers in an effort to assist responders and local utility providers.
Lastly, identifying medical facilities and infrastructure was important due to power outages, flooding, and damage at some of the area’s largest hospital centers. The identification of POIs was particularly challenging for analysts because it required them to both identify a particular structure, classify the type of structure, and then determine the presence and severity of damage. Based on previous studies looking at remote sensing imagery after the 2010 earthquake in Haiti, accurate classification of structures and subsequent damage only using satellite imagery or airborne datasets was not possible because a damaged building was not necessarily visible from directly overhead. In order to detect and verify building damages, a site survey and/or off-angle image were required in order to adequately collect imagery showing characteristics of building damage, particularly collapsed or partially structures.
The scale and diversity of mapping tasks associated with disaster response scenarios such as Hurricane Maria present several potential functions for emerging machine learning technologies. First, and most generally, machine learning can assist in the provision of labeling assignments by determining the level of complexity in each image assignment prior to tasking. More complex scenes could be assigned to experienced mapping analysts and labelers while simpler scenes could be directed toward novice analysts. Second, object detection algorithms could be used to perform quality control on the mapping annotation data submitted by analysts. The primary role of algorithms in this function would be as an assistive technology to ensure analysts do not miss key features. Third, object detection (and potentially classification) algorithms could provide an assessment of each image before being assigned to a mapping analyst for human inspection. While this implementation could greatly increase analyst performance and speed, it requires a high level of algorithmic performance that may not be realistic in some complex scenes with today’s technology.
Defining an Automation Taxonomy
In January 2014, SAE released its first version of J3016: Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles. This document was instrumental in unifying the language and providing clarity about the intended capabilities of products in design. It describes six levels of automation from no driving automation (Level 0) to full driving automation (Level 5). As autonomous driving capabilities have evolved, this taxonomy has gone through two revisions and has grown from 12 to 35 pages.
Building a similar taxonomy for geospatial problems would allow the GEOINT Community to move from a technology-centric definition to a use case-centric definition. This would help the community better understand what it is asking of new technology and the types of performance that should be expected.
The following table is a proposed taxonomy for moving toward automated building extraction for foundational mapping in the context of a disaster response scenario. It separates two equally difficult tasks, localizing objects in an image, and fine grade classification of objects in that image.
|Level 0||No automation from machine learning. Traditional desktop or web-based GIS software would commonly be used with standard cartographic functions and tools.|
|Level 1||Machine learning is used to create a general count of an object in a broad feature class in an area. This should be used in situations for which large errors in count can be tolerated.|
|Level 2||One single specific task is automated to provide a suggestion to a human. For example, providing a geo-located bounding box or polygon, or providing a recommended label for a specific feature such as a residence, office, police station, or hospital.|
|Level 3||The complete labeling activity is automated and a complete footprint and narrow feature label is sent to a human labeler, and a recommended label for a specific feature is provided for a human to assess.|
|Level 4||The complete labeling activity is automated and a complete footprint and narrow feature label is sent to a human, and a recommended label for a specific feature is automated for a geospatially confined area.|
|Level 5||The complete labeling activity is automated and a complete footprint and narrow feature label is automated for the entire globe.|
Current State of the Art
In the last two years, several open-source datasets designed to move the state of the art forward in applying machine learning to the challenges of accurate building maps were developed. The SpaceNet Buildings Dataset has more than 800,000 building footprints across six cities (Atlanta, Khartoum, Las Vegas, Paris, Rio de Janeiro, and Shanghai) and is designed to improve the performance of extracting building footprints from satellite imagery.
In July 2017, the Intelligence Advanced Research Projects Activity (IARPA) released its Functional Map of the World (fMoW) dataset, which includes more than 1,000,000 DigitalGlobe satellite image chips covering 63 categories such as airport, police station, hospital, shopping mall, and single unit residential building, and is designed to improve the classification of already identified buildings and structures.
The SpaceNet dataset enables the creation of Level 1 or Level 2 automated systems. The fMoW dataset enables the creation of a Level 2 system for building classification. To enable Level 3 to Level 5, systems trained from both datasets would be required or, ideally, another dataset created to enable assessment of Level 3 to Level 5 systems for creating foundational maps of a region.
Innovations in machine learning continue to benefit the GEOINT Community by providing automation to enable mapping and analysis at unprecedented speed, scale, and efficiency. The application of this technology to drive improved mission outcomes should remain the focus of the community. To this end, understanding what level of performance or augmented support is necessary for a given mission remains a challenge and opportunity for GEOINT practitioners. We proposed a taxonomy and definition analogous to the six levels for autonomous vehicle driving with the goal of helping to enable the application of advanced machine learning algorithms against geospatial problems. Improving the community’s understanding of what levels of automation are possible and how much automation should be applied in a given scenario is essential to gaining advantage during mission-critical situations such as natural disaster response.
- The Development and Uses of Crowdsourced Building Damage Information Based on Remote-Sensing.
Headline Image Courtesy of Radiant Solutions