In the defense and intelligence communities, machine learning has graduated from nascent to operational
For more than 33 years spanning more than 7,000 episodes, Alex Trebek has been host of the popular TV game show Jeopardy! During that time, the erudite emcee has seen it all. He’s watched geniuses crash and burn. He’s seen people win—and lose—millions. Perhaps the most remarkable Jeopardy! event of which Trebek has been a part, however, was the match wherein Jeopardy! champions Ken Jennings and Brad Rutter challenged Watson, an IBM supercomputer.
Watson had been trained to answer trivia questions using a combination of data mining, pattern recognition, and natural language processing. During the course of three televised matches that aired in February 2011, Watson destroyed its human challengers, winning $77,147 compared to Jennings’ $24,000 and Rutter’s $21,600.
“I didn’t give it the kind of serious thought I should have in terms of examining the technology required to have a computer that will understand the nuances and subtitles we present with our clues in Jeopardy!,” Trebek told TIME magazine. “It wasn’t until I saw the computer play that I thought, ‘Holy smokes, this is serious stuff.’”
When it was first developed in 2007, Watson took two hours to answer a single Jeopardy! question and struggled to beat even child competitors. That it could trounce adult champions a mere four years later is a testament to the power of artificial intelligence (AI) and machine learning. AI and machine learning have matured even further in the seven years since Watson’s victory, graduating from novelty to necessity—especially for the defense and intelligence communities, which are simultaneously researching and operationalizing machine learning in order to win an entirely different kind of competition: what some senior military officials are calling an “AI arms race.”
The Case for Cognitive Computing
So-called “deep learning” is the machine learning technique that most interests the defense and intelligence communities. Although the computer science behind it is complex, its premise is simple: A computer receives a question and identifies an array of possible answers. To determine which answer is correct, it uses hundreds of algorithms to examine the available evidence, including what type of information is available and how reliable it is. Using neural networks that simulate human brain function, each piece of evidence is weighted against the rest. Ultimately, the computer ranks the possible answers from most to least likely and puts forth the most promising one. Human operators subsequently tell the computer whether it was right or wrong, at which point it self-edits its algorithms. Each time the computer answers a question it “learns” something new, which over time allows it to reach more accurate and reliable conclusions.
This capability is especially attractive to the U.S. Department of Defense (DoD) as it pursues what it calls the “Third Offset.”
Consider, for example, the U.S. Air Force. “We look at neural networks like they’re wingmen,” said Capt. Michael Kanaan, director’s action officer for U.S. Air Force Intelligence (AF-A2), which envisions a future where U.S. airmen are assisted by machine sidekicks that constantly analyze the world around them, à la Star Wars’ R2-D2. “We’re training our R2-D2 to be right there with us. In that way, machine learning acts as a decision aid.
It can shrink the time radius of our OODA loops: observe, orient, decide, and act.”
Simply put: The increased speed and certainty it can gain from machine learning will allow the U.S. to remain ahead of its adversaries. “AI and machine learning provide us insight at speed and scale that we otherwise would not have,” Kanaan continued. “It has second-, third-, and fourth-order effects that create decision advantage for us.”
There are practical benefits as well as strategic ones.
“One of the main challenges [the Intelligence Community (IC) faces] is data volume,” said Central Intelligence Agency (CIA) spokesperson Jonathan Liu. “For example, thousands of terrorist videos are uploaded on a daily basis. Therefore, there is a need to detect, characterize, and triage data in a scalable manner. Machine learning enables and assists our officers to maximize their time in solving problems and making high-level decisions.”
Data-processing fatigue is another important challenge machine learning can help solve. “For example, it is known that human visual recognition performance decays with time. In contrast, well-trained algorithms can sustain constant performance and process data 24/7,” Liu said. “Combining both human and machine-driven decision-making is the optimal way to solve problems. Specifically, using machines to automatically solve basic, repetitive, and time-consuming tasks, such as finding small objects in image collections. The summarized data then serve as the initial pre-culled data set needed to solve highly complex intelligence problems.”
In other words, machine learning makes a force multiplier of computers, achieving maximal analytic capacity with minimal human resources.
AI’s benefits have been apparent for decades. As the technology advances, the IC is moving quickly to test and field new machine learning capabilities.
The CIA has approximately 140 pilot projects underway, with the goal to transfer machine learning from researchers and data scientists to customers and operators.
“The range of applications varies widely and applies to most of CIA’s components. Data understanding is an overarching objective, including the extraction of patterns impossible to find with standard or traditional manual techniques. Examples include change detection across different timeframes, and amplifying imperceptible motion patterns from pixels,” Liu reported.
Multimedia analytics is an area of emphasis, according to Liu: “This includes natural language processing tasks such as automatic machine translation and transcription, and image and video processing tasks such as object and activity characterization.”
The Intelligence Advanced Research Projects Activity (IARPA) is working on numerous programs designed to deliver machine learning capabilities to the IC, according to IARPA Program Manager Hakjae Kim.
The program about which Kim is most enthusiastic is the Functional Map of the World (fMoW) Challenge, which concluded in December and in February will award cash prizes to the top five participants who developed algorithms to detect and categorize buildings, structures, and land uses in satellite imagery—a challenging task due to the sometimes low resolution and high variability (e.g., time of day, weather, etc.) of satellite images. To help participants train their algorithms, IARPA published one of the largest-ever publicly available satellite image datasets, annotated with more than a million points of interest across approximately 60 categories such as hospitals, schools, lighthouses, bridges, and cellphone towers.
“We’ve invested a lot of money to create inputs and outputs that can be used to train deep neural nets,” explained Kim, who hopes the algorithms produced during the fMoW Challenge will activate a community of developers who continue to apply their expertise toward IC objectives. “As more people become familiar with IC challenges, they’ll be able to use [the dataset we created] to help us solve our problems, which will be a bigger contribution than the algorithms that come out of the competition.”
Like IARPA, NGA is leveraging external expertise to acquire and scale its machine learning capabilities—most notably through its Global Enhanced GEOINT Delivery (Global-EGD) contract with DigitalGlobe, whose Geospatial Big Data Analytics (GBDX) platform is a marketplace through which customers can acquire machine learning algorithms created by DigitalGlobe and third-party developers for use with DigitalGlobe imagery.
“The Global-EGD contract’s largest and most attractive asset is the EnhancedView Web Hosting Service, which provides near-real-time access to over 1 billion square kilometers of DigitalGlobe imagery,” explained NGA Program Manager Brian Bates. “We’ve worked very closely with DigitalGlobe to build an interface between the EnhancedView Web Hosting Service and GBDX so our analysts can access … algorithms to run over different areas of interest that correspond with their mission set.”
An acquisition, design, delivery, and demonstration activity completed in summer 2017 unearthed a number of algorithms NGA analysts are currently applying across missions, according to Bates. There’s a water detection algorithm, for example, to identify water inundation after natural disasters; a soil detection algorithm to identify construction activity; and ship and plane detection algorithms to detect unusual air and marine activity. As of November, NGA is using a material identification algorithm that can detect manmade paints and polymers and a vehicle detection algorithm that can identify cars and trucks, as well as distinguish between them.
“[Analysts] receive alerts in the interface as well as email alerts … indicating that the threshold they have set for activity or number of objects has been met or exceeded, and what area that happened in,” Bates said. “Eyes-on-imagery analysis is a time-consuming process, and if you’re doing missions like search or monitoring it can be extremely tedious.”
Deep Learning at DoD
The DoD is pursuing machine learning capabilities as enthusiastically as the IC. The Army Research Lab (ARL), for example, is exploring a number of ways to enable deep learning at the tactical edge.
“This kind of computing is going to be embedded wherever we do computing,” said Dr. Brian Sadler, Army senior scientist for intelligent systems at ARL. “It’s going to be lightweight and low-power, and that’s going to allow us to apply algorithms not just in robots, but on sensors.”
On sensors, machine learning eventually will exploit “cognitive radio” techniques to create self-forming and self-healing networks. Such techniques allow warfighters and sensors to intelligently manage spectrum usage and network capacity in contested environments where wireless communications face hacking, jamming, and spectrum scarcity. Many of these challenges can be mitigated by machine learning algorithms that support dynamic changes to signal structure and frequencies, allowing data to be shared freely and securely as the environment evolves.
Watson is also helping the Army push machine learning to the tactical edge, according to IBM. Instead of competing on game shows, Watson is helping the Army’s Logistics Support Activity leverage the Internet of Things (IoT) to predict vehicle maintenance failures across the service’s fleet of 3,500 Stryker combat vehicles.
“The Stryker has the same [IoT] computer system that we all have in our cars,” said Sam Gordy, general manager of IBM’s federal business. “Merging that structured engine data with unstructured data like training manuals, field manuals, and handwritten maintenance reports from the field, then laying predictive analytics on top of that, allows us to, in essence, deliver personalized medicine to each individual Stryker combat vehicle. That not only gives you return on investment—lower maintenance costs—but more importantly gives you return on mission in the form of equipment uptime so you’re not putting soldiers at risk in the field.”
As it ingests more maintenance data from more vehicles, Watson will become smart enough to predict which vehicles will fail, as well as how, when, and under what circumstances.
As powerful as these predictive analytics are, machine learning’s greatest promise doesn’t lie in IoT insights, but in computer vision, which is the focus of DoD’s signature machine learning operation: Project Maven.
Established in April 2017 by Deputy Defense Secretary Robert O. Work, Project Maven—otherwise known as the Algorithmic Warfare Cross-Functional Team—is led by Air Force Lt. Gen. John N.T. “Jack” Shanahan, director for defense intelligence for warfighter support with the Office of the Under Secretary of Defense for Intelligence. Project Maven’s goal, according to the memo that established it, is “to turn the enormous volume of data available to DoD into actionable intelligence and insights at speed.” Step one toward achieving that objective is augmenting or automating processing, exploitation, and dissemination (PED) of full-motion video (FMV) captured by unmanned aerial vehicles in support of DoD’s campaign to defeat ISIS.
At press time, Shanahan’s 12-person team was on track to achieve the following goals by the end of calendar year 2017: organizing a data-labeling effort; developing, acquiring, and/or modifying algorithms to accomplish object detection, classification, and alerts for FMV PED; identifying required computational resources; determining a path to fielding necessary infrastructure; and integrating algorithmic-based technology with programs of record.
As promised by Shanahan, Project Maven’s first algorithms were delivered in December for testing.
“DoD has a huge influx of video coming in. Inside all this video are nuggets of intelligence, but there’s too much of it for analysts to ingest and digest to then make an intelligence decision on,” said Kevin Berce, business development manager at NVIDIA and co-chair of USGIF’s Machine Learning & Artificial Intelligence Working Group. “Machine learning is going to help tell the analysts where to look. If you’re looking for a white truck, why spend time looking at hours of video where there’s no white truck? Let’s just give the analysts the video where the white truck is.”
Man vs. Machine
Project Maven is expected to be a playbook for acquiring and operationalizing machine learning capabilities across DoD and the IC. One of the most valuable lessons it has yielded so far is that human analysts remain essential, according to Kanaan.
“Our approach is the idea of human-machine teaming,” explained Kanaan, who said the ultimate goal is for machines to take over the “observe” and “orient” components of a typical OODA loop so human analysts can concentrate on the “decide” and “action” components.
Although the goal is for humans to eventually rely on machines, for now it’s machines that must rely on humans, according to Kanaan, who stressed labeling as a key component of Project Maven; so far, he said, more than 1,000 Air Force intelligence analysts have labeled “tens of thousands” of objects for use in training Maven’s algorithms.
Data labeling is only the first step. Next must come data validation, which is a major priority for NGA, according to Bates. “We will be instituting a feedback mechanism where the analyst can click on the image and tell the algorithm where it failed,” he said. “That information will then go back to the algorithm developers to help them retrain their algorithm.”
Currently, Bates said, the algorithms NGA acquired from GBDX have an accuracy rate of approximately 70 percent. “That’s pretty good,” he continued, “but for government work we need it to be a lot more authoritative than that.”
That cooperation hinges on trust, according to Bates, who cited user confidence as a major hurdle. “[It’s the] crawl-walk-run paradigm,” he said. “Right now we’re crawling. And the reason I say that is because you can run algorithms against imagery all day long, but you’re not going to gain any kind of authority or trust with the elements if analysts don’t have the ability to verify the accuracy of those algorithms.”
The Defense Advanced Research Projects Agency (DARPA) is working on a solution: what it calls “Explainable AI.”
“These very complex data analysis algorithms are giving recommendations to an intel analyst, but the analyst may not understand why the system is making that recommendation,” said David Gunning, a program manager in DARPA’s Information Innovation Office. “That analyst gets judged according to the quality and accuracy of her recommendations, so in order to feel comfortable putting her name on the recommendation that goes forward she wants to understand what the machine learning model was thinking.”
Enter Explainable AI, which launched in August 2016 under Gunning’s tutelage. The five-year program has awarded contracts to 11 teams that are building software prototypes capable of explaining machine learning outcomes to human users. Like students in a high school math class, each team’s system will be instructed to “show its work.”
“Users will be able to ask the system, ‘Why do you think that’s a convoy in North Korea?’ And the system will come back with an initial explanation like, ‘Oh, I think these are trucks and they’ve all been on the road for an hour,’” explained Gunning, who said explanations might be verbal or visual (e.g., a photo with items circled on it). Understanding a system’s logic will build fidelity in a way that increases machine learning’s adoption across government. It’s not just about adoption, however. Because future adversaries might be able to hack American algorithms, it’s also about security, according to Gunning, who cited research wherein users were invited to use two different machine learning systems running the exact same algorithm.
“Experiments have shown that … if you just put a smiley face on one of the systems, people will trust that system more than the other one,” Gunning said. “So, it’s easy to fool people [when they can’t see] if the system is making a mistake or not.”
And machines do, in fact, make mistakes—just like humans. Which is why the future of data analytics isn’t man or machine; it’s man and machine. The question facing the defense and intelligence communities now is when and how the two can work together most effectively.
“Our workforce is ready for this. They deserve an unleashing of their innovative culture, and largely what underpins that innovative culture is the tactics in which you use technology,” Kanaan concluded. “While the nature of war remains largely unchanged, the character behind it is defined by those who can most quickly and effectively adapt in response to new and disruptive technologies.”
- To learn more about USGIF’s Machine Learning & Artificial Intelligence Working Group, visit usgif.org/community/committees/machinelearning.