An Orchestra of Machine Intelligence

What the future might hold for intelligence analysis

By: Mark Sarojak, GeoNeo Inc.; Daniel Kepner, BAE Systems; Rex W. Tracy, Ph.D., Integrity Operations LLC; Cordula A. Robinson, Ph.D., Northeastern University; Craig Gruber, Ph.D., Kostas Research Institute for Homeland Security; and Dan Feldman, Planet Insight | February 1, 2018

Imagine a near future in which complex intelligence questions such as “Where is Osama bin Laden?” could be posed as simple textual queries, with answers automatically generated in milliseconds rather than months. Analysts wouldn’t spend countless hours searching for data to help answer their complex questions, nor would they spend many more hours waiting for large datasets to download and process on local computing resources. In this future, analysts would simply type a query, then an orchestra of machine intelligence (MI) systems would present a short list of high-probability answers with supporting information for each response. Let’s look into our crystal ball to see if we can catch a glimpse or two of what the future might hold for intelligence analysis. Before we do, here are some of the ground rules:

We refer to the following group of technologies collectively as machine intelligence (MI): classical artificial intelligence (AI), machine learning (ML), deep learning (DL), multitask learning, reinforcement learning (RL), data mining (DM), decision analysis (DA), and large-scale stochastic dynamic optimization (metaheuristics).
We believe MI is critical to answer demanding intelligence questions quickly and effectively due to the volume, velocity, and variety of data involved, but in this article, we do not advocate for one technique over another for any particular purpose.
Accurately answering complex questions often requires a broad spectrum of intelligence disciplines. Because of this, we are not constraining this discussion to geospatial intelligence (GEOINT) alone.
A sample scenario will help illustrate how ML is applied to answer complex intelligence questions. As such, we will attempt to show how the analysis of finding a high-value target (HVT), like Osama bin Laden (OBL), is accelerated and enhanced through use of MI. Note: The scenario described herein is a hypothetical application of how an MI-enabled system could have been used in the search for OBL. The scenario we describe is not intended to be historically accurate and is used only to illustrate the use of MI in an intelligence scenario.

The Orchestral Ensemble

Analysts are ever engaged in answering intelligence questions: Who was that and where did they come from? Where are they now? What are they planning and when will they strike? Although seemingly simple questions, unearthing the answers is often challenging due to the mountains of raw data involved and the extensive skills and experience required. The Intelligence Community (IC) has come to know this broad spectrum of domain-specific data and skills by many names, such as geospatial intelligence (GEOINT), signals intelligence (SIGINT), human intelligence (HUMINT), measurement and signatures intelligence (MASINT), and open-source intelligence (OSINT), among others.

Earlier, we introduced the concept of “an orchestra of MI systems” that could aid in answering these challenging questions. In a broad sense, we are talking about a system of humans and instruments working together to make beautiful music—especially if saving lives and ensuring national security is music to your ears. MI encompasses a broad set of computer science techniques dedicated to developing systems that can perform complex skills that generally require human intelligence, including advanced visual perception tasks like automated target recognition (ATR), natural language comprehension, and other complex decision-making processes. Given the vast quantities of intelligence data that have outpaced the growth of human resources, the community needs intelligent computer systems that analysts can use to automate functions they cannot or should not perform manually—tasks such as searching for, downloading, selecting, deleting, moving, processing, re-processing, and sharing data. When analysts are free from inefficient tasks, they can focus their time doing what they are best at—activities that require human creativity and critical thinking, such as asking insightful sequences of questions, collaborating with other analysts, and evaluating which potential answers are most plausible to the human decision-makers. This is the true value of highly trained analysts.

This article is part of USGIF’s 2018 State & Future of GEOINT Report. Download the PDF to view the report in its entirety and to read this article with citations.

The Path Forward

How does the GEOINT Community get there from here? We can best illustrate our vision by examining a sample intelligence scenario. We envision the user of this system would compose a question in plain language by asking something like, “Where is Osama bin Laden?” Asking this question would initiate an intricate series of activities to decompose the query, identify the applicable data sources, perform complex data analysis via disparate MI-enabled subsystems, fuse data and query results, and ultimately generate a series of highest probability responses for the analyst to consider. In our envisioned orchestra, MI is important as both composer and conductor, and our system can be conceptualized as these two main parts:

Query Composer: The first role of the query composer is to determine what the analyst is asking. The MI system uses natural language processing (NLP) to disambiguate and identify the specific interests of the analyst’s query as the name of the HVT as “Osama bin Laden” (referenced to a unique entity identifier), and the most probable current location. The query composer is additionally responsible for identifying which data sources are applicable to answering the analyst’s question. Data suitability assessments are required to identify which data sources provide relevant information with respect to the questions asked, as well as an assessment of each source’s accuracy and reliability. For our example, the query composer determines various intelligence sources are likely to contribute to answering this question, including all-source reports, field reports (HUMINT), cell-phone logs and voice recordings (SIGINT), satellite imagery and video (GEOINT), and information found on open-source sites such as real estate records and social media (OSINT).

Query Conductor: When the analyst’s question is clearly understood and relevant data sources are identified, the query conductor begins orchestrating federated queries across domain-specific information subsystems, such as GEOINT libraries and SIGINT databases, and fusing results from MI-enabled analysis engines.

For our scenario, the query conductor initially prioritizes all-source analysis reports generated by leading analysts. Discovered reports indicate several experts believe OBL is hiding in either Pakistan or Afghanistan. The query conductor uses these initial findings to query SIGINT databases to analyze cell-phone data recorded from those countries, looking for voice recognition patterns of OBL or his known lieutenants. While no direct matches are found for OBL’s voice signature, cell-phone activity of his known associates is used to create a pattern of repeating locations and times, known as a “pattern of life.”

Using these patterns, the query conductor initiates anomaly detection algorithms and quickly detects an unusual call originating from a public pay phone booth to the cell phone of one of OBL’s lieutenants. Using the call time and phone booth location, the query conductor cross-correlates this information with GEOINT databases to identify potentially relevant datasets and queue them for automated processing. Facial detection/recognition algorithms automatically identify video footage from an unsecured web camera nearby containing the face of a person in the phone booth at the time of the call. However, due to low video resolution, the algorithms are unable to pinpoint the specific identity of the unknown caller. Simultaneously, a wide-area motion imagery (WAMI) collection acquired during the time of the phone call is found to also contain the phone booth’s geographic coordinates. Motion tracking algorithms are applied to the WAMI data.

The caller’s movements both before and after the call are revealed, showing numerous stops throughout the day. The query conductor cross-correlates the stops to geographic information system (GIS) foundational databases, and one of the locations is identified as a residential compound of unknown ownership. Initiating a scan of available OSINT real estate websites reveals a senior military officer owns the property. Additional OSINT scans of social media posts by this military officer reveals fervent support for OBL’s activities and his ideology. Pattern analysis identifies the officer’s high-velocity series of social media posts suddenly ceases on the same date as the last known public appearance by OBL.

The query conductor then initiates GEOINT analysis techniques to analyze the residential compound over a series of high-resolution satellite imagery, and automated analysis techniques identify unusual movement behaviors within the property. At this point, sufficient evidence has been collected to generate a high confidence score, and the query conductor presents the findings and supporting materials to the analyst for review. Upon confirmation by the analyst, the query conductor flags the compound’s address as a possible location of OBL in relevant interagency databases and includes it for tasking of future multisource surveillance activities.

This scenario presents a single thread that MI algorithms might follow. However, we believe the system would best serve the analyst if several possible results, along with confidence scores and links to the supporting data, were provided in a format such as the following:

#	Possible Answer	Confidence
1	XXXX	90
2	XXXX	82
3	XXXX	76
4	XXXX	40

The confidence scores are generated using weighted analysis across all contributing information sources and would vary depending on the pedigree and provenance of the source information and timeline data. Analysts would interact with each answer and confidence score to display the supporting data. This enables analysts to visually traverse the logic associated with the recommendation, and to either “agree” or “disagree” with the individual assessments. Based on this feedback from the analyst, the MI algorithms would automatically update to re-assess the weight of the recommendations, thereby learning from the analyst’s assessment of the supporting data. This information can be used to drive future data collection priorities and methodologies in preparation for the same or a similar question being posed in the future.

Additional Assertions

A flexible framework is needed in which new domain-specific algorithms can be plugged in, trained, and validated easily and effectively. Data from many information sources would require automated data conditioning and source preparation to assist in conflating, normalizing, providing metadata to, and contextualizing the collected intelligence. These conditioning services must exist as flexible and discrete services to allow processing workflows that drive conditioning and preparation of content.
Analytic tradecraft is associated with the interpretation of errors, including their nature, magnitude, direction, and associated consequences. Therefore, MI calculations must articulate their accuracy in a manner that is easily understandable by the analyst. As MI algorithms improve, the accuracy of the responses to posed questions should trend upward. Measures of effectiveness and performance will be established and followed for various algorithms. For example, a 40 percent accuracy threshold may be sufficient for recommending the likelihood that an entity is at a specific location, but is not suitable for situations involving kinetic effects.
Validating MI algorithms requires cross-industry approaches that facilitate the credibility of the algorithm. Datasets should minimize bias of model performance estimates and provide a mechanism for evaluating various model-tuning parameters. Model validations should occur upon the preparation of the model. Analyst feedback is important to improve the confidence in the algorithms.
Many analyst questions will revolve around scenarios with little training data available or lower confidence predictions. For these scenarios, we believe the techniques described remain valid. Less mature data and training will require additional human expert involvement to better train the MI systems.
It is imperative the knowledge and expertise of senior analysts is retained before they transition out of the analytic workforce. To facilitate this, MI techniques should be employed to watch and learn as expert analysts perform their tradecraft. This will capture expert tradecraft within the MI knowledge base without placing an additional training burden upon the analysts.

Conclusion

This article describes a future GEOINT system that employs MI technologies to supplement and support human analysis. Though creation of a fully functional MI system would require significant advances in technology, governance, and policy, the result would be highly valuable—a revolutionary advancement in intelligence analysis. Allowing analysts to do what they do best (creativity and critical thinking) is crucial to maintaining national security, and freeing analysts from rote and repetitive tasks would enable them to reach key decisions faster. Additionally, incorporating learning into the system will improve where, when, and how intelligence data is collected.

Posted in: Contributed Tagged in: 2018 S&FOG Report, Activity-Based Intelligence, Analysis, Machine Learning & AI