Trajectory spoke with Rendered.ai’s CEO about the company’s recent partnership with Orbital Insight to contribute synthetic data for a project that is building automated detection technology for the National Geospatial-Intelligence Agency
In his keynote address at last year’s GEOINT Symposium, National Geospatial-Intelligence Agency (NGA) Director VADM Robert Sharp said partners cooperating and combining their efforts to meet needs will be a key aspect of our evolution to get the best out of the GEOINT community in the future. How did Orbital Insight and Rendered.ai become collaborators on a Small Business Innovation Research grant (SBIR) from NGA?
Kuntz: When we founded Rendered.ai, we teamed up with an engineer at Orbital Insight to take a look at whether or not synthetic data could help improve some of the rare object detectors they were working on. In doing that exploration, we came upon several opportunities and challenges ahead for synthetic data which we started to integrate into the Rendered.ai platform. When the SBIR solicitation came out, it seemed natural for us to extend that work to address NGA’s goals.
What are the goals of your research grant (Phase 1 and Phase 2) and how are you measuring achievements toward those goals?
We are focused, very broadly, on rare-object detection and building tools that can help train detection algorithms and assess their likely efficacy. One of the major challenges in AI for remote sensing is the importance of rare objects. Sometimes the object itself is rarely seen. In other circumstances it is that the object needs to be detected in a different context. Furthermore, there is often such a dearth of data that it is hard to even test algorithms. We are exploring just how few “real” examples you can get away with.
Can you explain what synthetic data is and why it’s important for training AI models?
Wow. That’s a big question. So AI is basically software in which the behavior is determined by “training” the software with data—and just about every AI project ever undertaken is performance limited by the availability of that data. There are a lot of reasons for this such as:
Rare objects and edge cases: The most important things to find are often the things you rarely see and these are consequently the most unlikely objects that result in large quantities of training data. If an object appears in 10% of the images collected, you will need to collect 10 times as many images as you need to have data for AI training.
Data labeling: It’s actually not enough to have data for AI training. You also have to tell the computer what is in that data. For detection, human beings have to literally draw boxes around those objects. This is a tedious and expensive process that is prone to error. In remote sensing it can be hard for humans to identify objects (we have seen a lot of examples of incorrect labelling) or it might be impossible for a human to interpret the data (such as in some radar imagery) if you don’t have prior knowledge of the scene.
New sensors: In a world in which we have collected millions of images in order to train AI systems, what do we do with new sensor technology? Do we have to do that collection and labelling again? For that matter how do we know what sensor specifications will be most useful to detection with AI? If a user could simulate data from new or proposed sensor types, then they could start to train AI before they deploy the new sensors in the field, greatly accelerating innovation.
Synthetic data provides a means to address these challenges by essentially using physics based simulations to create datasets from a combination of a known ground truth (usually a 3D mesh) and a physics simulation of the sensor (RGB, HSI, SAR, etc.)
How will the work you are doing benefit NGA and the broader GEOINT community?
The most immediate benefit is improved detection algorithms. But the extended benefits may become even more important, such as being able to anticipate the performance of algorithms in unforeseen circumstances, to fix bias and model drift, and provide a test bed for continuous engineering of these algorithms. It turns out AI is not the first form of software that doesn’t have bugs.
Has USGIF membership benefited your organizations? If so, how?
We are relatively new to the USGIF community and are excited to start participating!
Do you have any guidance for other companies seeking collaborative partnerships like yours?
I actually spend a lot of time mentoring startup companies. It’s hard to wrap all of the advice I would give into a written answer here, but one thing I encourage small companies to do is to work with larger companies as prime contractors when they are first getting started. Startups face a lot of hurdles like a lack of resources for proposal writing, and a lack of past performance. Teaming up with a leader in the space (like Orbital Insight) can be a great way to jumpstart the process.
Featured image, provided by Rendered.ai: Top row: Synthetic images are 2nd, 3rd from left; Bottom row: Synthetic image: 2nd from left