Prior to July 2016, acclaimed data scientist Jeff Jonas was an IBM fellow and the company’s chief scientist of context computing. In August 2016, Jonas founded Senzing, a spin out of IBM focused on delivering easy-to-use, affordable, smart, real-time entity resolution to the world at large.
Jonas’ work played a role in defeating card count teams as depicted in the book Bringing Down the House and the movie 21. He is currently the author or co-author of 14 patents. Jonas was briefly a quadriplegic in 1988 following a car accident. Since then, he has fully recovered to compete in more than 50 Ironman triathlons and is one of only four people to complete every Ironman triathlon in the world. Jonas is also a member of USGIF’s Board of Directors.
Can you tell us about Senzing?
I proposed a one-of-a-kind spinout to IBM, and we spun out a license for the source code, the rights to practice some patents, and some core team members. It has turned into a really unique and fantastic partnership, and has allowed my team to get singularly focused on democratizing entity resolution (ER). We are headquartered in Venice Beach, Calif., but have people located all over the country.
What is entity resolution?
Senzing’s mission is ER—it’s all we do. All organizations have duplicate identities in their data. On your phone you probably have duplicates. Imagine this problem for a bank, social service agency, or healthcare organization with tens or hundreds of thousands, or even millions, of identities to manage. Anyone managing identity lists has a need for ER.
A marketing department trying to remove duplicates from their mailing list is the simplest use case. Most organizations purchase expensive and complicated ER products that are difficult to use and require experts. Or they try to perform ER themselves, which is even more challenging and requires a team of programmers—some companies have 10, 20, or more engineers dedicated to building ER. Our mission is to democratize ER—to make world-class ER easy and affordable.
Our ER software, G2, helps organizations find non-obvious connections in their data. For a bank, are you looking at five customers each with one bank account or one customer with five bank accounts? ER is also ideal for insider threat detection, like finding the nexus between a former employee that was fired and an insider threat investigation.
How might ER be useful to the Intelligence Community?
Intelligence is often about keeping an eye out for bad guys, for example to make sure they’re not coming into the country or showing up on the radar in some surprising way. Historic intelligence failures are often because the dots weren’t connected fast enough.
You have to be able to do fuzzy matching to find clever criminals. As such, ER must take into account things like name misspellings, messy addresses, and number transpositions. It must see through all of this fuzziness to determine who’s who. Roughly 50 companies sell some form of ER, but we’re the only one that does so using real-time machine learning (ML). Most ML you have to teach, tune, and reload. Our method is self-tuning and self-correcting in real time, without reloading. It just gets smarter as you go forward, and that eliminates the need for experts.
What are the implications of ER for the GEOINT Community, more specifically?
ER allows data without a geographic location to be combined with data containing a geographic location. When such records resolve, data previously without location can be mapped.
What do you mean by the “democratization” of ER?
We are making ER very easy to use, literally for the first time. As long as someone can use Microsoft Excel they can resolve entities. For example, if you buy a marketing file, how can you be sure you’re not marketing to people who are already customers?
Or, one of our customers does supply chain risk assessment for global brands. They use ER to scrape lists looking for derogatory information about their vendors, such as toxic spills or child labor. ER allows them to go back to their customer and say, for example, “Do you realize your manufacturer is in trouble for three toxic spills and child labor? This could place your brand’s reputation at risk.”
G2 is quite versatile. For example, since 2012, an early version of G2 has been used by the Electronic Registration Information Center (ERIC) nonprofit organization to modernize voter registration in America. As of December, one third of the country runs on this system and both Democrats and Republicans love it. It’s a great system, and one of the systems I’m most proud of.
Some experts say a global artificial intelligence (AI) “arms race” is beginning to unfold. What are your thoughts on this?
AI and ML have captured everybody’s imagination. They are currently overhyped, but I still think it would be foolish to not make the most of them. AI will be a key differentiator for how people compete, whether banks competing with banks, governments competing with other governments, or law enforcement competing with organized crime. Regarding the notion of an arms race, I think technology has always been an arms race and this is just another flavor. Certainly you’re going to have to be better at it than your adversary if you want to remain competitive.
What’s in store for the future of Senzing?
My team and I collectively have more than 200 years of ER experience. ER is a huge market and we’re going to serve every corner of it. Our style of ML is unique and tailored to the kind of analytics required to conduct real-time ER. We have a long list of advanced features, though we are now focused on masking G2’s sophistication to make it smart enough that it runs itself—like a giant easy button.
The basics of this technology have been around for a while. It’s getting smarter and smarter but it has always been complex and difficult to deploy. The next evolution is making it easier and more affordable so it doesn’t take an army of systems integrators and lots of time and money to implement.
Headline Image: Jeff Jonas addresses young professionals during a mentoring session in the USGIF booth at the GEOINT 2013* Symposium.