Data scientists are key players in helping enterprises derive value from their data environments. They use a combination of industry domain knowledge, programming skills, and statistical expertise to extract meaningful insights from data. This skillset is critical for enterprises using data to optimize business operations, but data science expertise is already scarce and getting even harder to find. In order to maximize the effort of valuable data scientists, organizations need to figure out what to do in-house, and where to leverage external resources. Socure deploys a team of data scientists to solve application fraud and identity verification problems in a single, comprehensive solution, freeing up your own team to focus on tasks that are more central to your core business. 

Socure utilizes machine learning (ML) to solve crucial pain points our customers face. We combine ML with the efforts of our own data science team in a continuous innovation approach that gives customers the most current and comprehensive consumer identity verification capabilities which lead to seamless, safe account openings at scale. In fact, data scientists make up about 25% percent of our current staff and that number continues to rise as we never stop pursuing better ways to verify identities and predict fraud for enterprises in all industries. Those solutions are a result of our data scientists, our proprietary identity graph, and our enormous investments in data. Because we do the heavy lifting at what we do best (ID verification and fraud detection), we help enterprises make better use of their own scarce data science expertise to address the aspects of their business they know best.

Applying Data Science: The Process, Pitfalls, and Power

Finding the right data sources, developing meaningful models, and continuously improving those models is a challenge even for the most mature and well-funded data science teams. Starting with raw data analysis, you have to locate the optimal data sources to inform your models. And for every type of data, you typically need multiple sources to avoid relying on a single source that might have flaws or gaps. Taking it a step further, you have to continuously scan these sources, as well as new ones that are spun up, to inform and improve your models. This is not one-and-done to arrive at a list of data sources, but an ongoing process that scans for new data sources and evaluates their efficacy. Socure is evaluating eight to ten new data sources every quarter. 

The next step is extracting insights from the data to inform your model; this is also referred to as “feature engineering.” In the context of fraud, that means identifying attributes that provide insights into the riskiness of an identity, including: name, email, phone number, device, physical address, IP address, and other elements that comprise a digital identity. What’s more, it is not simply evaluating what is good and bad in individual data elements, but the combination of those elements that might indicate fraud. For example, a single device used for five different account applications might be benign, but if there is a different identity associated with each application, it might raise the risk of those applications. Or, if the tenure of an email address, phone number, and physical address are proximate, that could be an indicator of a synthetic identity. These features, or predictors, need to be evaluated against training data comprised of hundreds of millions of known outcomes, to find what is most indicative of fraud. For our latest Sigma Identity Fraud solution, we evaluated 17,000 features. And you work this process on an ongoing basis to compare your existing “champion” model against “challenger” models that might be better. 

You also need to continually improve models based on feedback data. Getting more feedback data improves model precision. A single enterprise might be able to use feedback data from its particular base of consumers, but Socure is leveraging feedback data from a consortium of more than 750 leading companies, including four of the top five banks, and seven of the top ten credit card issuers. This data enables us to identify trends within industries and across industries to which in-house teams don’t have access. Socure’s intelligence insights are derived from credit issuers, credit unions, lenders, banks, fintechs, insurance, remittance, payroll and other digital industries. This training data allows us to see every possible pattern of good, bad, and questionable identity attributes so we can identify and evaluate them. 

How Socure Complements Your Data Science Initiatives

Organizations face a variety of challenges in selecting the right data science models and teams for the right business case. Do you want to gain better insights into customer behaviors, optimize the customer journey across channels (store/branch/facility, call center, website, mobile, or other), or gain a single view of the customer? The challenge is choosing the most impactful problems to analyze and solve. 

Another data science challenge lies in establishing and maintaining your data pipeline. Socure automates the process of acquiring, ingesting, cleansing data originating from hundreds of sources. In addition, we can automate the ingestion of customer feedback data from a multitude of sources. The end result is the most accurate modeling in the industry. This provides more accuracy for our customers, who then supply even better data, all in a loop that is continually improving. We typically issue a new fraud model every quarter, and are looking to improve models with even greater frequency. 

A significant reason for Socure’s business growth and momentum lies in companies recognizing that they can apply Socure’s data science expertise to solve their digital identity verification problems. The continually improving accuracy delivered by Socure enables our customers to auto-accept more good consumers, deflect fraudsters, and minimize the operational costs of manual application review. Having Socure work and solve for application fraud allows you to better solve for fraud and conversion while deploying your internal data science talent to solve issues of more significance or specialty. 

Give us a shout if you want to understand how Socure can help you reduce your fraud risk, auto-accept more good customers, minimize your false positives, and streamline your risk program. 

Topics: Identity verification, ML, identity fraud detection

Todd Thiemann

Todd Thiemann

Todd is Senior Director for Product Marketing at Socure, where he manages marketing for Socure’s Fraud suite of offerings. Prior to Socure, Todd worked in cybersecurity and identity at companies including Arctic Wolf Networks, Nok Nok Labs, Vormetric/Thales, and Trend Micro. Find Todd on LinkedIn.

Subscribe to Socure's Blog Posts