Q: Why has the industry seen such a significant uptick in synthetic ID fraud attacks recently?
A: There are four main factors contributing to this trend: so much PII being compromised and available on the internet, the move by industry to digital services, Social Security number (SSN) randomization, and the prioritization of the customer experience and growth over security.
Data breaches have been with us for some time now, but the scale of those breaches is growing exponentially. In 2019, over 2.7 billion identity records were stolen and posted online for sale. The year was off to a rough start with the “Collection #1" breach in January. At the time, it was reported to be the largest public data breach by volume, with 772,904,991 unique emails and 21,222,975 unique passwords exposed.
With the shift to digital and online services for consumers, fraudsters can now take PII they retrieve on the internet and are able to game the credit system with synthetic identities at scale, from the comfort of their homes. There’s also evidence that as institutions are getting better at combating third-party fraud, fraudsters are shifting to synthetic because it is more difficult to detect. Information and tactics for how to perpetrate synthetic fraud are becoming more widely available as well.
Another material factor that continues to ravage ID verification systems is the randomization of SSNs by the Social Security Administration (SSA). Until June 25, 2011, the first three digits were known as the "area number" which referred to the state or territory where an SSN was assigned. Thereafter, the number was randomly assigned. The unintended consequence was the unmooring of traditional fraud detection systems that used the geographic data as a factor in matching an SSN to a person. By using the first three numbers of someone's SSN (pre-randomization), you could often tell in which state they were born, or at the very least, one of the states where they once lived. With the new, completely randomized SSNs, you can no longer associate the state of birth with the SSN of the individual.
Finally, we see with the tech boom and fierce competition among the fintechs, it is not uncommon for institutions to prioritize customer acquisition by minimizing customer friction in the onboarding process. Synthetic fraud detection gets overlooked in the rush to grow the customer base and introduce new services. By the way, growth and security aren’t mutually exclusive. Working with a company like Socure optimizes fraud and IDV checks in a highly efficient yet accurate way, without degrading the onboarding experience.
Q: How do fraudsters build and pollinate synthetic identities?
A: Synthetic identity theft is a type of fraud in which a fraudster often combines real and fake pieces of information to create an entirely new identity. The three common approaches include:
- Identity fabrication, first-party synthetic where the identity is built entirely from fake elements
- Identity manipulation, where numbers or letters in real identity elements are simply tumbled, e.g. ‘1234 Main Street’ becomes ‘1243 Main Street’ and may pass fuzzy matching
- Identity compilation, third-party synthetic fraud using a combination of real and fake identity elements
Unlike traditional identity fraud where a criminal makes use of a victim’s identity for a quick ‘smash and grab’ scheme, synthetic identities are usually established over time--months or even years--to extract maximum financial reward. The perpetrators build up a decent credit score, open multiple accounts, and often appear to be good customers while going undetected until they decide to cash in, or “bustout” where they use up all available credit lines and disappear.
Q: Why are synthetic identities difficult to detect?
A: One of the biggest issues we hear from prospects and customers alike is the lack of patterns to predict synthetic fraud. What is needed is a common dependent variable classification and definition of a synthetic ID. Without this kind of detail, it is difficult to build fraud models that can consistently and accurately detect synthetic identity fraud.
Because synthetic identity fraud is also a victimless crime, there is no victim to simply raise their hand and alert lenders that someone is using their identity to open accounts. When the fraudster disappears, it is difficult to tell the difference between a synthetic and a genuine customer that maybe defaulted and you simply couldn't reach. This enables fraudsters to work completely anonymously and with the luxury of time to reap maximum benefit.
It’s also important to note that there also exists a simple lack of awareness in many cases. Some institutions are unaware of the magnitude of synthetic identities hiding in their customer portfolios because they look benign or even great (many have FICO scores over 750).
Q: What attributes would indicate there exists potential for synthetic ID fraud?
A: We have found that certain attributes have proven to be indicators of synthetic identity, and our ML models evaluate hundreds of variables. A few general ones that come to mind are:
- High number of inquiries across all credit reports
- Randomized SSN issued after 2011
- The inability to match and/or verify an identity across all PII elements
- High risk associated with a phone number
- High risk associated with an email address
The Socure Approach
Sigma Synthetic Fraud was developed for a digital-first environment. The product was built from the ground up in partnership with our largest customers to meet high consumer expectations while combating increasingly sophisticated fraud tactics and vectors.
Socure has uniquely deployed specific reason codes that enable clients to clearly understand hidden patterns of synthetic fraud. These reason codes were derived using graph-based techniques, Neo4j DB and Spark, extracting topological, velocity and PII interaction features.
Using unsupervised ML and multi-class classification, Socure is able to further understand and share with clients the differences between synthetic label patterns across a diverse base of customer institutions. This enables Socure to provide intelligent industry recommendations on how to best label synthetic fraud for collections and other purposes.
Using supervised ML and a variety of boosting algorithms, Socure is able to deliver an explainable synthetic ID fraud model that has the highest classification rates of any vendor in the industry.