August 13, 2024 - Dan Gringarten

4 Super Powers of Unsupervised Machine Learning

PwC’s Global Economic Crime and Fraud Survey 2022 reported that 51% of surveyed organizations experienced fraud over the previous two years. Fraudulent activities often deviate from normal behavior patterns, but these deviations are not always apparent.

As financial transactions grow in volume and complexity, the ability to automatically adapt and uncover new fraud patterns becomes crucial. Unsupervised machine learning (UML) – a type of algorithm used to analyze and cluster unlabeled datasets – offers a dynamic, proactive approach to fraud detection, ensuring that financial institutions stay ahead of increasingly sophisticated fraudulent schemes.

Supervised ML vs. Unsupervised ML

Before we explore how UML empowers fraud detection, let’s take a look at why it has moved to the forefront.

A few years back, fraud was primarily individual in nature. For instance, a person might walk into a bank to cash a forged check. As businesses, financial institutions and consumers have transitioned to digital platforms, the sophistication and scale of fraud have increased dramatically.

We are now in the third generation of online thefts, known as “Ubiquitous Identity Theft.” According to the Cybercrime Journal, 1.077 billion identity records were stolen in data breaches by 2018. That means that every adult in the U.S. has had their identity stolen about 4.4 times, on average.

Traditional approaches to fraud detection such as supervised machine learning (SML) were insufficient at combatting modern fraud. SML relies on historical data to train models and is slow to adapt to new and evolving fraud tactics. The problem is that fraudsters are constantly changing their strategies. Unlike SML, UML works with unstructured data and is capable of identifying new and evolving patterns in real-time, providing a more robust defense against sophisticated and large-scale fraud.

4 Super Powers of UML

UML algorithms can sift through vast amounts of transaction data to uncover anomalies that might indicate fraudulent behavior. By identifying these outliers, financial institutions can detect and prevent fraud more effectively.

But that’s not all! Here are four “secret” capabilities of UML in proactive fraud detection.

1) It slashes false positives.

UML doesn’t need labeled data; it analyzes large sets of unlabeled data to find hidden patterns and spot new threats. Using techniques like anomaly detection, clustering and graph analysis, it can see how anomalies and activities are related. This deeper insight means fewer false positives and better fraud detection compared to supervised models.

2) It finds hidden fraud clusters.

Anomalies can happen for a number of reasons, and often result in a high false positive rate. However, when the relationships between anomalies are discovered, false positives decrease, because organizations gain insight into whether an anomaly is truly suspicious.

For example, fraud rings may synchronize their behaviors instead of creating one-off fraud transactions, which may not appear as an anomaly when looking at individual user activities. This is why it’s important to go a few layers deeper and define the relationships between activities. UML isn’t anomaly detection because it looks at clusters of activities rather than individual activities that don’t fit any specific patterns.

3) It provides explainability.

Model governance ensures the quality and reliability of the model, but only if users understand how their fraud model works and how it arrives at certain conclusions. UML provides specific reason codes explaining why a transaction or activity was flagged as fraudulent, based on activities, behaviors, timing and other factors.

Additionally, UML models dynamically observe data in real time, making them resilient to outliers or data skews. This capability allows for clear explanations of flagged activities.

4) It provides scalability and rapid results.

UML provides unmatched scalability for organizations, enabling users to manage big data volume with high QPS and low latency, to power real-time threat response. Some UML-based solutions process bullions of user accounts with real-time activity streams, and can handle millions of transactions simultaneously.

And because UML doesn’t require extensive training or data labeling, organizations can usually get it up and running in as little as two weeks.

Examples of UML in Action

Now that you understand how UML works, let’s take a look at a few real-world examples where it was used successfully to thwart fraud.

Financial institution uncovers a fraud ring

A financial institution recently uncovered a sophisticated fraud ring through meticulous analysis of credit card applications. They discovered a coordinated effort led by a social media influencer who conducted webinars instructing participants on how to falsify income details and personal data to fraudulently secure credit lines. Webinars attendees purchased emails from the dark web to bypass detection, avoiding third-party address verification signals.

UML was used to link the fraudulent applications to referral URLs from the webinars. Patterns emerged, such as 80% of applications declaring a monthly income of $6,833 and all applicants listing themselves as “self-employed.” What’s more, applications were often submitted in the middle of the night, and frequently used the @outlook.com email domain. These consistent indicators exposed the coordinated effort to deceive the financial institution.

Using Datavisor’s UML capabilities, the institution increased fraud detection rate from 52% to 97%, and prevented $900,000 worth of losses.

Enhancing fraud detection

A loan originator was relying heavily on supervised machine learning (SML) and rule-based systems for detecting fraud. These systems depended on historical labels and struggled with an unsatisfactory precision-recall balance.

The organization integrated DataVisor’s UML model with its existing SML model using an ensemble technique. Specifically, the new approach augmented labels for various transaction processors, accelerating the training of additional SML models. Leveraging UML, the client was able to capture new fraud patterns in real-time, leading to a 20% increase in fraud detection and a 70% decrease in false positives.

Discover the Best UML in Fraud Prevention

DataVisor’s comprehensive fraud and AML solution suite combines patented UML technology with native device intelligence and a powerful decision engine to provide protection for the entire customer lifecycle. Schedule a personalized demo of our UML capabilities to see why we’ve been adopted by many Fortune 500 companies across the globe.

about Dan Gringarten
Dan is a Product Marketing Manager at DataVisor, with over eight years of diverse professional experience, including a finance background where he earned his CPA. He is passionate about sports, cats and the art of mixology. Dan holds an MBA from Berkeley Haas.
about Dan Gringarten
Dan is a Product Marketing Manager at DataVisor, with over eight years of diverse professional experience, including a finance background where he earned his CPA. He is passionate about sports, cats and the art of mixology. Dan holds an MBA from Berkeley Haas.