Digital Fraud Wiki

Your source for the latest fraud intelligence, insights, research, and commentary.

Unsupervised Machine Learning

Machine learning is a branch of artificial intelligence that enables algorithms to learn from existing data and then apply that knowledge to new data. Unsupervised machine learning (UML) is a major category of machine learning techniques that works without requiring labeled input data. Instead, it infers a function to describe the hidden structures of “unlabeled” input data points. UML is often used to discover patterns within large amounts of unlabeled data, and is especially effective for discovering new and unknown patterns.

Common UML approaches today broadly include anomaly detection techniques that attempt to identify outliers, and clustering/graph analysis techniques that focus on studying the relationships and connectivity among input data. The DataVisor UML Engine is developed based on the latter approach, combining clustering techniques and graph analysis algorithms together to discover correlated fraudulent or suspicious patterns from unlabeled data. By analyzing the distance and connectivity between data points that represent accounts and their activities across a large time period, the DataVisor UML Engine is able to  automatically discover new abuse, fraud, and money laundering activities.

UML for Fraud Detection and Prevention

There are many fraud use cases for which UML can be applied, including:

Application Fraud

Using UML, banks and financial institutions can analyze whole networks of applications to detect hidden connections that may appear legitimate when viewed in isolation. 

Money Laundering

Unsupervised machine learning algorithms can look at complex networks of transactions instead of individual ones and can detect and eliminate launderers who deposit small denominations of funds to avoid CTR reporting.

Fake Content

A UML engine can correlate behavior across accounts, instead of merely viewing accounts in isolation. This enables organizations to stop fraudulent attacks that originate from new account registrations, and incapacitate incubating accounts before they can cause damage.

Promotion Abuse

UML solutions enable the captured of all members of a given fraud ring by identifying hidden linkages between fake account registrations and discovering unknown attacks without labels or training data.

Spam and Scam

Unsupervised machine learning models can proactively detect evolving attack patterns, and reduce operational costs by increasing accuracy, lowering false positives, and enabling real-time bulk decisioning.

Proactivity, UML, and DataVisor

By its nature, UML is proactive. Because UML does not require existing labels or data, it can detect new and unknown fraud types, enabling organizations to take early action, and prevent damage before it occurs. This makes UML an ideal approach for dealing with the most sophisticated emerging fraud techniques.

At DataVisor, our approach combines cutting-edge AI and machine learning technologies to correlate fraudulent and suspicious patterns across billions of accounts in real time. Patented and proprietary unsupervised machine learning (UML) algorithms work without labeled input data to automatically detect new and previously unidentified fraud and abuse patterns.

The DataVisor Unsupervised Machine Learning Engine processes all events and account activities simultaneously to analyze patterns across hundreds of millions of accounts. This enables detection of suspicious connections between malicious accounts, even when those accounts are incubating, mimicking legitimate user activities, or changing attack techniques. This also allows the UML Engine to detect all the members of an attack ring at once, ensuring the attack is fully stopped.