Man Vs. Machine

Machine learning has become an invaluable tool in the fight against fraud. It combines computational statistics, artificial intelligence, signal processing, optimisation, and other methods to identify patterns.

Machine learning has been a significant breakthrough in helping companies move from reactive to predictive by highlighting suspicious attributes or relationships that may be invisible to the naked eye but indicate a larger pattern of fraud.

The great value of machine learning is the sheer volume of data that computers can analyse that humans cannot, thanks to a variety of pattern recognition algorithms. With this you can add exponentially more data to your analysis — but selecting the right data and approach to model the problems is critical.

A solid solution also requires specialised expertise to apply rigorous methodology in data analysis and develop the fraud models to ensure consistent quality. This expertise includes carefully analysing the data, correctly treating the irregular values and data elements, dealing with bias, and validating the underlying assumption of the machine learning techniques, all whilst avoiding pitfalls such as focusing on trivial patterns in historical data and an inability to generalise the results for future events.

Traditionally, the majority of machine learning systems have strictly used supervised learning, which incorporates prior knowledge of fraud tactics to guide pattern identification, because it’s easy to teach the machine once there is a clear target for it to learn.

This leads to some limitations of supervised machine learning-based fraud detection systems, including:

  • Collecting and analysing enough data (historical fraud tags and transactions) to accurately identify future fraud behaviour, then deploying those models may take several weeks or months. This means it can take a long time for machine learning systems to react and prevent fraud, in which time fraudsters can do a lot of damage.
  • Given rapid changes in behaviour by fraudsters to evade detection, machine learning can fail to generate an effective pattern or consistent profile, thus dramatically reducing its efficacy.
  • Poor use of machine learning can generate a lot of false positives that can lead to the types of disproportionate customer challenges and friction that we have highlighted throughout this report.
  • Dirty fraud tags due to mislabelling by the fraud analysts or unreliable reporting can cause the fraud model to be biased toward detecting certain behaviours that do not necessarily represent frauds

A way to increase the accuracy of supervised machine learning-based fraud detection is to pair it with unsupervised machine learning techniques that look for irregular or uncharacteristic items, known as anomaly detection.

Anomaly detection approaches can complement supervised learning methods

Unsupervised machine learning techniques, also known as anomaly detection models, complement supervised learning by looking for aberrations in the patterns of a transaction flow. These deviations may indicate fraud, or may simply be a change in global behaviour (what “normal” looks like). For this reason, anomaly detection models generate a larger number of false positives than a good supervised learning based model does, and are inappropriate to deploy as the only machine learning technique.

However, anomaly detection models can be a strong complement to supervised learning approaches because they approach the same problem from entirely different angles and exploit orthogonal information. When combining both techniques, the resultant analytic engine can recognise previous patterns of confirmed fraud, whilst also raising an alert if a pattern of activity changes. Making both techniques work together requires robust expertise, as the combined approach provides optimal performance – increasing fraud detection rates and reducing false positives.

Experian continues to be at the forefront of machine learning advances, developing sophisticated models that are less reliant on fraud tagging and react quickly to attacker behaviours. We view the combination of approaches as pioneering machine learning for fraud detection.