Live training with Aric LaBarr starts on April 13 at 1 PM (ET)

Training duration: 4 hours (Hands-on)

Price with 30% discount

Regular Price: $210.00

Subscribe now and start 7-day free trial

Sign-up for a Basic or Premium Plan and Get 10-35% Additional Discount Live Training

Instructor Bio:

Aric LaBarr, PhD

Associate Professor of Analytics | Institute for Advanced Analytics at NC State University

Aric LaBarr, PhD

A Teaching Associate Professor in the Institute for Advanced Analytics, Dr. Aric LaBarr is passionate about helping people solve challenges using their data. There he helps design the innovative program to prepare a modern work force to wisely communicate and handle a data-driven future at the nation's first Master of Science in analytics degree program. He teaches courses in predictive modeling, forecasting, simulation, financial analytics, and risk management. Previously, he was Director and Senior Scientist at Elder Research, where he mentored and led a team of data scientists and software engineers. As director of the Raleigh, NC office he worked closely with clients and partners to solve problems in the fields of banking, consumer product goods, healthcare, and government. Dr. LaBarr holds a B.S. in economics, as well as a B.S., M.S., and Ph.D. in statistics — all from NC State University.

30% discount ends in:

  • 00 Days
  • 00 Hours
  • 00 Minutes
  • 00 Seconds

By the end of the course, participants will be able to:

  • Develop good features (recency, frequency, and monetary value as well as categorical transformations) for detecting and preventing fraud

  • Identify anomalies using statistical techniques like z-scores, robust z-scores, Mahalanobis distances, k-nearest neighbors (k-NN), and local outlier factor (LOF)

  • Identify anomalies using machines learning approaches like isolation forests and classifier adjusted density estimation (CADE)

  • Visualize these anomalies identified by the above approaches

DIFFICULTY LEVEL: BEGINNER-INTERMEDIATE

Course Abstract

Data is everywhere and its prevalence drives decisions for almost every industry. However, anomalies in data can lead to incorrect or out of date decisions to be made. Whether it is just doing exploratory data analysis and trying to clean your data, monitoring the health of a computer system to make sure things are working properly, or trying to catch fraudulent claims in life insurance, anomaly detection helps detect outliers before they can become too much of a problem for decision makers. This course will examine anomaly detection through the example of fraud, but all of these techniques can be applied to other areas as well. We will start with the importance of feature creation and transformation. We will then cover more statistical based approaches to anomaly detection. Last, we will end with more machine learning based approaches to allow the learner to approach anomalies from any angle and industry need.

Course Outline

1. Introduction to Fraud

  • The Problem of Fraud - How can we analytically define fraud? There are important characteristics of fraud that puts a better perspective on the modeling and identification of fraud.
  • Detection and Prevention - The two biggest pieces that any holistic fraud solution should have are detection of previous instances of fraud and prevention of new instances. This section also defines the typical fraud identification process in organizations.
  •  Analytical Solution - Now that we now what fraud is as well as the organizational structure of how to deal with fraud, we need to introduce the analytical approaches to becoming a mature organization on detecting and preventing fraud.


2. Data Preparation

  • Feature Engineering - The best way to glean information from data is to develop good features to help detect and identify fraud. We talk about and develop strategies for developing good features for anomaly detection.
  • RFM Features - Thinking about new features in terms of recency, frequency, and monetary impact help define important characteristics of fraud. This is where the session gets interactive as participants put on their "fraudster hat" and try to think like a criminal to help develop new features.
  • Categorical Feature Engineering - This section will cover ways to use categorical pieces of information to create even more rich features for our anomaly detection.


3. Anomaly Models

  • Non-statistical Techniques - This section covers Benford's Law and why it was used (and still is) for basic anomaly detection.
  • Univariate Analysis - When addressing anomalies for one variable at a time, we can use a variety of techniques. This section covers z-scores, robust z-scores, the IQR Rule, and the adjusted IQR rule.
  • Multivariate Analysis - This is where the biggest improvements in anomaly detection have happened over the past decade. We will start with more statistical approaches like Mahalanobis distances (and their robust counterparts) as well as k-Nearest Neighbors (k-NN) and the Local Outlier Factor (LOF). Then we will move into more advanced machine learning approaches to anomaly detection like isolation forests and classifier-adjusted density estimation (CADE).
  • Wrap-up - Here will will summarize everything we have done to build up our anomaly detection as well as hint towards the next course in more advanced fraud detection models. 

Which knowledge and skills you should have?

  • Introductory R/Python

  • Basic introduction to decision trees (this isn't required, but helpful for understanding)

  • Basic introduction to classification models like logistic regression, decision trees, etc. (this isn't required, but helpful for understanding)

What is included in your ticket?

  • Access to live training and QA session with the Instructor

  • Access to the on-demand recording

  • Certificate of completion

Upcoming Live Training & Recordings

Access all live training