Login | Register

Machine Learning Methods for the Detection of Fraudulent Insurance Claims

Title:

Machine Learning Methods for the Detection of Fraudulent Insurance Claims

Zhao, Sisheng (2020) Machine Learning Methods for the Detection of Fraudulent Insurance Claims. Masters thesis, Concordia University.

[img]
Preview
Text (application/pdf)
Zhao_Master_S2020.pdf - Accepted Version
3MB

Abstract

This thesis focuses on automotive fraudulent claims detection, a particular Property and Casualty (P&C) insurance product. By analyzing the customer's information, we try to define a model to determine if one customer has filed a fraudulent claim.

Two datasets used in this thesis. One of them is very imbalanced, as 6.1% of policyholders file fraudulent claims (coded as 1) and 93.9% of policyholders file normal claims (coded as 0). So, we need to deal with the imbalanced classes, by using rebalanced methods such as SMOTE and under-sampling. Then we use classical methods (naïve Bayes and logistic regression) and new data science methods (random forest and gradient boosting) to detect the fraudulent claims. During the process, we compare these methods to find which one performs better for this application.

In addition, the combination of SMOTE and clustering is also used to these two datasets, which is unusual in fraud detection. But the results have been improved a lot for all these four classification models. What is more, link analysis method has also been mentioned in the conclusion.

These methods have also been used to another dataset, which is not that imbalanced, with 24.7% of fraudulent claims and 75.3% of normal claims. The reason for using two datasets is to see if the degree of imbalance affects the performance of the oversampling, undersampling and different models. If so, then these methodologies will be more convincing. If not, we can dig deeper to find the reason.

Divisions:Concordia University > Faculty of Arts and Science > Mathematics and Statistics
Item Type:Thesis (Masters)
Authors:Zhao, Sisheng
Institution:Concordia University
Degree Name:M.A.
Program:Mathematics
Date:20 March 2020
Thesis Supervisor(s):Garrido, Jose
ID Code:986611
Deposited By: Sisheng Zhao
Deposited On:26 Jun 2020 13:45
Last Modified:26 Jun 2020 13:45
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top