Login | Register

Transaction Graph Analysis for Bitcoin Address Classification: Traditional Supervised Machine Learning and Deep Learning Methods

Title:

Transaction Graph Analysis for Bitcoin Address Classification: Traditional Supervised Machine Learning and Deep Learning Methods

Saeidimanesh, Seyedarash (2024) Transaction Graph Analysis for Bitcoin Address Classification: Traditional Supervised Machine Learning and Deep Learning Methods. Masters thesis, Concordia University.

[thumbnail of Saeidimanesh_MASc_F2024.pdf]
Preview
Text (application/pdf)
Saeidimanesh_MASc_F2024.pdf - Accepted Version
Available under License Spectrum Terms of Access.
2MB

Abstract

In this thesis, we consider the problem of Bitcoin address classification and clustering, common in the domains of law enforcement and regulatory compliance. We build a machine learning-based classification framework which is able to attribute a Bitcoin address to one of the predefined classes or to a specific company. We consider five distinct classes for coarse-grained classification: cryptocurrency exchanges, online marketplaces, mining pools, fundraising/charity platforms, and gambling; and 180 companies for fine-grained classification. Classes and the companies were selected so that they represent a broad spectrum of entities and activities within the Bitcoin ecosystem.
This thesis has three main contributions. First, due to the lack of publicly available datasets suitable for testing machine-learning classification algorithms, we create our own labeled dataset consisting of 3M Bitcoin addresses (from 2016-2022), with each Bitcoin address assigned a ready-to-use vector of carefully crafted features. Second, using this dataset, we conduct a comparative analysis of different machine-learning techniques and features for classification. Finally, we develop
two types of classifiers: based on the Boosted tree algorithm and the neural network-based classifier. Both are able to attribute a Bitcoin address to one of the predefined classes/companies. Our binary classification model achieves an F1 score of 76% using the Boosted tree algorithm, while our deep learning model achieves a 90% F1 score for multi-class classification with an accuracy of 92% and 28% higher than related work correspondingly. We achieve 67% accuracy for linking Bitcoin addresses to one of the 180 companies with our deep-learning model.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Thesis (Masters)
Authors:Saeidimanesh, Seyedarash
Institution:Concordia University
Degree Name:M.A. Sc.
Program:Information Systems Security
Date:1 March 2024
Thesis Supervisor(s):Pustogarov, Ivan
ID Code:993816
Deposited By: Seyedarash Saeidimanesh
Deposited On:24 Oct 2024 18:04
Last Modified:24 Oct 2024 18:04
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top