Pilehchiha, Sina (2022) Improving Reproducibility in Smart Contract Research. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
487kBPilehchiha_MASc_F2022.pdf - Accepted Version |
Abstract
The most popular smart contract-based blockchain platform at the moment is Ethereum.
Based on market value, it is the second-largest blockchain platform behind Bitcoin, with a steadily increasing market share. Ethereum smart contracts are used to secure billions of dollars worth of assets.
Source code for smart contracts must be examined for any potential flaws that could result in significant financial losses and damage trust because they cannot be modified after deployment.
A wide range of tools have been developed for this goal, and extensive literature on vulnerabilities and detection techniques on the subject above constantly keeps emerging.
The analysis, testing, and debugging of smart contracts through automated processes have also been the subject of extensive research.
Researchers have worked on the development of tools that can automatically detect and fix vulnerabilities in smart contracts, especially tools that rely on less explored methodologies, such as machine learning-based tools.
We provide details on our work on \slithersimil, a statistical addition to a static analyzer, as a data-driven endeavor to complement the existing security analysis methods of smart contracts.
\slithersimil~allows developers and auditors to check the similarity between the source code snippets of smart contracts written in Solidity and allows users to check smart contracts with a database of vulnerable smart contracts through the same mechanism of similarity checking in order to facilitate the discovery of security vulnerabilities in smart contracts.
However, such automated analysis tools typically need datasets for their training, testing, and validation phases; collecting such data for smart contracts is time-consuming.
Besides, it is difficult and time-consuming to replicate the findings of the majority of prior empirical studies or to contrast one's findings with those of others who have researched the above topics.
Research studies offer datasets that frequently come in the form of sparse datasets with minimal to no usage guidance.
Due to the fast-paced nature of the Ethereum ecosystem, the datasets available are often quickly outdated.
These are significant barriers to performing verifiable, reproducible research, as it takes a substantial amount of time to accomplish many subtasks such as locating, extracting, cleaning, and categorizing a reasonable amount of high-quality, heterogeneous smart contract data.
To address this issue, we introduce \etherbase, an extensible, queryable, and user-friendly database of smart contracts and their metrics that improve reproducibility and benchmarking in smart contract research.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Pilehchiha, Sina |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Electrical and Computer Engineering |
Date: | 15 August 2022 |
Thesis Supervisor(s): | Clark, Jeremy and Aghdam, Amir G. |
ID Code: | 990953 |
Deposited By: | Sina Pilehchiha |
Deposited On: | 27 Oct 2022 14:23 |
Last Modified: | 27 Oct 2022 14:23 |
Repository Staff Only: item control page