Salari, Mohammad Ali (2021) The effect of Computational Environments on Big Data Processing Pipelines in Neuroimaging. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
3MBSalari_PhD_S2022.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Variations in computational infrastructures, including operating systems, software versions, and hardware architectures, introduce variability in neuroimaging analyses that could affect the reproducibility of scientific conclusions. These variations are due to the creation, propagation, and amplification of numerical instabilities in analysis pipelines. It is critical to identify numerical instabilities to make experiments computationally reproducible. In this thesis, we characterize the numerical stability of commonly-used complex pipelines in the context of neuroimaging analysis across operating systems and provide accessible tools for developers and researchers to evaluate their pipelines and findings. First, we present the Spot tool that identifies the processes from which differences originate and the path along which they propagate in a pipeline. In the next step, to study the numerical instabilities more comprehensively, we introduce controlled numerical perturbations to the floating-point computations using the Monte-Carlo arithmetic method. For this purpose, we propose an interposition technique to model the effect of operating system updates on analysis pipelines using the Monte-Carlo arithmetic. Finally, leveraging the interposition technique, we compare numerical variability with tool variability in an fMRI analysis. We show that the results of analyses are sensitive to computational environment changes originating from numerical errors. All the methods implemented in this thesis are publicly available and can be used to facilitate further investigations toward stabilizing pipelines.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | Salari, Mohammad Ali |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | Computer Science |
Date: | 31 December 2021 |
Thesis Supervisor(s): | Glatard, Tristan |
Keywords: | Computational reproducibility, Neuroimaging pipelines, Monte-Carlo arithmetic, Computing environment, Numerical quantification |
ID Code: | 990449 |
Deposited By: | Ali Salari |
Deposited On: | 16 Jun 2022 14:43 |
Last Modified: | 16 Jun 2022 14:43 |
Repository Staff Only: item control page