Munyamahoro, Fidence (2024) Outliers detection based on quantiles and depth functions. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
2MBFidence_MSc_S2024.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Abstract
Outliers detection based on quantiles and depth functions
Outlier detection plays a crucial role in data analysis and is employed in various domains such as finance, healthcare, and anomaly detection. This thesis presents a novel approach for detecting outliers using quantiles and depth functions, and we apply it to an air quality dataset. Quantiles provide a statistical measure of the distribution of data, while depth functions assess the centrality of observations relative to the entire dataset. Combining these two techniques, we propose a robust
and effective method to identify outliers in multidimensional datasets. Our approach is particularly useful in scenarios where traditional outlier detection methods may be inadequate or fail to capture the complex patterns present in the data. By considering multiple quantiles, we can identify outliers that deviate from different aspects of the data distribution.
Additionally, we incorporate depth functions, which measure the centrality of observations within a dataset, to further refine our
outlier detection process. To evaluate the effectiveness of our approach, we apply it to a real-world air quality dataset.
The data is about the New York Air Quality Measurements of 1973 for five months from May to September recorded daily. It contains 153 observations of 6 variables. By applying our method, we can identify outliers representing unusual air quality patterns, potentially
indicating anomalies or errors in the data collection process.
Our experimental findings support the proposed approach and effectively detect outliers in the air quality dataset.
Compared to traditional outlier detection techniques, our method achieves higher accuracy and provides more detailed in sights into the nature of the outliers.
Furthermore, we show that the identified outliers can be
valuable in understanding the factors contributing to air pollution and in improving the quality of air quality monitoring systems.
The findings of this research contribute to the advancement of
outlier detection methodologies and provide valuable insights for practitioners in identifying and handling outliers in real-world applications.
Divisions: | Concordia University > Faculty of Arts and Science > Mathematics and Statistics |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Munyamahoro, Fidence |
Institution: | Concordia University |
Degree Name: | M. Sc. |
Program: | Mathematics |
Date: | 16 January 2024 |
Thesis Supervisor(s): | Mailhot, Mélina |
ID Code: | 993928 |
Deposited By: | Fidence MUNYAMAHORO |
Deposited On: | 05 Jun 2024 16:27 |
Last Modified: | 05 Jun 2024 16:27 |
Related URLs: |
Repository Staff Only: item control page