Login | Register

Cluster Analysis of Multivariate Data Using Scaled Dirichlet Finite Mixture Model


Cluster Analysis of Multivariate Data Using Scaled Dirichlet Finite Mixture Model

Oboh, Eromonsele Samuel (2016) Cluster Analysis of Multivariate Data Using Scaled Dirichlet Finite Mixture Model. Masters thesis, Concordia University.

Text (application/pdf)
Oboh_MASc_F2016.pdf - Accepted Version
Available under License Spectrum Terms of Access.


We have designed and implemented a finite mixture model, using the scaled Dirichlet distribution for the cluster analysis of multivariate proportional data. In this thesis, the task of cluster analysis first involves model selection which helps to discover the number of natural groupings underlying a dataset. This activity is then followed by that of estimating the model parameters for those natural groupings using the expectation maximization framework.

This work, aims to address the flexibility challenge of the Dirichlet distribution by introduction of a distribution with an extra model parameter. This is important because scientists and researchers are constantly searching for the best models that can fully describe the intrinsic characteristics of the observed data and flexible models are increasingly used to achieve such purposes.

In addition, we have applied our estimation and model selection algorithm to both synthetic and real datasets. Most importantly, we considered two areas of application in software modules defect prediction and in customer segmentation. Today, there is a growing challenge of detecting defected modules early in complex software development projects. Therefore, making these sort of machine learning algorithms crucial in driving key quality improvements that impacts bottom-line and customer satisfaction.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Thesis (Masters)
Authors:Oboh, Eromonsele Samuel
Institution:Concordia University
Degree Name:M.A. Sc.
Program:Quality Systems Engineering
Date:December 2016
Thesis Supervisor(s):Bouguila, Nizar
Keywords:Unsupervised Learning, Software Modules Categorization, Finite Mixture Models, Scaled Dirichlet Distribution
ID Code:982063
Deposited On:09 Jun 2017 14:49
Last Modified:18 Jan 2018 17:54
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Back to top Back to top