Tayanov, Vitaliy (2022) Nonlinear Classifier Stacking on Riemannian and Grassmann Manifolds with Application to Video Analysis. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
6MBTayanov_PhD_F2022.pdf - Accepted Version |
Abstract
This research is devoted to the problem of overfitting in Machine Learning and Pattern Recognition. It should lead to improving the generalisation ability and accuracy boosting in the case of small and/or difficult classification datasets. The aforementioned two problems have been solved in two different ways: by splitting the entire datasets into functional groups depending on the classification difficulty using consensus of classifiers, and by embedding the data obtained during classifier stacking into nonlinear spaces i.e. Riemannian and Grassmann manifolds. These two techniques are the main contributions of the thesis. The insight behind the first approach is that we are not going to use the entire training subset to train our classifiers but some part of it in order to approximate the true geometry and properties of classes. In terms of Data Science, this process can also be understood as Data Cleaning. According to the first approach, instances with high positive (easy) and negative (misclassified) margins are not considered for training as those that do not improve (or even worsen) the evaluation of the true geometry of classes. The main goal of using Riemannian geometry consists of embedding our classes in nonlinear spaces where the geometry of classes in terms of easier classification has to be obtained. Before embedding our classes on Riemannian and Grassmann manifolds we do several Data Transformations using different variants of Classifier Stacking. Riemannian manifolds of Symmetric Positive Definite matrices are created using the classifier interactions while Grassmann manifolds are built based on Decision Profiles. The purpose of the two aforementioned approaches is Data Complexity reduction. There is a consensus among researchers, that Data Complexity reduction should lead to an overfitting decrease as well as to classification accuracy enhancement.
We carried out our experiments on various datasets from the UCI Machine Learning repository. We also tested our approaches on two datasets related to the Video Analysis problem. The first dataset is a Phase Gesture Segmentation dataset taken from the UCI Machine Learning repository. The second one is the Deep Fake detection Challenge dataset. In order to apply our approach to solve the second problem, some image processing has been carried out. Numerous experiments on datasets of general character and those related to Video Analysis problems show the consistency and efficiency of the proposed techniques. We also compared our techniques with the state-of-the-art techniques. The obtained results show the superiority of our approaches for most of the cases. The significance of carried out research and obtained results manifests in better representation and evaluation of the geometry of classes which may overlap only in feature space due to some improper measurements, errors, noises, or by selecting features that do not represent well our classes. Carried out research is a pioneering in terms of Data Cleaning and Classifier Ensemble Learning in Riemannian geometry.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | Tayanov, Vitaliy |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | Computer Science |
Date: | 26 August 2022 |
Thesis Supervisor(s): | Suen, Ching Y. and Krzyzak, Adam |
ID Code: | 990999 |
Deposited By: | Vitaliy Tayanov |
Deposited On: | 27 Oct 2022 14:27 |
Last Modified: | 27 Oct 2022 14:27 |
Repository Staff Only: item control page