Bayesian Learning of Asymmetric Gaussian-Based Statistical Models using Markov Chain Monte Carlo Techniques

Title:

Bayesian Learning of Asymmetric Gaussian-Based Statistical Models using Markov Chain Monte Carlo Techniques

Fu, Shuai (2018) Bayesian Learning of Asymmetric Gaussian-Based Statistical Models using Markov Chain Monte Carlo Techniques. Masters thesis, Concordia University.

Preview

Text (application/pdf)
Shuai_MASc_F2018.pdf - Accepted Version
Available under License Spectrum Terms of Access.

1MB

Abstract

A novel unsupervised Bayesian learning framework based on asymmetric Gaussian mixture (AGM) statistical model is proposed since AGM is shown to be more effective compared to the classic Gaussian mixture. The Bayesian learning framework is developed by adopting sampling-based Markov chain Monte Carlo (MCMC) methodology. More precisely, the fundamental learning algorithm is a hybrid Metropolis-Hastings within Gibbs sampling solution which is integrated within a reversible jump MCMC (RJMCMC) learning framework, a self-adapted sampling-based MCMC implementation, that enables model transfer throughout the mixture parameters learning process, therefore, automatically converges to the optimal number of data groups. Furthermore, a feature selection technique is included to tackle the irrelevant and unneeded information from datasets. The performance comparison between AGM and other popular solutions is given and both synthetic and real data sets extracted from challenging applications such as intrusion detection, spam filtering and image categorization are evaluated to show the merits of the proposed approach.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:	Thesis (Masters)
Authors:	Fu, Shuai
Institution:	Concordia University
Degree Name:	M.A. Sc.
Program:	Information Systems Security
Date:	July 2018
Thesis Supervisor(s):	Bouguila, Nizar
ID Code:	984499
Deposited By:	Shuai Fu
Deposited On:	16 Nov 2018 16:22
Last Modified:	17 Aug 2022 16:38

References:

[1] N. Bouguila, D. Ziou, and E. Monga, “Practical bayesian estimation of a finite beta mixture
through gibbs sampling and its applications,” Statistics and Computing, vol. 16, no. 2, pp.
215–225, 2006.
[2] T. Elguebaly and N. Bouguila, “Background subtraction using finite mixtures of asymmetric
gaussian distributions and shadow detection,” Mach. Vis. Appl., vol. 25, no. 5, pp. 1145–1162,
2014.
[3] S. Richardson and P. J. Green, “On bayesian analysis of mixtures with an unknown number of
components (with discussion),” Journal of the Royal Statistical Society: series B (statistical
methodology), vol. 59, no. 4, pp. 731–792, 1997.
[4] T. Elguebaly and N. Bouguila, “Simultaneous bayesian clustering and feature selection using
rjmcmc-based learning of finite generalized dirichlet mixture models,” Signal Processing,
vol. 93, no. 6, pp. 1531–1546, 2013.
[5] ——, “Bayesian learning of finite generalized gaussian mixture models on images,” Signal
Processing, vol. 91, no. 4, pp. 801–820, 2011.
[6] J. Yang, X. Liao, X. Yuan, P. Llull, D. J. Brady, G. Sapiro, and L. Carin, “Compressive sensing
by learning a gaussian mixture model from measurements,” IEEE Transactions on Image
Processing, vol. 24, no. 1, pp. 106–119, Jan 2015.
[7] C. K.Wen, S. Jin, K. K.Wong, J. C. Chen, and P. Ting, “Channel estimation for massive mimo
using gaussian-mixture bayesian learning,” IEEE Transactions on Wireless Communications,
vol. 14, no. 3, pp. 1356–1368, March 2015.
[8] N. Bouguila, “Count data modeling and classification using finite mixtures of distributions,”
IEEE Trans. Neural Networks, vol. 22, no. 2, pp. 186–198, 2011.
[9] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data
via the em algorithm,” Journal of the royal statistical society. Series B (methodological), pp.
1–38, 1977.
[10] N. Bouguila, D. Ziou, and R. I. Hammoud, “On bayesian analysis of a finite generalized
dirichlet mixture via a metropolis-within-gibbs sampling,” Pattern Anal. Appl., vol. 12, no. 2,
pp. 151–166, 2009.
[11] N. Bouguila and T. Elguebaly, “A fully bayesian model based on reversible jump MCMC and
finite beta mixtures for clustering,” Expert Syst. Appl., vol. 39, no. 5, pp. 5946–5959, 2012.
[12] S. Bourouis, M. A. Mashrgy, and N. Bouguila, “Bayesian learning of finite generalized inverted
dirichlet mixtures: Application to object classification and forgery detection,” Expert
Syst. Appl., vol. 41, no. 5, pp. 2329–2336, 2014.
[13] W. K. Hastings, “Monte carlo sampling methods using markov chains and their applications,”
Biometrika, vol. 57, no. 1, pp. 97–109, 1970.
[14] S. Geman and D. Geman, “Stochastic relaxation, gibbs distributions, and the bayesian restoration
of images,” in Readings in Computer Vision. Elsevier, 1987, pp. 564–584.
[15] N. Bouguila, D. Ziou, and S. Boutemedjet, “Simultaneous non-gaussian data clustering, feature
selection and outliers rejection,” in Pattern Recognition and Machine Intelligence - 4th
International Conference, PReMI 2011, Moscow, Russia, June 27 - July 1, 2011. Proceedings,
ser. Lecture Notes in Computer Science, S. O. Kuznetsov, D. P. Mandal, M. K. Kundu, and
S. K. Pal, Eds., vol. 6744. Springer, 2011, pp. 364–369.
[16] S. Boutemedjet, N. Bouguila, and D. Ziou, “A hybrid feature extraction selection approach
for high-dimensional non-gaussian data clustering,” IEEE Trans. Pattern Anal. Mach. Intell.,
vol. 31, no. 8, pp. 1429–1443, 2009.
[17] S. Raudys and A. K. Jain, “Small sample size effects in statistical pattern recognition: Recommendations
for practitioners,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 13, no. 3, pp.
252–264, 1991.
[18] R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artif. Intell., vol. 97, no.
1-2, pp. 273–324, 1997.
[19] K. Z. Mao, “Identifying critical variables of principal components for unsupervised feature
selection,” IEEE Trans. Systems, Man, and Cybernetics, Part B, vol. 35, no. 2, pp. 339–344,
2005.
[20] C. Tsai and C. Chiu, “Developing a feature weight self-adjustment mechanism for a k-means
clustering algorithm,” Computational Statistics & Data Analysis, vol. 52, no. 10, pp. 4658–
4672, 2008.
[21] J. G. Dy and C. E. Brodley, “Feature selection for unsupervised learning,” Journal of Machine
Learning Research, vol. 5, pp. 845–889, 2004.
[22] M. H. Law, M. A. Figueiredo, and A. K. Jain, “Simultaneous feature selection and clustering
using mixture models,” IEEE transactions on pattern analysis and machine intelligence,
vol. 26, no. 9, pp. 1154–1166, 2004.
[23] T. Elguebaly and N. Bouguila, “Simultaneous high-dimensional clustering and feature selection
using asymmetric gaussian mixture models,” Image and Vision Computing, vol. 34, pp.
27–41, 2015.
[24] S. Fu and N. Bouguila, “Bayesian learning of finite asymmetric gaussian mixtures,” in Proceedings
of The 31st International Conference on Industrial, Engineering & Other Applications
of Applied Intelligent Systems Montreal, QC, CA, June 25-28, 2018, 2018.
[25] ——, “Asymmetric gaussian mixtures with reversible jump MCMC,” in 2018 IEEE Canadian
Conference on Electrical & Computer Engineering (CCECE) (CCECE 2018), Quebec City,
Canada, May 2018.
[26] ——, “A bayesian intrusion detection framework,” in Cyber Science 2018, Glasgow, Scotland,
UK, June 11-12, 2018, 2018.
[27] N. Bouguila and D. Ziou, “A powreful finite mixture model based on the generalized dirichlet
distribution: Unsupervised learning and applications,” in 17th International Conference on
Pattern Recognition, ICPR 2004, Cambridge, UK, August 23-26, 2004. IEEE Computer
Society, 2004, pp. 280–283.
[28] ——, “Dirichlet-based probability model applied to human skin detection [image skin detection],”
in 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing,
ICASSP 2004, Montreal, Quebec, Canada, May 17-21, 2004. IEEE, 2004, pp. 521–524.
[29] D. Luengo and L. Martino, “Fully adaptive gaussian mixture metropolis-hastings algorithm,”
in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013,
Vancouver, BC, Canada, May 26-31, 2013. IEEE, 2013, pp. 6148–6152.
[30] G. Casella, C. P. Robert, and M. T. Wells, “Mixture models, latent variables and partitioned
importance sampling,” Statistical Methodology, vol. 1, no. 1-2, pp. 1–18, 2004.
[31] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis of the KDD CUP
99 data set,” in 2009 IEEE Symposium on Computational Intelligence for Security and Defense
Applications, CISDA 2009, Ottawa, Canada, July 8-10, 2009. IEEE, 2009, pp. 1–6.
[32] M. Stephens, “Bayesian analysis of mixture models with an unknown number of componentsan
alternative to reversible jump methods,” Annals of statistics, pp. 40–74, 2000.
[33] Investor.cisco.com. (2018) Cisco 2017 annual cybersecurity report. [Online]. Available:
https://investor.cisco.com/investor-relations/news-and-events/news/news-details/2017/
Cisco-2017-Annual-Cybersecurity-Report-Chief-Security-Officers-Reveal-True-Cost-of-Breaches-And-The-default.aspx
[34] A. L. Buczak and E. Guven, “A survey of data mining and machine learning methods for
cyber security intrusion detection,” IEEE Communications Surveys Tutorials, vol. 18, no. 2,
pp. 1153–1176, Secondquarter 2016.
[35] R. A. R. Ashfaq, X.-Z. Wang, J. Z. Huang, H. Abbas, and Y.-L. He, “Fuzziness based semisupervised
learning approach for intrusion detection system,” Information Sciences, vol. 378,
pp. 484 – 497, 2017.
[36] W. Lee and S. J. Stolfo, “Data mining approaches for intrusion detection,” in Proceedings
of the 7th USENIX Security Symposium, San Antonio, TX, USA, January 26-29, 1998, A. D.
Rubin, Ed. USENIX Association, 1998.
[37] N. Bouguila, “Bayesian hybrid generative discriminative learning based on finite liouville mixture
models,” Pattern Recognition, vol. 44, no. 6, pp. 1183–1200, 2011.
[38] M. Azam and N. Bouguila, “Unsupervised keyword spotting using bounded generalized gaussian
mixture model with ICA,” in 2015 IEEE Global Conference on Signal and Information
Processing, GlobalSIP 2015, Orlando, FL, USA, December 14-16, 2015. IEEE, 2015, pp.
1150–1154.
[39] J. A. Hartigan and M. A.Wong, “Algorithm as 136: A k-means clustering algorithm,” Journal
of the Royal Statistical Society. Series C (Applied Statistics), vol. 28, no. 1, pp. 100–108, 1979.
[40] K. Lab. (2018) Spam: share of global email traffic 2014-2017. [Online]. Available:
https://www.statista.com/statistics/420391/spam-email-traffic-share/
[41] G. F. J. S. Mark Hopkins, Erik Reeber. (2018) Uci machine learning repository: Spambase
data set. [Online]. Available: http://archive.ics.uci.edu/ml/datasets/Spambase?ref=datanews.io
[42] Y. Han and X. Qi, “Machine-learning-based image categorization,” in International Conference
Image Analysis and Recognition. Springer, 2005, pp. 585–592.
[43] W. Fan and N. Bouguila, “Infinite dirichlet mixture model and its application via variational
bayes,” in Machine Learning and Applications and Workshops (ICMLA), 2011 10th International
Conference on, vol. 1. IEEE, 2011, pp. 129–132.
[44] L.-J. Li and L. Fei-Fei, “What, where and who? classifying events by scene and object recognition,”
2007.
[45] D. G. Lowe, “Object recognition from local scale-invariant features,” in Computer vision,
1999. The proceedings of the seventh IEEE international conference on, vol. 2. Ieee, 1999,
pp. 1150–1157.
[46] L. Fei-Fei and P. Perona, “A bayesian hierarchical model for learning natural scene categories,”
in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference
on, vol. 2. IEEE, 2005, pp. 524–531.
[47] T. Hofmann, “Unsupervised learning by probabilistic latent semantic analysis,” Machine
learning, vol. 42, no. 1-2, pp. 177–196, 2001.
[48] F. Najar, S. Bourouis, A. Zaguia, N. Bouguila, and S. Belghith, “Unsupervised human action
categorization using a riemannian averaged fixed-point learning of multivariate ggmm,” in
International Conference Image Analysis and Recognition. Springer, 2018, pp. 408–415.
[49] G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual categorization with bags
of keypoints,” in Workshop on statistical learning in computer vision, ECCV, vol. 1, no. 1-22.
Prague, 2004, pp. 1–2.
[50] N. Bouguila, “Spatial color image databases summarization,” in Proceedings of the IEEE International
Conference on Acoustics, Speech, and Signal Processing, ICASSP 2007, Honolulu,
Hawaii, USA, April 15-20, 2007. IEEE, 2007, pp. 953–956.
[51] N. Bouguila and D. Ziou, “Improving content based image retrieval systems using finite multinomial
dirichlet mixture,” in Machine Learning for Signal Processing, 2004. Proceedings of
the 2004 14th IEEE Signal Processing Society Workshop. IEEE, 2004, pp. 23–32.
[52] S. Boutemedjet, D. Ziou, and N. Bouguila, “Unsupervised feature selection for accurate recommendation
of high-dimensional image data,” in Advances in Neural Information Processing
Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information
Processing Systems, Vancouver, British Columbia, Canada, December 3-6, 2007, J. C. Platt,
D. Koller, Y. Singer, and S. T. Roweis, Eds. Curran Associates, Inc., 2007, pp. 177–184.

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Bayesian Learning of Asymmetric Gaussian-Based Statistical Models using Markov Chain Monte Carlo Techniques

Bayesian Learning of Asymmetric Gaussian-Based Statistical Models using Markov Chain Monte Carlo Techniques

Abstract

References: