On the smoothing of multinomial estimates using Liouville mixture models and applications

Title:

On the smoothing of multinomial estimates using Liouville mixture models and applications

Bouguila, Nizar (2013) On the smoothing of multinomial estimates using Liouville mixture models and applications. Pattern Analysis and Applications, 16 (3). pp. 349-363. ISSN 1433-7541

Preview

Text (application/pdf)
bouguila2013a.pdf - Accepted Version

4MB

Official URL: http://dx.doi.org/10.1007/s10044-011-0236-8

Abstract

There has been major progress in recent years in statistical model-based pattern recognition, data mining and knowledge discovery. In particular, generative models are widely used and are very reliable in terms of overall performance. Success of these models hinges on their ability to construct a representation which captures the underlying statistical distribution of data. In this article, we focus on count data modeling. Indeed, this kind of data is naturally generated in many contexts and in different application domains. Usually, models based on the multinomial assumption are used in this case that may have several shortcomings, especially in the case of high-dimensional sparse data. We propose then a principled approach to smooth multinomials using a mixture of Beta-Liouville distributions which is learned to reflect and model prior beliefs about multinomial parameters, via both theoretical interpretations and experimental validations, we argue that the proposed smoothing model is general and flexible enough to allow accurate representation of count data.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:	Article
Refereed:	Yes
Authors:	Bouguila, Nizar
Journal or Publication:	Pattern Analysis and Applications
Date:	August 2013
Digital Object Identifier (DOI):	10.1007/s10044-011-0236-8
Keywords:	Liouville family of distributions Mixture models Smoothing Count data Generative discriminative learning SVM Texture classification Object recognition
ID Code:	977853
Deposited By:	Danielle Dennie
Deposited On:	27 Sep 2013 14:13
Last Modified:	18 Jan 2018 17:45

References:

1. Brodley CE, Smyth P (1997) Applying classification algorithms in practice. Stat Comput 7(1):45–56

2.Bouguila N, Ziou D, Vaillancourt J (2003) Novel Mixture based on the Dirichlet distribution: application to data and image classification. In: Perner P, Rosenfeld A (eds) Machine learning and data mining in pattern recognition (MLDM). LNAI, vol 2734. Springer, Berlin, pp 172–181

3.Vijaya PA, Murty MN, Subramanian DK (2006) Efficient median based clustering and classification techniques for protein sequences. Pattern Anal Appl 9(2-3):243–255

4.Dagan I, Lee L, Perrira FCN (1999) Similarity-based models of word cooccurrence probabilities. Mach Learn 34(1–3):43–69

5.Scott S, Matwin S (1999) Feature engineering for text classification. In: Proceedings of the international conference on machine learning (ICML), pp 379–388

6.Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, 8th European conference on computer vision (ECCV)

7.Leung T, Malik J (2001) Representing and recognizing the visual appearance of materials using three-dimensional textons. Int J Comput Vis 43(1):29–44

8.Bouguila N, ElGuebaly W (2009) Discrete data clustering using finite mixture models. Pattern Recognit 42(1):33–42

9.Cheng BYM, Carbonell JG, Klein-Seetharaman J (2005) Protein classification based on text document classification techniques. Prot Struct Funct Bioinform 58:955–970

10.Witten IH, Bell TC (1991) The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression. IEEE Trans Inform Theory 37(4):1085–1094

11.Fienberg SE, Holland PW (1973) Simultaneous estimation of multinomial cell probabilities. J Am Stat Assoc 68(343):683–691

12.Hall P, Titterington DM (1987) On smoothing sparse multinomial data. Aust J Stat 29(1):19–37» CrossRef
13.Simonoff JS (1995) Smoothing categorical data. J Stat Plann Infer 47:41–69

14.Bouguila N, Ziou D (2007) Unsupervised learning of a finite discrete mixture: applications to texture modeling and image databases summarization. J Vis Commun Image Represent 18(4):295–309

15.Bouguila N, Ziou D (2004) A powerful finite mixture model based on the generalized Dirichlet distribution: unsupervised learning and applications. In Proceedings of the 17th international conference on pattern recognition (ICPR), pp 280–283

16.Bouguila N, Ziou D (2004) Dirichlet-based probability model applied to human skin detection. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp 521–524

17.Bouguila N, Ziou D, Hammoud RI (2009) On Bayesian analysis of a finite generalized Dirichlet mixture via a metropolis-within-Gibbs sampling. Pattern Anal Appl 12(2):151–166

18.McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York

19.Hoare Z (2008) Landscapes of Naive Bayes classifiers. Pattern Anal Appl 11(1):59–72

20.Andrés-Ferrer J, Juan A (2010) Constrained domain maximum likelihood estimation for Naive Bayes text classification. Pattern Anal Appl 13(2):189–196

21.Goodman LA (1970) The multivariate analysis of qualitative data: interactions among multiple classifications. J Am Stat Assoc 65(329):226–256

22.Goodman LA (1971) The analysis of multidimensional contingency tables: stepwise procedures and direct estimation methods for building models for multiple classifications. Technometrics 13(1):33–61

23.Goodman LA (1964) Interactions in multidimensional contingency tables. Ann Math Stat 35(2):632–646

24.Gart JJ, Zweifel JR (1967) On the bias of various estimators of the logit and its variance with application to quantal bioassay. Biometrika 54(1/2):181–187

25.Grizzle JE, Starmer CF, Koch GG (1969) Analysis of categorical data by linear models. Biometrics 25(3):489–504

26.Bouguila N, Ziou D (2004) Improving content based image retrieval systems using finite multinomial Dirichlet mixture. In: Proceedings of the IEEE workshop on machine learning for signal processing (MLSP), pp 23–32

27.Bouguila N (2007) Spatial color image databases summarization. In: IEEE International conference on acoustics, speech, and signal processing (ICASSP), vol 1, Honolulu, HI, USA, pp 953–956

28.Good IJ, Bayesian A (1967) Significance test for multinomial distribution (with Discussion). J R Stat Soc B 29(3):399–431

29.Fienberg SE (1972) On the choice of flattening constants for estimating multinomial probabilities. J Multivar Anal 2(1):127–134

30.Lidstone GJ (1920) Note on the general case of the Bayes–Laplace formula for inductive or a posteriori probabilities. Trans Fac Actuar 8:182–192

31.Jeffreys J (1961) Theory of probability. 3rd edn. Clarendon Press, Oxford

32.Perks W (1947) Some observations on inverse probability including a new indifference rule (with discussion). J Inst Actuar 73:285–334

33.Bouguila N, Ziou D, Vaillancourt J (2004) Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application. IEEE Trans Image Process 13(11):1533–1543

34.Lochner RH (1975) A generalized Dirichlet distribution in Bayesian life testing. J R Stat Soc B 37:103–113

35.Bouguila N, ElGuebaly W (2008) On discrete data clustering. In: Proceedings of the Pacific–Asia conference on knowledge discovery and data mining (PAKDD). LNCS, vol 5012. Springer, Osaka, pp 503–510

36.Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate and related distributions. Chapman and Hall, New York

37.Bouguila N, Ziou D (2005) Using unsupervised learning of a finite Dirichlet mixture model to improve pattern recognition applications. Pattern Recognit Lett 26(12):1916–1925

38.Bouguila N, Ziou D, Monga E (2006) Practical Bayesian estimation of a finite beta mixture through Gibbs sampling and its applications. Stat Comput 16(2):215–225

39.Robbins HE (1956) An empirical Bayes approach to statistics. In: Neyman J (ed) Proceedings of the third Berkeley symposium on mathematical statistics and probability, vol 1, pp 157–163

40.Robbins HE (1964) The empirical Bayes approach to statistics. Ann Math Stat 35(1):1–20

41.Deely JJ, Lindley DV (1981) Bayes empirical Bayes. J Am Stat Assoc 76(376):833–841

42.Carlin BP, Louis TA (2000) Bayes and empirical Bayes methods for data analysis, 2nd edn. Chapman & Hall/CRC, Boca Raton

43.McLachlan JG, Krishnan T (1997) The EM Algorithm and Extensions. Wiley

44.Hu T, Sung SY (2005) Clustering spatial data with a hybrid EM approach. Pattern Anal Appl 8(1–2):139–148

45.Bouguila N, Ziou D (2007) High-dimensional unsupervised selection and estimation of a finite generalized Dirichlet mixture model based on minimum message length. IEEE Trans Pattern Anal Mach Intell 29(10):1716–1731

46.Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471

47.Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42(1–2):143–175

48.Lebanon G, Lafferty J (2004) Hyperplane margin classifiers on the multinomial manifold. In: Proceedings of the international conference on machine learning (ICML), pp 66–73

49.Vapnik VN (1998) Statistical learning theory. Wiley, New York

50.Zhang D, Chen X, Lee WS (2005) Text classification with kernels on the multinomial manifold. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR), pp 266–273

51.Jebara T, Kondor R, Howard A (2004) Probability product kernels. J Mach Learn Res 5:819–844

52.Moreno PJ, Ho PP, Vasconcelos N (2003) A Kullback–Leibler divergence based kernel for SVM classification in multimedia applications. In: Proceedimgs of advances in neural information processing systems (NIPS). MIT Press, Cambridge

53.Topsoe F (2000) Some inequalities for information divergence and related measures of discrimination. IEEE Trans Inform Theory 46(4):1602–1609

54.Chapelle O, Haffner P, Vapnik VN (1999) Support vector machines for histogram-based image classification. IEEE Trans Neural Netw 10(5):1055–1064

55.Varma M, Zisserman A (2002) Classifying images of materials: achieving viewpoint and illumination independence. In: Proceedings of the European conference on computer vision (ECCV), pp 255–271

56.Szczypiński PM, Strzelecki M, Materka A, Klepaczko A (2009) MaZda: a software package for image texture analysis. Comput Methods Prog Biomed 94(1):66–76

57.Zhu SC, Wu Y, Mumford D (1998) Filters, random fields and maximum entropy (FRAME): towards a unified theory for texture modeling. Int J Comput Vis 27(2):107–126

58.Varma M, Zisserman A (2009) A statistical approach to material classification using image patch exemplars. IEEE Trans Pattern Anal Mach Intell 31(11):2032–2047

59.Dana KJ, van Ginneken B, Nayar SK, Koenderink JJ (1999) Reflectance and texture of real-world surfaces. ACM Trans Graphics 18(1):1–34

60.Lazebnik S, Schmid C, Ponce J (2005) A sparse texture representation using local affine regions. IEEE Trans Pattern Anal Mach Intell 27(8):1265–1278

61.Grzegorzek M (2010) A system for 3D texture-based probabilistic object recognition and its applications. Pattern Anal Appl 13(3):333–348

62.Schiele B, Pentland A (1999) Probabilistic object recognition and localization. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 177–182

63.Amsaleg L, Gros P (2001) Content-based retrieval using local descriptors: problems and issues from a database perspective. Pattern Anal Appl 4(2–3):108–124

64.Caputo B, Wallraven C, Nilsback M-E (2004) Object categorization via local kernels. In: Proceedings of the 17th international conference on pattern recognition (ICPR), pp 132–135

65.Lyu S (2005) Mercer kernels for object recognition with local features. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 223–229

66.Deselaers T, Keysers D, Ney H (2005) Discriminative training for object recognition using image patches. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 157–162

67.Loupias E, Sebe N, Bres S, Jolion J (2000) Wavelet-based salient points for image retrieval. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 518–521

68.Linde Y, Buzo A, Gray RM (1980) An algorithm for vector quantization design. IEEE Trans Commun 28:84–95

69.Nene SA, Nayar SK, Murase H (1996) Columbia object image library (COIL-20). Technical Report CUCS-005-96, Columbia University

70.Nene SA, Nayar SK, Murase H (1996) Columbia object image library (COIL-100). Technical Report CUCS-006-96, Columbia University

71.Weber M, Welling M, Perona P (2000) Unsupervised learning of object models and recognition. In: Proceedings of the European conference on computer vision (ECCV), pp 18–32

72.Bouguila N, Ziou D (2010) A Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling. IEEE Trans Neural Netw 21(1):107–122

73.Bouguila N (2009) A model-based approach for discrete data clustering and feature weighting using MAP and stochastic complexity. IEEE Trans Knowl Data Eng 21(12):1649–1664

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Altmetric

Research related to the current document (at the CORE website)

Spectrum Research Repository

On the smoothing of multinomial estimates using Liouville mixture models and applications

On the smoothing of multinomial estimates using Liouville mixture models and applications

Abstract

References: