Login | Register

Variational learning of a Dirichlet process of generalized Dirichlet distributions for simultaneous clustering and feature selection

Title:

Variational learning of a Dirichlet process of generalized Dirichlet distributions for simultaneous clustering and feature selection

Fan, Wentao and Bouguila, Nizar (2013) Variational learning of a Dirichlet process of generalized Dirichlet distributions for simultaneous clustering and feature selection. Pattern Recognition, 46 (10). pp. 2754-2769. ISSN 00313203

[img]
Preview
Text (application/pdf)
bouguila2013b.pdf - Accepted Version
613kB

Official URL: http://dx.doi.org/10.1016/j.patcog.2013.03.026

Abstract

This paper introduces a novel enhancement for unsupervised feature selection based on generalized Dirichlet (GD) mixture models. Our proposal is based on the extension of the finite mixture model previously developed in [1] to the infinite case, via the consideration of Dirichlet process mixtures, which can be viewed actually as a purely nonparametric model since the number of mixture components can increase as data are introduced. The infinite assumption is used to avoid problems related to model selection (i.e. determination of the number of clusters) and allows simultaneous separation of data in to similar clusters and selection of relevant features. Our resulting model is learned within a principled variational Bayesian framework that we have developed. The experimental results reported for both synthetic data and real-world challenging applications involving image categorization, automatic semantic annotation and retrieval show the ability of our approach to provide accurate models by distinguishing between relevant and irrelevant features without over- or under-fitting the data.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Article
Refereed:Yes
Authors:Fan, Wentao and Bouguila, Nizar
Journal or Publication:Pattern Recognition
Date:October 2013
Digital Object Identifier (DOI):10.1016/j.patcog.2013.03.026
Keywords:Infinite mixture models; Dirichlet process; Generalized Dirichlet; Feature selection; Clustering; Images categorization; Image auto-annotation
ID Code:977854
Deposited By: DANIELLE DENNIE
Deposited On:27 Sep 2013 14:19
Last Modified:18 Jan 2018 17:45
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top