Kégl, Balázs (1999) Principal curves : learning, design, and applications. PhD thesis, Concordia University.
The subjects of this thesis are unsupervised learning in general, and principal curves in particular. Principal curves were originally defined by Hastie [Has84] and Hastie and Stuetzle [HS89] (hereafter HS) to formally capture the notion of a smooth curve passing through the "middle" of a d -dimensional probability distribution or data cloud. Based on the definition, HS also developed an algorithm for constructing principal curves of distributions and data sets. The field has been very active since Hastie and Stuetzle's groundbreaking work. Numerous alternative definitions and methods for estimating principal curves have been proposed, and principal curves were further analyzed and compared with other unsupervised learning techniques. Several applications in various areas including image analysis, feature extraction, and speech processing demonstrated that principal curves are not only of theoretical interest, but they also have a legitimate place in the family of practical unsupervised learning techniques. Although the concept of principal curves as considered by HS has several appealing characteristics, complete theoretical analysis of the model seems to be rather hard. This motivated us to redefine principal curves in a manner that allowed us to carry out extensive theoretical analysis while preserving the informal notion of principal curves. Our first contribution to the area is, hence, a new theoretical model that is analyzed by using tools of statistical learning theory. Our main result here is the first known consistency proof of a principal curve estimation scheme. The theoretical model proved to be too restrictive to be practical. However, it inspired the design of a new practical algorithm to estimate principal curves based on data. The polygonal line algorithm, which compares favorably with previous methods both in terms of performance and computational complexity, is our second contribution to the area of principal curves. To complete the picture, in the last part of the thesis we consider an application of the polygonal line algorithm to hand-written character skeletonization.
|Divisions:||Concordia University > Faculty of Engineering and Computer Science > Computer Science and Software Engineering|
|Item Type:||Thesis (PhD)|
|Pagination:||xv, 122 leaves : ill. ; 29 cm.|
|Degree Name:||Theses (Ph.D.)|
|Program:||Computer Science and Software Engineering|
|Thesis Supervisor(s):||Krzyzak, Adam|
|Deposited By:||Concordia University Libraries|
|Deposited On:||27 Aug 2009 13:15|
|Last Modified:||08 Dec 2010 10:17|
Repository Staff Only: item control page