<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>0120-6230</journal-id>
<journal-title><![CDATA[Revista Facultad de Ingeniería Universidad de Antioquia]]></journal-title>
<abbrev-journal-title><![CDATA[Rev.fac.ing.univ. Antioquia]]></abbrev-journal-title>
<issn>0120-6230</issn>
<publisher>
<publisher-name><![CDATA[Facultad de Ingeniería, Universidad de Antioquia]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S0120-62302010000600024</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Analysis and convergence of weighted dimensionality reduction methods]]></article-title>
<article-title xml:lang="es"><![CDATA[Análisis y convergencia de métodos de reducción de dimensionalidad ponderados]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Riaño Rojas]]></surname>
<given-names><![CDATA[Juan Carlos]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Prieto Ortiz]]></surname>
<given-names><![CDATA[Flavio Augusto]]></given-names>
</name>
<xref ref-type="aff" rid="A02"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Sánchez Camperos]]></surname>
<given-names><![CDATA[Edgar Nelson]]></given-names>
</name>
<xref ref-type="aff" rid="A03"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Acosta Medina]]></surname>
<given-names><![CDATA[Carlos Daniel]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Castellanos Domínguez]]></surname>
<given-names><![CDATA[Germán Augusto]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
</contrib-group>
<aff id="A01">
<institution><![CDATA[,Universidad Nacional de Colombia Sede Manizales  ]]></institution>
<addr-line><![CDATA[Manizales ]]></addr-line>
<country>Colombia</country>
</aff>
<aff id="A02">
<institution><![CDATA[,Universidad Nacional de Colombia Sede Bogotá  ]]></institution>
<addr-line><![CDATA[Bogotá D.C ]]></addr-line>
<country>Colombia</country>
</aff>
<aff id="A03">
<institution><![CDATA[,Centro de Investigaciones Avanzadas IPN  ]]></institution>
<addr-line><![CDATA[Jalisco ]]></addr-line>
<country>México</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>12</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>12</month>
<year>2010</year>
</pub-date>
<numero>56</numero>
<fpage>245</fpage>
<lpage>254</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.co/scielo.php?script=sci_arttext&amp;pid=S0120-62302010000600024&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.co/scielo.php?script=sci_abstract&amp;pid=S0120-62302010000600024&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.co/scielo.php?script=sci_pdf&amp;pid=S0120-62302010000600024&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[We propose to use a Fisher type discriminant objective function addressed to weighted principal component analysis (WPCA) and weighted regularized discriminant analysis (WRDA) for dimensionality reduction. Additionally, two different proofs for the convergence of the method are obtained. First one analytically, by using the completeness theorem, and second one algebraically, employing spectral decomposition. The objective function depends on two parameters U matrix being the rotation and D diagonal matrix weight of relevant features, respectively. These parameters are computed iteratively, in order to maximize the reduction. Relevant features were obtained by determining the eigenvector associated to the most weighted eigenvalue on the maximum value in U. Performance evaluation of the reduction methods was carried out on 70 benchmark databases. Results showed that weighted reduction methods presented the best behavior, PCA and PPCA lower than 17% while WPCA and WRDA higher than 45%. Particularly, WRDA method had the best performance in the 75% of the cases compared with the others studied here.]]></p></abstract>
<abstract abstract-type="short" xml:lang="es"><p><![CDATA[En este trabajo se propone utilizar una función objetivo discriminante tipo de Fisher, para la reducción de la dimensionalidad, en el análisis de componentes principales ponderados (WPCA) y al análisis discriminante regularizado ponderados (WRDA). Además, se desarrollan dos pruebas de la convergencia del método. Primero analíticamente, usando el teorema de completitud, y una segunda prueba algebraica, empleando descomposición espectral. La función objetivo depende de dos parámetros: U matriz de rotaciones y D matriz pesos de características relevantes, respectivamente. Estos parámetros se calculan iterativamente, para maximizar la reducción. Las características relevantes fueron obtenidas determinando el vector propio asociado al valor propio con máximo valor en U. La evaluación del desempeño de los métodos de la reducción fue realizada sobre 70 bases de datos (benchmark). Los resultados mostraron que los métodos ponderados presentan un mejor comportamiento PCA y PPCA por debajo del 17% mientras que WPCA y WRDA por encima del 45%. Particularmente, el método WRDA tuvo el mejor funcionamiento en el 75% de los casos comparados con los otros estudiados en este trabajo.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[PCA]]></kwd>
<kwd lng="en"><![CDATA[PPCA]]></kwd>
<kwd lng="en"><![CDATA[WPCA]]></kwd>
<kwd lng="en"><![CDATA[WRDA]]></kwd>
<kwd lng="en"><![CDATA[dimensionality reduction]]></kwd>
<kwd lng="es"><![CDATA[WPCA]]></kwd>
<kwd lng="es"><![CDATA[WRDA]]></kwd>
<kwd lng="es"><![CDATA[reducción de dimensión]]></kwd>
</kwd-group>
</article-meta>
</front><body><![CDATA[ <p align="center"><font face="Verdana" size="4"> <b>Analysis and convergence of weighted dimensionality reduction methods</b></font></p>      <p align="center"><font face="Verdana" size="4"> <b>An&aacute;lisis y convergencia de m&eacute;todos de reducci&oacute;n de dimensionalidad ponderados</b></font></p>      <p> <font face="Verdana" size="2"> <i>Juan Carlos Ria&ntilde;o Rojas<sup>1</sup>* , Flavio Augusto Prieto Ortiz.<sup>2</sup>, Edgar Nelson S&aacute;nchez Camperos.<sup>3</sup>, Carlos Daniel Acosta Medina.<sup>1</sup>, Germ&aacute;n Augusto Castellanos Dom&iacute;nguez<sup>1</sup></i></font></p>       <p> <font face="Verdana" size="2"><sup>1</sup>Universidad Nacional de Colombia Sede Manizales. A.A. 127. Km 9 v&iacute;a el aeropuerto campus la Nubia Manizales, Colombia    <br>    <br>  <sup>2</sup>Universidad Nacional de Colombia Sede Bogot&aacute;. A.A. 14490. Carrera 30 N&deg; 45-03 Edificio IEI 406 Bogot&aacute; D.C., Colombia    <br>    <br>  <sup>3</sup>Centro de Investigaciones Avanzadas IPN, Cinvestav. Guadalajara. Av. Cient&iacute;fica 1145 , colonia el Baj&iacute;o, Zapopan , 45015, Jalisco, M&eacute;xico </font></p>      <br>  <hr noshade size="1">      <p><font face="Verdana" size="3"><b>Abstract</b></font></p>       ]]></body>
<body><![CDATA[<p><font face="verdana" size= "2">We  propose to use a Fisher type discriminant objective function addressed to  weighted principal component analysis (WPCA) and weighted regularized  discriminant analysis (WRDA) for dimensionality reduction. Additionally, two  different proofs for the convergence of the method are obtained. First one  analytically, by using the completeness theorem, and second one algebraically,  employing spectral decomposition. The objective function depends on two  parameters  U matrix  being the rotation and D diagonal matrix weight of relevant features, respectively.  These parameters are computed iteratively, in order to maximize the reduction.  Relevant features were obtained by determining the eigenvector associated to  the most weighted eigenvalue on the maximum value in U. Performance evaluation of the  reduction methods was carried out on 70 benchmark databases. Results showed  that weighted reduction methods presented the best behavior, PCA and PPCA lower  than 17% while WPCA and WRDA higher than 45%. Particularly, WRDA method had the  best performance in the 75% of the cases compared with the others studied here.</font></p>       <p><font face="Verdana" size="2"><i>Keywords:</i>PCA, PPCA, WPCA, WRDA, dimensionality reduction</font>.</p>   <hr noshade size="1">      <p><font face="Verdana" size="3"><b>Resumen</b></font></p>      <p><font face="Verdana" size="2">En este trabajo se propone utilizar una funci&oacute;n objetivo discriminante tipo de Fisher, para la reducci&oacute;n de la dimensionalidad, en el an&aacute;lisis de componentes principales ponderados (WPCA) y al an&aacute;lisis discriminante regularizado ponderados (WRDA). Adem&aacute;s, se desarrollan dos pruebas de la convergencia del m&eacute;todo. Primero anal&iacute;ticamente, usando el teorema de completitud, y una segunda prueba algebraica, empleando descomposici&oacute;n espectral. La funci&oacute;n objetivo depende de dos par&aacute;metros: <i>U</i> matriz de rotaciones y <i>D</i> matriz pesos de caracter&iacute;sticas relevantes, respectivamente. Estos par&aacute;metros se calculan iterativamente, para maximizar la reducci&oacute;n. Las caracter&iacute;sticas relevantes fueron obtenidas determinando el vector propio asociado al valor propio con m&aacute;ximo valor en <i>U</i>. La evaluaci&oacute;n del desempe&ntilde;o de los m&eacute;todos de la reducci&oacute;n fue realizada sobre 70 bases de datos (benchmark). Los resultados mostraron que los m&eacute;todos ponderados presentan un mejor comportamiento PCA y PPCA por debajo del 17% mientras que WPCA y WRDA por encima del 45%. Particularmente, el m&eacute;todo WRDA tuvo el mejor funcionamiento en el 75% de los casos comparados con los otros estudiados en este trabajo.</font></p>      <p><font face="Verdana" size="2"><i>Palabras clave:</i>WPCA, WRDA, reducci&oacute;n de dimensi&oacute;n</font>.</p>  <hr noshade size="1">      <p><font face="Verdana" size="3"><b>Introduction</b></font></p>      <p><font face="Verdana" size="2">The relevant information  extraction from a data set with a great number of features has been considered  as a big problem in machine learning and pattern recognition. These great sets  appear normally in areas as bioinformatics and text recognition, where is  common to find feature vectors with dimensions higher than 107, but  with a low number of relevant characteristics. Thus, the classification  algorithm performance is limited and their computing time is high, reducing the  application in real time tasks [1]. For solving this problem, irrelevant  features that do not contribute to the extraction and selection process must be  rejected, improving the classifiers performance. Traditionally, the  dimensionality reduction has been developed by using linear techniques such as  principal components analysis (PCA), probabilistic principal components  analysis (PPCA) and factorial analysis [2-4]. Nevertheless, these linear  techniques are not suitable for handling non linear complex data. For this  reason, in recent years, a great number of non linear techniques for  dimensionality reduction have been proposed several geometric methods for  feature selection and dimensional reduction. Where they divide the methods into  projective methods and methods that model the manifold on which the data lies.  For projective methods, we review projection pursuit, principal component  analysis (PCA), kernel PCA, probabilistic PCA, and oriented PCA; and for the  manifold methods, we review multidimensional scaling (MDS), landmark MDS,  Isomap, locally linear embedding, Laplacian eigenmaps and spectral clustering  [5-7]. Nevertheless they lack of convergence analysis. Non linear reduction  techniques have a good performance in complex artificial tasks; however, they  do not overcome the traditional linear techniques in real word tasks using  several databases without carrying out formal proofs of this fact [8]. In [9]  an algebraic weighted variables approximation is presented. It is based on the  Kernel matrix spectral properties. The main contribution consists in obtaining  relevant variables using the weighted objective function, proving its  convergence on employing strong hypothesis from the analysis fundaments, but  being different to those used in this paper. An exhaustive review in extraction  and selection features methods classification, grouping them in two classes,  has been done in [10]. The first one contains the binary search methods, which  at the same time are catalogued as exhaustive search methods but forbidden  because of its computational cost. These find the optimum of the objective  function but provide unstable and not optimal results. The second group  includes the weighted methods that multiply the features by continuous values,  in order to employ mathematical analysis techniques, for optimizing the  objective function. In reference [11-14] descriptive studies of the reduction  and regression methods are observed. Some of them show applications in pattern  classification for identifying faces, while others present applications for  materials science. Although the two last references have the same abbreviation  (WPCA) they refer to other aspects, employing &quot;W&quot; for window or  whitened, being different to the methods considered in this paper, because they  carried out a local reduction.    <br>    <br> In this work, a complete study  of two of these weighted methods is included, WPCA and WRDA. These methods were  already introduced in [15,16] by using an EM algorithm and without including a  theoretical study. The main advantage of these methods is the capability of  combining in one step two tasks (features selection and extraction), returning  the called relevant features. Here we present the methods as weighted rotations  for maximizing the objective function J which is the matrix traces  ratio that represents the inter&shy;classes and intra-classes dispersion. From this  definition, we develop a convergence analysis. The main convergence result is  obtained from two different points of view. First analytically by using the  completeness theorem and second algebraically, employing spectral  decomposition.    <br>    ]]></body>
<body><![CDATA[<br> The  convergence was validated employing artificial and real data for selecting  relevant variables from a set of 70 geometric features (areas, perimeters,  fractal dimension, curvatures, Hausdorff dimension among others), statistical  (correlations, means, entropy and moments) needed for characterizing patterns  that can separate two classes. In order to evaluate the performance of these  methods, the ROC surfaces hyper-volume (hyper-surface) has to be calculated by  using the Monte Carlo method, additional to the error classification.</font></p>      <p><font face="Verdana" size="3"><b>Methodology</b> </font></p>      <p> <font face="Verdana" size="2"><b><i>Relevance and variables selection using weighting</i></b></font></p>      <p> <font face="Verdana" size="2">The variables selection problem can be understood as choosing a subset of <i>p</i> features, from the whole features set <i>c</i>, that allows obtaining a suitable performance in the classification process.This kind of search is handled by some evaluation function named the relevant set. On the other hand, the extraction techniques carry out a transformation of <i>c</i> features space to a lower dimension space. In order to guarantee the optimum solution, these methods execute an exhaustive search, increasing the computational cost. For solving this problem, heuristic methodology has been proposed, but producing unstable behavior respect to the objective function. Other alternative consists in using weighting methods, although they are not the optimum solution, they are more stable and flexible, producing a suitable solution [17&#93;. Some of these methods are described below.</font></p>       <p> <font face="Verdana" size="2"><b><i>Weighted probabilistic PCA</i></b></font></p>       <p> <font face="Verdana" size="2">It is a particular factorial analysis case. In this method, the original <i>X</i> features are observed as a linear combination of <i>Z</i> factors group, joined to a specific error <i>V</i> and <i>C</i> that represent charge coefficients that are modeled according to (1):</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e01.gif"></p>      <p> <font face="Verdana" size="2">Where the random variables <i>Z</i> can be assumed non dependent and identically distributed, with a unitary spherical Gaussian variance. Note that there is a difference between the probabilistic model PCA (PPCA) and PCA, where the random variable variance can be associated to the diagonal elements of Z. The model also considers the general perturbation matrix <i>V</i>, but in [18&#93; a restriction of Gaussian variance <i>R = &epsilon;I</i> (isotropic noise) is stated. The previous model formulation is modified by introducing the weights on the original variables (features) and thus containing the weighted rotation. Let <i>D</i> be a diagonal matrix that contains the <i>i</i>-th variable in the <i>d<sub>ii</sub></i> element. If the new variable is assumed as <i>d<sub>ii</sub>x<sub>i</sub></i>, a new data subset <i>y = Dx</i> is generated, where the probabilistic model is defined in (2):</font></p>        <p> <img src="../img/revistas/rfiua/n56/n56a24e02.gif"></p>      <p> <font face="Verdana" size="2">where <i>Z</i> and <i>V</i> are distributed as in equation (1). From this definition <i>Y</i> is normally distributed, with mean equal to zero and the covariance given by (3):</font></p>      ]]></body>
<body><![CDATA[<p> <img src="../img/revistas/rfiua/n56/n56a24e03.gif"></p>      <p> <font face="Verdana" size="2">The weights are found in <i>D</i> and they are responsible of generating the noise for the <i>X</i> variables. From (3) the EM (<i>Expectation - Maximization</i>) algorithm is obtained, in order to estimate the unknown variables state in the <i>E</i>-step and maximizing the total probability from the estimation of <i>Z</i> and the observation of <i>Y</i>, in the <i>M</i>-step. <i>E</i>-step and <i>M</i>-step are observed in equation (4) and (5), respectively:</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e04.gif"></p>       <p> <font face="Verdana" size="2"><b><i>Weighted regularized discriminant Analysis WRDA</i></b></font></p>      <p> <font face="Verdana" size="2">RDA was proposed by Friedman [19&#93; for being used in small samples, where data possess high dimensionality, trying to overcome the discriminant rule degradation. In this document, they were identified as the regularized linear discriminant analysis method. The aim of this technique is to find the lineal projection space where the dispersion between classes was the maximum value. One way consists in maximizing the ratio between projected classes in the dispersion matrix inter-classes &sum;<sub>B</sub> and the dispersion matrix intra-class &sum;<sub>W</sub>, as is expressed in (6):</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e06.gif"></p>      <p> <font face="Verdana" size="2">where <i>W</i> is the projection matrix, which dimension is defined by the number of linearly separated classes <i>k</i>. The aim is to maximize the previous objective function (6), under restriction |<i>W<sup>T</sup>&sum;<sub>W</sub>W</i>| = 1. The solution is obtained employing the Lagrange multipliers, the solutions are <i>k</i>-1 eigenvectors generalized from &sum;<sub>B</sub> and &sum;<sub>W</sub> that correspond to the principal eigenvectors of &sum;<sup>-1</sup><sub>W</sub>&sum;<sub>B</sub>. The regularization is needed because for small samples size, &sum;<sub>W</sub> cannot be inverted directly. Then, the solution would be reformulated as is expressed in the next equation, where &Lambda; is the eigenvalues matrix (a diagonal matrix) as is presented in (7):</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e07.gif"></p>      <p> <font face="Verdana" size="2">After data weighting by obtaining <i>XD</i>, where <i>D</i> be a diagonal matrix, the function to optimize is transformed in (8):</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e08.gif"></p>      ]]></body>
<body><![CDATA[<p> <font face="Verdana" size="2"><b><i>Weighted variables and relevance criteria</i></b></font></p>      <p> <font face="Verdana" size="2">In previous subsections some weighted linear transformations were defined. Now, the interest is to project data in a <i>f</i> dimension space. Such dimension depends on the chosen rotation criterion; for instance, there is a bi-class problem and the convergence of WRDA is required, the fixed dimension must be <i>f</i>=1, in order to reach the convergence. For evaluating the weighted projection relevance at a fixed dimension, the measurement of separability is required. The parameter to be optimized is the weight diagonal matrix <i>D</i>, and the selected criterion is the inter-classes and intra-classes dispersion matrix traces coefficient, known as <i>J<sub>4</sub></i> [20&#93;. For projected and weighted data, this measurement is given by (9):</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e09.gif"></p>      <p> <font face="Verdana" size="2">where &#1092; can represent <i>U</i> or <i>W</i>. The size of &#1092; is <i>c</i> x <i>f</i> and <i>f</i> denotes the fixed dimension, corresponding to the number of projection vectors &#1092; such that: &#1092; = (<i>&#1092;<sub>1</sub>, &#1092;<sub>2</sub>,…, &#1092;<sub>f</sub></i>).    <br>    <br>  Rewriting the <i>D</i> matrix as a column vector <i>d</i>, and using the Hadamard matrix product (expressed as o), the traces of equation (9) can be rewritten as is presented in (10):</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e10.gif"></p>      <p> <font face="Verdana" size="2">Then, equation (9) is transformed in equation (11):</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e11.gif"></p>      <p> <font face="Verdana" size="2">This function is essentially equal to the LDA (linear discriminant analysis) function, then, the solution of <i>d</i> with norm <i>L<sub>2</sub></i>, will be chosen by the principal eigenvector given by (12):</font></p>      ]]></body>
<body><![CDATA[<p> <img src="../img/revistas/rfiua/n56/n56a24e12.gif"></p>      <p> <font face="Verdana" size="2">Note that this kind of description assumes that the &#1092; elements are static; this problem is overcome by overlapping <i>d</i> and &#1092; calculation, until the convergence of both is reached [9&#93;. Related to weight interpretability, in order to define the dispersion, positive values are generally required. Nevertheless, in the context of relevance function used in this work, negative values can be obtained, this is avoided taking the absolute value of <i>d</i>.</font></p>      <p> <font face="Verdana" size="2"><b><i>Weighted reduction WPCA and WRDA</i></b></font></p>      <p> <font face="Verdana" size="2">In [9&#93; the convergence towards local maximum, using the power method applied to Q-&alpha; method which objective function is similar to <i>J<sub>4</sub></i> is shown. Such proof was carried out for a particular case, where the objective function is poor, which means that there is a subset of characteristics including a coherent cluster and a positive function, conditions that normally can not be demanded. For this reason, in this section the convergence objective function  in WPCA is proved, but it requires using the lemma 1, which can be demonstrated from two perspectives: analytic and algebraic.    <br>         <br>  The next lemma guaranties the objective function has a maximum. Moreover, it is observed that any search method converges to the same limit.    <br>    <br>  <i>Lemma:</i> The objective function <img src="../img/revistas/rfiua/n56/n56a24e00.gif">has a maximum.    <br>    <br>  <i>Proof (analyticversion):</i> the objective function can be represented as: <i>J (W, D) = (d, &Gamma;w (d)</i> where <img src="../img/revistas/rfiua/n56/n56a24e001.gif"></font></p>      ]]></body>
<body><![CDATA[<p><font face="Verdana" size="2">Let be C = &#123; (d, &Gamma;<sub>w</sub> (d)) : ||d|| = 1&#125;   a set. The set is not empty  since any eigenvector d = &beta;<sub>i</sub> satisfies the condition in the <i>C</i> set. Now, it is necessary to  show that the set posses supremum. Taking a base of eigenvectors  &#123; &beta;<sub>1</sub>, &beta;<sub>2</sub>,&hellip;,&beta;<sub>n</sub>&#125;  <em>,</em> associated to the  transformation &Gamma;w then, for any d vector conformed by a linear  combination of &nbsp;&beta;<sub>i</sub>&nbsp; i it is had that d = <b>&sum;<sup>n</sup><sub>i=1</sub>c<sub>i</sub>&beta;<sub>i</sub></b>,  such that:    <br>    <br>  <i>J</i>(<i>W,D</i>)=(<i>d,&Gamma;<sub>w</sub>(d)</i>) =(<b>&sum;<sup>n</sup><sub>i=1</sub>c<sub>i</sub>&beta;<sub>i</sub>,&Gamma;<sub>W</sub>(&sum;<sup>n</sup><sub>i=1</sub>c<sub>i</sub>&beta;<sub>i</sub>)</b>)=  <b>&sum;<sup>n</sup><sub>i=1</sub>&lambda;<sub>i</sub>c<sup>2</sup><sub>i</sub></b>, where ||<i><b>d</b></i>||<sup>2</sup>=||<b>&sum;<sup>n</sup><sub>i=1</sub>c<sup>2</sup><sub>i</sub></b>||&gt;1.  Then <i>C</i> is upper bounded, because of the  supremum axiom <i>sup(C)</i> exists. Considering ||<i><b>d</b></i>||<sup>2</sup> = <b>&sum;<sup>n</sup><sub>i=1</sub>c<sup>2</sup><sub>i</sub></b>  = 1 and &lambda; = max<sub><i>i&le;i&le;n</i></sub>&#123; <i>&lambda;</i><sub>i</sub>&#125;   it implies that <i>sup(C)</i> = <i>&lambda;</i>.  For this reason, if <i>d</i> = &beta; the eigenvector associated to the greater eigenvalue <i>&lambda;</i>, maximizes the objective function.    <br>    <br>  <i>Proof 2 (algebraic  version):</i>  The objective function J(w,D) = <i>trace</i>(W<sup>T</sup> DADW) is maximized, under  restrictions:  W<sup>T</sup>W=Id, <i>trace</i>(W<sup>T</sup>DBDW) = 1 where <i>D</i> be a diagonal matrix and ||<i><b>d</b></i>||  = 1. The original data is stored in the <i>X</i> matrix and its covariance <i>X<sup>T</sup>X</i> is analyzed, written it as <i>X<sup>T</sup>X</i> =  <i>A + B</i> ,  being  <i>A</i> and <i>B</i> symmetric and positive semidefined matrixes, which can be substituted by Cholesky decomposition: <i>X<sup>T</sup>X</i> = <i>A<sup>T</sup><sub>1</sub>A<sub>1</sub></i> + <i>B<sup>T</sup><sub>1</sub>B<sub>1</sub></i>. Multiplying the left side by <i>W<sub>T</sub>D</i> and the right side by <i>DW</i> and taken the traces, next  expression is obtained: J(W,D) = <i>traza</i>(W<sup>T</sup>DA<sup>T</sup><sub>1</sub>A<sub>1</sub>DW) = <i>traza</i>(DA<sup>T</sup><sub>1</sub>A<sub>1</sub>DWW<sup>T</sup>).  Using the <i>W</i> orthogonality conditions, it  can be written as J(W,D) = <i>traza</i>(DA<sup>T</sup><sub>1</sub> A<sub>1</sub>D)= ||A<sub>1</sub>D||<sup>2</sup><sub>F</sub>.    <br>     <br>  Finally it can be  expressed as  J(W,D) = ||&#91;d<sub>1</sub>A<sub>1</sub>(1,:), d<sub>2</sub>A<sub>1</sub>(2,:),&hellip;,d<sub>n</sub>A<sub>1</sub>(n,:)&#93;<sup>T</sup>||<sup>2</sup><sub>2</sub>.  Its matrix representation is given by:</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e002.gif"></p>      <p> <font face="Verdana" size="2"> Then, <i>trace</i> (W<sup>T</sup>DB<sup>T</sup><sub>1</sub>B<sub>1</sub>DW)  = 1 implies that &sum;d<sup>2</sup><sub>i</sub>B<sub>1</sub>(:,i)<sup>T</sup>B<sub>1</sub>(:,i)  = 1, and it can  be transformed in  &sum;d<sub>i</sub>&tilde; = 1 where  &sum;d<sub>i</sub>&tilde; = d<sub>i</sub>y<sub>i</sub> such that y<sub>i</sub> =B<sub>1</sub>(:,i)<sup>T</sup>B<sub>1</sub>(:,i).  Therefore  <i>D</i> can be  expressed in a matrix form as:</font></p>      <p><img src="../img/revistas/rfiua/n56/n56a24e003.gif"></p>      ]]></body>
<body><![CDATA[<p> <font face="Verdana" size="2"> The last expression can be transformed in the next function for maximizing: J(W,D) = ||Md||<sup>2</sup><sub>2</sub>, joined to ||<i><b>d</b></i>|| = 1, which has as a maximum ||M|| as is evident.    <br>    <br>  Following, the WPCA algorithm and its convergence is presented.</font></p>      <p> <font face="Verdana" size="2"><b><i>WPCA algorithm</i></b></font></p>      <p> <font face="Verdana" size="2">The iterative nature and its convergence of EM and PCA probabilistic parameters estimation is used for obtaining the WPCA algorithm, which is described as follow, employing <i>r</i> as the iteration index:    <br>         <br>  i &nbsp;&nbsp;Normalize each characteristics vector for obtaining zero mean and the one Euclidean norm (||x||<sub>2</sub> = 1).    <br>    <br>  ii &nbsp;&nbsp;Start with some orthogonal set of vectors U<sup>(0)</sup>.    <br>    ]]></body>
<body><![CDATA[<br>  iii &nbsp;&nbsp;Calculate D<sup>(r)</sup> from the solution given in equation (12) and weighted data.    <br>    <br>  iv &nbsp;&nbsp;Calculate the <i>step-E</i> and <i>step-M</i>, from equation (4) and (5) respectively. Normalize the <i>C</i> columns de <i>C</i> for obtaining ||C(:,i)||<sub>2</sub> = 1.    <br>    <br>  v &nbsp;&nbsp;If ||C<sup>(r)</sup> - C<sup>(r-1)</sup>||<sub>2</sub> &gt; &epsilon;, return to numeral iii.    <br>    <br>  vi &nbsp;&nbsp;Orthonormalize the subspace obtained, finding its singular values decomposition (SVD), as follow: SVD(C<sup>T</sup>DX<sup>T</sup>XDC) = ASA<sup>T</sup>, C<sub>final</sub>= A<sup>T</sup>C where <i>A, S</i> the elements obtained from the decomposition SVD.</font></p>      <p> <font face="Verdana" size="2"><b><i>WPCA convergence</i></b></font></p>      <p> <font face="Verdana" size="2">As was stated in the last section, weighting the characteristics by integrating with <i>EM</i> method can guarantee the convergence of steps <i>D</i>, and it is possible to ensure the relevant features obtained. From equation (4) the relationship (13) was reach in the algorithm:</font></p>      <p> <img src="../img/revistas/rfiua/n56/n56a24e13.gif"></p>      ]]></body>
<body><![CDATA[<p> <font face="Verdana" size="2">  Again, <i>r</i> corresponds to the iteration.  As  <i>EM</i> is applied  (increasing  r) the  perturbation  <i>V<sup>(r)</sup></i> decreases,  due to it is approximated to the most discriminant axes. That is, if r &rarr; &infin;, then ||  <i>V<sup>(r)</sup></i>||  &rarr; 0, guarantying the  convergence.    <br>         <br>  <i>Theorem 1:</i> If <i>C(r)&rarr; C&circ; y Z<sup>(r)</sup>&rarr; Z&circ;</i> then <i>D<sup>(r)</sup>&rarr; D&circ;</i>.</font></p>      <p><font face="Verdana" size="2"> <i>Proof:</i> Given the Eq. (13) for iterations <i>r</i> and <i>r</i> + 1, the subtraction produces: (D<sup>(r+1)</sup> -D<sup>(r)</sup>)X = (C<sup>(r+1)</sup>Z<sup>(r+1)</sup>  -C<sup>(r)</sup>Z<sup>(r)</sup>) + (V<sup>(r+1)</sup> - V<sup>(r)</sup>) . Applying to the last relationship, any type of  norm follows: ||(D<sup>(r+1)</sup> -D<sup>(r)</sup>)X|| &le; ||C<sup>(r+1)</sup>Z<sup>(r+1)</sup> - C<sup>(r)</sup>Z<sup>(r)</sup> + V<sup>(r+1)</sup> - V<sup>(r)</sup>||. It  is known that if r &rarr; &infin;,  then || V<sup>(r)</sup>|| &rarr; 0, then: ||(D<sup>(r+1)</sup> -D<sup>(r)</sup>)X|| &le; ||C<sup>(r+1)</sup>Z<sup>(r+1)</sup> - C<sup>(r)</sup>Z<sup>(r)</sup>||.  Adding and subtracting C<sup>(r+1)</sup> Z<sup>(r)</sup>,to the  right side, it is obtained||(D<sup>(r+1)</sup> -D<sup>(r)</sup>)X|| &le;  ||C<sup>(r+1)</sup>(Z<sup>(r+1)</sup> -Z<sup>(r)</sup>) + (C<sup>(r+1)</sup> &ndash;C<sup>(r)</sup>)Z<sup>(r)</sup>||  Applying the triangular inequality and the multiplicative property, the  expression: ||(D<sup>(r+1)</sup> -D<sup>(r)</sup>) ||||X||&le;|| C<sup>(r+1)</sup>|||| (Z<sup>(r+1)</sup> -Z<sup>(r)</sup>)||+ || (C<sup>(r+1)</sup>  &ndash;C<sup>(r)</sup>)|||| Z<sup>(r)</sup>|| is reached, when <i>r</i> &rarr; &infin;, then ||(D<sup>(r+1)</sup> -D<sup>(r)</sup>) ||||X||&le;|| C<sup>&#094;</sup>|||| (Z<sup>(r+1)</sup> -Z<sup>(r)</sup>)||+ || (C<sup>(r+1)</sup>  &ndash;C<sup>(r)</sup>)|||| Z&#094;|| &nbsp;By  hypothesis, <i>X</i> is the original data matrix, then ||X|| &gt; 0. In a norm space, if a sequence converges, is a sequence of  Cauchy. If  <i>r</i> &rarr; &infin;, then ||C<sup>(r+1)</sup> &ndash;C<sup>(r)</sup>||&rarr; O, ||Z<sup>(r+1)</sup>  -Z<sup>(r)</sup>||&rarr; O, and ||D<sup>(r+1)</sup> -D<sup>(r)</sup>||&rarr; O. Thus the weight  convergence working in a Banach space is had. </font></p>      <p> <font face="Verdana" size="2"><b><i>WRDA Algorithm</i></b></font></p>      <p> <font face="Verdana" size="2">For this algorithm, errors produced by the rotation of weighted data are not important, since not only the function of each rotation but also the weighting function have similar directions. The algorithm is described as follow:    <br>    <br>  Fix the <i>k</i> - 1 dimension, being <i>k</i> the number of classes.    <br>    <br>  Normalize each vector of feature for having media zero and Euclidian norm one.    ]]></body>
<body><![CDATA[<br>    <br>  Start with some orthogonal set of vectors <i>W<sup>(0)</sup></i>.    <br>    <br>  Calculate <i>d<sup>(r)</sup></i> from the solution of equation (12), weighting the data.    <br>    <br>  Calculate the <i>W<sup>(r)</sup></i> from equation (8) y (9).</font></p>     <p><font face="Verdana" size="2"> If ||<i>W<sup>(r)</sup></i>-<i>W<sup>(r-1)</sup></i>||<sub>2</sub> &gt; &epsilon;, return to numeral iii. &epsilon; is the fixed error in the process.    <br>       <br>   Its objective function is precisely the observed in the lemma 1.</font></p>     <p><font face="Verdana" size="3"><b>Results and analysis</b> </font></p>      ]]></body>
<body><![CDATA[<p><font size="2" face="Verdana">By using a support vectors machine (SVM) classifier and evaluating its performance using two different approaches: employing ROC curves and hyper-surfaces and using the classification error, the dimensionality reduction PCA, PPCA, WPCA y WRDA techniques behavior was studied. Real data generated from geometric features as area, perimeters, orientations, dispersion, centroids and different statistical moments were used, obtaining 70 features applied to 50 capillary images of people without lupus erythematosus and 50 capillary images of ill people. In <a href="#Figura1">figure 1</a>, data projection on the principal plane is shown. It is observed that in WPCA and WRDA methods, the data clouds belonging to healthy people class (circle) versus lupus erythematosus ill people class (cross), are more compacted. They vary between -0.1 and 0.1 on the horizontal axis and take values between -0.04 and 0.04 on the vertical axis. While in PCA and PPCA methods, the clouds are enlarged and the horizontal axis varies between -0.5 and -0.5 and the vertical axis varies between -0.3 and 0.2.    <br> </font><font face="Verdana" size="2">    <p align="center"><img src="../img/revistas/rfiua/n56/n56a24i01.gif" ><a name="Figura1"></a></p> In <a href="#Figura2">figure 2</a>, the reduction methods performance is shown. In the left curve, the highest point indicates that WRDA obtained the greatest efficiency percent, while the right curve shows that WRDA has the lowest classification error compared with the other methods.</font></p>       <p align="center"><img src="../img/revistas/rfiua/n56/n56a24i02.gif" ><a name="Figura2"></a></p>      <p> <font face="Verdana" size="2">The <a href="#Figura3">figure 3</a> shows the reduction methods performance in the classification applied to 70 databases discharged from the website [21&#93;. From the figure, it is observed that the WRDA method presented the best performance compared to the others methods.</font></p>      <p align="center"><img src="../img/revistas/rfiua/n56/n56a24i03.gif" ><a name="Figura3"></a></p>      <p> <font face="Verdana" size="2">In <a href="#Tabla1">table 1</a>, the performance behavior and classifier errors of reduction methods applied to 70 databases were summarized. The values were obtained by using the mean, median, standard deviation and minima. In row "number of match" a best performance for WRDA was observed in 53 of the 70 databases, while PCA and PPCA were the worst. </font></p>      <p align="center"><img src="../img/revistas/rfiua/n56/n56a24t01.gif" ><a name="Tabla1"></a></p>      <p><font face="Verdana" size="3"><b>Conclusions</b> </font></p>      <p> <font face="Verdana" size="2">Proofs of WPCA and WRDA reduction methods were carried out from two points of view: algebraic and geometric. Such proofs were relatively weaker than those known for other weighted methods. Results of weighted reduction methods PPCA, WPCA and WRDA performance were present. These results indicate that WRDA method has a best performance in 75% of databases, being the first compared to the other methods. In the literature, no rigorous proofs of the WPCA method and its variants are reported. For this reason, great part of the document was dedicated to carry out formalization tests, one analytic and other algebraic.    ]]></body>
<body><![CDATA[<br>         <br>  Capillary images showed great complexity for extracting relevant features due to the signal noise ratio. The feature reduction methods implemented in this paper. It was confirmed by the errors generated in the order of 30%. It can be observed since the classes were highly overlapped. Nevertheless, the method with the best performance was WRDA, reducing the classification error in spite of the classes overlapping.</font></p>      <p><font face="Verdana" size="3"><b>Acknowledgments</b> </font></p>      <p> <font face="Verdana" size="2">Authors grateful the economical support of Colciencias y CONACyT in the Project 364, and to Dima for the Project "Herramienta para soporte al diagn&oacute;stico de la enfermedades vasculares usando im&aacute;genes capilares", and to Cristian Ocampo Bland&oacute;n for carrying out the graphic interface in the dimensionality reduction process.</font></p>      <p><font face="Verdana" size="3"><b>References</b> </font></p>      <!-- ref --><p> <font face="Verdana" size="2">1. B. A. Olshausen, D. J. Field. "Emergence of simple-cell receptive field properties by learning a sparse code for natural images". <i>Nature</i>. Vol. 381. 1996. pp. 607- 609.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000128&pid=S0120-6230201000060002400001&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>   2. C. M. Bishop. <i>Pattern recognition and machine learning</i>. Ed. Springer. New York. 2006. pp. 1-738.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000130&pid=S0120-6230201000060002400002&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  3. M. E. Tipping, C. M. Bishop. "Mixtures of probabilistic principal component analyzers". <i>Neural Computation</i>. Vol. 11. 1999. pp. 443-482.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000132&pid=S0120-6230201000060002400003&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>     <!-- ref --><br>  4. A. Sharma, K. K. Paliwal, G. C. Onwubolu. "Pattern Classification: An Improvement Using Combination of VQ and PCA Based Techniques". <i>American Journal of Applied Sciences</i>. Vol. 2. 2005. pp. 1445-1455.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000134&pid=S0120-6230201000060002400004&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  5. J. C. Burges. "Geometric Methods for Feature Selection and Dimensional Reduction" in: O. Maimon, L. Rokach. (editors). <i>Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers</i>. Ed. Springer. New York. 2006. pp. 59-91.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000136&pid=S0120-6230201000060002400005&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  6. L. K. Saul, K. Q. Weinberger, J. H. Ham, F. Sha, D. D. Lee. "Spectral methods for dimensionality reduction". In: O Chapelle, B. Schölkopf, A. Zien. (editores). <i>Semisupervised Learning</i>. Ed. MIT Press. Cambridge. Massachusetts. USA. 2006. pp. 279-294.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000138&pid=S0120-6230201000060002400006&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>   7. J. Venna. <i>Dimensionality reduction for visual exploration of similarity structures</i>. PhD thesis. Helsinki University of Technology. Helsinki. 2007. pp. 11-32.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000140&pid=S0120-6230201000060002400007&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  8. L. V. Maaten, E. Postma, J. V. Herik. "Dimensionality Reduction: A Comparative Review". <i>Elsevier Journal of Machine Learning Research</i>. Vol. 10. 2009. pp. 1-41.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000142&pid=S0120-6230201000060002400008&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>   9. L. Wolf, A. Shashua. "Feature Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach". <i>Journal of Machine Learning Research</i>. Vol. 6. 2005. pp. 1855-1887.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000144&pid=S0120-6230201000060002400009&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  10. A. Blum, P. Langley. "Selection of relevant features and examples in machine learning". <i>Artificial Intelligence</i>. Vol. 97. 1997. pp. 245-271.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000146&pid=S0120-6230201000060002400010&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>   11. M. A. Turk, A. P. Pentland. "Face Recognition Using Eigenfaces". <i>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</i>. Maui. Hawaii. USA. Vol. 1. 1991. pp. 586-591.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000148&pid=S0120-6230201000060002400011&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  12. G. Balcerowska, R. Siuda. "Inelastic background subtraction from a set of angle-dependent XPS spectra using PCA and polynomial approximation". <i>Vacuum</i>. Vol. 54. 1999. pp.195-199.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000150&pid=S0120-6230201000060002400012&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  13. W. Deng, J. Hu, J. Guo, W. Cai, D. Feng. "Robust, accurate and efficient face recognition froma single training image: A uniform pursuit approach". <i>Pattern Recognition</i>. Vol. 43. 2010. pp. 1748–1762.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000152&pid=S0120-6230201000060002400013&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  14. Q. Zhao, H. Lu, D. Zhang. "A fast evolutionary pursuit algorithm based on linearly combining vectors". <i>Pattern Recognition</i>. Vol. 39. 2006. pp. 310-312.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000154&pid=S0120-6230201000060002400014&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>   15. L. G. S&aacute;nchez-Giraldo, F. Mart&iacute;nez-Tabares, G. Castellanos-Dom&iacute;nguez. "Functional Feature Selection by Weighted Projections in Pathological Voice Detection". <i>Lecture Notes in Computer Science</i>. Vol. 5856. 2009. pp. 329-336.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000156&pid=S0120-6230201000060002400015&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  16. L. G. S&aacute;nchez-Giraldo, G. Castellanos-Dom&iacute;nguez. "Weighted feature extraction with a functional data extension". <i>Neurocomputing</i>. Vol. 73. 2010. pp. 1760- 1773.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000158&pid=S0120-6230201000060002400016&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  17. D. Skocaj, A. Leonardis. "Weighted and robust incremental method for subspace learning". <i>Proceedings of the Ninth IEEE International Conference on Computer Vision</i>. Vol. 2. 2003. pp. 1494-1501.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000160&pid=S0120-6230201000060002400017&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  18. M. Tipping, C. Bishop. "Probabilistic principal component analysis". <i>Journal of the Royal Statistical Society. Series B</i>. Vol. 61. 1999. pp. 611-622.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000162&pid=S0120-6230201000060002400018&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>   19. J. H. Friedman. "Regularized discriminant analysis". <i>Journal of the American Statistical Association</i>. Vol. 84. 1989. pp. 165-175.     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000164&pid=S0120-6230201000060002400019&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>  20. A. R. Webb. <i>Statistical Pattern Recognition</i>. 2<sup>a</sup>. ed. Ed. John Willey and Sons. London. 2002. pp. 305-360.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000166&pid=S0120-6230201000060002400020&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <!-- ref --><br>   21. C. Lai. W. J. Lee. M. Loog. P. Paclik. D. Tax. URL:<a href="http://ict.ewi.tudelft.nl/~davidt/occ/index.html" target="_blank">http://ict.ewi.tudelft.nl/~davidt/occ/index.html</a>. Consultada el 1 de Julio de 2009.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=000168&pid=S0120-6230201000060002400021&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><br>    <br>       <p><font face="Verdana" size="2">(Recibido el 17 de diciembre de 2009. Aceptado el 31 de agosto de 2010) </font></p>      <p><font face="Verdana" size="2"><sup>*</sup>Autor de correspondencia:correo electr&oacute;nico: <a href="mailto:jcrianoro@unal.edu.co">jcrianoro@unal.edu.co</a> (J. C.Ria&ntilde;o)</font></p>     ]]></body>
<body><![CDATA[ ]]></body><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Olshausen]]></surname>
<given-names><![CDATA[B. A]]></given-names>
</name>
<name>
<surname><![CDATA[Field]]></surname>
<given-names><![CDATA[D. J]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Emergence of simple-cell receptive field properties by learning a sparse code for natural images]]></article-title>
<source><![CDATA[Nature]]></source>
<year>1996</year>
<volume>381</volume>
<page-range>607- 609</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bishop]]></surname>
<given-names><![CDATA[C. M]]></given-names>
</name>
</person-group>
<source><![CDATA[Pattern recognition and machine learning]]></source>
<year>2006</year>
<page-range>1-738</page-range><publisher-loc><![CDATA[New York ]]></publisher-loc>
<publisher-name><![CDATA[Ed. Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tipping]]></surname>
<given-names><![CDATA[M. E]]></given-names>
</name>
<name>
<surname><![CDATA[Bishop]]></surname>
<given-names><![CDATA[C. M]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Mixtures of probabilistic principal component analyzers]]></article-title>
<source><![CDATA[Neural Computation]]></source>
<year>1999</year>
<volume>11</volume>
<page-range>443-482</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sharma]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Paliwal]]></surname>
<given-names><![CDATA[K. K]]></given-names>
</name>
<name>
<surname><![CDATA[Onwubolu]]></surname>
<given-names><![CDATA[G. C]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Pattern Classification: An Improvement Using Combination of VQ and PCA Based Techniques]]></article-title>
<source><![CDATA[American Journal of Applied Sciences]]></source>
<year>2005</year>
<volume>2</volume>
<page-range>1445-1455</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Burges]]></surname>
<given-names><![CDATA[J. C]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Geometric Methods for Feature Selection and Dimensional Reduction]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Maimon]]></surname>
<given-names><![CDATA[O]]></given-names>
</name>
<name>
<surname><![CDATA[Rokach]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
</person-group>
<source><![CDATA[Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers]]></source>
<year>2006</year>
<page-range>59-91</page-range><publisher-loc><![CDATA[New York ]]></publisher-loc>
<publisher-name><![CDATA[Ed. Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Saul]]></surname>
<given-names><![CDATA[L. K]]></given-names>
</name>
<name>
<surname><![CDATA[Weinberger]]></surname>
<given-names><![CDATA[K. Q]]></given-names>
</name>
<name>
<surname><![CDATA[Ham]]></surname>
<given-names><![CDATA[J. H]]></given-names>
</name>
<name>
<surname><![CDATA[Sha]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[D. D]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Spectral methods for dimensionality reduction]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Chapelle]]></surname>
<given-names><![CDATA[O]]></given-names>
</name>
<name>
<surname><![CDATA[Schölkopf]]></surname>
<given-names><![CDATA[B]]></given-names>
</name>
<name>
<surname><![CDATA[Zien]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<source><![CDATA[Semisupervised Learning]]></source>
<year>2006</year>
<page-range>279-294</page-range><publisher-loc><![CDATA[Cambridge ]]></publisher-loc>
<publisher-name><![CDATA[Ed. MIT Press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Venna]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<source><![CDATA[Dimensionality reduction for visual exploration of similarity structures]]></source>
<year></year>
<page-range>11-32</page-range></nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Maaten]]></surname>
<given-names><![CDATA[L. V]]></given-names>
</name>
<name>
<surname><![CDATA[Postma]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
<name>
<surname><![CDATA[Herik]]></surname>
<given-names><![CDATA[J. V]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Dimensionality Reduction: A Comparative Review]]></article-title>
<source><![CDATA[Elsevier Journal of Machine Learning Research]]></source>
<year>2009</year>
<volume>10</volume>
<page-range>1-41</page-range></nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wolf]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Shashua]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Feature Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach]]></article-title>
<source><![CDATA[Journal of Machine Learning Research]]></source>
<year>2005</year>
<volume>6</volume>
<page-range>1855-1887</page-range></nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Blum]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Langley]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Selection of relevant features and examples in machine learning]]></article-title>
<source><![CDATA[Artificial Intelligence]]></source>
<year>1997</year>
<volume>97</volume>
<page-range>245-271</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Turk]]></surname>
<given-names><![CDATA[M. A]]></given-names>
</name>
<name>
<surname><![CDATA[Pentland]]></surname>
<given-names><![CDATA[A. P]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Face Recognition Using Eigenfaces]]></article-title>
<source><![CDATA[Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition]]></source>
<year>1991</year>
<volume>1</volume>
<page-range>586-591</page-range></nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Balcerowska]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
<name>
<surname><![CDATA[Siuda]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Inelastic background subtraction from a set of angle-dependent XPS spectra using PCA and polynomial approximation]]></article-title>
<source><![CDATA[Vacuum]]></source>
<year>1999</year>
<volume>54</volume>
<page-range>195-199</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Deng]]></surname>
<given-names><![CDATA[W]]></given-names>
</name>
<name>
<surname><![CDATA[Hu]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Guo]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Cai]]></surname>
<given-names><![CDATA[W]]></given-names>
</name>
<name>
<surname><![CDATA[Feng]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Robust, accurate and efficient face recognition froma single training image: A uniform pursuit approach]]></article-title>
<source><![CDATA[Pattern Recognition]]></source>
<year>2010</year>
<volume>43</volume>
<page-range>1748-1762</page-range></nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhao]]></surname>
<given-names><![CDATA[Q]]></given-names>
</name>
<name>
<surname><![CDATA[Lu]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A fast evolutionary pursuit algorithm based on linearly combining vectors]]></article-title>
<source><![CDATA[Pattern Recognition]]></source>
<year>2006</year>
<volume>39</volume>
<page-range>310-312</page-range></nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sánchez-Giraldo]]></surname>
<given-names><![CDATA[L. G]]></given-names>
</name>
<name>
<surname><![CDATA[Martínez-Tabares]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
<name>
<surname><![CDATA[Castellanos-Domínguez]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Functional Feature Selection by Weighted Projections in Pathological Voice Detection]]></article-title>
<source><![CDATA[Lecture Notes in Computer Science]]></source>
<year>2009</year>
<volume>5856</volume>
<page-range>329-336</page-range></nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Giraldo]]></surname>
<given-names><![CDATA[L. G]]></given-names>
</name>
<name>
<surname><![CDATA[Castellanos-Domínguez]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Weighted feature extraction with a functional data extension]]></article-title>
<source><![CDATA[Neurocomputing]]></source>
<year>2010</year>
<volume>73</volume>
<page-range>1760- 1773</page-range></nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Skocaj]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Leonardis]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Weighted and robust incremental method for subspace learning]]></article-title>
<source><![CDATA[Proceedings of the Ninth IEEE International Conference on Computer Vision]]></source>
<year>2003</year>
<volume>2</volume>
<page-range>1494-1501</page-range></nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tipping]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Bishop]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Probabilistic principal component analysis]]></article-title>
<source><![CDATA[Journal of the Royal Statistical Society]]></source>
<year>1999</year>
<volume>61</volume>
<page-range>611-622</page-range></nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Friedman]]></surname>
<given-names><![CDATA[J H]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA["Regularized discriminant analysis"]]></article-title>
<source><![CDATA[Journal of the American Statistical Association]]></source>
<year>1989</year>
<volume>Vol 84</volume>
<page-range>pp 165-175</page-range></nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Webb]]></surname>
<given-names><![CDATA[A. R]]></given-names>
</name>
</person-group>
<source><![CDATA[Statistical Pattern Recognition]]></source>
<year>2002</year>
<edition>2</edition>
<page-range>305-360</page-range><publisher-loc><![CDATA[London ]]></publisher-loc>
<publisher-name><![CDATA[Ed. John Willey and Sons]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lai]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[W. J]]></given-names>
</name>
<name>
<surname><![CDATA[Loog]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Paclik]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Tax]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
</person-group>
<source><![CDATA[]]></source>
<year></year>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
