Paper
22 March 1999 Recent progresses of neural network unsupervised learning: I. Independent component analyses generalizing PCA
Author Affiliations +
Abstract
The early vision principle of redundancy reduction of 108 sensor excitations is understandable from computer vision viewpoint toward sparse edge maps. It is only recently derived using a truly unsupervised learning paradigm of artificial neural networks (ANN). In fact, the biological vision, Hubel- Wiesel edge maps, is reproduced seeking the underlying independent components analyses (ICA) among 102 image samples by maximizing the ANN output entropy (partial)H(V)/(partial)[W] equals (partial)[W]/(partial)t. When a pair of newborn eyes or ears meet the bustling and hustling world without supervision, they seek ICA by comparing 2 sensory measurements (x1(t), x2(t))T equalsV X(t). Assuming a linear and instantaneous mixture model of the external world X(t) equals [A] S(t), where both the mixing matrix ([A] equalsV [a1, a2] of ICA vectors and the source percentages (s1(t), s2(t))T equalsV S(t) are unknown, we seek the independent sources <S(t) ST(t)> approximately equals [I] where the approximated sign indicates that higher order statistics (HOS) may not be trivial. Without a teacher, the ANN weight matrix [W] equalsV [w1, w2] adjusts the outputs V(t) equals tanh([W]X(t)) approximately equals [W]X(t) until no desired outputs except the (Gaussian) 'garbage' (neither YES '1' nor NO '-1' but at linear may-be range 'origin 0') defined by Gaussian covariance <V(t) V(t)T>G equals [I] equals [W][A] <S(t) ST(t)greater than [A]T[W]T. Thus, ANN obtains [W][A] approximately equals [I] without an explicit teacher, and discovers the internal knowledge representation [W], as the inverse of the external world matrix [A]-1. To unify IC, PCA, ANN & HOS theories since 1991 (advanced by Jutten & Herault, Comon, Oja, Bell-Sejnowski, Amari-Cichocki, Cardoso), the LYAPONOV function L(v1,...,vn, w1,...wn,) equals E(v1,...,vn) - H(w1,...wn) is constructed as the HELMHOTZ free energy to prove both convergences of supervised energy E and unsupervised entropy H learning. Consequently, rather using the faithful but dumb computer: 'GARBAGE-IN, GARBAGE-OUT,' the smarter neurocomputer will be equipped with an unsupervised learning that extracts 'RAW INFO-IN, (until) GARBAGE-OUT' for sensory knowledge acquisition in enhancing Machine IQ. We must go beyond the LMS error energy, and apply HOS To ANN. We begin with the Auto- Regression (AR) which extrapolates from the past X(t) to the future ui(t+1) equals wiTX(t) by varying the weight vector in minimizing LMS error energy E equals <[x(t+1) - ui(t+1)]2> at the fixed point (partial)E/(partial)wi equals 0 resulted in an exact Toplitz matrix inversion for a stationary covariance assumption. We generalize AR by a nonlinear output vi(t+1) equals tanh(wiTX(t)) within E equals <[x(t+1) - vi(t+1)]2>, and the gradient descent (partial)E/(partial)wi equals - (partial)wi/(partial)t. Further generalization is possible because of specific image/speech having a specific histogram whose gray scale statistics departs from that of Gaussian random variable and can be measured by the fourth order cumulant, Kurtosis K(vi) equals <vi4> - 3 <vi2>2 (K greater than or equal to 0 super-G for speeches, K less than or equal to 0 sub-G for images). Thus, the stationary value at (partial)K/(partial)wi equals plus or minus 4 PTLwi/(partial)t can de-mix unknown mixtures of noisy images/speeches without a teacher. This stationary statistics may be parallel implemented using the 'factorized pdf code: (rho) (v1, v2) equals (rho) (v1) (rho) (v2)' occurred at a maximal entropy algorithm improved by the natural gradient of Amari. Real world applications are given in Part II, (Wavelet Appl-VI, SPIE Proc. Vol. 3723) such as remote sensing subpixel composition, speech segmentation by means of ICA de-hyphenation, and cable TV bandwidth enhancement by simultaneously mixing sport and movie entertainment events.
© (1999) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Harold H. Szu "Recent progresses of neural network unsupervised learning: I. Independent component analyses generalizing PCA", Proc. SPIE 3722, Applications and Science of Computational Intelligence II, (22 March 1999); https://doi.org/10.1117/12.342876
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Sensors

Machine learning

Independent component analysis

Principal component analysis

Neurons

Neural networks

Positron emission tomography

RELATED CONTENT


Back to Top