Making Principal Components Visible for Teaching and Research
ISBN 978-85-88783-11-9
Authors
1Van Der Merwe, F.
1UNIVERSITY OF PRETORIA Email: fritz.vandermerwe@up.ac.za
Abstract
Principal Component Analysis (PCA) is often used to reveal the internal structure of a complex data set, as well as for data dimensionality reduction, in spatial data analysis, remote sensing, meteorology and other computational models. In Geographic Information Science (GISc) these internal, sometimes simplified and more understandable, structures, or factors, can be used for further analysis, or the creation of understandable thematic maps. It is therefore important for students to understand and be able to utilize the techniques. The problem with teaching or using the technique is the inherent multidimensionality of the principal components. It is fairly easy to demonstrate the rotation of the data space in two or three dimensions, but after that the understanding depends on the power of the learner’s, or researcher’s, imagination. Principal components are also notoriously difficult to interpret, which is the reason why they are not utilized more in research projects. One way to make the technique more understandable and assist in the interpretation is to make the effects of principal component transformations visible. The purpose is to show the student a way which makes interpretation easier and more intuitive, through visualisation of the data. To illustrate the processes SAGA GIS is used to isolate and interpret the effect of the main principal components of a set of Landsat multi-spectral images by projecting the components on a single RGB composite image. The effect provides the student and analyst with a picture of the combination of the main parts of the spectrum reflected from the different land cover types. Different combinations of the components and the use of a colour slider illustrate the weights of the components in the different land cover types. To illustrate how PCA can be used for further analysis the results of an experiment, which looks at the relative accuracy of doing supervised classification using only the isolated components compared to using all the spectral data, is discussed. Attention is directed at the similarity between the land cover and the component colours.
Keywords
Teaching PCA; Visualization and PCA; PCA Classification