17
Dealing with Complexity: Data Reduction and Clustering
This is certainly an age in which we do not suffer from scarcity of data. Using information infrastructures and the Web, we may collect plenty of observations of many variables, resulting in rich datasets waiting for analysis, maybe too rich. We sometimes need to simplify data in order to visualize them, to discover patterns, and to make decisions based on them. In this chapter we outline some of the most relevant techniques, which have several applications in supply chain management, marketing, finance, and related fields. First, we motivate the need for data reduction in Section 17.1; this is often a preliminary step to make the application of other quantitative methods possible. Principal component analysis (PCA), the subject of Section 17.2, is a nice illustration of the role played by linear algebra in multivariate statistics; be sure to master the material on eigenvalues and eigenvectors from Chapter 3 before getting here. Section 17.3 illustrates factor analysis, which shares some of the technical machinery of PCA, but takes a different view. Factor analysis is an example of the statistical techniques trying to find latent, i.e., not directly observable, variables that may help in understanding an otherwise too complicated phenomenon. The chapter closes with Section 17.4, which outlines a range of techniques collectively known as cluster analysis. This set of methods aims at grouping observations into similar clusters, ...
Get Quantitative Methods: An Introduction for Business Management now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.