StatTools : Factor Analysis - Parallel Analysis Explained, Tables, and Program

Links : Home Index (Subjects) Index (Categories) Contact StatTools

Related link :
Factor Analysis Explanation Page
Factor Analysis - Principal Component Extraction Program Page

Introduction Tables of Minimum Eigen Values Computer Program, References
Introduction Technical Considerations Example

Parallel Analysis is a procedure sometimes used to determine the number of Factors or Principal Components to retain in the initial stage of Exploratory Factor Analysis.

This discussion assumes that the user understands Factor Analysis and the procedure of Principal Component extraction, and no details for these are provided here.

A critical decision in Exploratory Factor Analysis is to determine how many Principal Components to retain, as each extraction produces decreasingly significant Factors. Retaining too few leads to a loss of information in the data, and retaining too many includes trivial and random information. Both of these produce misleading and unreproducible results.

Traditionally, researchers depend on one or more of the following criteria to determine how many components to retain.

  • The most common, and the default criteria in most statistical packages is the K1 rule. Principal Components are retained while the Eigen Value (the Variance associated with the component), is >=1. This is based on the argument that, in a correlation matrix, each variable contribute a variance of 1, so a component that accounts for less than that has no meaning and should be discarded. The criticism to the K1 rule is that the Eigen values are inflated by random associations in the data, so that the use of the K1 rule often retains more components or factors than appropriate, particularly when the sample size is small.
  • Another rule is using the Scree Test. The Eigen values are plotting against component number, and when the sharp decrease in Eigen values level off (the scree), the remaining components are abandoned. This is based on the arguments that the initial and significant components each extracts a large proportion of the variance from the correlation matrix, while the insignificant ones contain mostly data noise and so their Eigen values are similar. The criticism of using the Scree Test is that it depends on eye balling when there is no sharp transition where the scree begins. Thus researchers often disagree where the scree begins.
  • Some researchers tried different retention levels, and match the results with the theoretical model of the data. This does produce neat outcomes, but runs the risk that, if the underlying theory is flawed to start with, then the results tend not to be reproducible.

Parallel Analysis takes a different approach, and is based on the Monte Carlo simulation. A data set of random numbers, but having the same sample size and number of variables as the user's research data, are subjected to analysis, and the Eigen values obtained are recorded. This is repeated many times (often between 50 and 100 iterations, and the tables later on this page used 1000 iterations).

The mean and Standard Deviation of the replicated Eigen values for each component are then calculate, from which the 95th percentile value is obtained (95th percentile = mean + 1.65SD). These form the standard against which the Eigen value of each component from the research data is compared, and a components is retained if its Eigen value exceeds the 95th percentile of the simulated values. The argument being that a component should be retained if its Eigen value is clearly greater (at 95th percentile) than that obtained at random.