Mastering Principal Components Analysis (PCA) Using SPSS Statistics


Principal Components Analysis (PCA) is a powerful statistical technique used to reduce the dimensionality of data sets, while preserving as much variability as possible. This tutorial will guide you through the steps of conducting PCA in SPSS Statistics, explaining the assumptions, interpreting the SPSS output, and reporting the results in APA style. For a more comprehensive understanding, consider reviewing our previous posts on sphericity and descriptive and inferential statistics.

Assumptions and Data Requirements

Before performing PCA, ensure that your data meets the following assumptions:

  • Sample size should be adequate. A common rule of thumb is to have at least 5 to 10 observations per variable.
  • Variables should be measured at the interval or ratio level.
  • Variables should have linear relationships with each other.
  • The data should be approximately normally distributed.

Example Dataset

For this example, we will use a dataset containing measurements of different attributes from various products. The variables include weight, height, width, and depth.

Conducting PCA in SPSS

  1. Open SPSS and load your dataset.
  2. Navigate to Analyze > Dimension Reduction > Factor.
  3. In the Factor Analysis dialog box, move your variables (e.g., weight, height, width, depth) to the Variables box.
  4. Click on the Descriptives button, select KMO and Bartlett’s test of sphericity, then click Continue.
  5. Click on the Extraction button, select Principal components and Eigenvalues over 1, then click Continue.
  6. Click on the Rotation button, select Varimax, then click Continue.
  7. Click OK to run the analysis.

SPSS Output and Interpretation

After running PCA, SPSS provides several output tables. Below, we discuss and interpret the key tables.

KMO and Bartlett’s Test

Test Value
Kaiser-Meyer-Olkin Measure of Sampling Adequacy 0.75
Bartlett’s Test of Sphericity Approx. Chi-Square 50.123
df 6
Sig. 0.001

The KMO value of 0.75 indicates a middling measure of sampling adequacy. Bartlett’s Test of Sphericity is significant (p < 0.05), indicating that the correlations between variables are sufficiently large for PCA.

Total Variance Explained

Component Initial Eigenvalues % of Variance Cumulative %
1 2.543 63.575 63.575
2 0.987 24.675 88.250
3 0.267 6.675 94.925
4 0.203 5.075 100.000

The table shows that the first two components have eigenvalues greater than 1 and together explain 88.25% of the total variance. This suggests that these two components can effectively summarize the data.

Component Matrix

Variable Component 1 Component 2
Weight 0.879 0.326
Height 0.852 -0.214
Width 0.802 0.414
Depth 0.774 -0.437

The Component Matrix shows the loadings of each variable on the components. High loadings indicate that a variable strongly influences a component. For example, Weight and Height load highly on Component 1.

Reporting Results in APA Style

When reporting PCA results in APA style, include the following elements:

  • A brief description of PCA and its purpose.
  • Details about the dataset and sample size.
  • Results of the KMO and Bartlett’s tests.
  • The number of components retained and the variance explained.
  • Interpretation of the component loadings.

For example:

Principal Components Analysis (PCA) was conducted to reduce the dimensionality of the dataset containing measurements of Weight, Height, Width, and Depth. The Kaiser-Meyer-Olkin measure verified the sampling adequacy for the analysis, KMO = 0.75, and Bartlett’s test of sphericity χ²(6) = 50.123, p < .001, indicated that correlations between items were sufficiently large for PCA. Two components had eigenvalues over Kaiser’s criterion of 1 and in combination explained 88.25% of the variance. The scree plot also showed an inflection that justified retaining two components. Component 1 was strongly correlated with Weight, Height, Width, and Depth.

Conclusion

Principal Components Analysis is a valuable tool for data reduction, allowing researchers to simplify complex datasets while retaining essential information. By following the steps outlined in this tutorial and interpreting the SPSS output correctly, you can effectively use PCA in your research projects. For more detailed tutorials, consider exploring our posts on sphericity and descriptive and inferential statistics.


Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *