What is Least Squares Factor Analysis?

Least Squares Factor Analysis is a statistical method used in exploratory factor analysis (EFA) that aims to identify underlying latent factors by minimizing the discrepancies between observed correlations and those predicted by the factor model. It estimates factor loadings and unique variances by fitting the model to the data, ensuring the "best fit" in a least squares sense.

This approach is popular because it does not require assumptions of multivariate normality, making it a robust alternative to Maximum Likelihood Factor Analysis, especially with non-normal data.

The Core Principle of Least Squares

At its heart, "least squares" refers to the mathematical principle of minimizing the sum of the squared differences (or residuals) between the observed data (e.g., correlations between variables) and the data reproduced by the proposed factor model. The goal is to find factor loadings that result in the smallest possible overall error, thus providing the most accurate representation of the relationships among variables.

Types of Least Squares Factor Analysis

Within the broad category of least squares factor analysis, two prominent methods are often distinguished:

1. Unweighted Least Squares (ULS) / Minimum Residuals (Minres)

Description: Unweighted Least Squares (ULS), often implemented as Minres (Minimum Residuals), is the most straightforward form. It minimizes the sum of the squared differences between the observed and reproduced correlation matrices without applying any differential weighting to these differences. Every residual is treated equally.
Strengths:
- Computationally simple and efficient.
- Does not assume multivariate normality of the data.
- Can be a good choice for initial factor extraction.
Limitations:
- Does not typically provide a statistical test of goodness-of-fit, such as a chi-square test, making it harder to formally assess model fit compared to other methods.
- Can sometimes lead to "Heywood cases" where a variable's communality (the proportion of its variance explained by the common factors) exceeds 1.0.

2. Generalized Least Squares (GLS)

Description: Generalized Least Squares (GLS) factoring is an advanced technique that refines the ULS approach. It adjusts the minimization process by measuring the correlations in a way that is inversely proportional to their uniqueness. This means that variables with higher uniqueness (i.e., more variance not explained by common factors) are given less weight in the estimation process, effectively reducing their influence on the overall factor solution. Conversely, variables that are less unique and share more common variance are weighted more heavily.
Strengths:
- More statistically rigorous than ULS.
- Does not strictly require multivariate normality, making it more robust than Maximum Likelihood Factor Analysis when assumptions are violated.
- Generates a chi-square goodness-of-fit test, which allows for a statistical evaluation of how well the factor model reproduces the observed correlation matrix.
- Often provides more stable and reliable factor solutions than ULS, especially when variables have differing levels of uniqueness.
Limitations:
- More complex computationally than ULS.
- While more robust than ML, extreme deviations from normality can still affect the accuracy of the chi-square test.

How it Works: A Step-by-Step Overview

Least squares factor analysis generally involves an iterative process to arrive at the factor solution:

Initial Estimates: Start with initial estimates for factor loadings and unique variances.
Reproduced Correlation Matrix: Use these estimates to calculate a "reproduced" or model-implied correlation matrix.
Calculate Residuals: Compare the observed correlation matrix with the reproduced matrix to find the differences (residuals).
Minimize Squared Residuals: Adjust the factor loadings and unique variances to minimize the sum of these squared residuals. For GLS, this involves weighting the residuals based on the inverse of variable uniqueness.
Iterate: Repeat steps 2-4 until the changes in the factor loadings and minimized residuals are negligible, indicating convergence to a stable solution.

Practical Applications and Benefits

Least squares factor analysis is a valuable tool across various fields:

Psychology and Education: Developing and validating psychological scales, personality inventories, and educational tests by identifying underlying constructs (e.g., intelligence, anxiety).
Social Sciences: Uncovering latent variables in survey data, such as public opinion, political attitudes, or social class indicators.
Market Research: Segmenting customers based on underlying preferences or behaviors, or identifying key dimensions of product perception.
Healthcare: Understanding symptom clusters or quality of life dimensions.

The primary benefits include its non-reliance on strict normality assumptions, offering a flexible and robust option for exploring complex data structures. The chi-square test provided by GLS also gives researchers a statistical measure to evaluate the model's fit to the data.

Comparing Least Squares, ULS, GLS, and Maximum Likelihood

To further understand Least Squares Factor Analysis, it's helpful to see how its variants (ULS, GLS) compare to the widely used Maximum Likelihood (ML) method.

Feature	Unweighted Least Squares (ULS) / Minres	Generalized Least Squares (GLS)	Maximum Likelihood (ML)
Weighting	No differential weighting of residuals	Weights residuals inversely proportional to variable uniqueness	Weights based on the inverse of the observed covariance matrix
Normality Assumption	Not required	Not strictly required, more robust to non-normality than ML	Assumes multivariate normality of observed variables
Chi-square Test	Generally not provided (for ULS/Minres)	Yes	Yes
Objective	Minimize sum of squared differences (residuals)	Minimize weighted sum of squared differences (residuals)	Maximize the likelihood of the observed data given the model
Robustness	Good	Good, especially with non-normal data	Sensitive to violations of normality assumptions

Least Squares Factor Analysis, particularly GLS, provides a strong, flexible approach for discovering underlying latent structures in data, especially when multivariate normality cannot be assumed, offering a crucial alternative to maximum likelihood methods while still providing statistical tests of model fit.