What is the difference between factor analysis and discriminant analysis?

The core difference between Factor Analysis and Discriminant Analysis lies in their primary objectives: Factor Analysis aims to uncover underlying, unobservable (latent) variables and reduce data dimensionality, while Discriminant Analysis focuses on predicting group membership based on observed variables.

Understanding Factor Analysis

Factor Analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors or latent variables. It is an exploratory or confirmatory technique that seeks to simplify complex datasets.

Key Objectives of Factor Analysis:

Identify Latent Variables: Its primary goal is to find underlying constructs or factors that explain the patterns of correlations among a set of observed variables. For instance, if you have many questions on a survey, Factor Analysis can help identify a few core themes (factors) that those questions measure.
Data Reduction: When dealing with a large number of variables, Factor Analysis can reduce the dataset to a more manageable set of factors, which represent the original variables with minimal loss of information. This is particularly useful for preparing data for other analyses.
Structure Exploration: It helps in understanding the fundamental structure of a set of variables, revealing how they group together.

How Factor Analysis Works:

Factor Analysis assumes that the observed variables are linear combinations of these unobserved factors plus some error terms. It examines the intercorrelations among variables to identify these common underlying dimensions.

Practical Example:

Imagine a customer satisfaction survey with 30 questions about product features, service quality, price, and brand perception. Instead of analyzing 30 individual scores, Factor Analysis could reduce these to a few latent factors like "Product Value," "Customer Support," and "Brand Loyalty," making the results much easier to interpret and act upon. Learn more about Factor Analysis.

Understanding Discriminant Analysis

Discriminant Analysis (DA), particularly Linear Discriminant Analysis (LDA), is a predictive statistical technique used to classify observations into one of several predefined groups. It constructs a classification rule using a set of independent variables (predictors) to predict the group membership of new observations.

Key Objectives of Discriminant Analysis:

Predict Group Membership: The main purpose is to predict which group an individual or item belongs to based on a set of predictor variables. This is a powerful tool for classification.
Identify Discriminating Variables: It identifies which variables best differentiate between the groups.
Develop Classification Rules: It creates functions (discriminant functions) that maximize the separation between the groups.

How Discriminant Analysis Works:

Discriminant Analysis works by finding linear combinations of the independent variables that best separate the pre-existing groups. It creates one or more discriminant functions that act as a new set of dimensions, along which the groups are maximally separated.

Practical Example:

A bank might use Discriminant Analysis to predict whether a loan applicant will default or not default on a loan. Using financial indicators (income, credit score, existing debt) and demographic data, the analysis can build a model to classify new applicants into the "high risk of default" or "low risk of default" groups. This helps the bank make informed lending decisions. Explore Discriminant Analysis applications.

Key Differences Summarized

The fundamental distinction lies in their purpose: Factor Analysis is about discovering underlying structure and reducing data, while Discriminant Analysis is about predicting category membership.

Feature	Factor Analysis	Discriminant Analysis
Primary Goal	Uncover latent variables; data reduction	Predict group membership; classify observations
Dependent Variable	None (all variables are considered independent)	Categorical (group membership)
Independent Variables	Continuous (used to form factors)	Continuous (predictor variables)
Nature of Technique	Exploratory/Confirmatory (structure discovery)	Predictive/Classificatory
Output	Factors, factor loadings, eigenvalues	Discriminant functions, classification rules, accuracy rates
Data Requirement	Assumes underlying factors drive observed variables	Requires predefined groups
Use Case Example	Understanding dimensions of customer satisfaction	Predicting customer churn or disease diagnosis
Data Flow	Input many variables, output fewer factors	Input predictor variables, output group assignment

When to Use Which Method

Choosing between Factor Analysis and Discriminant Analysis depends entirely on your research question and the nature of your data:

Choose Factor Analysis when:
- You want to identify unobserved, latent constructs that explain the relationships among a set of observed variables.
- You have many variables and want to reduce their number into a smaller, more manageable set of factors for subsequent analysis.
- You are exploring the underlying structure of a questionnaire or scale.
- Your goal is to understand why variables correlate in a certain way.
Choose Discriminant Analysis when:
- You have clearly defined groups (e.g., "buyer/non-buyer," "successful/unsuccessful," "treatment/control").
- Your primary objective is to build a model that can predict which group new observations will belong to.
- You want to understand which variables contribute most to distinguishing between these groups.
- You are interested in how groups differ based on a set of predictors.

In essence, if your interest lies in discovering hidden structures or simplifying complex data, Factor Analysis is your tool. If your goal is predicting which category an item or person belongs to, Discriminant Analysis is the appropriate method.