Aug. 19 . . . Class 1: Introduction to class
This class needs to (1) introduce everyone to each other, (2) introduce
everyone to the class' requirements, objectives and style, (3) build
the proper mindset for studying this technique, (4) survey the
basic rationale underlying SEM, and (5) review the logical plan of the course. SEM is very much "regression
+ latent variables," but the latent variables make a big difference,
and it may help to get these similarities and differences out on
the table right away.
Reading: none
Leading questions: (1) What!? (2) How is SEM different
from regression? (3) What is a latent variable?
Aug. 26 . . . Class 2: Overview of structural
equation modeling
This class should show us where we are headed. It may help students
to put the coming material into a larger perspective.
Reading: Rigdon (1998), "Structural
Equation Modeling." (in two parts, on the online reserves
site) In Marcoulides (ed.), Modern Methods for Business Research,
Mahwah, NJ: LEA.
Leading questions: What is SEM supposed to accomplish? What are
the requirements for using SEM? What are SEM's key limitations?
Sep. 2 . . . Class 3: Tools--covariances,
matrices and matrix algebra
This class will begin with an introduction to the various elements
that appear in structural equation models. Then we will look at
two of the most important mathematical tools for understanding this
method--the algebra of variances and covariances, and the algebra
of matrices. These tools are essential in helping researchers to
understand how the method should behave under well-specified
conditions--so that the computer programs will not be a "black
box." Partial correlations also offer insights regarding "what
to expect" from SEM analysis of a data set.
Reading: "Model Notation,
Covariances and Path Analysis" (Bollen 1989, Structural
Equations with Latent Variables); author unknown, "matrix algebra
chapters" (this electronic file will be distributed by the
instructor). This semester, we will emphasize the SIMPLIS natural
language syntax, but we still need to talk matrices in order to
understand what's happening, especially when something goes wrong.
Leading questions: (1) How do you pronounce the following Greek
characters: h , z
, Q , x ,
f , y ? (2)
What are the eight parameter matrices in the general LISREL
model (hint: not all Greek characters represent parameters)? (3)
Suppose that x = a * x + d
and y = b *h + c * e
, and suppose that d and e
are known to be uncorrelated with x and
h --what is the covariance between x
and y? (4) If A is a 3 x 3 matrix of 2s, and I is a 3 x 3
identity matrix, what is (A + I) and what is A * I? (5) What does
it mean when a matrix is "not positive definite," and
what does it imply about other operations that may be performed
on such a matrix?
Sep. 9 . . .
Class 4: Using SEM software.
LISREL is the "grand
old dame" of SEM packages (though Mplus is clearly teh most sophisticated SEM package today). There is a free
student version that wil be sufficient for most of our analysis,
up until the term project We'll get the basics today, so students
can start estimating models. Bagozzi's 1980 paper was one of the
first applications of SEM in Marketing literature, and offers an
unusually detailed (and somewhat erroneous) discussion of the model,
so we'll use that as a starting point. Both the detailed discussion
and the errors are important preparation for using SEM.
Reading: Simplis
manual, ch. 6, "Simplis Reference Guide." Bagozzi (1980,
"Performance and Satisfaction
in an Industrial Sales Force: An Examination of their Antecedents
and Simultaneity," Journal of Marketing, 22 (Spring),
65-77
Leading questions:
(1) What are the "order" restrictions in a Simplis command
file? That is, which lines must come first, and so forth? (2) Suppose
you specify at one point that a certain parameter is free, but at
another point specify that it is fixed--which specification takes
precedence?
Sep. 16 . . .
Class 5: Looking at a LISREL printout
We'll look at a printout from a LISREL run. We'll see how we got
from a path diagram to a mathematical model. I'll talk about a systematic
approach to examining a LISREL printout, which may optimize efficiency
and minimize frustration. The 1994 paper describes a simple heuristic
for computing degrees of freedom. DF is an important check to help
make sure you have set up a model correctly.
Reading: Simplis manual, ch. 5, "LISREL Output"; Sample
outputs (see classnotes page); Rigdon
(1994), "Calculating
Degrees of Freedom for a Structural Equation Model,"
Structural Equation
Modeling, 1, 274-78.
Leading questions: (1) Looking at a LISREL printout, how can you
tell if you set up the problem as you intended? (2) Based on the
printout, and what you have already learned, what are two ways you
can determine the number of parameters that the model is estimating?
(3) How do you read a stem-leaf diagram?
Sep. 23 . . . Class 6: Measurement models
and psychometrics
SEM offers us the opportunity to closely examine the psychometric
properties of measures, yet the SEM measurement models does not
match up precisely with classical true-score test theory. We'll
look at some classical indices of measure quality, such as Cronbach's
alpha, and some substitute indices that SEM users have adopted,
including "composite reliability" and "average variance
extracted." We'll also take a critical look at the "formative
measurement model," which purports toreverse the factor analytic
relationship between measures and constructs. The discussion of
factor models vs principle components sets up our later discussion
of partial least squares (PLS), an alternative modeling approach.
Reading: Rigdon (1994), "Demonstrating
the Effects of Unmodeled Random Measurement Error," Structural
Equation Modeling, 1, 375-80; Joreskog, "Basic
Ideas of Factor and Component Analysis"; Jarvis, MacKenzie
and Podsakoff (2003), "A Critical Review of Construct Indicators
and Measurement Model Misspecification in Marketing and Consumer
Research," Journal of Consumer Research, 30 (2), 199-218, especially
pp. 199-205.
Leading questions:
(1) When can researchers ignore the potential for random measurement
error? (2) What are the attributes of a good measure? (3) Can we
use SEM parameter estimates or other output to evaluate the psychometric
properties of our observed variables? (4) How can you tell whether
a CFA model will fit just by looking at a correlation matrix? (5)
Given the impact of random measurement error, how can proxy variables
formed as linear composites be used in theory-based research?
Sep. 30 . . . Class
7: Estimation methods, fit assessment and fit
indices
In a world where all fit is "approximate," how do we
distinguish between "good approximate fit" and plain-old
bad fit? We'll talk about both "traditional" and newer
fit indices, and look at some of recent findings.
Also, lets practice reproducing a LISREL analysis from a
published article. The article is Hallen, Johanson and Seyed-Mohamed
(1991). Warning: if you estimate their model, as described in the
paper, you will not get their result. A correct reproduction
of their model should yield a c 2 of
about 74, with 39 degrees of freedom. Set AD=OFF on the OUtput line.
Interpret the fit, and look for evidence of fit problems. Ask yourself
how the authors obtained the results that they reported.
Readings: Simplis manual, ch. 4, "Fitting and Testing";
Hu and Bentler (1999), "Cutoff
Criteria for Fit Indexes in Covariance Structure Analysis: Conventional
Criteria Versus New Alternatives," Structural Equation
Modeling, 6 (1), 1-31; Hallen, Johanson and Seyed-Mohamed
(1991), "Interfirm Adaptation
in Business Relationships," Journal of Marketing,
55 (April), 29-37.
Oct. 7 . . . Class 8:
Data issues: non-normal and incomplete data
Some researchers
say that having data which are not normally distributed, or which
are only of ordinal--rather than interval--scale, is a common situation
that is too often ignored. Others say that this situation isn't
ignored often enough. Improved methods for dealing with this problem
are now available. Incomplete observations are also a common problem.
Superior FIML methods have been available for years in the Amos
package, and have recently been incorporated into SEM packages in
general.
Readings: Finney and DiStefano, "Nonnormal and categorical Data in Structural Equation Modeling," pp. 269-314 in Hancock and Mueller (2006); Enders, "Analyzing Structural Equation Models with Missing Data," pp. 315-344 in hancock and Mueller (2006).; PRELIS instructions (to be distributed by instructor).
Leading questions:
TBA.
NOTE: The Midterm Exam will be distributed during this class.
It will be due at the beginning of next week's class. Bring your
lingering questions, so we can address them before I hand out the
midterm.
Oct. 13 . . . Semester Midpoint
Oct. 14 . . . Class 9: Midterm exam discussion
Students must turn in their midterm exam at the beginning
of class. Late submissions, after class discussion, will
not be accepted, even if the student does not attend the class discussion.
I'm serious about this.
Reading: there is no additional reading assignment
for this class. Students should just do their best on the exam.
Leading questions: see exam.
Oct. 21 . . .
Class 10: Identification
We cannot estimate
all models that might be of theoretical interest. In particular,
we cannot estimate models that are not "identified." We
will look at ways to tell whether or not a given model is identified,
and talk about some very recent developments. We will also look
at the problem of "empirical underidentification."
Reading: Rigdon (1995), "A
Necessary and Sufficient Identification Rule for Structural Models
Estimated in Practice," Multivariate
Behavioral Research
Leading questions:
(1) How many measures do we need in order to estimate a one-factor
model? A two-factor model? (2) Can we "eyeball" a model
and tell whether or not it is identified? (3) If our model of interest
is not identified, what can we do?
Oct. 28 . . .
Class 11: Multi-group analysis, mean structures
and invariance
We'll talk about
the use of SEM to analyze experimental data, as well as the analytical
opportunities in comparing model fits and parameter estimates across
data samples taken from different groups. We'll also talk, finally,
about the means of latent variables, and how we can estimate them.
This discussion leads into a consideration of ways to include interaction
effects in structural equation models. (I'm pushing here--I hope
to do this in one class, in order to conserve time for another advanced
topic.)
Readings: Simplis
manual, ch. 2, "Multi-sample Examples"; Steenkamp and
Baumgartner (1998), "Assessing
Measurement Invariance in Cross-National Consumer Research,"
Journal of Consumer Research, 25, 78-90;
Mackenzie and Spreng (1992),
"How Does Motivation Moderate the Impact of Central and Peripheral
Processing on Brand Attitudes and Intentions?" Journal
of Consumer Research, 18, 519-29.
Leading questions:
(1) How might differences in group sample size affect the results
of a multi-sample analysis? (2) How are degrees of freedom calculated
for the Mackenzie and Spreng model? (3) When is it important to
include mean effects in a structural equation model? (4) Can we
confidently assert that a measure behaves "the same way"
in different populations?
Nov. 4 . . . Class
12: Latent variable interactions
Main effects are
one thing, but sometimes relations are interactive--the strength
of factor may be partly a function of the level of factor b. We
will talk about modeling interaction relationships using both the
multiple group approach and the multiplicative interaction approach.
Among multiplicative approaches, we will focus on Marsh, Hau and
Wen's comparison of different approaches.
Readings: Marsh,
Wen and Hau, "Structural Equation Models of Latent Interaction and Quadratic Effects," pp. 225-265 in Hancock and Mueller (2006).
Leading questions:
What is an interaction? What kinds of interactions would
you expect to encounter in your field of interest? Can you render
those interactions within a SEM model? Are strictly correct but
mind-bendingly complex methods better than more approximate but
simpler methods?
Nov. 11 . . .
Class 13: Multilevel and longitudinal analysis
Sometimes respondents
are clustered--salespeople are clustered by employer, students are
clustered by classroom or school, and so forth. In longitudinal
analysis, observations are clustered by respondent. Cluster-level
effects may be substantively interesting in their own right, while
failing to account for these effects may produce misleading results.
Reading: Hancock and Lawrence, "Using Latent Growth Models to Evaluate Longitudinal Change," pp. 171-196 in Hancock and Mueller (2006); Stapleton, "Using Multilevel Structural Equation Modeling Techniques with Complex Sampling Data," pp. 345-384 in Hancock and Mueller (2006).
Leading questions:
(1) How would you represent a
linear change process in SEM terms? A quadratic change process?
What about a process that was irregular over time? (2) How do you adapt a structural equation model when responses are clustered, as when different customers shop the same store (but not all the same), or when different employees share the same boss (but not all the same)? (3) How is multilvele modeling like multigroup modeling? (4) How is multilevel modeling very much the same as longitudinal modeling?
Nov. 16 .
. . Last day to withdraw and receive a WF.
Nov. 18 . . .
Class 14: Path modeling with Partial Least
Squares (PLS)
PLS is a popular
tool for path modeling that looks a lot like SEM--on the surface--and
is very popular in certain fields, like information systems. We'll
look at how PLS works and hopefully conduct some analyses using
SmartPLS (see my website for a link to this freeware program). We'll
talk about how to compare SEM and PLS results.
Reading: Rigdon (2005),
"Structural Equation Modeling: Nontraditional Alternatives."
In Everitt and Howell (eds.), Encyclopedia of Statistics
in Behavioral Science vol. 4. New York: Wiley,
1934-41 (to be distributed by instructor); other readings
TBA.
Leading questions:
TBA
Nov. 25 -- Thanksgiving
Holiday / No class
Dec. 2 . . . Class
15: Exploratory modeling with Tetrad OR Latent Variable Mixture Modeling.
A group of researchers
at Carnegie Mellon--joined by others around the world--argues that
the role of theory and a priori models in the social sciences is
way overblown. They argue that the truth reveals itself in the data,
and that researchers can and should make plausible causal inferences
about the truth largely from the data alone. Their program, called
Tetrad IV, helps researchers to find the class of models that is
consistent with the data, given some strong assumptions.
Reading: Scheines
et al (1998), "The TETRAD Project: Constraint-Based Aids to
Causal Model Specification," Multivariate Behavioral Research,
33, 65-118; selections from the Tetrad IV manual, available with
the Tetrad IV download at http://www.phil.cmu.edu/projects/tetrad_download/
.
Leading questions:
TBA
If we choose mixture modeling instead, we will read Phil Gagne's chapter on the subject in Hancock and Mueller (2006).
Dec. 9 (12:30)
. . . Assigned Final Exam period--exam is due by 5:00 PM.
|