If we grant that no social
scientific inquiry can be value-neutral, then we are left with the situation
where political scientists are relying upon implicit and uninvestigated value
assumptions when constructing their descriptive categories (see Pt. II). This
parallels the “no-theory” fallacy consistently made by political scientists who
employ quantitative methodologies. Quantitative political science is quite
resistant, and at times openly hostile, to theorizing. Instead, the majority of
papers that have been published over the past decade have openly eschewed it
for strict hypothesis testing. The field has become almost completely
inductive.
What
this means is that political scientists are largely content with investigating
whether or not the observed relationship between two variables is explainable
by random chance. While in the natural sciences this is done by controlling the
environment and running a large number of trials, political scientists are
frequently unable to do this. Instead, they must rely upon large sample sizes
and statistical ‘controls’ to accomplish this task.[1]
To a certain extent this explains why the bar has been set “lower” with respect
to social scientific publications: they can’t run island experiments for both
ethical and practical reasons. Any well designed statistical model that
strongly suggests a correlation between two variables is in itself an
accomplishment. So I do not mean to begrudge the political scientist this
difficult task.
But to
suggest that we should stop here, I think, sells us short. We should try to explain
why these relationships exist. The
approach that political scientists take today is akin to Newton being fully
content with knowing that two objects of different mass fall at the same rate.
For the political scientist, there’s no need to take the extra step and start
theorizing about gravity.
Except,
they’re still tempted to do a little bit of explaining. But these explanations
largely serve as illustrations of how the data might work together in the larger political ecosystem. Stories are
told in a few lines that narrate why these variables might interact with each
other the way that they do. This is crude theory construction. Or rather, it’s
hypothesizing a possible theory. The theory itself, though, is rarely tested
against other competing theories.
Normally
the story that’s told that links our variables together is some variation on
rational expectation models. “Of course this would be the outcome!” the
political scientist implicitly says. “The actors are maximizing their payments
subject to their budget constraints. Now look over here at this correlation I
found…”
There’s
a term for this. It’s called “assuming your conclusions”. It means that the
explanations are always made to fit the findings. Or in other words, our
explanations are inherently unfalsifiable.[2]
“Theory”
should not be confused with “models”. Political scientists love their models.
Quants build statistical models, which is to say that they put certain
variables in relation to one another. They could be linear, multi-linear, or
logarithmic (to name a few). To illustrate, here’s a simple multilinear model:
y = x2 + x
It’s postulated that there’s some sort of quadratic
relationship between the independent and dependent variables. This is a model.
It proposes a relationship. It doesn’t explain why this relationship exists.
That’s what theory’s for.
One
result of neglecting rigorous theorization is that there’s not much of a logic
behind why political scientists include the particular control variables that
they do beyond the impression that they might be important (or others have used
them). The strategy is oftentimes to throw the kitchen sink at the problem and
see what works. I can understand this impulse (I’ve had it myself) and to a
certain extent it rests on compelling logic: you can’t know a priori that a variable doesn’t have a
confounding effect on the relationship we’re looking to measure. Therefore, you
might as well test everything you can in order to come up with the most robust
model possible.
And
while the point that you cannot know before testing it whether or not a
variable is important or not is well taken, you should have a good idea based
upon theories whether or not a given
variable should have an effect or not on your findings.
Take
the classic example of the direct relationship between ice cream sales and
murders. Obviously ice cream sales are not causing murders to rise and vice
versa. The confounding variable here, temperature, explains both. As
temperatures go up, people get angrier and more murders take place. Similarly,
people enjoy cool treats on hot days. It makes physiological sense.
But, if
political scientists were attacking this problem, they’d run through an
exhaustive search of all possible variables they could think of (gender, race,
political ideology, income, et cetera) along with temperature. And though they’d
arrive at the same finding, their procedure is thoughtless. It’s the same
exhaustive search that computers execute while playing chess. And more
importantly, it is only now that explanations (i.e. crude theories) are
offered.
We should be testing our theories
and our models at the same time. A
successful model may produce evidence against a theory. Or it may support it.
By waiting until after the model is confirmed to weave a possible explanation,
we are theorizing without theory testing. The paper usually ends here. We don’t
then test the theory in differing circumstances to see if it holds.
Despite
the “let the data speak for themselves” ethos prevalent in both political
science and pop quant journalism, it’s actually more theory that would make
model building easier. This is because knowing how our findings fit together would allow us to know beforehand which combination of variables are more likely to work.
We’ll also know which combination of variables we should test them against to
substantiate or refute our prior theory.
[1] Control
variables are variables other than your independent and dependent variable that
you build into your model. You include these because they might actually be
driving your findings (i.e. be a ‘confounding’ variable). The temperature
variable I discuss below is a good illustration of this.
No comments:
Post a Comment