
The following table identifies a number of fallacies, problems, biases, and effects that scholars have, over the centuries, recognized as confounding the conduct of good research. Note that some of these "methodological potholes" remain contentious among some scholars.
ad hominem argument Criticizing the person rather than criticizing the argument. Advice: Focus on the quality of the argument. discovery fallacy Criticizing an idea because of its origin
(for example, an idea given in a religious text). Advice: Criticize the justifications offered in
support of an idea rather than how the idea originated.
In 1858, Kekulé, was having difficulty deciphering
the chemical structure of benzene.
One night he went to bed and dreamed of a snake biting its tail.
The next morning, he awoke with the insight that perhaps
benzene is chemically ring-shaped.
Following up on the idea, his subsequent work did indeed
establish that the structure of benezene could be explained
only by joining the ends of the carbon chain to form a
closed ring.
Some people might be uncomfortable with the idea that
Kekulé's dream proved seminal in understanding benzene.
For most people, dreams are not the preeminent technique
for doing research.
However, scholars make a distinction between the
context of discovery and the context of legitimation.
Kekulé's dream did not establish the structure of benzene.
It was Kekulé's subsequent experimental work that was
consistent with a ring-structure.
Chemists would be wrong to criticize the idea because it originated
in a dream.
It simply doesn't matter what the source of an idea is.
It was Kekulé's subsequent arguments
(context of legitimation) -- the evidence he assembled
in favor of a ring-structure that convinced other chemists of
the idea.
ipse dixit Appealing to an authority figure in support of an argument. Advice: Cite published research rather than
identifying authority figures.
Provide references so others can judge the quality of the
supporting research for themselves. ad baculum argument An appeal to physical or psychological threat.
A surprisingly common form of argument, even in modern times.
In his debate with the Scholastics concerning the sun-centered theory
of the solar system, Galileo "was shown the instruments of torture"
as a way of bringing the debate to a conclusion.
The threat is often made as an observation that someone else will
cause harm, as in "Your boss will fire you if your views become known."
Common in many religious arguments, as when "one should not do
(or think) X because you will be punished by God."
May also appear as a threat directed at others, as when
"The government will deport your friends if you don't stop
publishing criticisms."
Also pervasive in business, as echoed in the phrase "He twisted my arm."
The ad baculum argument has force only insofar as we fear pain or empathize
with others -- more than we are motivated by a just argument.
Advice: Do not threaten.
Draw attention when others use threats in debates. egocentric bias The tendency to assume that other people experience things
the same way we do. Advice: Don't rely exclusively on introspection.
Listen carefully to what others report.
Carry out a survey or run an experiment
in order to observe the behaviors of others.
Be wary when generalizing from your own experiences. cultural bias The inappropriate application of a concept to people from
another culture. Advice:Talk with culturally knowledgeable people.
Carry out cross-cultural experiments.
Listen carefully in post-experiment debriefings. cultural ignorance The failure to make a distinction that people in another
culture readily make. Advice: Talk with culturally knowledgeable people.
Listen carefully in post-experiment debriefings. over-generalization The tendency to assume that an experimental result
generalizes to a wide variety of real-world situations. Advice: Be careful.
Look for converging evidence.
Analyze additional works.
Run further experiments. inertia fallacy The idea that research consistent with a particular conclusion
will "grow" in the future. A subtle fallacy that is evident in such statements
as "Research is increasingly showing that ...".
Future research is just as likely to over-turn a current theory as to confirm it.
As in the stock market, past "trends" are not necessarily indicative of future results.
Research results do not have inertia. Advice:Talk about research results in the past tense ("Research has shown ..."
rather than "Research is showing ...").
Avoid "growth" or "band-wagon" metaphors when describing the evidence
pertaining to some theory. relativist fallacy The belief that no idea, hypothesis, theory or belief is
better than another. Advice: Avoid "absolute" relativism; the world appears to be
"relatively relative." Don't mistake relativism for pluralism. universalist phobia A prejudice against the possibility of cross-cultural universals. Advice: Familiarize yourself with music from a variety
of cultures.
Investigate notions of similarity and difference.
Use cross-cultural surveys or experiments where appropriate.
problem of induction The problem (identified by Hume) that no number of particular
observations can establish the truth of some general conclusion. Advice: Avoid claiming you know the truth.
Present your research results as "consistent" or "inconsistent"
with a particular theory, hypothesis or interpretation.
How do we learn from observation?
The classic response is that we learn through a process dubbed induction.
Induction entails making a set of specific observations,
and then forming a general principal from these observations.
For example, having stubbed my toe on many occasions over
the course of my life,
I have formed a general conviction that rapid acceleration of my toe
into massive objects is likely to evoke pain.
We might say that I have learned from experience
(although my continued toe-stubbings make me question how
well I've learned this lesson).
The 18th-century Scottish philosopher, David Hume,
recognized that there are serious difficulties with
the concept of induction.
Hume noted that no amount of observation could ever resolve
the truth of some general statement.
For example, no matter how many white swans one observes,
an observer would never be justified in concluding that
all swans are white.
In postmodernist language, one would say that we cannot
legitimately raise local observations to
the status of global truths.
Several serious attempts have been made by philosophers to
resolve the problem of induction.
Three of these attempts have been influential in scientific
circles:
falsificationism,
conventionalism and instrumentalism.
However these attempts suffer from serious problems of their own.
In all three philosophies, the validity of empirical knowledge
is preserved by forfeiting any strong claim to absolute truth.
Observation can never be used to "prove" anything.
But that doesn't mean that observation is useless;
we still manage to learn from observation, even
if the process seems mysterious.
In observation-based research, we never claim to prove something.
Instead, we can say that the observations are consistent with
a particular theory, hypothesis or interpretation.
positivist fallacy The problem arising when a phenomenon is deemed not to
exist because no evidence is available:
"Absence of evidence is interpreted as evidence of absence." Advice: Recognize that not all phenomena leave obvious evidence
of their existence.
Some areas of research have little or no available data or evidence.
Data-poor fields raise some special methodological concerns,
one of which is the positivist fallacy.
If a phenomenon leaves no trail of evidence,
then there is nothing to study.
We may even be tempted to conclude that nothing has happened.
In other words, the positivist fallacy is the
misconception that absence of evidence may be interpreted
as evidence of absence.
Positivism had a marked impact on mid-twentieth century psychology.
In particular, the influence of logical positivism was notable
in the behaviorists such as B.F. Skinner.
The classic example of the positivist fallacy was the penchant
of behaviorists to dismiss unobservable mental states as non-existent.
For example, because "consciousness" could not be directly observed,
for the positivist it must be regarded as an occult or fictional
quality with no truth status (Ayer, 1936).
Psychology escaped the excesses of behaviorism with the
advent of the cognitive revolution which re-introduced mental states
as legitimate topics of investigation.
It is perhaps ironic that computer technology played an
important role in facilitating the acceptance of mental states.
Although not easily observable, computer memories could clearly
exist in a number of different states, and these states could
significantly affect the ensuing computational behavior.
If it is true that the positivist fallacy tends to arise
from data-poor conditions, then it should be possible to
observe this same misconception in humanities
scholarship -- whenever data is limited.
Consider, by way of example, the following argument from
the distinguished historical musicologist, Albert Seay.
At the beginning of his otherwise fine book on medieval music,
Seay provides the following rationale for focusing predominantly
on sacred music in preference to secular music:
"Although much music did exist for secular purposes and many
musicians satisfied the needs of secular audiences, the Church and its
musical opportunities remained the central preoccupation.
No better evidence of this emphasis on the religious can
be seen than in the relative scarcity of both information and
primary source materials for secular music as compared to
those for the sacred." (Seay, 1975, p.2)
In other words, Seay is arguing that, with regard to
secular medieval music-making, absence of evidence is
evidence of absence.
Since secular activities generated little documentation,
we have almost no idea of the extent and day-to-day
pertinence of medieval secular music-making.
For illiterate peasants, "do-it-yourself" folk music may
have shaped daily musical experience far more than has
been supposed.
Of course Seay may be entirely right about the
relative unimportance of secular music-making,
but in basing his argument on the absence of data,
he is in the company of the most rabid logical positivist.
The positivist fallacy is commonly regarded as a symptom
of scientific excess.
However, it knows no disciplinary boundaries;
it tends to appear whenever pertinent data is scarce.
confirmation bias The tendency to see events as conforming to a hypothesis while
viewing falsifying events as "exceptions". Advice: Be systematic in your observations.
hindsight bias The ease with which people confidently interpret or explain any
set of existing data. Advice: Whenever possible, attempt to predict data in advance.
Aim to test ideas rather than to look for confirmation. unfalsifiable hypothesis The formulation of a theory or hypothesis which cannot be,
in principle, falsified. Advice: Whenever possible, formulate theories, hypotheses
or interpretations so they are, in principle, falsifiable.
Identify the sorts of observations that would be inconsistent with
your views.
The most well-known attempt to resolve the
problem of induction
was formulated by Karl Popper in 1934.
Popper accepted the view that no amount of observation
could ever verify that a particular proposition is true.
That is, an observer cannot prove that all swans are white.
However, Popper argued that one could be certain of falsity.
For example, observing a single black swan would allow
one to conclude that the claim -- all swans are white
-- is false.
Accordingly, Popper endeavored to explain the growth
of knowledge as arising by trimming the tree of
possible hypotheses using the pruning shears of falsification.
Truth is what remains after the falsehoods have been trimmed away.
For Popper, what makes a theory a "scientific theory" is
not that the theory is true (since we cannot know this).
Rather, what makes a theory "scientific" is that the
theory is, in principal, falsifiable.
The mark of a good theory, for Popper, is that the
theory is stated in a way that admits the possibility
of being disproved or falsified.
Accordingly, a theory that claims "all blungs are blue"
is a bad theory, not simply because we don't know what a "blung" is,
but because as stated, the claim would be impossible to falsify.
A number of music scholars, such as Eugene Narmour and William Poland,
have argued that music scholars need, whenever possible,
to state their theories in a way that they can, in principle,
be falsified.
post-hoc hypothesis Following data collection, the formulation and testing of
additional hypotheses not envisaged before the data was collected. Advice: Limit. Beware of hindsight bias and multiple tests.
Collect new data;
analyze new works. smorgasbord thinking Sometimes we don't realize that we unconsciously hold a
collection of hypotheses for all occasions.
Suppose two very different people marry: we explain it by saying
"Opposites attract."
But if two very similar people marry, we explain it by saying
"Birds of a feather flock together."
If a group of people dealing with a problem work inefficiently,
we explain it by saying
"Too many cooks spoil the broth."
But if a group of people excell at a task, we explain it by saying
"Two heads are better than one."
If a person rushes into a poor decision, we conclude that one should
"look before you leap."
But if a fast decision produces a good result, we conclude that
"He who hesitates is lost."
Similarly, we say "Time waits for no man," but we also say
"Haste makes waste."
We say "Cross that bridge when you come to it," but we also
say "Don't put off 'til tomorrow what you can do today."
Most of us move easily from one contradictory hypothesis to
another with little awareness of what we are doing.
If explanations are so easy to come by -- no matter what trend
exists in the data -- then no explanation can be trusted.
Consider, for example, two common ideas related to human behavior.
Many people believe that there is something to the ancient Greek
idea of "catharsis."
That is, by watching (for example) a drama portraying a murder,
viewers can, in some sense "purge" any murderous instinct they may have.
But many of the same people also believe "Monkey see, monkey do."
So what do you make of television violence? Or pornography?
If someone thinks pornography is okay, they will tend to argue that
it is cathartic. But if someone thinks pornography is bad,
they will argue "monkey see, monkey do."
Advice: Don't deceive yourself that you have only
one prediction. Write your prediction down before you analyse
any data. Ask yourself whether you have a "spare" explanation should
the data show a reverse trend; if so, ask yourself what
hypothesis you should be testing.
ad-hoc hypothesis The proposing of a supplementary hypothesis that is intended to
explain why a favorite theory failed an experimental test. Advice: Open to grave abuse. Try to avoid.
Test the ad hoc hypothesis in another experiment. sensitivity syndrome The tendency to try to interpret every perturbation in a data set;
a failure to recognize that data always contains some "noise". Advice: Use test-retest and other techniques to estimate the margin of
error for any collected data.
Report chance levels, p values, effect sizes.
Beware of hindsight bias. positive results bias A bias commonly shown by scholarly journals to publish only studies
that demonstrate positive results (i.e., where data and theory agree). Advice: Seek replications for suspect phenomena.
Be aware of possible "bottom-drawer effect". bottom-drawer effect Unawareness of unpublished negative results of earlier experiments. Advice:
Maintain contact and communicate within a scholarly community.
Ask other scholars whether they have carried
out a given analysis, survey or experiment.
Widely report negative results through informal channels. head-in-the-sand syndrome The failure to test important theories, assumptions,
or hypotheses that are readily testable. Advice:
Collect pertinent data.
Carry out analyses.
Do a survey.
Run an experiment.
data neglect The tendency to ignore readily available data when
assessing theories, assumptions or hypotheses. Advice: Don't ignore existing resources.
Test your hypotheses using other available data sets. research hoarding The failure to make the fruits of your scholarship
available for the benefit of others. Advice: Publish often. Prefer to write short research
articles rather than books. Make your data available to others.
I once worked with a colleague who spent 30 years studying the
life of Felix Mendelssohn.
Mendelssohn was a very prolific letter writer, and my colleague
had produced English translations of several thousand letters.
The letters include details of trips, concerts, rehearsals, meetings,
etc. from which my colleague had produced a detailed day-by-day
chronicle of Mendelssohn's life.
On such-and-such a day Mendelssohn conducting a concert
and made revisions to a particular work.
The next day he read a newspaper review and received a
letter from his publisher.
Etc., etc.
The assembled chronicle runs to several hundred pages.
None of the translations have been published, and
the day-by-day chronicle remains a manuscript.
My colleague likes to tell the story of an experience
that took place while he was on sabbatical about 15 years ago.
He had travelled to Cambridge University where many
of Mendelssohn's letters are archived.
As one might expect, my colleague encountered another
musicologist who was also studying Mendelssohn's letters.
By chance, this other scholar happened to glimpse the
computer printout of my colleague's chronicle of Mendelssohn's
day-by-day activities.
About an hour later, my colleague returned to his table
and discovered that a section of the printout was missing.
After several minutes of searching, my colleague noticed
the missing printout sitting under a pile of books
belonging to the other musicologist.
Red-faced, she handed the chronicle back to my colleague.
My colleague likes to tell this story as an example of
the low morals to which scholars might stoop.
But, I think there is a more compelling lesson.
To this day, after the passage of some 30 years,
my colleague has not yet published his day-by-day
chronicle of Mendelssohn's life.
My colleague argues that the chronicle is not yet "perfect"
and still requires some work reconciling some conflicting
dates, etc.
My colleague retired about 5 years ago, and still works
occasionally on the project.
In the meantime, an entire generation of Mendelssohn scholars
have had to work without benefit of his work.
Yes, there are likely to be errors in the chronicle.
But these errors will be identified more quickly by
soliciting the feedback of other Mendelssohn scholars.
Research is a communal activity.
All scholars benefit by being in constant dialogue
with our peers.
We might be tempted to hold back information from our
colleagues in order to bask in the glory of some discovery.
But we shouldn't wait long before allowing others
to benefit from the fruits of our labors.
There is a point where our own egos actually
impede the development of an area of knowledge.
double-use data The use of a single data set both to formulate a
theory and to "independently" test the theory. Advice: Avoid. Collect new data.
A pernicious problem plaguing much scholarship
is the tendency to use a single data set both
to generate the theory and to support the theory.
Formally, if observation O is used to formulate theory T,
then O cannot be construed as a predicted outcome of T.
That is, observation O in no way supports T.
The origin of the Theory of Continental Drift arose from observing
the suspicious visual fit between the east coasts of the American
continents and the west coasts of Europe and Africa.
The bulge of north-west Africa appears to fit like a piece
of a jig-saw puzzle into the Caribbean gulf.
This observation was ridiculed as childish nonsense by
geologists in the first part of the twentieth century.
Geologists were right to dismiss the similarity of the coast-lines as
evidence in support of the theory of continental drift,
since this similarity was the origin of the theory in the first place.
Plate tectonics gained credence only when independent evidence was
gathered consistent with the spreading of the Atlantic sea-bed.
skills neglect The human disposition to resist learning new scholarly
methods that may be pertinent to a research problem. Advice: Resist scholarly laziness.
Engage in continuing education.
Learn things your peers and teachers don't know. control failure The failure to contrast an experimental group with a control group. Advice: Add a control group. third variable problem The presumption that two correlated variables are causally linked;
such links may arise through an unknown third variable. Advice: Avoid interpreting correlation as causality.
Carry out an experiment where manipulating variables can test
notions of probable causality.
In correlational studies, the researcher can demonstrate that
there is a relationship or association between two
variables or events.
But there is no way to determine whether A causes B or B causes A.
Moreover, the researcher cannot dismiss the possibility that
A and B are not causally connected.
It may be the case that both A and B are caused by an
independent third variable.
By way of illustration we might note that there is a
strong correlation between consumption of ice cream
and death by drowning.
Whenever ice cream consumption increases there is a
concomitant increase in drowning deaths (and vice versa).
Of course the likely reason for this correlation is
that warm summer days lead people to go swimming and
also leads to greater ice cream consumption.
In historical disciplines, one can never know whether
the association of two events is causal, accidental,
or the effect of a third (unidentified) event or factor.
reification Falsely concretizing an abstract concept
(e.g. regarding spatial representations of pitch structure as
mental representations). Advice: Take care with terminology. validity problem When an operational definition of a variable fails to accurately
reflect the true theoretical meaning of the variable (See Cozby, p.31). Advice: Think carefully when forming operational definitions.
Use more than one operational definition.
Seek converging evidence. anti-operationalizing problem The tendency to raise perpetual objections to all operational definitions. Advice: Propose better operational definitions.
Seek converging evidence using several alternative
operational definitions.
Most concepts are vaguely defined.
For example, what is meant by a "melodic arch"?
Ostensibly, a phrase may be considered "arch-shaped"
if the pitches go up and then go back down.
But surely its possible for some notes to deviate
from this strict criterion, and yet the phrase may
still be considered "arch-shaped".
Suppose the pitches go: A-B-C-D-E-C-E-D-C-B-A.
Does the "C" in the middle mean that there is no
arch?
If we are unable to provide a precise definition
of a melodic arch, then it would appear that we'd
never be able to determine whether most phases are
arch-shaped, or whether there is a tendency to
create arch-shaped phrases.
If everyone is unable to agree of a definition of
a melodic arch, then, by definition, it is impossible
to study the purported phenomenon.
One could legitimately claim there is no such thing
as a melodic arch.
The best way to address this problem is by proposing
several alternative operational definitions of a melodic arch.
For each operational definition, we can examine a large
number of phrases to determine which phrases conform to
the given definition.
Suppose that we provide 5 contrasting definitions of a
melodic arch.
If we can show that most phrases are arch-shaped, no
matter which definition we choose, then it becomes
more difficult for someone to claim that arch-shaped
phrases are mere figments of our imagination.
See
The Melodic Arch in Western Folksongs.
problem of ecological validity The problem of generalizing results from controlled experiments
to real-world contexts. Advice: Seek convering evidence between controlled experiments and
experiments in real-world settings. naturalist fallacy The belief that what IS is what OUGHT to be. Advice: Imagine desirable alternatives. presumptive representation The practice of representing others to themselves. (Natoli, 1997; p.151). Advice: Exercise care when portraying or summarizing the views of others
-- especially when your portrayal causes a disadvantaged group
to lose power. exclusion problem The tendency to prematurely exclude competing views. (Natoli, 1997; p.151). Advice: Remember that "no theory is every truly dead."
(Popper) contradiction blindness The failure to take contradictions seriously. Advice: Attend to possible contradictions. multiple tests If a statistical test relies on a 0.05 confidence level,
then, on average, a spuriously significant result will occur
for each 20 tests performed. Advice: Avoid excessive numbers of tests for a given data set.
Use statistical techniques to compensate for multiple tests.
Split large data sets into one or more "reserved sets."
Prefer hypothesis testing over open-ended chasing after
significance. magnitude blindness The tendency to become preoccupied with significant
results that have a small magnitude of effect. Advice: Aim to uncover the most important factors first. regression artifacts The tendency to interpret regression toward the mean
as an experimental phenomenon. Advice: Don't use extreme values as a sampling criterion.
Use a control group (such as scrambling orders) to
compare with the experimental group. range restriction effect Failure to vary an independent variable over a sufficient
range of values -- with the consequence that the effect size
looks small. Advice: Decide what range of a variable or what effect size is of
interest. Run a pilot study. ceiling effect When a task is so easy that the experimental manipulation shows
little/no effect. Advice: Make the task more difficult. Run a pilot study. floor effect When a task is so difficult that the experimental manipulation
shows little/no effect. Advice: Make the task easier. Run a pilot study. sampling bias Any confound that causes the sample to not be representative
of the pertinent population. Advice: Use random sampling.
If there are identifiable sub-groups
use a stratified random sample.
Where possible, avoid "convenience" or
haphazard sampling. subsample ignorance Failure to recognize that sub-groups within a sample
respond differently. For example, where responses diverge between
males and females, or between vocalists and instrumentalists. Advice: Use descriptive methods and data exploration
methods to examine the experimental results.
Use cluster analysis methods where appropriate. cohort bias or cohort effect Differences between age groups in a cross-sectional study
that are due to generational differences rather than due
to the experimental manipulation. Advice: Use a more narrow range of ages. Use a longitudinal design
instead of a cross-sectional design.
Suppose we had a theory that listeners tend to become
less tolerant of dissonance as they age.
In a cross-sectional design, we might randomly select 30 people
(say) for each of the ages of 15, 25, 35, 45, 55, 65 and 75.
We might present each subject with chords, phrases, passages,
or complete musical works, and ask them to rate how
unpleasant or annoying they find them.
Further suppose that we found that older listeners rate
the passages significantly more unpleasant/annoying than
younger listeners.
On this basis, we would not be justified in claiming that
listeners become less tolerant of dissonance with increasing age.
Why?
It is possible that music has become more dissonant
with successive generations.
In other words, the effect may have nothing to do with age,
and may be simply attributable to when a person was born.
A better approach to studying this question would employ a
longitudinal design.
Using this approach, the researcher would follow specific
individuals as they age over several decades.
One would need to test each subject several times over a
long period of time to determine whether they rate the
same sounds more unpleasant or annoying as they grow older.
expectancy effect Any unconscious or conscious cues that convey to the subject how the
experimenter wants them to respond. Expecting someone to behave in a particular
way has been shown to promote the expected behavior. Advice: Use standardized interactions with subjects.
Use automated data-gathering methods. Use double-blind protocol.
One of the earliest demonstrations of the expectancy effect is
found in the famous case of "Clever Hans" -- a horse that
appeared to be an utter genius.
Here is a description by Robert Rosenthal:
"Hans, it will be remembered, was the clever horse who could solve
problems of mathematics and musical harmony with equal skill and
grace, simply by tapping out the answers with his hoof.
A committee of eminent experts testified that Hans, whose
owner made no profit from his horse's talents, was receiving
no cues from his questioners.
Of course, Pfungst later showed that this was not so,
that tiny head and eye movements were Hans' signals to begin
and to end his tapping.
When Hans was asked a question, the questioner looked at
Hans' hoof, quite naturally so, for that was the way for
him to determine whether Hans' answer was correct.
Then, it was discovered that when Hans approached the
correct number of taps, the questioner would inadvertently
move his head or eyes upward -- just enough that Hans
could discriminate the cue, but not enough that even
trained animal observers or psychologists could see it."
When interacting with an experimental subject, there are
innumerable subtle cues by which the experimenter may
unwittingly communicate what the subject is expected to do.
In such circumstances it is better that the experimenter
not be present, or that the experimenter not know the
purpose of the experiment (double-blind), or that the
interaction with the subject is done using automated
procedures.
placebo effect The positive or negative response arising from the subject's belief
about the efficacy of some manipulation. Advice: Use a placebo control group.
Medical practitioners long ago discovered that a
patient's belief in the effectiveness of a treatment
can have a marked impact on their recovery, or
on their experience of pain.
Roughly one-third of patients report feeling better
after taking a simple "sugar pill".
The placebo effect makes it more difficult to test
the efficacy of new drugs in pharmaceutical research.
The simple act of injecting someone, or giving them a pill,
will cause people to report improvement.
Consequently, drug trials must include a "placebo group"
who are treated in an identical fashion to the experimental
group -- with the exception that any pills or injections
use inert substances.
A drug is likely to be effective, only if the improvement
for the experimental group exceeds any improvement
among the placebo group.
The placebo effect is not limited to improvement.
Simply suggesting that a pill contains a toxin, for example,
is likely to make recipients feel sick.
In some cases, the placebo effect can be remarkably large.
For example, Marlatt and Rohsenow (1980) carried out a
study to compare the effect of alcohol compared with
the psychological effect of believing one is drinking
alcohol.
The results showed that the belief that one has consumed
alcohol has a greater effect on behavior than the alcohol
itself.
The placebo effect is an example of the influence of
demand characteristics.
demand characteristics Any aspect of an experiment that might inform subjects of
the purpose of the study. Advice: Control demand characteristics by: (1) using deception
(for example, by adding "filler" questions that make it more
difficult for subjects to infer the experimental question),
(2) debriefing subjects at the end of the experiment,
(3) using field observation,
(4) avoiding within-subjects designs where all subjects are aware of all
the experimental conditions,
(5) asking subjects not to discuss the experiment with
future participants.
People who participate in studies, surveys or experiments
are not "inert" or "neutral".
People form their own intuitions about the purpose of a
study, and will often respond in a way that reflects
their opinion about the hypothesis, rather than their
unreflective way of behaving.
For example, a person might receive in the mail a survey
distributed by a large chemical corporation.
Browsing through the survey, the respondant might form the
view that the industry is carrying out the survey in the
hopes of showing that people are less concerned about
pollution than is widely believed.
The respondant might be offended by this possibility,
and consequently respond to questions in a way that
actually exaggerates their views.
Conversely, many participants in experiments are often
eager to please the researcher.
This is especially true in cross-cultural studies.
Faced with a European anthropologist, a native
Papuan might well respond to questions based on what
they think the anthropologist wants to hear.
Some years ago, I carried out an experiment to try to
determine whether people are prejudiced against woman composers.
In the experiment, we provided listeners with copies of
concert programs that identified various pieces of contemporary
art music and included brief biographical descriptions of the
composers -- some of whom were male and some female.
We played brief excerpts from each piece on the program
and asked listeners to rate how well they liked them.
Two groups of listeners were used.
For one group, we made slight changes to the program so
that male and female composers were switched.
So some listeners heard an excerpt thinking the composer was
a woman, whereas other listeners heard the same excerpt
thinking the composer was a man.
We found no difference whatsoever in the ratings according
to sex.
Some excerpts were rated more highly than others, but it
didn't matter whether the composer was a man or a woman.
After the experiment, we carried out brief interviews with
our participants.
We discovered that the vast majority of our listeners
accurately suspected that the purpose of the experiment was to
determine the effect of sex on musical ratings.
If we had not carried out post-experiment debriefings,
we might have wrongly concluded that listeners are
not prejudiced by whether a composer is a man or a woman.
A better experiment might have found a much more subtle
way (deception) to imply that a given excerpt is written
by a man or a woman.
Demand characteristics are also evident in non-experimental
situations.
For example, after teaching a particular analysis method,
music students will naturally tend to assume that an
assigned musical work will readily be explicated using
the analytic method.
reactivity problem When the act of measuring something changes the
measurement itself. (See Cozby, p.33) Advice: Use clandestine measurement methods. history effect Any change between a pretest measure and posttest measure that is
not attributable to the experimental manipulation. Advice: Isolate subjects from external information.
Use post-experiment debriefing to identify possible confounds. maturation confounds Any changes in responses due to changes in the subject not related to
the experimental manipulation.
Examples of maturation changes include increasing boredom, becoming hungry,
and (for longer experiments) reduced reaction times, fading beauty,
becoming wiser, etc. (See Cozby, p.68) Advice: Prefer short experiments. Provide breaks. Run a pilot study. testing effect In a pretest-posttest design, where a pre-test causes subjects to
behave differently. (See Cozby, p.69). Advice: Use clandestine measurement methods.
Use a control group with no manipulation between pre- and post-test. carry-over effect When the effects of one treatment are still present when the
next treatment is given. (See Cozby, p.281) Advice: Leave lots of time between treatments.
Use between-subjects design. order effect In a repeated measures design, the effect that the order of
introducing treatment has on the dependent variable. Advice: Randomize or counter-balance treatment order.
Use between-subjects design.
Suppose you play a series of musical excerpts and ask listeners
to rate how well they like each excerpt.
It turns out that the first musical excerpt will always tend
to be more highly rated than the subsequent excerpts.
This effect is common: people enjoy the first bite from a
chocolate bar more than subsequent bites.
In addition, if a "nice" musical excerpt is preceded by an
"ugly" musical excerpt, then the "nice" excerpt will typically
be rated as more pleasant than if it had been preceded by a
less "ugly" excerpt.
These order effects must be overcome whenever one asks
people to rate several musical excerpts or stimuli.
One way to avoid order effects is to provide each listener
with a unique random ordering of the excerpts.
This means that all excerpts have an equal probability of
occuring first, or occuring after an especially "ugly" excerpt.
In short, one should avoid playing the excerpts in the same
order to each of the listeners.
Alternatively, the experimenter can use all possible
orderings. This can be done only if the number of items
is small.
mortality problem In a longitudinal study, the bias introduced by some subjects
disappearing from the sample. Advice: Convince subjects to continue; investigate possible differences
between continuing and non-continuing subjects. premature reduction The tendency to rush into an experiment without first
familiarizing yourself with a complex phenomena. Advice: Use descriptive and qualitative methods to explore
a complex phenomenon. Use explorative information to help form
testable hypotheses and to identify plausible confounds that
need to be controlled. spelunking Exploring a phenomenon without ever testing a proper
hypothesis. Advice: Formulate and test hypotheses. shifting population problem The tendency to reconceive a sample as representing a different
population than originally conceived. Advice: Write-down in advance what you think is the population. instrument decay Changes of measurement over time due to fatigue, increased observational
skill, or changes of observational standards. Advice: Use a pilot study to establish observational standards and develop skill. reliability problem When various measures or judgments are inconsistent. Advice: Solutions: (1) careful training of experimenter,
(2) careful attention to instrumentation, (3) measure reliability,
and avoid interpreting affects smaller than the error bars. hypocrisy Holding others to a higher methodological standard than oneself. Advice: Employ higher standards than others.
At first, this advice will seem unfair.
Surely, there is no imperative for one researcher (me) to be more rigorous
than another researcher.
There are two rejoinders to this complaint.
First, legitimate criticisms can come from otherwise incompetent or ill-informed people.
We may rightly feel angry when another scholar criticizes our work,
when their own work is less rigorous.
But this doesn't not mean that the criticism is less valid.
Secondly, by following the advice to employ higher standards than others
a "virtuous circle" is established through which the methodological rigor
of a discipline is advanced.
It is for these reasons that hypocrisy is here defined as "holding others
to a higher methodological standard than oneself" rather than simply
"holding others to a higher methodological standard".
As defined here, hypocrisy is an error that only I can make.
If another person holds me to a higher methodological standard, he/she
is not a hypocrite.
But if I hold someone else to a higher methodological standard, then I
am a fully qualified hypocrite.
"That's an abolute goldmine you've got there," she said.
"Yes, I know," responded my colleague proudly.
"Have you seen my chronicle?" he asked of the other scholar.
"No," she replied with a guilty look on her face.