Changes in the notes throughout the class will be indicated here.
Mention the research paper and in-class tests. Mention the MSU Honor Code.
Classical era- 700 BC to 1850 AD
Philosophical orientation, ask the "ought", how should things be, asks what justice is, who should rule (the wise or the multitude), what are the obligations of citizens and of government.
Institutional era- 1850 to 1900 AD
Traditional approach, focus on institutional process, how a bill becomes a law, the structure of the government, a legalistic case-study approach, a nation is seen as unitary and as a rational actor, very descriptive approach, very historical method.
The Traditionalist approach to analysis combines the classical and institutional eras.
Transitional era- 1900 to 1945. Problems of irony of form, pluralism exists.
Behavioral era- 1945 to present- characteristics are:
1) Science, theory, predictions, explanation, patterns:
Examples:
a) Theories of presidential voting behavior: sociological theory of
voting, such as race and income affecting voter's vote
choice; social-psychological University of Michigan model using party
identification, issues, and candidate evaluations; simple satisfaction
versus dissatisfaction predicting vote for presidential party's
candidate.
b) Southern state legislative groups: white Republicans are fairly
conservative; African-American Democrats are fairly liberal; and white
Democrats are essentially centrist or moderate, depending on the
issues.
c) Mass-elite study, which party organization is closer to the average
voter on issues? In 1991 and 2001 Alabama-Mississippi study, it was
Democrats, though their organization has moved to the left over the years.
A contrary study examined why Republicans now control a majority of U.S.
House and Senate seats in the South; it found that Democratic congress
members and Senators had steadily moved ideologically to the left since
1970, suggesting that Democratic "elites" in today's South may have become
too liberal for many white southern voters.
2) Data gathering and research are theory directed
Examples:
a) For presidential voting behavior, national survey of voters was
conducted, asking their party identification, their attitudes on public
issues, and their likes and dislikes of the major party candidates;
such a national study would also ask voters' their race and income,
and whether they were financially satisfied or dissatisfied.
b) For southern state legislative factions, we identified legislators'
party from their websites, their race from their pictures, and their roll
call votes from newspapers' reports.
c) For mass-elite study, we conducted mail surveys of Democratic and
Republican county executive committee members in Mississippi and Alabama,
and statewide telephone polls of average adults in both states. We asked
identically worded questions on about twenty different public policy
issues, including ideological self-identification and party
identification.
3) Value free
Examples:
a) We simply seek to predict and explain how the political world operates,
we do not let our own opinions about how it should operate influence our
research. Hence, though conservatives may claim that Reagan won in 1980
because of his conservative philosophy, and liberals may claim that
Clinton won in 1992 because of his moderate liberal philosophy, our
research may indicate that each won merely because voters were
dissatisfied with the economic recessions (and in 1980 the foreign policy
crises).
b) A researcher may be a disillusioned liberal who believes that
African-American lawmakers are isolated from all other lawmakers, but the
data may show that white Democrats often vote with them on education,
race, and election issues, and that Republican lawmakers lose on more roll
call votes than do black Democrats (if Democrats control the legislature
numerically).
c) A researcher may be a conservative Republican who has friends in the
state Republican party headquarters, but the data may show that average
voters are essentially moderate, that Democratic party organization
members in the South until the turn of the century were moderate liberal,
while Republican party organization members were conservative. Hence,
especially on education and health care issues, Democratic party members
were closer to average southern voters than were Republicans, at least
until the turn of the century. A researcher may be a liberal, but the data
would show how southern U.S. House and Senators from the South had more
and more liberal voting records as the decades passed from 1970 to 2010;
hence, today's Democratic elites are too liberal for many conservative
white southern voters.
4) Interdisciplinary- sociology, psychology, economics
Examples:
a) The earlier American presidential election studies of the 1940s relied
heavily on sociology, proposing that group membership affected the party
voted for outside of the South. Rurality, Protestantism, and higher income
predicted more Republican votes, while urban residence, Catholicism, and
lower income predicted more Democratic votes.
b) My study of Balance Theory drew on psychology. People tend to acquire
and retain psychologically consistent beliefs and attitudes. If a person
likes a candidate, they tend to believe that the candidate agrees with
their own positions on issues regardless of whether the candidate actually
does; if a voter dislikes a candidate, they tend to believe that they are
in disagreement with the candidate on the issue.
c) Shaffer worked with economics professor (Chressanthis) in studying
whether U.S. Senate election margins were accountable to the public, which
they were in an indirect sense. Elections were affected
by presidential coattails, campaign spending, divisive primaries, and
preceding election margin. Economic conditions in the state and federal
pork barrel dollars did not affect the elections.
5) Methodological sophistication-
Examples:
a) We conduct national public opinion polls that are representative of
the nation's diversity. We do not conduct shopping mall polls, or phone-in
or internet polls that fail to reflect the views of lower socioeconomic
classes. So we can test whether the sociological group,
social-psychological, or economic models of presidential voting are
upheld. In yet another published study, I used such national polls from
1960 thru 1976 to explain how
voter turnout declined due to decreased political efficacy, decreased
partisan intensity, and decreased newspaper readership.
b) The southern state legislative factions research started with one
southern state, Mississippi, in only a few years. We expanded to a twenty
year time frame in Mississippi. Then, we added other southern states, like
Georgia, Florida, Arkansas, and Texas.
c) My mass-elite linkage study started with just Mississippi in one year,
but then I
contacted Pat Cotter at the University of Alabama and we had a second
state for confirmation. We also did the study originally in 1991, and
then repeated it in 2001. Finally, in 2001 we also included a national
survey that was representative of the entire South, which had many
spending preference items.
d) Shaffer's study of balance theory
relied on the
1994-1996 American national panel study to examine cognition change over
time; panel studies follow the same people over time.
e) Shaffer and Chressanthis study of Senate accountability used pooled
time-series, cross-sectional approach. All even-numbered years from 1976
thru 1986 were included, as were all 33 state contests in each election
year. Regression and probit were used.
6) Individual and group level of analysis
a) The presidential voting studies used the individual voter as the unit
of analysis.
b) The southern state legislative factions also looked at individuals
(legislators in this case), but they combined them into three groups based
on their race and party.
c) Balance theory and voter turnout studies also looked at
individuals.
d) Mass-elite study looked at individuals of different types, the mass
voter versus the elite party member.
Criticisms of Behavioralism- are people and events predictable, can we be value free; discuss
History of Public Administration.
Methodological Issues in PA
Four characteristics of a good theory:
1. Explanation- why does something happen
Examples:
a) Presidential voting models. People vote Democratic because they
psychologically identify with the Democratic party, because they are
liberal, and because they prefer the Democratic presidential candidate's
characteristics. Or, people hold the President's party responsible for
economic conditions in the country, so they tend to vote for the President
or his party's successor when things are going well, and they tend to
vote against him or his party's successor when things are going badly.
b) Southern state legislative factions. White conservatives are
gravitating toward the more conservative party nationally, the
Republicans, therefore white Republican legislators tend to vote
conservatively. Liberal African-Americans tend to join the more liberal
party nationally, the Democrats, so African-American Democratic
legislators tend to vote liberally. Moderate whites tend to join the more
ideologically inclusive party in the modern South, so they tend to be
Democrats; hence, white Democratic legislators tend to vote moderately.
2) Prediction- if we know people's positions on the
independent variables, can we predict their positions on
the dependent variables
Examples:
a) In presidential vote model, if a voter is a Democrat, a liberal, and
prefers the Democratic candidate's attributes, we predict that they would
vote for the Democratic presidential candidate. If a voter is a
Republican, a conservative, and prefers the Republican candidate's
attributes, we predict they would vote for the Republican presidential
candidate.
b) In the southern state legislative project, we predict that
African-American Democrats will tend to vote more liberally, against
anti-crime measures, for public education projects, and for
affirmative action programs. We predict that white Republicans will
tend to vote in the opposite manner, in a conservative direction. We
also predict that white Democrats will tend to vote somewhere in
between these two groups.
c) Clinton impeachment vote was very partisan in committee. In House
Judiciary Committee, conservative white male Republicans opposed
demographically diverse liberal Democrats.
Click here for info about the
Judiciary Committee members.
3) Generalizability- does theory apply to different
situations and circumstances and time and geographic areas
Examples:
a) Presidential vote model. Can apply to other offices, such as
U.S. Congress, governor, and state legislature. Applies to any time span;
19th
century would have different parties though (Whigs and Democrats,
Federalists and
Democratic-Republicans). Can apply to different geographic areas, such as
other nations (Ohio State professor Bradley Richardson used party
identification model in Japan, Netherlands, Germany, France, Britain,
Italy).
b) Southern state legislative factions project. Can be generalized to
other southern states, even to northern states and the Congress, as the
literature indicates. Can be generalized over time, such as 1980 to
present. Can it be generalized to other nations having a newly empowered
group, such as South Africa?
4) Parsimony- simple with few independent variables,
simplest theory is best if everything else is equal
Examples:
a) Presidential Vote models. Is parsimonious, as has only three
predictors--party identification, issues, candidates. The economic
dissatisfaction model has even fewer predictors--one.
b) Southern state legislative factions project. It has only two
predictors--party and race of legislator. The dependent variable is less
parsimonious, as it is not merely ideology, but different types of issues
such as education, crime, race issues.
Example of Predictive Ability of a Theory.
The party identification model. The last eight presidential elections (since 1984, inclusive) were very competitive with Democrats winning four and Republicans winning four. So if we had no other information about a state like Mississippi, we would predict that a Mississippi survey respondent would have a 50-50 chance of voting Democratic or Republican. Our predictive success improves once we ask a respondent what their party identification (a 7-point scale) is. How they vote follows (using the Mississippi Poll data):
Many conservative whites switched to the GOP in the last two decades of the 20th century, so let us repeat this analysis, examining only the first four presidential elections in the 21st century, where Republicans won the first two with Bush, and Democrats won the last two with Obama.
Hypothesis Testing
Independent variable is the predictor; it comes first temporally and causally, it causes the dependent variable.
Dependent variable is the effect, it is being caused by the independent variable.
Ideology --------------------------> Presidential Vote
(Independent var.).......................(Dependent Variable)
Hypothesis is a statement of a relationship between concepts.
Example: self-identified conservatives are more likely to vote Republican, compared to self-identified liberals.
Hypothesis test- example with crosstabulations, put independent variable at top, dependent variable at the side. Calculate column percents.
VOTE FOR: | |||
BARACK OBAMA | 81% | 67% | 27% |
MITT ROMNEY | 19% | 33% | 73% |
100% | 100% | 100% |
...................................Theory
.....................................|
....................................\|/
................................-Hypothesis-
..............Concept <------------------------> Concept
.......................(Relationship between concepts)
The hypothesis above is at the theoretical level- general, abstract
............Indicator <------------------------> Indicator
..............(Relationship between indicators; hypothesis testing)
Operationalizing your concept is to select specific indicators of your abstract concepts. Hypothesis testing occurs at the indicator level, and it measures the relationship between the indicators.
If hypothesis is rejected, maybe the indicator is not valid.
Religiosity example of a theory.
At the theoretical level, the two principal concepts are Social
Deprivation and Religiosity. The principal hypothesis at the theoretical
level is that people who are socially deprived are more likely to be
intensely religious than are people who are not socially deprived.
Operationalizing the concepts is to choose valid, specific
indicators of those concepts. One indicator of religiosity might be
frequency of church attendance. An indicator of social deprivation might
be annual family income before taxes. The major problem with
operationalizing one's concepts is whether the indicators are valid
measures of those theoretical concepts. Is a person who attends church
twice a week necessarily more religious than someone who never attends
church, but who reads the Bible and prays daily? Is a person with a large
family income, but who also has a large family size, necessarily well-off
financially? Can you think of more valid indicators of these concepts of
social deprivation and religiosity?
Hypothesis Testing measures the relationship between the indicators. Are people with low family incomes more likely to attend church weekly, compared to people with high family incomes? Are people with lower net financial worths more likely to pray daily, compared to people with high net financial worths? If your hypothesis is rejected, there may be two reasons. Perhaps your theory is rejected, or perhaps your indicators are not valid measures of your concepts.
Actual Results of This Hypothesis Test:
Using the 2004-2012 Mississippi Poll, no substantively significant relationship was found between reported family income and reported frequency of church attendance (indeed, the relationship was the reverse of what we hypothesized).
YOUR RESEARCH PAPER
1) Introduction- discuss the importance of your subject. Discuss your initial expectations. Example of gender gap in presidential voting--why are women voting slightly more Democratic than are men? Why is this subject important? Why do you think this female Democratic bias is occurring?
2) Your model and hypotheses. List all five of your hypotheses, and draw your model.
Example of a model and its hypotheses:
Assume that sex is the earliest, independent variable; presidential vote
is the latest, dependent variable; ideology and income are the two
intervening variables located between sex and vote.
SEX........(H1).......> Ideology .....(H2).....> PRESIDENTIAL
Male or...................(H3)..............................> VOTE
Female.....(H4)........> Income ......(H5)........> (D or R)
The hypotheses are:
H1: Women are more likely to be liberal, compared to men.
H2: Liberals are more likely to vote Democratic for President, compared to
conservatives.
H3: Women are more likely to vote Democratic for President, compared to
men.
H4: Women are more likely to have lower incomes, compared to men.
H5: Lower income people are more likely to vote Democratic for president,
compared to higher income people.
3) Literature review. Need at least 10 academic sources. The articles should be grouped by hypothesis, even if you must discuss the same article more than once. Most students use an on-line database for their literature search, such as JSTOR (website: http://www.jstor.org/action/showAdvancedSearch). For my on-line bibliography of articles since 1975 in four political science journals, click here. When in the internet, click on EDIT at top of page, then click on FIND (ON THIS PAGE), and then type in the keyword in the FIND WHAT box. Keep clicking on the FIND NEXT box to find multiple articles. Also, use different keywords for each of your variables (concepts).
4) Methods section. Provide information for each of the years of the Mississippi Poll that you are using. For information about the polls, click here. Information on the sampling methods used in each year is provided here. Three sample paragraphs for your paper follow:
To test my model, I used information drawn from The Mississippi Poll project, a series of statewide public opinion polls conducted by the Survey Research Unit of the Social Science Research Center (SSRC) at Mississippi State University and directed by political science professor Stephen D. Shaffer. In order to maximize my sample size and therefore minimize my sample error, I combined or pooled telephone surveys conducted in two years-- 2000 and 2004. The 2000 Mississippi Poll surveyed 613 adult Mississippi residents from April 3 to April 16, 2000 and had a response rate of 49%, while the 2004 Mississippi Poll surveyed 523 adult Mississippi residents from April 5 to April 21, 2004 for a response rate of 48%. The two years combined contained only 765 likely voters- respondents whose responses to three questionnaire items indicated that they were likely to vote in the presidential election, and to vote for candidates of the two major parties. With 765 likely voters interviewed, the sample error is 3.6%, which means that if every Mississippi likely voter had been interviewed, the results could differ from those reported here by as much as 3.6%. The pooled sample was adjusted or weighted by demographic characteristics to ensure that social groups less likely to answer the surveys or to own telephones were also represented in the sample in rough proportion to their presence in the state population. In both years, a random sampling technique was used to select the households and each individual within the household to be interviewed, and no substitutions were permitted. The SSRC's Computer Assisted Telephone Interviewing System (CATI) was used to collect the data.
I relied on four variables included in both years of the Mississippi Poll. Sex is very straightforward, while income was measured by reported total family income before taxes in the year before each survey. The presidential vote asked respondents six months before the election which of the two major party candidates they planned to vote for if the election were held today. Ideology was a self-identification question, asking respondents the following questions: "What about your political beliefs? Do you consider yourself very liberal, somewhat liberal, moderate or middle of the road, somewhat conservative, or very conservative?"
In order to have enough people to analyze using multivariate tables, I recoded or combined categories of two of the variables. Eight income categories were recoded into three levels--low income was defined as families making less than $20,000 a year, middle income was considered as $20-40,000 per year, and high income included families making over $40,000 annually. Five ideological self-identification categories were combined into three groups-- liberals included those considering themselves as "very" or "somewhat" liberal, conservatives were those identifying themselves as "somewhat" or "very" conservative, and the middle category of "moderate/middle of the road" constituted an intermediate "moderate" grouping. Sex and presidential vote already had only two categories for each, so they did not have to be recoded.
5) Findings-- bivariate. Test each of your 5 hypotheses using crosstabs. Compare percentages using complete sentences, which test your hypotheses. Mention the direction of the relationship, the magnitude of the relation using gamma or average percentage difference, and statistical significance level using chi-squared. Also, draw all tables and provide variable and value labels, and column percents and totals.
TABLE 3
SEX DIFFERENCES IN PRESIDENTIAL VOTE
Male Sex | Female Sex | |
Gore or Kerry (D) Vote | 41% | 43% |
George Bush Jr. (R) Vote | 59% | 57% |
N Size | (359) | (406) |
Gamma = -.04
Chi-squared > .05
Note: Percentages total 100% down each column.
Source: 2000 and 2004 Mississippi Polls, conducted by Mississippi State
University.
Example of text paragraph:
Hypothesis 3 of my model states that women will be more likely to vote Democratic for president, compared to men. In the 2000 and 2004 Mississippi Polls, 43% of women indicated that they intended to vote for Democratic presidential candidates, compared to a slightly smaller 41% of men who indicated an intended Democratic vote. However, this percentage difference in Democratic vote between the sexes is only 2%, and the gamma value reflecting the magnitude of the relationship between sex and the presidential vote is a mere -.04. Furthermore, the Chi-squared statistic is not significant at the .05 level, indicating that we cannot generalize this weak relationship between sex and the presidential vote, found in the 2000 and 2004 statewide polls, to the entire population. Hence, my hypothesis that women are more likely than men to vote Democratic for president is rejected.
6) Findings- multivariate. At least control for your two intervening variables. Provide information listed in 5. What do these multivariate tables tell you about which of the variables is important in influencing the dependent variable, and about how important each is.
7) Conclusions- Redraw your model, discuss your findings and literature, suggestions for future research.
8) References- alphabetize your references by authors' last name. Give full citations for scholarly articles, books, and other citations.
Review a sample model and hypotheses.
Review the Mississippi Poll codebook, and choose four variables that will constitute the model that you will do your research paper on. Now, draw up the model, and type the exact wording of your five hypotheses. Turn this in to me at the next class.
An easy-to-read summary of the Mississippi Poll codebook, which includes the variables that are included in multiple years, is available here.
Examine the three samples of student research papers:
Sample one
Sample two
Sample three
10 STAGES OF A RESEARCH DESIGN
1) Problem Formulation- what are you studying, why is it important. Rivenbark article, casino gambling, importance due to regressivity, hurts poor, addiction.
2) Literature review- thorough. Political science journals are: American Political Science Review, American Journal of Political Science, Journal of Politics, American Politics Quarterly, Public Opinion Quarterly. For a list of on-line political science articles, click here.
Public administration journals: Public Administration Review, American Review of Public Administration; check syllabus for a sample of articles and journals.
Literature suggests hypotheses.
3) Identify Unit of Analysis- what are you collecting data on, getting information about what units.
The four units of analysis are: Individual, county, state,
nation
a) Individual level examples are public opinion polls.
b) County level example is a public policy study examining spending
in each of Mississippi's 82 counties.
c) State level example is a public policy study examining spending
in each of the nation's 50 states.
d) Nation unit of analysis example may be relating each of the
world's nation's suicide rate to its absence of Catholicism in its
population.
Test your ability to identify the unit of analysis of ten different studies by going back to the directory for this class, and accessing one of the sample tests for Test 1
4) Design data collection mode- survey, roll call,
aggregate (unit analysis above individual), content
analysis:
a) Survey is a public opinion survey. It can be of the mass population, or
of a more specialized group, such as government workers.
b) Roll call mode deals with congressional or state legislative votes on
public issues, and often includes demographic characteristics of their
district's constituents.
c) Aggregate mode deals with a level of analysis higher than the
individual. It deals with cases that combine numbers of individuals, such
as counties, states, etc. The data are often secondary data analysis,
collected by government agencies.
d) Content analysis is a study of the characteristics of messages, such as
how ideologically biased is the mass media, and how many liberal or
conservative themes are voiced by a President or governor
5) Pre-test survey anticipates validity problems with indicators, and suggests variables you left out. For a statewide public opinion poll of 600 Mississippians who are asked 100 questions, you might ask a random sample of 25 Starkville residents the 100 questions, and then ask the interviewers whether the respondents had difficulty answering any of the questions, and if so why.
6) Data collection, surveys use CATI system, or
secondary data analysis (use existing dataset).
CATI stands for
Computer-Assisted Telephone Interviewing system, and is used for the
researcher to collect her own data on an original study.
Secondary data
analysis relies on existing data sources, such as the
University of Michigan National Election Studies conducted every two
years, or the MSU Mississippi Poll conducted every two years.
7) Data reduction, usually obsolete with CATI, often needed with in-person and mail surveys; enter data into SPSS program. Demonstrate in class.
8) Design statistical analysis technique, do a simple one first such as crosstabs.
9) Perform analysis, get results, show tables and results, discuss results.
10) Conclusions- what you found, so what, importance, theory upheld or rejected, future research directions.
ASSIGNMENT DUE: TURN IN MODEL CONTAINING FOUR VARIABLES AND FIVE ARROWS, AND FIVE HYPOTHESES, AND AN INTRODUCTION TO YOUR TOPIC.
Refer to your research paper model, and the five hypotheses you have proposed to conduct your literature review. You should find scholarly journal articles that examine each of these five hypotheses. Most students use an on-line database for their literature search, such as JSTOR (website: http://www.jstor.org/action/showAdvancedSearch). You can also use the professor's on-line bibliography of journal articles. Click on "EDIT" and then "FIND IN PAGE". In the Box that says "FIND WHAT?" type in the name of one of your variables. Keep finding relevant journal articles. If you don't find enough, slightly vary the name of that variable. Now repeat this step with the other three variables. Go to the library on the bottom floor, and find each journal in the stacks in alphabetical order by the name of the journal.
Stanley Milgram study- obedience to authority.
Informed Consent, components of-
Anonymity versus Confidentiality-
Anonymity- no one can identify a person with their
responses
Confidentiality- researcher knows who the respondent
is, but promises not to tell anyone
Examples of informed consent:
1) Mississippi Poll
2) NSF Grassroots Party Activists cover letter
One must never harm subjects.
MSU Human Subjects form approval
(this is an older form, but it is referenced because it clearly and
concisely asks about important issues)
Subpoena problem, so if confidential data convert into anonymous
data as soon as possible
Studies having ethics problems:
1) MSU literacy study, when suspected interviewer fraud results in
Attorney General request for respondent info
2) Ray Cleere's workplace study included identifiable questions
and political questions, and MSU dropped out of it
3) NSF Grassroots Party Activists study- ICPSR deleted county and
state variables
Political biases are a major problem in funded research:
1) Media sensationalism- 1982 Clarion-Ledger Senate poll
2) Official suppression of studies they disagree with— Mabus
governmental child care study suppressed by Fordice administration
ASPA Code of Ethics: 5 sources of ethics
1) Serve Public Interest: oppose discrimination and harassment, promote
affirmative action; public right to know; involve citizens in
decisionmaking
2) Respect Law and Constitution: change obsolete, counterproductive laws;
prevent mismanagement of public funds, need audits; protect privileged
information; whistleblower protect
3) Personal Integrity: give others credit for their work-plagiarism; avoid
appearance of conflict-of-interest, such as nepotism, gift acceptance,
misusing public resources, improper outside employment; act nonpartisan in
actions; admit own errors
4) Ethical Organizations: promote creativity, open communication among
workers; permit dissent, no reprisal, due process used; merit use
5) Professional Excellence: keep current on new issues, problems, upgrade
professional competence; professional associations active; help public
service students, like internships provide
Review the full text of ASPA's ethics code.
LEVELS OF MEASUREMENT
NOMINAL- lowest level of measurement, mere
classification. No ability to order the categories.
Examples are religion. Use crosstabulations.
ORDINAL- able to order the categories of the variable in
terms of a category having more of something than the
next category. But can't determine how much more of that
quality that the category has compared to the other
category.
Example is rating job performance of public officials into
excellent, good, fair, or poor categories.
INTERVAL- able to order the categories, and also
determine how much of the quality the category has.
Usually has numbers that have meaning to denote how
much of the quality each category has.
Example is income. Use regression techniques.
Test your ability to classify indicators by nominal, ordinal, and interval levels of measurement by turning to the sample tests, test 1. Click here.
Definition- repeated measurements of a concept (the indicator) should yield similar results.
Tests of reliability:
1) Test-Retest- using the same indicator on the same people at two or more time points. Should have consistent responses at both time points.
TEST-RETEST RELIABILITY TEST OF PARTY IDENTIFICATION
(Note: the following table is derived from Herbert B. Asher's Presidential Elections and American Politics, 5th edition, page 71; Brooks/Cole co., 1992)
1976 Partisanship
1972 Party Id | Strong Dem. | Weak Dem. | Indep. Dem. | Pure Indep. | Indep. Rep. | Weak Rep. | Strong Rep. |
Strong Dem. | 9 | 4 | 1 | 0 | 0 | 0 | 0 |
Weak Dem. | 5 | 13 | 3 | 2 | 1 | 1 | 0 |
Indep. Dem. | 2 | 3 | 4 | 1 | 1 | 0 | 0 |
Pure Indep. | 1 | 1 | 2 | 5 | 2 | 1 | 0 |
Indep. Rep. | 1 | 0 | 1 | 3 | 5 | 2 | 1 |
Weak Rep. | 0 | 1 | 0 | 1 | 3 | 7 | 2 |
Strong Rep. | 0 | 0 | 0 | 0 | 1 | 4 | 6 |
How much stability is there in this table? How many people have given the same response at both time points? Count the number of people in the diagonal. The number remaining stable in attitudes = (9 + 13 + 4 + 5 + 5 + 7 + 6) = 49. The total number of people in the table is 100. Hence, 49% of the sample has remained stable in attitudes. Is 49% high or low reliability? The stable percent must be compared to chance alone. Chance stability is the number of stable cells, divided by the total number of cells in the table. Hence, chance stability is 7 / 49 = 14%. Since 49% is significantly higher than 14%, this indicator is reliable.
A more recent example follows:
(Source of the following info is: Politial Behavior of the American Electorate, 12th edition, by William H.
Flanigan and Nancy Zingale, p. 104; data originally are from the Youth-Parent Socialization Panel
Study, 1965-1997, Youth Wave, data provided by the ICPSR).
|
DEMOCRATS In 1982 |
INDEPENDENTS In 1982 |
REPUBLICANS In 1982 |
DEMOCRATS In 1997 |
23 |
5 |
4 |
INDEPENDENTS In 1997 |
7 |
27 |
10 |
REPUBLICANS In 1997 |
2 |
5 |
17 |
2) Alternate Forms (Parallel Forms)- using two or more indicators on the same people at one time point. Should have consistent responses for both indicators.
ALTERNATE FORMS
2002 Party Identification
Party that is best for "People like you" | Democratic | Independent | Republican |
Democrats | 172 | 51 | 7 |
Both are Equal | 18 | 40 | 29 |
Republican | 6 | 39 | 157 |
Consistent responses for both indicators are Democrats who believe that the Democratic party is best for people like themselves, Republicans who believe that the Republican party is best for people like themselves, and Independents who believe that both parties are equally good for people like themselves. The number of consistent responses is (172 + 40 + 157) = 369.
The total number of people is 519. The percentage of people who give consistent responses is:
369 / 519 = 71%. How reliable is the party identification indicator compared to chance alone. Chance is the number of consistent cells divided by the total number of cells: 3 / 9 = 33%. Since 71% is significantly greater than 33%, the party identification indicator is reliable.
Note- these data are from the Mississippi Poll.
3) Split Half- using multiple indicators of a concept on the same people at one time point. Forms two scales with each combining people's responses on half of the indicators. The two scales' scores should be consistent for people.
Health care example. In 2004 the Mississippi Poll included seven questions about how important people thought a number of health care issues were, and they rated them from scores of 1 for Very Important to scores of 4 for Not Important. An item on Recruiting and Retaining Doctors were not highly related to the other six items, so we excluded it from analysis. The other six items were:
These six indicators were divided into two groups: Group A included items 1, 3, and 5; and Group B included items 2, 4, and 6. Responses to all three items in each group were added together. Since each item was coded to range from a 1 to 4, the scale for each group ranges from a 3 to a 12. The Pearson correlation between the two scales is a .71, which is pretty respectable.
Another way of testing consistency is with a crosstabulation. Looking at the frequency distributions of each scale, I combined each scale's codes as follows: 3 and 4 were coded as High Priority; 5 and 6 were coded as Medium; 7 thru 12 were coded as Low Priority. The crosstabulation follows:
SPLIT HALF EXAMPLE
Group A Scale
Group B Scale | High | Medium | Low |
High | 141 | 25 | 1 |
Medium | 81 | 104 | 29 |
Low | 6 | 24 | 47 |
Notice that 292 people (141 + 104 + 47) gave consistent responses to both of the scales. They fall in the diagonal, being high-high, medium-medium, or low-low. The total number of people in the table is 458. Therefore, 292/458 people gave consistent responses, or 64% of the sample. Chance alone would predict about one-third or 33%. So the six indicators of the importance of health care demonstrate some reliability.
4) Cronbach's Alpha- used for multi-indicator indexes, calculates how reliable the component indicators are. Ranges from 0 for unreliable to 1 for most reliable. The Cronbach's Alpha for the six health care items included in the 2004 Mississippi Poll analysis discussed earlier was .80.
Reasons for low observed reliability:
Definition- are we really measuring what we think we are measuring.
Types of validity tests:
1) Face Validity- on its face, it appears to be valid. Simple concepts, such as a ruler. Just use it.
Very well established indicator, don't question it.
2) Construct (Criterion) Validity- relate your questionable indicator to more well established indicators, and see whether it behaves as you expect it to behave.
CONSTRUCT VALIDITY
Questionable Indicator is Party Identification
Well Established Indicators | Strong Dem | Weak Dem | Indep. Dem. | Pure Indep. | Indep. Rep. | Weak Rep. | Strong Rep. |
Pres. Vote | |||||||
1984-1992 | 13% | 54% | 49% | 77% | 95% | 91% | 95% |
1996-2004 | 7 | 32 | 22 | 58 | 87 | 92 | 94 |
2008-2012 | 11 | 26 | 20 | 64 | 88 | 95 | 89 |
1984 | 15 | 65 | 48 | 90 | 94 | 83 | 91 |
1988 | 13 | 46 | 52 | 68 | 94 | 87 | 98 |
1992 | 7 | 49 | 47 | 50 | 100 | 98 | 95 |
1996 | 7 | 26 | 23 | 45 | 90 | 84 | 92 |
2000 | 7 | 30 | 30 | 63 | 84 | 97 | 93 |
2004 | 7 | 40 | 15 | 69 | 86 | 96 | 97 |
2008 | 18 | 20 | 24 | 72 | 85 | 96 | 86 |
2012 | 2 | 36 | 0 | 50 | 93 | 93 | 92 |
Senate Vote | |||||||
1984-1994 | 29% | 54% | 55% | 80% | 86% | 80% | 92% |
1984 | 25 | 53 | 44 | 79 | 80 | 63 | 93 |
1988 | 15 | 40 | 50 | 80 | 83 | 76 | 92 |
1994 | 50 | 73 | 77 | 80 | 93 | 93 | 94 |
2014 | 12 | 26 | 32 | 67 | 67 | 92 | 94 |
Note: Cell entries are percentage vote for Republican candidate among each of the seven party identification categories. These data are from the Mississippi Poll.
Our expectations are that the percentage Republican vote would increase steadily as one moves from the most Democratic party identification category of Strong Democrat to the most Republican party identification category of Strong Republican. Examine the 1988 presidential vote indicator, we see a steady increase in Republican vote as we move from Strong Dem. to Strong Rep. with two exceptions. Only 87% of Weak Republicans voted for Republican Bush, while 94% of Independent Republicans voted for Bush. Those two categories should have reversed percentages, so circle both of those cells, since they involve validity problems with the party identification indicator. Examine the 1996 presidential vote and you find two sets of validity problems among Democrats and Republicans. Circle the four cells having validity problems.
Repeat this validity test for the other vote indicators, including the Senate vote items, and discuss the validity problems with the party identification indicator that you find.
A good example of a construct validity test is in an MSU publication about health care issues. See table 1 in this link.
3) Convergent-Discriminant Validity Test- different measures of the same concept should yield similar results; the same measures of different concepts should yield different results. Examine correlation matrix.
A good example of a convergent-discriminant validity test is provided in table 8 of an MSU publication. See this link.
Another good, recent example, follows:
EXAMPLE OF CONVERGENT-DISCRIMINANT VALIDITY TEST
Adult Mississippians’ views of political issues in 2010 and 2012 (Pearson r’s)
|
Abortion |
Gay Marriage |
Affirmative Action |
Gov’t help Blacks’ socioeconomic Position |
Gov’t help get doctors-hospitals low cost |
Gov’t help get jobs and good living standard |
Abortion |
- |
|
|
|
|
|
Gay Marriage |
.32 |
- |
|
|
|
|
Affirmative Action |
.04 |
.15 |
- |
|
|
|
Black socioecon. Position |
.15 |
.21 |
.61 |
- |
|
|
Doctors & Hospitals |
.16 |
.23 |
.43 |
.61 |
- |
|
Jobs & living standards |
.11 |
.20 |
.41 |
.55 |
.67 |
- |
Civil liberty (abortion-gay marriage) average intra-cluster correlation = .32
Economic welfare (affirmative action-black socioeconomics-doctors-jobs) average intra-clsuter correlation = .55
Average Inter-cluster correlation (between items from these two dimensions) = .16
CONVERGENT-DISCRIMINANT VALIDITY TEST
Correlation Matrix of State Spending Preferences (1981-1999)
Day Care | Envir | Health | Indus-try | Police | Poor | Prisons | Highways | E&S Educ. | Tourism | |
Day Care | - | |||||||||
Envir | .19 | - | ||||||||
Hlth. | .36 | .17 | - | |||||||
Indus. | .06 | .07 | .10 | - | ||||||
Pol. | .08 | .10 | .08 | .09 | - | |||||
Poor | .39 | .11 | .39 | .03 | .05 | - | ||||
Prison | .15 | .06 | .13 | .07 | .22 | .12 | - | |||
High. | .15 | .11 | .12 | .14 | .16 | .08 | .12 | - | ||
E&S Educ. | .11 | .13 | .15 | .08 | .15 | .13 | .08 | .09 | - | |
Tour. | .07 | .10 | .02 | .25 | .15 | 0 | .12 | .13 | .01 | - |
Univ-ersity | .14 | .12 | .15 | .14 | .10 | .18 | .07 | .13 | .33 | .07 |
Note: data are based on the 1981-1999 Mississippi Poll, with some fictitious data included to simplify table interpretation.
Convergent-discriminant validity tests help to determine if your multiple indicators of one concept are actually measuring only one concept, or whether your indicators are measuring more than one concept (a multi-dimensional concept). Generate a correlation matrix as indicated above, and remember that the correlations range from 0 for no relationship to 1 for highest relationship. Then, pick out the highest correlations in order of their size. In the above table, the validity test shows that spending is a multi-dimensional concept involving four separate dimensions (concepts). Those dimensions are: social welfare (poor, day care, health), education (elementary-secondary and college), economic development (industry, tourism), and public order (police, prisons). The environment and highways indicators do not relate to any of these four, above the .2 correlation level. Hence, any researcher combining all eleven spending indicators into one scale that supposedly measures one concept of public support for government programs has validity problems, since there are four dimensions rather than one dimension of state spending.
SECOND CONVERGENT-DISCRIMINANT VALIDITY TEST
Updated Correlation Matrix of State Spending Preferences (2000, 2004)
Envir | Health | Industry | Police | Poor | Highways | E&S Educ. | Tourism | |
Envir | - | |||||||
Health | .22 | - | ||||||
Industry | .12 | .12 | - | |||||
Police | .07 | .12 | .18 | - | ||||
Poor | .23 | .40 | .08 | .03 | - | |||
Highways | .11 | .17 | .17 | .08 | .14 | - | ||
E&S Educ. | .14 | .28 | .09 | .07 | .30 | .12 | - | |
Tourism | .07 | .05 | .36 | .10 | -.02 | .12 | .09 | - |
University | .12 | .42 | .13 | .09 | .26 | .17 | .31 | .04 |
Note: data are real world data drawn from the 2000 and 2004 Mississippi Polls.
In this updated correlation matrix, note that only two dimensions emerge, and that three spending items are unrelated to both dimensions. The highest correlations are between health care and poverty spending (.40) and between health care and universities (.42). Elementary/secondary and universities are correlated at .31. The three other correlations between these four spending items range from .26 to .30 in value. These items of elementary-secondary, universities, health care, and poverty spending form one dimension. The second dimension is tourism and industry, which are correlated at .36. The three spending items that are uncorrelated with these two dimensions are police and highways, where the correlations with other spending items never exceed .18, and the environment (correlations never exceed .23). Unlike ten years ago, people appear to see the relevance of education for social welfare programs, in that people with a better education are less likely to need social welfare programs. Also note that we no longer ask two spending items- day care and prisons.
THIRD CONVERGENT-DISCRIMINANT VALIDITY TEST
Most Recent Correlation Matrix of State Spending Preferences (2006, 2008, 2010, 2012)
Envir | Health | Industry | Police | Poor | Highways | E&S Educ. | Tourism | |
Envir | - | |||||||
Health | .27 | - | ||||||
Industry | .14 | .20 | - | |||||
Police | .14 | .18 | .16 | - | ||||
Poor | .33 | .46 | .11 | .18 | - | |||
Highways | .10 | .13 | .19 | .24 | .17 | - | ||
E&S Educ. | .26 | .34 | .12 | .20 | .36 | .17 | - | |
Tourism | .10 | .08 | .25 | .17 | .04 | .18 | .11 | - |
University | .20 | .42 | .21 | .15 | .24 | .17 | .41 | .15 |
Note: data are real world data drawn from the 2006, 2008, 2010, and 2012 Mississippi Polls.
In this most recent example, one could argue that there are either two different dimensions, or three different dimensions. Two of the three highest correlations are for schools-universities, and for poor-health. Environment is more highly correlated with poor and health than with schools (elementary-secondary ed) or universities, so we can place it in the social welfare rather than education dimension. But health-universities is also highly correlated, and the correlations between schools/universities and the other three items (poor-environment-health) are also respectable, so one could argue that instead of two dimensions of social welfare and education, there is only one dimension of education-welfare. A separate third dimension is industry-highways-police-tournism, an economic development dimension, though these programs are also correlated with the education-welfare items. So do we have one, two, or three dimensions?? Let us turn to factor analysis.
4) Factor Analysis- can be used as a validity test for testing whether a concept is multi-dimensional.
2004 health care example. The six relevant items were subjected to a Principal Components Factor Analysis with Varimax Rotation. Only 457 of the 523 respondents were analyzed, since others lacked responses on one or more of the six items. Thus, 13% of the respondents were excluded from this factor analysis. Only one factor emerged, explaining 51% of the variance in all six items. Other factors explained less of the variance than each item did, so they were dropped from the analysis. The factor loadings for each item ranged from a low of .66 for public education to encourage nutrition and exercise to a high of .78 for providing health care for adults who can't afford it.
The Component Matrix, and the Component 1 scores follow:
Extraction Method: Principal Component Analysis. 1 components extracted.
These results suggest that it is valid to combine these six health care importance indicators into one scale measuring one dimension. If we had included the third health care item on the importance of recruiting and retaining doctors in Mississippi, we would have still ended up with one dimension, but the loading of that item on the factor was only .47, clearly the lowest of the factor loadings. This suggests that that item does not measure the one dimension very well, so we excluded it from the scale.
2006-2012 state spending programs example.
Rotated Component Matrix
Spending Program |
Component 1 |
Component 2 |
Environment |
.538 |
.128 |
Health Care |
.752 |
.108 |
Industrial Development |
.098 |
.647 |
Police Forces |
.232 |
.503 |
Poverty Programs |
.730 |
.015 |
Streets and Highways |
.157 |
.582 |
Elem.-Secondary Education |
.687 |
.127 |
Attracting Tourism |
-.017 |
.709 |
Higher Education |
.597 |
.247 |
As you can see with this factor analysis, varimax rotation, there are clearly two different dimensions- the education-welfare dimension with 5 programs, and the economic development dimension with 4 programs. That would probably be the most defensible conclusion, though using the entire correlation matrix would permit you to make an argument for using one or three dimensions as well.
It is interesting to correlate the factor scores of these two dimensions with other theoretically relevant factors. The education-welfare factor score is correlated -.35 with ideology identification and -.44 with party identification, indicating that conservatives and Republicans are more likely to want to spend less on these programs than liberals and Democrats. The economic development factor score is uncorrelated with ideology identification, and has a slight positive (.05 correlation) relationship with party identification, indicating that Republicans are slightly more supportive of these programs than are Democrats.
Discuss factor analysis as data reduction tool- it reduces number of variables into a smaller number of concepts.
Historic Problems with Polls:
Sample Error Correlates:
TABLE OF SAMPLE ERROR
(Source of table: Survey Research Methods, by Earl R. Babbie, Wadsworth Publishing Co., 1973, page 376)
HOMOGENEITY OF POPULATION
SAMPLE SIZE | 50/50 | 60/40 | 70/30 | 80/20 | 90/10 |
100 | 10 | 9.8 | 9.2 | 8 | 6 |
200 | 7.1 | 6.9 | 6.5 | 5.7 | 4.2 |
300 | 5.8 | 5.7 | 5.3 | 4.6 | 3.5 |
400 | 5 | 4.9 | 4.6 | 4 | 3 |
500 | 4.5 | 4.4 | 4.1 | 3.6 | 2.7 |
600 | 4.1 | 4 | 3.7 | 3.3 | 2.4 |
700 | 3.8 | 3.7 | 3.5 | 3 | 2.3 |
800 | 3.5 | 3.5 | 3.2 | 2.8 | 2.1 |
900 | 3.3 | 3.3 | 3.1 | 2.7 | 2 |
1000 | 3.2 | 3.1 | 2.9 | 2.5 | 1.9 |
1100 | 3 | 3 | 2.8 | 2.4 | 1.8 |
1200 | 2.9 | 2.8 | 2.6 | 2.3 | 1.7 |
1300 | 2.8 | 2.7 | 2.5 | 2.2 | 1.7 |
1400 | 2.7 | 2.6 | 2.4 | 2.1 | 1.6 |
1500 | 2.6 | 2.5 | 2.4 | 2.1 | 1.5 |
1600 | 2.5 | 2.4 | 2.3 | 2 | 1.5 |
1700 | 2.4 | 2.4 | 2.2 | 1.9 | 1.5 |
1800 | 2.4 | 2.3 | 2.2 | 1.9 | 1.4 |
1900 | 2.3 | 2.2 | 2.1 | 1.8 | 1.4 |
2000 | 2.2 | 2.2 | 2 | 1.8 | 1.3 |
Note: Cell entries are sample error figures.
Types of Surveys: In-person; Telephone; Mail; Mixed Methods; briefly discuss each.
For further information, see Mail and Telephone Surveys, by Don Dillman, John Wiley and Sons Co, 1978.
In-person-- pros:
1) Observe and clear up R's confusion
2) Obtain objective information about R's (respondent) lifestyle
3) Visual Aids use
4) Establish rapport? High response rate?
In-person-- cons:
1) Expensive
2) Safety of interviewer
3) Interviewer fraud
Telephone-- pros:
1) Quick
2) Cost effective
3) Centralized interviewing- no fraud
4) Interviewer safety
Telephone-- cons:
1) Excludes those without telephones
2) No visual aids-- voice dependent
Mail-- pros:
1) Cheap
2) Use with specialized population
Mail-- cons:
1) Excludes illiterates
2) Can't control who answers survey
3) Can't control order of questions answered
4) Slow
5) Incomplete forms
6) Low response rate?
Probability Sampling. Definition of probability sample: each population unit has some chance of being in the sample, and that chance can be calculated. Types of probability samples:
Telephone Sampling Techniques:
Sampling within the household:
1) Kish method, ask household resident to list first names of all adults,
then toss dice to select adult to interview;
2) Carter-Trodahl method: multiple selection tables
asking number of adults and number of men in household;
3) Sociological last birthday method; problem that it oversamples women.
Demographic Groups Undersampled in Surveys, especially Telephone Surveys:
Weighing the Sample:
In the 2012 and 2014 Mississippi Polls, we included cell phones in our sampling frame, so underrepresenting the young was no longer as huge a problem as it had been in 2010. Check out how representative the three polls were, and how each was weighted to compensate for demographic groups underrepresented, by clicking on the following links:
ACTUAL EXAMPLES:
(From Survey Research for Public Administration, by
David H. Folz, Sage Publishers)
1) Perceptions of local problems- p. 5, 22, 107
A) No problem, Minor Problem, Major Problem
B) Most serious problem
C) Agree-disagree with problem statements
2) Quality of local services- p. 8
A) Excellent, good, fair, poor
3) Policy preferences- p. 5, 22
A) Single most important change
B) How improve quality of life- not important, somewhat important, very
important
C) One policy- oppose or favor, strong or some.
4) Funding priorities- p. 5, 22
A) Single choice, reduce funding first
B) City spending- too little, about right, too much
5) Tax hike backing- p. 20
A) Specific increase for specific policy
6) Citizen usage satisfaction- p. 8
A) Filter question, did they use service?
B) Satisfied or dissatisfied, very or somewhat
C) How often policy met expectations
7) Business usage satisfaction- p. 6
A) Survey gov't workers about complaints heard
B) Survey businesses about specific problems, Overall satisfaction
8) Wording problems- p. 99
A) Loaded or leading
B) Double barreled
C) Too complex, double negative (Miss Poll)
D) Unbalanced alternatives (Blacks treated same
as whites or worse)
E) Acquiescence bias (agreement bias)- especially on agree-disagree
items
F) Sensitive items- use income categories
G) Social desirability- race items
Read Some 2008 Mississippi Poll Results, by Stephen D. Shaffer, SSRC, MSU, 2008.
Read Some 2010 Mississippi Poll Results, by Stephen D. Shaffer, SSRC, MSU, 2010.
Read Some 2012 Mississippi Poll Results, by Stephen D. Shaffer, SSRC, MSU, 2012.
Read Some 2014 Mississippi Poll Results, by Stephen D. Shaffer, SSRC, MSU, 2014.
DESCRIPTIVE STATISTICS are Univariate Statistics, dealing with one variable.
CENTRAL TENDENCY- typical case
DISPERSION- diversity, how divided or united the cases are, the form of the distribution (interval level)
Identify the mode and median categories in each of the following examples drawn from the 2008 Mississippi Poll:
Punishment favored in cases of first-degree murder:
Death penalty............... = 48%
Life without parole......... = 44%
A shorter jail term than life = 8%
How rate President Bush's job performance:
Excellent = 9%
Good .... = 24%
Fair .... = 33%
Poor .... = 34%
Ideological self-identification:
Very Liberal......... = 2%
Somewhat Liberal..... = 17%
Moderate............. = 27%
Somewhat Conservative = 34%
Very Conservative.... = 20%
Education Level:
High School Dropout. = 27%
High School Graduate = 29%
Some College........ = 28%
College Graduate.... = 13%
Some Graduate Work.. = 3%
Annual Family Income
Under $10,000 = 16%
$10-20,000... = 15%
$20-30,000... = 15%
$30-40,000... = 10%
$40-50,000... = 9%
$50-60,000... = 8%
$60-70,000... = 7%
Over $70,000. = 20%
Likelihood of Living in the Current Community in Five Years:
Definitely No. = 8%
Probably No... = 14%
Probably Yes.. = 32%
Definitely Yes = 46%
Population of the Community You Live In:
Farm or ranch = 10%
Rural area... = 25%
Under 2,500.. = 14%
2,500-10,000. = 22%
10,000-50,000 = 27%
Over 50,000.. = 2%
MEANS- What follows is a verbal interpretation of means, using the ideological self-identification and ideological perception questions.
Question wording: "What about your political beliefs? Do you consider yourself: very liberal, somewhat liberal, moderate or middle of the road, somewhat conservative, or very conservative?" Question wordings: "Please label the following political figures as very liberal, somewhat liberal, moderate (or middle of the road), somewhat conservative, or very conservative." "Democratic Presidential hopeful Hillary Clinton." "Democratic Presidential hopeful Barack Obama." "Republican Presidential hopeful John McCain." Ideological perception questions were not asked for the U.S. senate candidates. However, such questions were asked in previous years' polls for Musgrove, for when he was lieutenant governor (1998) and governor (2000, 2002 polls). For comparison purposes, we also include the perceptions of previous Democratic presidential candidates, asked in previous Mississippi polls.
The values below are "means" or averages for the ideological variables, all of which are coded as 1 for very liberal, 2 for somewhat liberal, 3 for moderate, 4 for somewhat conservative, and 5 for very conservative.
RANGE is distance between extreme categories. It requires an interval level measurement. Thus, merely subtract the lowest number representing the category at one end of the indicator from the highest number representing the category at the other end of the indicator. Examples follow:
A test of your knowledge of VARIANCE. Remember the example of Mississippi's party organization members. The mean for Democrats was 2.69, which was between somewhat liberal and moderate, but closer to moderate. The mean for Republicans was 4.45, which was between somewhat conservative and very conservative, but closer to somewhat conservative. However, remember the form of the distribution. Nearly 10% of Democrats were very conservative, and almost 20% were somewhat conservative, so there was considerable diversity or dispersion of ideologies in the Democratic party. Therefore, the variance of Democrats' ideology scores was a relatively higher number, a variance of 1.351. For Republicans on the other hand, less than 2% of them were very liberal or somewhat liberal. So there was much unity and clustering of ideological scores for the Republicans, and little diversity or dispersion of scores. Therefore, the variance of Republicans' ideology scores was a relatively low number, a variance of .493. Therefore, Democrats were more divided in ideology (a higher variance), and Republicans were more united on ideology (a lower variance).
Examples of variance follow:
4A. (5 points) The following two questions are based on recent Mississippi Polls, all conducted in the 21st century. Using the statistic of variance, are Democrats or Republicans most divided on each of the following five variables:
4B. (5 points) Using the statistic of variance, are whites or blacks most united on each of the following five variables:
Contingency tables can be used with nominal level measures, though we usually employ ordinal or interval level data having a limited number of categories. Contingency tables permit you to view the data in an easily interpretable and understood manner.
Percentage Difference is a measure of strength of the relationship. It ranges from a low of 0 to a high of 100. Always put the independent variable at the top of the table, and the dependent variable at the side. Then, calculate the column percentages. For ordinal and interval level indicators, compare the column percents (for the two extreme categories of the predictor) across the same category of your dependent variable. Make this comparison for the two extreme categories of your dependent variable, and take the average. If one of these comparisons is contrary to your hypothesis, make the difference a negative.
Other Measures of Association to use (Source: Research Methods in Political Science: An Introduction Using MicroCase, 2nd edition, by Michael Corbett; p. 139-144; copyrighted by MicroCase Corporation):
All measures range from 0 for no relationship to 1 for perfect
relationship. A positive or negative sign is a function of the direction
of the coding of the variables and whether your hypothesis is upheld.
p>The following are nine examples of bivariate tables. In class, we will
review three features of each table. 1) Is the relationship statistically
significant? Is Chi-squared significant at the .05 level or below? 2) What
is the magnitude of the relationship? That is, what is the gamma value. To
determine the relative importance of the predictors-- which predictor is
most and least important-- use the absolute value of the gamma, and ignore
the sign. 3) What is the direction of the relationship? That is, devise a
hypothesis for each table that reflects how the two variables are related.
Example for table 1: People younger in age are more likely to favor
spending more on health care, compared to people older in age.
Note: The tables in your research paper should look like these
tables in format.
Table 1
Age Differences in State Spending Preferences for Health Care
AGE
STATE SPENDING DESIRED: |
18-35 |
36-55 |
56 and Over |
Less |
10% |
7% |
8% |
Same |
18% |
18% |
34% |
More |
72% |
75% |
58% |
N Size |
(555) |
(571) |
(524) |
Gamma
= -.16
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.
Table 2
Income Differences in State Spending Preferences for Health Care
FAMILY INCOME
STATE SPENDING DESIRED: |
< $20,000 |
$20-40,000 |
$40-60,000 |
> $60,000 |
Less |
10% |
4% |
7% |
10% |
Same |
13% |
17% |
30% |
36% |
More |
77% |
79% |
63% |
54% |
N Size |
(365) |
(363) |
(222) |
(333) |
Gamma
= -.28
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.
Table 3
Ideological Differences in State Spending Preferences for Health Care
SELF-IDENTIFIED IDEOLOGY
STATE SPENDING DESIRED: |
Liberal |
Moderate |
Conservative |
Less |
3% |
6% |
12% |
Same |
15% |
17% |
31% |
More |
82% |
77% |
57% |
N Size |
(262) |
(495) |
(808) |
Gamma
= -.41
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.
Table 4
Race Differences in State Spending Preferences for Health Care
RACE
STATE SPENDING DESIRED: |
White |
African-American |
Less |
10% |
3% |
Same |
31% |
10% |
More |
59% |
87% |
N Size |
(1050) |
(555) |
Gamma
= .63
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.
Table 5
Sex Differences in State Spending Preferences for Health Care
SEX
STATE SPENDING DESIRED: |
Men |
Women |
Less |
12% |
5% |
Same |
27% |
20% |
More |
61% |
75% |
N Size |
(772) |
(889) |
Gamma
= .33
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.
Table 6
Income Differences in Having Access to a Personal Computer
FAMILY INCOME
HAVE ACCESS TO A PC? |
< $20,000 |
$20-40,000 |
$40-60,000 |
> $60,000 |
Yes |
54% |
67% |
85% |
94% |
No |
46% |
33% |
15% |
6% |
N Size |
(370) |
(368) |
(232) |
(341) |
Gamma
= -.59
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.
Table 7
Race Differences in Having Access to a Personal Computer
RACE
HAVE ACCESS TO A PC? |
White |
African-American |
Yes |
74% |
69% |
No |
26% |
31% |
N Size |
(1084) |
(560) |
Gamma
= .12
Chi-squared significance < .05
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.
Table 8
Sex Differences in Having Access to a Personal Computer
SEX
HAVE ACCESS TO A PC? |
Men |
Women |
Yes |
74% |
70% |
No |
26% |
30% |
N Size |
(790) |
(910) |
Gamma
= .10
Chi-squared significance < .06; Not Significant at .05 level.
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.
Table 9
Age Differences in Having Access to a Personal Computer
AGE
HAVE ACCESS TO A PC? |
18-35 |
36-55 |
56 and Over |
Yes |
82% |
79% |
55% |
No |
18% |
21% |
45% |
N Size |
(564) |
(585) |
(538) |
Gamma
= .41
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.
The following nine examples of bivariate tables are from earlier years of the Mississippi Poll. How have demographic differences in attitudes towards health care, and in access to personal computers, changed over the years? Explain.
Table 1
Age Differences in State Spending Preferences for Health Care
AGE
STATE SPENDING DESIRED: |
18-35 |
36-55 |
56 and Over |
More |
78% |
74% |
69% |
Same |
18% |
20% |
26% |
Less |
4% |
6% |
5% |
N Size |
(630) |
(720) |
(470) |
Gamma = .131 Table 2 Income Differences
in State Spending Preferences for Health Care FAMILY INCOME STATE SPENDING DESIRED: < $20,000 $20-40,000 $40-60,000 > $60,000 More 84% 78% 65% 53% Same 13% 17% 30% 40% Less 3% 5% 5% 7% N Size (418) (499) (280) (222) Gamma = .371 Table 3 Ideological
Differences in State Spending Preferences for Health Care SELF-IDENTIFIED IDEOLOGY STATE SPENDING DESIRED: Liberal Moderate Conservative More 79% 79% 67% Same 15% 18% 27% Less 6% 3% 6% N Size (630) (720) (470) Gamma = .223 Table 4 Race Differences
in State Spending Preferences for Health Care
RACE STATE SPENDING DESIRED: White African-American More 67% 89% Same 27% 9% Less 6% 2% N Size (1206) (570) Gamma = -.574 Table 5 Sex Differences in
State Spending Preferences for Health Care
SEX STATE SPENDING DESIRED: Men Women More 70% 77% Same 25% 18% Less 5% 5% N Size (825) (995) Gamma = -.174 Table 6 Income Differences
in Having Access to a Personal Computer
FAMILY INCOME HAVE ACCESS TO A
PC? < $20,000 $20-40,000 $40-60,000 > $60,000 Yes 28% 61% 76% 87% No 72% 39% 24% 13% N Size (280) (334) (187) (163) Gamma = -.634 Table 7 Race Differences
in Having Access to a Personal Computer
RACE HAVE ACCESS TO A
PC? White African-American Yes 66% 45% No 34% 55% N Size (806) (376) Gamma = .412 Table 8 Sex Differences in
Having Access to a Personal Computer SEX HAVE ACCESS TO A
PC? Men Women Yes 64% 55% No 36% 45% N Size (547) (668) Gamma = .189 Table 9
AGE HAVE ACCESS TO A
PC? 18-35 36-55 56 and Over Yes 72% 62% 38% No 28% 38% 62% N Size (409) (482) (321) Gamma = .410
Multivariate crosstabulations:
Multivariate analysis involves one dependent variable and more
than one independent variable (predictor).
Controlling- multivariate tables always permit you to examine the relationship between a predictor and a dependent variable, after taking into effect the impact of a second predictor.
For example, African-Americans tend to have a lower turnout than whites. A possible control variable is socioeconomic status (SES). Perhaps African-Americans have a lower average turnout than whites because of the lower socioeconomic status of blacks, and we know that people of all races having a lower SES tend to have lower turnout compared to people of all races having a higher SES. To determine whether a lower SES level explains why African-Americans tend to have lower turnouts than whites we examine: the relationship between race and turnout, controlling for SES. Do whites and blacks of the same SES level have the same turnout level; if so, SES is more important than race in shaping turnout.
___>___________>SES ________>
RACE _____________________> TURNOUT
Three types of variables that one would control for:
1) Outside variables- a variable that has an effect on one
of your
predictors and on your dependent variable. Here, race is an outside
variable. You would control for it to determine if SES has a direct,
causal effect on turnout, or whether the race-turnout effect is spurious.
If spurious, then race directly affects or causes SES and turnout, but SES
does not have a direct causal effect on turnout.
2) Intervening variable- a variable that is located between
a predictor and a dependent variable, and that explains why the "early"
predictor is related to the dependent variable. SES is an intervening
variable here, as it explains why race is related to turnout.
3) Specifying or Conditional variables- a predictor that
changes the relationship between another predictor and the dependent
variable. That is, the relationship has a different direction or magnitude
for different categories of the specifying variable. If a race gap in
turnout exists only among college grads in Mississippi but not among other
educational groups, then education is the specifying variable.
MODEL TESTED FOR ALL THREE SCENARIOS
RACE....................................> SES ...................................................> PARTICIPATION
RACE .................................................................................................> PARTICIPATION
SCENARIO 1:
BIVARIATE (includes low, medium, and high SES groups):
White Race | Black Race | |
Low Participation | 40% | 60% |
High Participation | 60% | 40% |
Column % Totalled | 100% | 100% |
MULTIVARIATE (Low SES group only):
White Race | Black Race | |
Low Participation | 70% | 70% |
High Participation | 30% | 30% |
Column % Totalled | 100% | 100% |
MULTIVARIATE (Medium SES group only):
White Race | Black Race | |
Low Participation | 50% | 50% |
High Participation | 50% | 50% |
Column % Totalled | 100% | 100% |
MULTIVARIATE (High SES group only):
White Race | Black Race | |
Low Participation | 20% | 20% |
High Participation | 80% | 80% |
Column % Totalled | 100% | 100% |
RACE ...................................> SES .....................................> PARTICIPATION
SCENARIO 2:
BIVARIATE (includes low, medium, and high SES groups):
White Race | Black Race | |
Low Participation | 40% | 60% |
High Participation | 60% | 40% |
Column % Totalled | 100% | 100% |
MULTIVARIATE (Low SES group only):
White Race | Black Race | |
Low Participation | 40% | 60% |
High Participation | 60% | 40% |
Column % Totalled | 100% | 100% |
MULTIVARIATE (Medium SES group only):
White Race | Black Race | |
Low Participation | 40% | 60% |
High Participation | 60% | 40% |
Column % Totalled | 100% | 100% |
MULTIVARIATE (High SES group only):
White Race | Black Race | |
Low Participation | 40% | 60% |
High Participation | 60% | 40% |
Column % Totalled | 100% | 100% |
RACE............................................................> SES
RACE............................................................> PARTICIPATION
SCENARIO 3:
BIVARIATE (includes low, medium, and high SES groups):
White Race | Black Race | |
Low Participation | 40% | 70% |
High Participation | 60% | 30% |
Column % Totalled | 100% | 100% |
MULTIVARIATE (Low SES group only):
White Race | Black Race | |
Low Participation | 70% | 80% |
High Participation | 30% | 20% |
Column % Totalled | 100% | 100% |
MULTIVARIATE (Medium SES group only):
White Race | Black Race | |
Low Participation | 50% | 60% |
High Participation | 50% | 40% |
Column % Totalled | 100% | 100% |
MULTIVARIATE (High SES group only):
White Race | Black Race | |
Low Participation | 30% | 40% |
High Participation | 70% | 60% |
Column % Totalled | 100% | 100% |
................................................................(40% multivariate)
RACE .........................................> SES .................................> PARTICIPATION
RACE .....................................................................................> PARTICIPATION
.......................(30% bivariate; 10% multivariate)
MODEL OF GENDER AND SENIORITY AFFECTING JOB SECURITY
GENDER ....................................> SENIORITY ..............................> JOB
GENDER ...........................................................................................> SECURITY
BIVARIATE: Gender .......> Job Security
Men | Women | |
Fired | 22% (55) | 50% (75) |
Kept Job | 78% (195) | 50% (75) |
100% (250) | 100% (150) |
BIVARIATE: Gender .......> Seniority
Men | Women | |
Low Seniority | 20% (50) | 67% (100) |
High Seniority | 80% (200) | 33% (50) |
100% (250) | 100% (150) |
BIVARIATE: Seniority .......> Job Security
Low Seniority | High Seniority | |
Fired | 70% (105) | 10% (25) |
Kept Job | 30% (45) | 90% (225) |
100% (150) | 100% (250) |
MULTIVARIATE (Low Seniority Group Only):
Men | Women | |
Fired | 70% (35) | 70% (70) |
Kept Job | 30% (15) | 30% (30) |
100% (50) | 100% (100) |
MULTIVARIATE (High Seniority Group Only):
Men | Women | |
Fired | 10% (20) | 10% (5) |
Kept Job | 90% (180) | 90% (45) |
100% (200) | 100% (50) |
GENDER ................... > SENIORITY ........................> JOB SECURITY
MODEL OF PARTY ID AND ATTITUDE TOWARD NIXON PARDON AFFECTING VOTE
PARTY ID ....................................> PARDON .... ..............................> 1976 PRESIDENTIAL
PARTY ID ...........................................................................................> VOTE
BIVARIATE: Party Id .......> Presidential Vote
Democratic Party Id | Republican Party Id | |
Carter (Dem) Vote | 80% (400) | 10% (30) |
Ford (Rep) Vote | 20% (100) | 90% (270) |
100% (500) | 100% (300) |
BIVARIATE: Party Id .......> Attitude toward Ford Pardon of Nixon
Democratic Party Id | Republican Party Id | |
For Pardon | 10% (50) | 83% (250) |
Against Pardon | 90% (450) | 17% (50) |
100% (500) | 100% (300) |
BIVARIATE: Attitude to Pardon .......> Presidential Vote
For Pardon | Against Pardon | |
Carter (Dem) Vote | 22% (65) | 73% (365) |
Ford (Rep) Vote | 78% (235) | 27% (135) |
100% (300) | 100% (500) |
MULTIVARIATE (Among Democrats Only)
For Pardon | Against Pardon | |
Carter (Dem) Vote | 80% (40) | 80% (360) |
Ford (Rep) Vote | 20% (10) | 20% (90) |
100% (50) | 100% (450) |
MULTIVARIATE (Among Republicans Only)
For Pardon | Against Pardon | |
Carter (Dem) Vote | 10% (25) | 10% (5) |
Ford (Rep) Vote | 90% (225) | 90% (45) |
100% (250) | 100% (50) |
PARTY.......................................> Attitude to Pardon
IDENT........................................>Presidential Vote
Statistical inference is our ability to generalize a relationship found in a sample to the entire population from which that sample was drawn. That is, can we infer population characteristics from sample data. If our statistical inference test suggests that in the population the relationship between the two variables is nonrandom, the relationship is said to be statistically significant.
For example, our 1996 Mississippi Poll sampled only 601 adult
Mississippians from a population of over two million. We found a definite
relationship in the sample between gender and seat belt use. 60% of women
said they "always" used their seat belts, compared to only 42% of men. 9%
of men said they "never" used their seat belts, compared to only 4% of
women. The magnitude of this relationship between gender and seat belt use
was 12%: [(60-42) + (9-4)] / 2. But can we generalize this relationship
found in the sample to the entire population? Is there a relationship
between gender and seat belt use in the entire population? Statistical
inference is the procedure we use to determine if any relationship exists
in the entire population.
In this example, the chi-squared (Pearson) is 22.9 with 3 df, and
is significant at .001 level. Only 1 chance in a thousand that no
relationship exists in the population.
Two tests of statistical inference:
1) Chi-squared is for nominal level variables. Hence, it does not provide information about the direction of the relationship, it simply indicates that a relationship exists in the population. Since the value of chi-squared tends to increase as sample size increases, it does not measure the strength of the association between variables.
Chi-squared = summation [ (fo - f e )squared / fe ]
For the expected frequency for each cell, multiply the column
total and the row total for that cell, and divide by the table total.
Degrees of freedom equal the number of columns minus 1 multiplied
by the number of rows minus 1.
Consult a Chi-squared chart.
On the SPSS output, use the Pearson chi-squared, which is the most
widely used form.
Warning: chi-squared should not be used if any cell has an expected value less than 1, or if more than 20% of the cells have expected values less than 5.
Example from Berman, Evan M., Public Administration Review, March/April 1997, Vol. 57 Issue 2, pages 105-113, "Dealing with Cynical Citizens" article, table 3, where he examines whether there is a link between the number of strategies that cities use to keep people informed about local government's actions and how much trust they have in city government.
-- OBSERVED FREQUENCIES--
Few strategies | Some or Many Strategies |
Row N Sizes | |
Trust Low | 37 (50.7%) | 65 (28.1%) | 102 |
Medium or High Trust | 36 (49.3%) | 166 (71.9%) | 202 |
Column N | 73 | 231 | 304 |
-- EXPECTED FREQUENCIES--
NUMBER OF STRATEGIES
Few Strategies | Some or Many Strategies |
Row N Sizes | |
Trust Low | 24.5 | 77.5 | 102 |
Medium or High Trust | 48.5 | 153.5 | 202 |
Column N | 73 | 231 | 304 |
The chi-squared computation for each cell is:
(37-24.5)2/24.5 = 156.25/24.5 = 6.4
(65-77.5)2/77.5 = 156.25/77.5 = 2.0
(36-48.5)2/48.5 = 156.25/48.5 = 3.2
(166-153.5)2/153.5 = 156.25/153.5 = 1.0
Summate these four cell results: 6.4+2+3.2+1 = 12.6
Chi-squared value is 12.6 with 1 degree of freedom. (2-1) * (2-1) = 1 df.
2) The t-test is an interval statistic (dependent variable must be interval). It tests the hypothesis that two groups have different means, and that the inter-group difference can be generalized to the population.
Two-sample t-test (SPSS-independent sample) means that each group is considered a sample.
A one-tailed t-test means that your hypothesis has a direction for the relationship. A two-tailed t-test is used to test nondirectional hypotheses. A two-tailed test is stricter, and SPSS does not report a one-tailed test, hence if your results are significant for the 2-tailed test, they will also be significant for the 1-tailed test.
Two statistics are reported-- for two populations having equal variances, or unequal variances.
The t-test is computed using a complex formula.
Degrees of freedom equals the
sum of the two sample sizes minus two.
t-value must be larger than table entry to be significant at the specified level.
Using SPSS program. Use Compare Means- Independent Samples Statistics Menu. Your Test Variable is your dependent variable, which should be interval level. Your Grouping Variable should be a dichotomous independent variable (recode it, when necessary). Use Levine test, which must be p <= .05 for equal variances; otherwise, use unequal variances row. Cite t-value and 2-tail sig. in papers. Significance Level must be <= .05.
Example of a t-test problem (drawn from 2004-2008 Mississippi Poll data).
Examining predictors of family income. Family income is an interval data, coded from a low of 1 for under $10,000 to a high of 8 for over $70,000. The following indicates what the average income codes are for pairs of categories of each predictor, as well as what the t-test significance level is. Answer the following two questions: For each predictor, what group has the higher family income; Is the t-test statistically significant for each of the following five predictors (remember, it must be significant at least at the .05 level)?
This technique finds the best fitting straight line through a set of points. Best fitting is defined by minimizing the sum of squared distances between the points and the regression line.
Equation of line is Y = a + (b * X), where Y is dependent variable, x is independent variable, a is the Y intercept, and b is the slope of the line, or (change in Y)/(change in X)
R2 is explained variance, the variance in Y explained by the independent variable's regression line.
R2 = (total variation - unexplained variation)/ Total Variation
Total Variation = sum of squared distances between the mean of Y and each case's Y value
Unexplained Variation (Residual) = sum of squared distances between each case's Y value and each case's predicted Y value (from the regression equation)
Explained Variation = sum of squared distances between each case's predicted Y value and the mean of Y.
b = unstandardized regression coefficient = slope = (change in Y) / (change in X)
Beta = standardized regression coefficient = b * (sdx/sdy), where sd means standard deviation. It adjusts for the differing ranges and scales of the variables.
Beta ranges from -1 to +1 with 0 being no relationship between the independent and dependent variables. The sign depends on the direction of the coding of your variables. A +1 or -1 is a perfect relationship. b values have a greater range which is not confined to 1 or -1.
Pearson R is the correlation coefficient. It equals the Beta in the bivariate case only. See p. 462 of text (p. 466 for 3rd edition) for formula used in calculating R.
R2 is the explained variation. It is the predictive ability of your independent variable.
Adjusted R2 shrinks the value of R2 by penalizing for each additional independent variable, and is statistically preferable to the R2. See p. 439 of text (p. 440 of 3rd edition).
The F statistic tests the statistical significance of the regression equation as a whole, and must be below .05. See p. 442 of text (p. 443 of 3rd edition).
Problem of outlying or deviant cases. See faculty example.
Example of calculating a biviarate regression problem.
You are asked to examine the relationship between years of service since receiving a PhD degree, and nine-month salaries of ten history professors. You need to plot the following points on graph paper, and then calculate the b value (unstandardized regression coefficient value or slope) and the y-intercept, as well as calculate what salary would have to be given to a senior professor with 30 years of service since their PhD was hired from another university, as well as what the starting salary would be (for someone with zero years of service who just got their PhD):
Multiple Regression is linear regression applied to more than one independent variable. With two independent variables, the predicted values comprise a plane (instead of a line in the one independent variable case).
Equation:
Y = a + b1x1 +
b2x2 +
b3x3 + b4x4
b value is the unstandardized regression coefficient, controlling for the effects of all other predictors. It is used to predict the value of the dependent variable from the known values of the independent variables.
b value is also used in making comparisons across subsamples. For example, if an independent variable is more important in affecting the dependent variable among men or among women.
Beta is the standardized regression coefficient, controlling for the effects of all other predictors. It tells the relative importance of the independent variables in influencing the dependent variable. It ranges from 0 to 1, with 1 being most important and 0 being least important. Negative signs reflect the direction of variable's coding.
Multiple r is the correlation between the actual Y value and the predicted Y value from the multiple regression equation.
R2 is the variance in the dependent variable explained by all of the independent variables.
Class lecture on regression assumptions and problems; class examples
Example of a multiple regression equation problem (taken from the 2004-2008 Mississippi Polls).
Predicting who believes they have been racially profiled. This dependent variable is coded 1 for not profiled, and 2 for being profiled. The independent variables and their coding follow:
The Betas or standardized regression coefficients for these predictors follow:
The significance levels for each of these regression coefficients follow:
The adjusted R-squared for this regression equation is 12%.
Using the above information, answer the following questions:
Multiple regression provides only the direct effects that independent variables exert on dependent variables. Yet outside variables may also affect the dependent variable by affecting an intervening variable in the model. Hence, an outside variable may exert an indirect effect on the dependent variable.
Total effects of an independent variable are equal to the sum of the direct effect of that variable and all of its indirect effects. Each indirect effect is the product of the effect that that outside variable has on an intervening variable, and the effect that the intervening variable has on the dependent variable.
Causal Modeling procedures.
1) Devise a model that shows temporal-causal ordering of the
variables
2) Use multiple regression SPSS program and regress each dependent
variable in the model on all of the independent variables that are
"earlier" than it is
3) Draw arrows for all statistically significant linkages. Put
Betas just above each line.
4) Indirect effects involve multiplying the relevant Betas
together
5) Total effect = direct effect + indirect effects
Pre-test ---------------> Stimulus ----------------> Post-test
Experimental Group
Pre-test ---------------------------------------------> Post-test
Control Group
Also, both groups must be equal in composition. Ensure equality by: matching; random assignment.
Internal invalidity problems— inferences (conclusions) drawn are not an accurate reflection of what actually happened.
External Invalidity Problems- unable to generalize to a population
(Note: internal and external invalidity problems derived from Donald T. Campbell and Julian C. Stanley's Experimental and Quasi-Experimental Designs for Research, Houghton Mifflin Co., 1963, pages 5-6.)
Solomon 4 Group Design- use same two groups from the classical experimental design, include two more groups. One having stimulus-posttest only, and another having only the posttest. Must have equal groups, which then assumes equal pre-test scores.
Post-Test Only Design- one experimental and one control group, no pre-tests. Groups must be equal, use randomization
Factorial Designs- used with 2 or more stimuli
Classical Experimental Design is strong on internal validity, but weak on external validity
Quasi-experimental designs are only moderate on internal validity, since they are natural-occurring experiments, and people cannot be randomly assigned to the groups.
Two major types of quasi-experiments:
1) Time Series Design- multiple pre-tests before stimulus; multiple post-tests after stimulus; no control group. Failure to control for numerous threats to internal validity of quasi-experiment.
2) Control Series Design- two time series, one for experimental group, one for control group. Must have groups that are as comparable as possible. Controls for many internal validity problems.
Correlational Design— extensive social science research, such as
survey research.
This is a post-test only design, with statistical controls used to
simulate experimental and control group. However, random assignment is not
used to create groups.
One shot case study is a pre-experiment weak on both internal and external validity. It consists of a stimulus and a post-test.
To test the hypotheses in my model, I used the 1998 telephone survey conducted by the Survey Research Unit of the Social Science Research Center at Mississippi State University. A random sampling technique was used to select the households, and a random method was employed to select one individual in each household to interview. Six hundred eight adult Mississippi residents were interviewed from April 14 to April 26, 1998. The response rate was 64%. The sample was adjusted by demographic characteristics (education, sex, race, adults, phone numbers) to ensure that all social groups were adequately represented in the survey. Census data for 1996 were used to obtain population estimates for education, and census data from 1990 were used for race and sex population estimates. With 608 people surveyed, the sample error is plus or minus 4%, which means that if every Mississippi resident had been interviewed, the results could differ from those reported here by as much as 4%.
Information on the methods used in each year of the Mississippi Poll is provided here.
Ecological fallacy is the incorrect assumption that relationships existing at the aggregate level also exist at the individual level.
Example of religion and presidential vote in the 1940s. Two tables showing individual level relations and aggregate marginal results.
First example from 1990 census- foreign born and college
degrees aggregate relationship
STATE.....% FOREIGN BORN.....% COLLEGE DEGREE
Mass...................9%......................20%
N.H....................5%......................18%
Vermont................4%......................19%
N.Y...................14%......................18%
N.J...................10%......................18%
Alab...................1%......................12%
Ark....................1%......................11%
La.....................2%......................14%
Miss...................1%......................12%
Ga.....................2%......................15%
S.C.................2%......................13%
The above table suggests that the foreign born are more likely to have college degrees than are U.S.-born adults. Such a conclusion would be committing the ecological fallacy. In reality, the data are merely indicating that states (not people) with a higher percentage of foreign born residents are also states that happen to have a population that contains a greater percentage of college educated adults, compared to states with a lower percentage of foreign born residents. The relationship between foreign born and education is a spurious (non-causal one); states with well-funded education systems tend to be located in the Northeast and Midwest, and those are the same states where many immigrants settle.
Second example from 1990 census- % black and %
Republican presidential vote at state level of analysis
STATE.....% BLACK.....% REPUBLICAN PRES. VOTE IN 1988
Alabama........25%.............59%
Georgia........27%.............60%
Miss...........36%.............60%
Virginia.......19%.............60%
Iowa............2%.............44%
Minn............2%.............46%
Penn............9%.............51%
Wash............3%.............48%
Wisc............5%.............48%
The above table suggests that African-Americans are more likely to vote Republican for President than are whites. Such a conclusion would be committing the ecological fallacy, since the table provides aggregate data, not individual-level data. The table in reality is merely showing that states having a high percentage of African-Americans are also states that just happen to be more likely to vote Republican for President, compared to states having a lower percentage of African-Americans. The relationship between race and vote at the state unit of analysis is a spurious, non-causal one. African-Americans merely happen to be concentrated in southern states, since such states historically relied on slavery on large plantations, and southern whites tend to be more conservative politically than are whites in the north.
Third example from 2010 census- % black and %
Republican presidential vote at state level of analysis
STATE.....% BLACK.....% REPUBLICAN PRES. VOTE IN 2008
Alabama........26%.............60%
Arkansas.......15%.............59%
Georgia........31%.............52%
Miss...........37%.............56%
Iowa............3%.............45%
Minn............5%.............44%
Penn............11%.............44%
Wash............4%.............41%
Wisc............6%.............42%
Example from Joe Parker book on Mississippi electoral patterns, 1st edition
Problems with cross-sectional surveys that gather data at only one time
point:
1) Inability to study change
2) Hard to make recursive causal assumptions
Panel design definition: the same people, asked the same questions, at two or more time points. Each time point is called a wave.
Problems with panel designs:
Examples of panel studies:
1) National election studies panels of 1956-58-60, 1972-74-76, and
1992-94-96. The second panel study was able to study the effects of
Watergate in the 1970s. The third panel study examined how party
identification and issue attitudes had reciprocal effects, each
affecting the other. It also found reciprocal effects for external
efficacy and turnout.
2) 1980, 4 wave U.S. national election study. It examined the effects of
campaigns on voters. It found that dissatisfaction with President
Carter's leadership caused his defeat.
3) The M. Kent Jennings panel of high school seniors and their parents.
Wave 1 was in 1965, wave 2 in 1973, and wave 3 in 1982. Subject was
socialization and persistence of attitudes over time. It found that
political orientations tended to stabilize by the time people were 30
years old.
Problems with obtrusive measures: (derived from the book
Unobtrusive
Measures: Nonreactive Research in the Social Sciences, by Eugene Webb,
Donald T. Campbell, Richard D. Schwartz, and Lee Sechrest; Rand McNally
Co., 1972)
1) Guinea Pig or Testing Effect: subjects may feel must leave
a good impression, or test may make them interested in subject
2) Role Selection: nonrepresentative role selected, especially
by less educated and less familiar with subject of test
3) Response Sets, such as acquiescence bias, sequence, wording:
Mississippi Poll, spending items
4) Interviewer Effect: race, age, and sex of interviewer may
affect responses
Unobtrusive Measures directly remove the researcher from the research setting.
Types of Unobtrusive Measures (Source: Research Methods in the Social Sciences, 5th edition, by Chava Frankfort-Nachmias and David Nachmias; St. Martin's, 1996, pages 315-324)