Changes in the notes throughout the class will be indicated here.

RESEARCH METHODS CLASS NOTES

TOPIC ONE: INTRODUCTION TO THE COURSE

Mention the research paper and in-class tests. Mention the MSU Honor Code.

TOPIC TWO: THE SCIENTIFIC METHOD, THE HISTORY OF THE DISCIPLINES OF POLITICAL SCIENCE AND PUBLIC ADMINISTRATION, AND THE FUTURE OF PUBLIC ADMINISTRATION

THE HISTORY OF THE DISCIPLINE OF POLITICAL SCIENCE

Classical era- 700 BC to 1850 AD

Philosophical orientation, ask the "ought", how should things be, asks what justice is, who should rule (the wise or the multitude), what are the obligations of citizens and of government.

Institutional era- 1850 to 1900 AD

Traditional approach, focus on institutional process, how a bill becomes a law, the structure of the government, a legalistic case-study approach, a nation is seen as unitary and as a rational actor, very descriptive approach, very historical method.

The Traditionalist approach to analysis combines the classical and institutional eras.

Transitional era- 1900 to 1945. Problems of irony of form, pluralism exists.

Behavioral era- 1945 to present- characteristics are:

1) Science, theory, predictions, explanation, patterns:
Examples:
a) Theories of presidential voting behavior: sociological theory of voting, such as race and income affecting voter's vote choice; social-psychological University of Michigan model using party identification, issues, and candidate evaluations; simple satisfaction versus dissatisfaction predicting vote for presidential party's candidate.
b) Southern state legislative groups: white Republicans are fairly conservative; African-American Democrats are fairly liberal; and white Democrats are essentially centrist or moderate, depending on the issues.
c) Mass-elite study, which party organization is closer to the average voter on issues? In 1991 and 2001 Alabama-Mississippi study, it was Democrats, though their organization has moved to the left over the years. A contrary study examined why Republicans now control a majority of U.S. House and Senate seats in the South; it found that Democratic congress members and Senators had steadily moved ideologically to the left since 1970, suggesting that Democratic "elites" in today's South may have become too liberal for many white southern voters.

2) Data gathering and research are theory directed
Examples:
a) For presidential voting behavior, national survey of voters was conducted, asking their party identification, their attitudes on public issues, and their likes and dislikes of the major party candidates; such a national study would also ask voters' their race and income, and whether they were financially satisfied or dissatisfied.
b) For southern state legislative factions, we identified legislators' party from their websites, their race from their pictures, and their roll call votes from newspapers' reports.
c) For mass-elite study, we conducted mail surveys of Democratic and Republican county executive committee members in Mississippi and Alabama, and statewide telephone polls of average adults in both states. We asked identically worded questions on about twenty different public policy issues, including ideological self-identification and party identification.

3) Value free
Examples:
a) We simply seek to predict and explain how the political world operates, we do not let our own opinions about how it should operate influence our research. Hence, though conservatives may claim that Reagan won in 1980 because of his conservative philosophy, and liberals may claim that Clinton won in 1992 because of his moderate liberal philosophy, our research may indicate that each won merely because voters were dissatisfied with the economic recessions (and in 1980 the foreign policy crises).
b) A researcher may be a disillusioned liberal who believes that African-American lawmakers are isolated from all other lawmakers, but the data may show that white Democrats often vote with them on education, race, and election issues, and that Republican lawmakers lose on more roll call votes than do black Democrats (if Democrats control the legislature numerically).
c) A researcher may be a conservative Republican who has friends in the state Republican party headquarters, but the data may show that average voters are essentially moderate, that Democratic party organization members in the South until the turn of the century were moderate liberal, while Republican party organization members were conservative. Hence, especially on education and health care issues, Democratic party members were closer to average southern voters than were Republicans, at least until the turn of the century. A researcher may be a liberal, but the data would show how southern U.S. House and Senators from the South had more and more liberal voting records as the decades passed from 1970 to 2010; hence, today's Democratic elites are too liberal for many conservative white southern voters.

4) Interdisciplinary- sociology, psychology, economics
Examples:
a) The earlier American presidential election studies of the 1940s relied heavily on sociology, proposing that group membership affected the party voted for outside of the South. Rurality, Protestantism, and higher income predicted more Republican votes, while urban residence, Catholicism, and lower income predicted more Democratic votes.
b) My study of Balance Theory drew on psychology. People tend to acquire and retain psychologically consistent beliefs and attitudes. If a person likes a candidate, they tend to believe that the candidate agrees with their own positions on issues regardless of whether the candidate actually does; if a voter dislikes a candidate, they tend to believe that they are in disagreement with the candidate on the issue.
c) Shaffer worked with economics professor (Chressanthis) in studying whether U.S. Senate election margins were accountable to the public, which they were in an indirect sense. Elections were affected by presidential coattails, campaign spending, divisive primaries, and preceding election margin. Economic conditions in the state and federal pork barrel dollars did not affect the elections.

5) Methodological sophistication-
Examples:
a) We conduct national public opinion polls that are representative of the nation's diversity. We do not conduct shopping mall polls, or phone-in or internet polls that fail to reflect the views of lower socioeconomic classes. So we can test whether the sociological group, social-psychological, or economic models of presidential voting are upheld. In yet another published study, I used such national polls from 1960 thru 1976 to explain how voter turnout declined due to decreased political efficacy, decreased partisan intensity, and decreased newspaper readership.
b) The southern state legislative factions research started with one southern state, Mississippi, in only a few years. We expanded to a twenty year time frame in Mississippi. Then, we added other southern states, like Georgia, Florida, Arkansas, and Texas.
c) My mass-elite linkage study started with just Mississippi in one year, but then I contacted Pat Cotter at the University of Alabama and we had a second state for confirmation. We also did the study originally in 1991, and then repeated it in 2001. Finally, in 2001 we also included a national survey that was representative of the entire South, which had many spending preference items.
d) Shaffer's study of balance theory relied on the 1994-1996 American national panel study to examine cognition change over time; panel studies follow the same people over time.
e) Shaffer and Chressanthis study of Senate accountability used pooled time-series, cross-sectional approach. All even-numbered years from 1976 thru 1986 were included, as were all 33 state contests in each election year. Regression and probit were used.

6) Individual and group level of analysis
a) The presidential voting studies used the individual voter as the unit of analysis.
b) The southern state legislative factions also looked at individuals (legislators in this case), but they combined them into three groups based on their race and party.
c) Balance theory and voter turnout studies also looked at individuals.
d) Mass-elite study looked at individuals of different types, the mass voter versus the elite party member.

Criticisms of Behavioralism- are people and events predictable, can we be value free; discuss

History of Public Administration.

Methodological Issues in PA

TOPIC THREE: THEORY BUILDING

Four characteristics of a good theory:

1. Explanation- why does something happen
Examples:
a) Presidential voting models. People vote Democratic because they psychologically identify with the Democratic party, because they are liberal, and because they prefer the Democratic presidential candidate's characteristics. Or, people hold the President's party responsible for economic conditions in the country, so they tend to vote for the President or his party's successor when things are going well, and they tend to vote against him or his party's successor when things are going badly.
b) Southern state legislative factions. White conservatives are gravitating toward the more conservative party nationally, the Republicans, therefore white Republican legislators tend to vote conservatively. Liberal African-Americans tend to join the more liberal party nationally, the Democrats, so African-American Democratic legislators tend to vote liberally. Moderate whites tend to join the more ideologically inclusive party in the modern South, so they tend to be Democrats; hence, white Democratic legislators tend to vote moderately.

2) Prediction- if we know people's positions on the independent variables, can we predict their positions on the dependent variables
Examples:
a) In presidential vote model, if a voter is a Democrat, a liberal, and prefers the Democratic candidate's attributes, we predict that they would vote for the Democratic presidential candidate. If a voter is a Republican, a conservative, and prefers the Republican candidate's attributes, we predict they would vote for the Republican presidential candidate.
b) In the southern state legislative project, we predict that African-American Democrats will tend to vote more liberally, against anti-crime measures, for public education projects, and for affirmative action programs. We predict that white Republicans will tend to vote in the opposite manner, in a conservative direction. We also predict that white Democrats will tend to vote somewhere in between these two groups.
c) Clinton impeachment vote was very partisan in committee. In House Judiciary Committee, conservative white male Republicans opposed demographically diverse liberal Democrats. Click here for info about the Judiciary Committee members.

3) Generalizability- does theory apply to different situations and circumstances and time and geographic areas
Examples:
a) Presidential vote model. Can apply to other offices, such as U.S. Congress, governor, and state legislature. Applies to any time span; 19th century would have different parties though (Whigs and Democrats, Federalists and Democratic-Republicans). Can apply to different geographic areas, such as other nations (Ohio State professor Bradley Richardson used party identification model in Japan, Netherlands, Germany, France, Britain, Italy).
b) Southern state legislative factions project. Can be generalized to other southern states, even to northern states and the Congress, as the literature indicates. Can be generalized over time, such as 1980 to present. Can it be generalized to other nations having a newly empowered group, such as South Africa?

4) Parsimony- simple with few independent variables, simplest theory is best if everything else is equal
Examples:
a) Presidential Vote models. Is parsimonious, as has only three predictors--party identification, issues, candidates. The economic dissatisfaction model has even fewer predictors--one.
b) Southern state legislative factions project. It has only two predictors--party and race of legislator. The dependent variable is less parsimonious, as it is not merely ideology, but different types of issues such as education, crime, race issues.

Example of Predictive Ability of a Theory.

The party identification model. The last eight presidential elections (since 1984, inclusive) were very competitive with Democrats winning four and Republicans winning four. So if we had no other information about a state like Mississippi, we would predict that a Mississippi survey respondent would have a 50-50 chance of voting Democratic or Republican. Our predictive success improves once we ask a respondent what their party identification (a 7-point scale) is. How they vote follows (using the Mississippi Poll data):

Many conservative whites switched to the GOP in the last two decades of the 20th century, so let us repeat this analysis, examining only the first four presidential elections in the 21st century, where Republicans won the first two with Bush, and Democrats won the last two with Obama.

Hypothesis Testing

Independent variable is the predictor; it comes first temporally and causally, it causes the dependent variable.

Dependent variable is the effect, it is being caused by the independent variable.

Ideology --------------------------> Presidential Vote
(Independent var.).......................(Dependent Variable)

Hypothesis is a statement of a relationship between concepts.

Example: self-identified conservatives are more likely to vote Republican, compared to self-identified liberals.

Hypothesis test- example with crosstabulations, put independent variable at top, dependent variable at the side. Calculate column percents.

VOTE FOR:
LIBERAL
MODERATE
CONSERVATIVE
BARACK OBAMA 81% 67% 27%
MITT ROMNEY 19% 33% 73%
100% 100% 100%


...................................Theory

.....................................|

....................................\|/

................................-Hypothesis-

..............Concept <------------------------> Concept
.......................(Relationship between concepts)

The hypothesis above is at the theoretical level- general, abstract

............Indicator <------------------------> Indicator
..............(Relationship between indicators; hypothesis testing)

Operationalizing your concept is to select specific indicators of your abstract concepts. Hypothesis testing occurs at the indicator level, and it measures the relationship between the indicators.

If hypothesis is rejected, maybe the indicator is not valid.

Religiosity example of a theory.

At the theoretical level, the two principal concepts are Social Deprivation and Religiosity. The principal hypothesis at the theoretical level is that people who are socially deprived are more likely to be intensely religious than are people who are not socially deprived.

Operationalizing the concepts is to choose valid, specific indicators of those concepts. One indicator of religiosity might be frequency of church attendance. An indicator of social deprivation might be annual family income before taxes. The major problem with operationalizing one's concepts is whether the indicators are valid measures of those theoretical concepts. Is a person who attends church twice a week necessarily more religious than someone who never attends church, but who reads the Bible and prays daily? Is a person with a large family income, but who also has a large family size, necessarily well-off financially? Can you think of more valid indicators of these concepts of social deprivation and religiosity?

Hypothesis Testing measures the relationship between the indicators. Are people with low family incomes more likely to attend church weekly, compared to people with high family incomes? Are people with lower net financial worths more likely to pray daily, compared to people with high net financial worths? If your hypothesis is rejected, there may be two reasons. Perhaps your theory is rejected, or perhaps your indicators are not valid measures of your concepts.

Actual Results of This Hypothesis Test:

Using the 2004-2012 Mississippi Poll, no substantively significant relationship was found between reported family income and reported frequency of church attendance (indeed, the relationship was the reverse of what we hypothesized).

TOPIC FOUR: INTRODUCTION TO RESEARCH PAPER, MODEL, HYPOTHESES

YOUR RESEARCH PAPER

1) Introduction- discuss the importance of your subject. Discuss your initial expectations. Example of gender gap in presidential voting--why are women voting slightly more Democratic than are men? Why is this subject important? Why do you think this female Democratic bias is occurring?

2) Your model and hypotheses. List all five of your hypotheses, and draw your model.

Example of a model and its hypotheses:
Assume that sex is the earliest, independent variable; presidential vote is the latest, dependent variable; ideology and income are the two intervening variables located between sex and vote.

SEX........(H1).......> Ideology .....(H2).....> PRESIDENTIAL
Male or...................(H3)..............................> VOTE
Female.....(H4)........> Income ......(H5)........> (D or R)

The hypotheses are:
H1: Women are more likely to be liberal, compared to men.
H2: Liberals are more likely to vote Democratic for President, compared to conservatives.
H3: Women are more likely to vote Democratic for President, compared to men.
H4: Women are more likely to have lower incomes, compared to men.
H5: Lower income people are more likely to vote Democratic for president, compared to higher income people.

3) Literature review. Need at least 10 academic sources. The articles should be grouped by hypothesis, even if you must discuss the same article more than once. Most students use an on-line database for their literature search, such as JSTOR (website: http://www.jstor.org/action/showAdvancedSearch). For my on-line bibliography of articles since 1975 in four political science journals, click here. When in the internet, click on EDIT at top of page, then click on FIND (ON THIS PAGE), and then type in the keyword in the FIND WHAT box. Keep clicking on the FIND NEXT box to find multiple articles. Also, use different keywords for each of your variables (concepts).

4) Methods section. Provide information for each of the years of the Mississippi Poll that you are using. For information about the polls, click here. Information on the sampling methods used in each year is provided here. Three sample paragraphs for your paper follow:

METHODS

To test my model, I used information drawn from The Mississippi Poll project, a series of statewide public opinion polls conducted by the Survey Research Unit of the Social Science Research Center (SSRC) at Mississippi State University and directed by political science professor Stephen D. Shaffer. In order to maximize my sample size and therefore minimize my sample error, I combined or pooled telephone surveys conducted in two years-- 2000 and 2004. The 2000 Mississippi Poll surveyed 613 adult Mississippi residents from April 3 to April 16, 2000 and had a response rate of 49%, while the 2004 Mississippi Poll surveyed 523 adult Mississippi residents from April 5 to April 21, 2004 for a response rate of 48%. The two years combined contained only 765 likely voters- respondents whose responses to three questionnaire items indicated that they were likely to vote in the presidential election, and to vote for candidates of the two major parties. With 765 likely voters interviewed, the sample error is 3.6%, which means that if every Mississippi likely voter had been interviewed, the results could differ from those reported here by as much as 3.6%. The pooled sample was adjusted or weighted by demographic characteristics to ensure that social groups less likely to answer the surveys or to own telephones were also represented in the sample in rough proportion to their presence in the state population. In both years, a random sampling technique was used to select the households and each individual within the household to be interviewed, and no substitutions were permitted. The SSRC's Computer Assisted Telephone Interviewing System (CATI) was used to collect the data.

I relied on four variables included in both years of the Mississippi Poll. Sex is very straightforward, while income was measured by reported total family income before taxes in the year before each survey. The presidential vote asked respondents six months before the election which of the two major party candidates they planned to vote for if the election were held today. Ideology was a self-identification question, asking respondents the following questions: "What about your political beliefs? Do you consider yourself very liberal, somewhat liberal, moderate or middle of the road, somewhat conservative, or very conservative?"

In order to have enough people to analyze using multivariate tables, I recoded or combined categories of two of the variables. Eight income categories were recoded into three levels--low income was defined as families making less than $20,000 a year, middle income was considered as $20-40,000 per year, and high income included families making over $40,000 annually. Five ideological self-identification categories were combined into three groups-- liberals included those considering themselves as "very" or "somewhat" liberal, conservatives were those identifying themselves as "somewhat" or "very" conservative, and the middle category of "moderate/middle of the road" constituted an intermediate "moderate" grouping. Sex and presidential vote already had only two categories for each, so they did not have to be recoded.

5) Findings-- bivariate. Test each of your 5 hypotheses using crosstabs. Compare percentages using complete sentences, which test your hypotheses. Mention the direction of the relationship, the magnitude of the relation using gamma or average percentage difference, and statistical significance level using chi-squared. Also, draw all tables and provide variable and value labels, and column percents and totals.

TABLE 3

SEX DIFFERENCES IN PRESIDENTIAL VOTE

Male Sex Female Sex
Gore or Kerry (D) Vote 41% 43%
George Bush Jr. (R) Vote 59% 57%
N Size (359) (406)

Gamma = -.04
Chi-squared > .05
Note: Percentages total 100% down each column.
Source: 2000 and 2004 Mississippi Polls, conducted by Mississippi State University.

Example of text paragraph:

Hypothesis 3 of my model states that women will be more likely to vote Democratic for president, compared to men. In the 2000 and 2004 Mississippi Polls, 43% of women indicated that they intended to vote for Democratic presidential candidates, compared to a slightly smaller 41% of men who indicated an intended Democratic vote. However, this percentage difference in Democratic vote between the sexes is only 2%, and the gamma value reflecting the magnitude of the relationship between sex and the presidential vote is a mere -.04. Furthermore, the Chi-squared statistic is not significant at the .05 level, indicating that we cannot generalize this weak relationship between sex and the presidential vote, found in the 2000 and 2004 statewide polls, to the entire population. Hence, my hypothesis that women are more likely than men to vote Democratic for president is rejected.

6) Findings- multivariate. At least control for your two intervening variables. Provide information listed in 5. What do these multivariate tables tell you about which of the variables is important in influencing the dependent variable, and about how important each is.

7) Conclusions- Redraw your model, discuss your findings and literature, suggestions for future research.

8) References- alphabetize your references by authors' last name. Give full citations for scholarly articles, books, and other citations.

Review a sample model and hypotheses.

TOPIC FIVE: ASSIGNMENT- REVIEW THE MISSISSIPPI POLL CODEBOOK

Review the Mississippi Poll codebook, and choose four variables that will constitute the model that you will do your research paper on. Now, draw up the model, and type the exact wording of your five hypotheses. Turn this in to me at the next class.

An easy-to-read summary of the Mississippi Poll codebook, which includes the variables that are included in multiple years, is available here.

Examine the three samples of student research papers:
Sample one
Sample two
Sample three

TOPIC SIX: RESEARCH DESIGN

10 STAGES OF A RESEARCH DESIGN

1) Problem Formulation- what are you studying, why is it important. Rivenbark article, casino gambling, importance due to regressivity, hurts poor, addiction.

2) Literature review- thorough. Political science journals are: American Political Science Review, American Journal of Political Science, Journal of Politics, American Politics Quarterly, Public Opinion Quarterly. For a list of on-line political science articles, click here.

Public administration journals: Public Administration Review, American Review of Public Administration; check syllabus for a sample of articles and journals.

Literature suggests hypotheses.

3) Identify Unit of Analysis- what are you collecting data on, getting information about what units.

The four units of analysis are: Individual, county, state, nation
a) Individual level examples are public opinion polls.
b) County level example is a public policy study examining spending in each of Mississippi's 82 counties.
c) State level example is a public policy study examining spending in each of the nation's 50 states.
d) Nation unit of analysis example may be relating each of the world's nation's suicide rate to its absence of Catholicism in its population.

Test your ability to identify the unit of analysis of ten different studies by going back to the directory for this class, and accessing one of the sample tests for Test 1

4) Design data collection mode- survey, roll call, aggregate (unit analysis above individual), content analysis:
a) Survey is a public opinion survey. It can be of the mass population, or of a more specialized group, such as government workers.
b) Roll call mode deals with congressional or state legislative votes on public issues, and often includes demographic characteristics of their district's constituents.
c) Aggregate mode deals with a level of analysis higher than the individual. It deals with cases that combine numbers of individuals, such as counties, states, etc. The data are often secondary data analysis, collected by government agencies.
d) Content analysis is a study of the characteristics of messages, such as how ideologically biased is the mass media, and how many liberal or conservative themes are voiced by a President or governor

5) Pre-test survey anticipates validity problems with indicators, and suggests variables you left out. For a statewide public opinion poll of 600 Mississippians who are asked 100 questions, you might ask a random sample of 25 Starkville residents the 100 questions, and then ask the interviewers whether the respondents had difficulty answering any of the questions, and if so why.

6) Data collection, surveys use CATI system, or secondary data analysis (use existing dataset).
CATI stands for Computer-Assisted Telephone Interviewing system, and is used for the researcher to collect her own data on an original study.
Secondary data analysis relies on existing data sources, such as the University of Michigan National Election Studies conducted every two years, or the MSU Mississippi Poll conducted every two years.

7) Data reduction, usually obsolete with CATI, often needed with in-person and mail surveys; enter data into SPSS program. Demonstrate in class.

8) Design statistical analysis technique, do a simple one first such as crosstabs.

9) Perform analysis, get results, show tables and results, discuss results.

10) Conclusions- what you found, so what, importance, theory upheld or rejected, future research directions.

TOPIC SEVEN: INTRODUCTION TO LITERATURE REVIEWS AND LIBRARY WORK

ASSIGNMENT DUE: TURN IN MODEL CONTAINING FOUR VARIABLES AND FIVE ARROWS, AND FIVE HYPOTHESES, AND AN INTRODUCTION TO YOUR TOPIC.

Refer to your research paper model, and the five hypotheses you have proposed to conduct your literature review. You should find scholarly journal articles that examine each of these five hypotheses. Most students use an on-line database for their literature search, such as JSTOR (website: http://www.jstor.org/action/showAdvancedSearch). You can also use the professor's on-line bibliography of journal articles. Click on "EDIT" and then "FIND IN PAGE". In the Box that says "FIND WHAT?" type in the name of one of your variables. Keep finding relevant journal articles. If you don't find enough, slightly vary the name of that variable. Now repeat this step with the other three variables. Go to the library on the bottom floor, and find each journal in the stacks in alphabetical order by the name of the journal.

TOPIC EIGHT: ETHICAL CONCERNS

Stanley Milgram study- obedience to authority.

Informed Consent, components of-

Anonymity versus Confidentiality-
Anonymity- no one can identify a person with their responses
Confidentiality- researcher knows who the respondent is, but promises not to tell anyone

Examples of informed consent:
1) Mississippi Poll
2) NSF Grassroots Party Activists cover letter

One must never harm subjects.
MSU Human Subjects form approval (this is an older form, but it is referenced because it clearly and concisely asks about important issues)
Subpoena problem, so if confidential data convert into anonymous data as soon as possible

Studies having ethics problems:
1) MSU literacy study, when suspected interviewer fraud results in Attorney General request for respondent info
2) Ray Cleere's workplace study included identifiable questions and political questions, and MSU dropped out of it
3) NSF Grassroots Party Activists study- ICPSR deleted county and state variables

Political biases are a major problem in funded research:
1) Media sensationalism- 1982 Clarion-Ledger Senate poll
2) Official suppression of studies they disagree with— Mabus governmental child care study suppressed by Fordice administration

ASPA Code of Ethics: 5 sources of ethics
1) Serve Public Interest: oppose discrimination and harassment, promote affirmative action; public right to know; involve citizens in decisionmaking
2) Respect Law and Constitution: change obsolete, counterproductive laws; prevent mismanagement of public funds, need audits; protect privileged information; whistleblower protect
3) Personal Integrity: give others credit for their work-plagiarism; avoid appearance of conflict-of-interest, such as nepotism, gift acceptance, misusing public resources, improper outside employment; act nonpartisan in actions; admit own errors
4) Ethical Organizations: promote creativity, open communication among workers; permit dissent, no reprisal, due process used; merit use
5) Professional Excellence: keep current on new issues, problems, upgrade professional competence; professional associations active; help public service students, like internships provide

Review the full text of ASPA's ethics code.

TOPIC NINE: LEVELS OF MEASUREMENT

LEVELS OF MEASUREMENT

NOMINAL- lowest level of measurement, mere classification. No ability to order the categories.
Examples are religion. Use crosstabulations.

ORDINAL- able to order the categories of the variable in terms of a category having more of something than the next category. But can't determine how much more of that quality that the category has compared to the other category.
Example is rating job performance of public officials into excellent, good, fair, or poor categories.

INTERVAL- able to order the categories, and also determine how much of the quality the category has. Usually has numbers that have meaning to denote how much of the quality each category has.
Example is income. Use regression techniques.

Test your ability to classify indicators by nominal, ordinal, and interval levels of measurement by turning to the sample tests, test 1. Click here.

TOPIC TEN: RELIABILITY AND VALIDITY

RELIABILITY

Definition- repeated measurements of a concept (the indicator) should yield similar results.

Tests of reliability:

1) Test-Retest- using the same indicator on the same people at two or more time points. Should have consistent responses at both time points.

TEST-RETEST RELIABILITY TEST OF PARTY IDENTIFICATION

(Note: the following table is derived from Herbert B. Asher's Presidential Elections and American Politics, 5th edition, page 71; Brooks/Cole co., 1992)

1976 Partisanship

1972 Party Id Strong Dem. Weak Dem. Indep. Dem. Pure Indep. Indep. Rep. Weak Rep. Strong Rep.
Strong Dem. 9 4 1 0 0 0 0
Weak Dem. 5 13 3 2 1 1 0
Indep. Dem. 2 3 4 1 1 0 0
Pure Indep. 1 1 2 5 2 1 0
Indep. Rep. 1 0 1 3 5 2 1
Weak Rep. 0 1 0 1 3 7 2
Strong Rep. 0 0 0 0 1 4 6

How much stability is there in this table? How many people have given the same response at both time points? Count the number of people in the diagonal. The number remaining stable in attitudes = (9 + 13 + 4 + 5 + 5 + 7 + 6) = 49. The total number of people in the table is 100. Hence, 49% of the sample has remained stable in attitudes. Is 49% high or low reliability? The stable percent must be compared to chance alone. Chance stability is the number of stable cells, divided by the total number of cells in the table. Hence, chance stability is 7 / 49 = 14%. Since 49% is significantly higher than 14%, this indicator is reliable.

A more recent example follows:

(Source of the following info is: Politial Behavior of the American Electorate, 12th edition, by William H.

Flanigan and Nancy Zingale, p. 104; data originally are from the Youth-Parent Socialization Panel

Study, 1965-1997, Youth Wave, data provided by the ICPSR).

 

 

DEMOCRATS

In 1982

INDEPENDENTS

In 1982

REPUBLICANS

In 1982

DEMOCRATS

In 1997

 

23

 

5

 

4

INDEPENDENTS In 1997

 

7

 

27

 

10

REPUBLICANS

In 1997

 

2

 

5

 

17

2) Alternate Forms (Parallel Forms)- using two or more indicators on the same people at one time point. Should have consistent responses for both indicators.

ALTERNATE FORMS

2002 Party Identification

Party that is best for "People like you" Democratic Independent Republican
Democrats 172 51 7
Both are Equal 18 40 29
Republican 6 39 157

Consistent responses for both indicators are Democrats who believe that the Democratic party is best for people like themselves, Republicans who believe that the Republican party is best for people like themselves, and Independents who believe that both parties are equally good for people like themselves. The number of consistent responses is (172 + 40 + 157) = 369.

The total number of people is 519. The percentage of people who give consistent responses is:

369 / 519 = 71%. How reliable is the party identification indicator compared to chance alone. Chance is the number of consistent cells divided by the total number of cells: 3 / 9 = 33%. Since 71% is significantly greater than 33%, the party identification indicator is reliable.

Note- these data are from the Mississippi Poll.

3) Split Half- using multiple indicators of a concept on the same people at one time point. Forms two scales with each combining people's responses on half of the indicators. The two scales' scores should be consistent for people.

Health care example. In 2004 the Mississippi Poll included seven questions about how important people thought a number of health care issues were, and they rated them from scores of 1 for Very Important to scores of 4 for Not Important. An item on Recruiting and Retaining Doctors were not highly related to the other six items, so we excluded it from analysis. The other six items were:

These six indicators were divided into two groups: Group A included items 1, 3, and 5; and Group B included items 2, 4, and 6. Responses to all three items in each group were added together. Since each item was coded to range from a 1 to 4, the scale for each group ranges from a 3 to a 12. The Pearson correlation between the two scales is a .71, which is pretty respectable.

Another way of testing consistency is with a crosstabulation. Looking at the frequency distributions of each scale, I combined each scale's codes as follows: 3 and 4 were coded as High Priority; 5 and 6 were coded as Medium; 7 thru 12 were coded as Low Priority. The crosstabulation follows:

SPLIT HALF EXAMPLE

Group A Scale

Group B Scale High Medium Low
High 141 25 1
Medium 81 104 29
Low 6 24 47

Notice that 292 people (141 + 104 + 47) gave consistent responses to both of the scales. They fall in the diagonal, being high-high, medium-medium, or low-low. The total number of people in the table is 458. Therefore, 292/458 people gave consistent responses, or 64% of the sample. Chance alone would predict about one-third or 33%. So the six indicators of the importance of health care demonstrate some reliability.

4) Cronbach's Alpha- used for multi-indicator indexes, calculates how reliable the component indicators are. Ranges from 0 for unreliable to 1 for most reliable. The Cronbach's Alpha for the six health care items included in the 2004 Mississippi Poll analysis discussed earlier was .80.

Reasons for low observed reliability:

VALIDITY

Definition- are we really measuring what we think we are measuring.

Types of validity tests:

1) Face Validity- on its face, it appears to be valid. Simple concepts, such as a ruler. Just use it.

Very well established indicator, don't question it.

2) Construct (Criterion) Validity- relate your questionable indicator to more well established indicators, and see whether it behaves as you expect it to behave.

CONSTRUCT VALIDITY

Questionable Indicator is Party Identification

Well Established Indicators Strong Dem Weak Dem Indep. Dem. Pure Indep. Indep. Rep. Weak Rep. Strong Rep.
Pres. Vote
1984-1992 13% 54% 49% 77% 95% 91% 95%
1996-2004 7 32 22 58 87 92 94
2008-2012 11 26 20 64 88 95 89
1984 15 65 48 90 94 83 91
1988 13 46 52 68 94 87 98
1992 7 49 47 50 100 98 95
1996 7 26 23 45 90 84 92
2000 7 30 30 63 84 97 93
2004 7 40 15 69 86 96 97
2008 18 20 24 72 85 96 86
2012 2 36 0 50 93 93 92
Senate Vote
1984-1994 29% 54% 55% 80% 86% 80% 92%
1984 25 53 44 79 80 63 93
1988 15 40 50 80 83 76 92
1994 50 73 77 80 93 93 94
2014 12 26 32 67 67 92 94

Note: Cell entries are percentage vote for Republican candidate among each of the seven party identification categories. These data are from the Mississippi Poll.

Our expectations are that the percentage Republican vote would increase steadily as one moves from the most Democratic party identification category of Strong Democrat to the most Republican party identification category of Strong Republican. Examine the 1988 presidential vote indicator, we see a steady increase in Republican vote as we move from Strong Dem. to Strong Rep. with two exceptions. Only 87% of Weak Republicans voted for Republican Bush, while 94% of Independent Republicans voted for Bush. Those two categories should have reversed percentages, so circle both of those cells, since they involve validity problems with the party identification indicator. Examine the 1996 presidential vote and you find two sets of validity problems among Democrats and Republicans. Circle the four cells having validity problems.

Repeat this validity test for the other vote indicators, including the Senate vote items, and discuss the validity problems with the party identification indicator that you find.

A good example of a construct validity test is in an MSU publication about health care issues. See table 1 in this link.

3) Convergent-Discriminant Validity Test- different measures of the same concept should yield similar results; the same measures of different concepts should yield different results. Examine correlation matrix.

A good example of a convergent-discriminant validity test is provided in table 8 of an MSU publication. See this link.

Another good, recent example, follows:

EXAMPLE OF CONVERGENT-DISCRIMINANT VALIDITY TEST

 

Adult Mississippians’ views of political issues in 2010 and 2012 (Pearson r’s)

 

 

Abortion

Gay Marriage

Affirmative

Action

 

Gov’t help Blacks’ socioeconomic

Position

Gov’t help get doctors-hospitals low cost

Gov’t help get jobs and good living standard

Abortion

-

 

 

 

 

 

Gay Marriage

 

.32

 

-

 

 

 

 

Affirmative

Action

 

.04

 

.15

 

-

 

 

 

Black socioecon.

Position

 

.15

 

.21

 

.61

 

-

 

 

Doctors &

Hospitals

 

.16

 

.23

 

.43

 

.61

 

-

 

Jobs & living standards

 

.11

 

.20

 

.41

 

.55

 

.67

 

-

 

Civil liberty (abortion-gay marriage) average intra-cluster correlation = .32

Economic welfare (affirmative action-black socioeconomics-doctors-jobs) average intra-clsuter correlation = .55

Average Inter-cluster correlation (between items from these two dimensions) = .16

 

 

CONVERGENT-DISCRIMINANT VALIDITY TEST

Correlation Matrix of State Spending Preferences (1981-1999)

Day Care Envir Health Indus-try Police Poor Prisons Highways E&S Educ. Tourism
Day Care -
Envir .19 -
Hlth. .36 .17 -
Indus. .06 .07 .10 -
Pol. .08 .10 .08 .09 -
Poor .39 .11 .39 .03 .05 -
Prison .15 .06 .13 .07 .22 .12 -
High. .15 .11 .12 .14 .16 .08 .12 -
E&S Educ. .11 .13 .15 .08 .15 .13 .08 .09 -
Tour. .07 .10 .02 .25 .15 0 .12 .13 .01 -
Univ-ersity .14 .12 .15 .14 .10 .18 .07 .13 .33 .07

Note: data are based on the 1981-1999 Mississippi Poll, with some fictitious data included to simplify table interpretation.

Convergent-discriminant validity tests help to determine if your multiple indicators of one concept are actually measuring only one concept, or whether your indicators are measuring more than one concept (a multi-dimensional concept). Generate a correlation matrix as indicated above, and remember that the correlations range from 0 for no relationship to 1 for highest relationship. Then, pick out the highest correlations in order of their size. In the above table, the validity test shows that spending is a multi-dimensional concept involving four separate dimensions (concepts). Those dimensions are: social welfare (poor, day care, health), education (elementary-secondary and college), economic development (industry, tourism), and public order (police, prisons). The environment and highways indicators do not relate to any of these four, above the .2 correlation level. Hence, any researcher combining all eleven spending indicators into one scale that supposedly measures one concept of public support for government programs has validity problems, since there are four dimensions rather than one dimension of state spending.

SECOND CONVERGENT-DISCRIMINANT VALIDITY TEST

Updated Correlation Matrix of State Spending Preferences (2000, 2004)

Envir Health Industry Police Poor Highways E&S Educ. Tourism
Envir -
Health .22 -
Industry .12 .12 -
Police .07 .12 .18 -
Poor .23 .40 .08 .03 -
Highways .11 .17 .17 .08 .14 -
E&S Educ. .14 .28 .09 .07 .30 .12 -
Tourism .07 .05 .36 .10 -.02 .12 .09 -
University .12 .42 .13 .09 .26 .17 .31 .04

Note: data are real world data drawn from the 2000 and 2004 Mississippi Polls.

In this updated correlation matrix, note that only two dimensions emerge, and that three spending items are unrelated to both dimensions. The highest correlations are between health care and poverty spending (.40) and between health care and universities (.42). Elementary/secondary and universities are correlated at .31. The three other correlations between these four spending items range from .26 to .30 in value. These items of elementary-secondary, universities, health care, and poverty spending form one dimension. The second dimension is tourism and industry, which are correlated at .36. The three spending items that are uncorrelated with these two dimensions are police and highways, where the correlations with other spending items never exceed .18, and the environment (correlations never exceed .23). Unlike ten years ago, people appear to see the relevance of education for social welfare programs, in that people with a better education are less likely to need social welfare programs. Also note that we no longer ask two spending items- day care and prisons.

THIRD CONVERGENT-DISCRIMINANT VALIDITY TEST

Most Recent Correlation Matrix of State Spending Preferences (2006, 2008, 2010, 2012)

Envir Health Industry Police Poor Highways E&S Educ. Tourism
Envir -
Health .27 -
Industry .14 .20 -
Police .14 .18 .16 -
Poor .33 .46 .11 .18 -
Highways .10 .13 .19 .24 .17 -
E&S Educ. .26 .34 .12 .20 .36 .17 -
Tourism .10 .08 .25 .17 .04 .18 .11 -
University .20 .42 .21 .15 .24 .17 .41 .15

Note: data are real world data drawn from the 2006, 2008, 2010, and 2012 Mississippi Polls.

In this most recent example, one could argue that there are either two different dimensions, or three different dimensions. Two of the three highest correlations are for schools-universities, and for poor-health. Environment is more highly correlated with poor and health than with schools (elementary-secondary ed) or universities, so we can place it in the social welfare rather than education dimension. But health-universities is also highly correlated, and the correlations between schools/universities and the other three items (poor-environment-health) are also respectable, so one could argue that instead of two dimensions of social welfare and education, there is only one dimension of education-welfare. A separate third dimension is industry-highways-police-tournism, an economic development dimension, though these programs are also correlated with the education-welfare items. So do we have one, two, or three dimensions?? Let us turn to factor analysis.

4) Factor Analysis- can be used as a validity test for testing whether a concept is multi-dimensional.

2004 health care example. The six relevant items were subjected to a Principal Components Factor Analysis with Varimax Rotation. Only 457 of the 523 respondents were analyzed, since others lacked responses on one or more of the six items. Thus, 13% of the respondents were excluded from this factor analysis. Only one factor emerged, explaining 51% of the variance in all six items. Other factors explained less of the variance than each item did, so they were dropped from the analysis. The factor loadings for each item ranged from a low of .66 for public education to encourage nutrition and exercise to a high of .78 for providing health care for adults who can't afford it.

The Component Matrix, and the Component 1 scores follow:

Extraction Method: Principal Component Analysis. 1 components extracted.

These results suggest that it is valid to combine these six health care importance indicators into one scale measuring one dimension. If we had included the third health care item on the importance of recruiting and retaining doctors in Mississippi, we would have still ended up with one dimension, but the loading of that item on the factor was only .47, clearly the lowest of the factor loadings. This suggests that that item does not measure the one dimension very well, so we excluded it from the scale.

2006-2012 state spending programs example.

Rotated Component Matrix

 

Spending Program

Component 1

Component 2

Environment

.538

.128

Health Care

.752

.108

Industrial Development

.098

.647

Police Forces

.232

.503

Poverty Programs

.730

.015

Streets and Highways

.157

.582

Elem.-Secondary Education

.687

.127

Attracting Tourism

-.017

.709

Higher Education

.597

.247

 

As you can see with this factor analysis, varimax rotation, there are clearly two different dimensions- the education-welfare dimension with 5 programs, and the economic development dimension with 4 programs. That would probably be the most defensible conclusion, though using the entire correlation matrix would permit you to make an argument for using one or three dimensions as well.

It is interesting to correlate the factor scores of these two dimensions with other theoretically relevant factors. The education-welfare factor score is correlated -.35 with ideology identification and -.44 with party identification, indicating that conservatives and Republicans are more likely to want to spend less on these programs than liberals and Democrats. The economic development factor score is uncorrelated with ideology identification, and has a slight positive (.05 correlation) relationship with party identification, indicating that Republicans are slightly more supportive of these programs than are Democrats.

TOPIC ELEVEN: DIMENSIONAL ANALYSES- FACTOR ANALYSIS

Discuss factor analysis as data reduction tool- it reduces number of variables into a smaller number of concepts.

TOPIC TWELVE: SURVEY RESEARCH--SAMPLING AND SURVEY TYPES

Historic Problems with Polls:

Sample Error Correlates:

TABLE OF SAMPLE ERROR

(Source of table: Survey Research Methods, by Earl R. Babbie, Wadsworth Publishing Co., 1973, page 376)

HOMOGENEITY OF POPULATION

SAMPLE SIZE 50/50 60/40 70/30 80/20 90/10
100 10 9.8 9.2 8 6
200 7.1 6.9 6.5 5.7 4.2
300 5.8 5.7 5.3 4.6 3.5
400 5 4.9 4.6 4 3
500 4.5 4.4 4.1 3.6 2.7
600 4.1 4 3.7 3.3 2.4
700 3.8 3.7 3.5 3 2.3
800 3.5 3.5 3.2 2.8 2.1
900 3.3 3.3 3.1 2.7 2
1000 3.2 3.1 2.9 2.5 1.9
1100 3 3 2.8 2.4 1.8
1200 2.9 2.8 2.6 2.3 1.7
1300 2.8 2.7 2.5 2.2 1.7
1400 2.7 2.6 2.4 2.1 1.6
1500 2.6 2.5 2.4 2.1 1.5
1600 2.5 2.4 2.3 2 1.5
1700 2.4 2.4 2.2 1.9 1.5
1800 2.4 2.3 2.2 1.9 1.4
1900 2.3 2.2 2.1 1.8 1.4
2000 2.2 2.2 2 1.8 1.3

Note: Cell entries are sample error figures.

Types of Surveys: In-person; Telephone; Mail; Mixed Methods; briefly discuss each.

For further information, see Mail and Telephone Surveys, by Don Dillman, John Wiley and Sons Co, 1978.

PROS AND CONS OF SURVEY TYPES

In-person-- pros:
1) Observe and clear up R's confusion
2) Obtain objective information about R's (respondent) lifestyle
3) Visual Aids use
4) Establish rapport? High response rate?

In-person-- cons:
1) Expensive
2) Safety of interviewer
3) Interviewer fraud

Telephone-- pros:
1) Quick
2) Cost effective
3) Centralized interviewing- no fraud
4) Interviewer safety

Telephone-- cons:
1) Excludes those without telephones
2) No visual aids-- voice dependent

Mail-- pros:
1) Cheap
2) Use with specialized population

Mail-- cons:
1) Excludes illiterates
2) Can't control who answers survey
3) Can't control order of questions answered
4) Slow
5) Incomplete forms
6) Low response rate?

Probability Sampling. Definition of probability sample: each population unit has some chance of being in the sample, and that chance can be calculated. Types of probability samples:

Telephone Sampling Techniques:

Sampling within the household:
1) Kish method, ask household resident to list first names of all adults, then toss dice to select adult to interview;
2) Carter-Trodahl method: multiple selection tables asking number of adults and number of men in household;
3) Sociological last birthday method; problem that it oversamples women.

Demographic Groups Undersampled in Surveys, especially Telephone Surveys:

Weighing the Sample:

In the 2012 and 2014 Mississippi Polls, we included cell phones in our sampling frame, so underrepresenting the young was no longer as huge a problem as it had been in 2010. Check out how representative the three polls were, and how each was weighted to compensate for demographic groups underrepresented, by clicking on the following links:

TOPIC THIRTEEN: SURVEY RESEARCH-QUESTIONNAIRE CONSTRUCTION, IMPLEMENTATION

ACTUAL EXAMPLES:
(From Survey Research for Public Administration, by David H. Folz, Sage Publishers)

1) Perceptions of local problems- p. 5, 22, 107
A) No problem, Minor Problem, Major Problem
B) Most serious problem
C) Agree-disagree with problem statements

2) Quality of local services- p. 8
A) Excellent, good, fair, poor

3) Policy preferences- p. 5, 22
A) Single most important change
B) How improve quality of life- not important, somewhat important, very important
C) One policy- oppose or favor, strong or some.

4) Funding priorities- p. 5, 22
A) Single choice, reduce funding first
B) City spending- too little, about right, too much

5) Tax hike backing- p. 20
A) Specific increase for specific policy

6) Citizen usage satisfaction- p. 8
A) Filter question, did they use service?
B) Satisfied or dissatisfied, very or somewhat
C) How often policy met expectations

7) Business usage satisfaction- p. 6
A) Survey gov't workers about complaints heard
B) Survey businesses about specific problems, Overall satisfaction

8) Wording problems- p. 99
A) Loaded or leading
B) Double barreled
C) Too complex, double negative (Miss Poll)
D) Unbalanced alternatives (Blacks treated same as whites or worse)
E) Acquiescence bias (agreement bias)- especially on agree-disagree items
F) Sensitive items- use income categories
G) Social desirability- race items

Read Some 2008 Mississippi Poll Results, by Stephen D. Shaffer, SSRC, MSU, 2008.

Read Some 2010 Mississippi Poll Results, by Stephen D. Shaffer, SSRC, MSU, 2010.

Read Some 2012 Mississippi Poll Results, by Stephen D. Shaffer, SSRC, MSU, 2012.

Read Some 2014 Mississippi Poll Results, by Stephen D. Shaffer, SSRC, MSU, 2014.

TOPIC FOURTEEN: REVIEW OF MATERIAL FOR FIRST EXAM

FIRST ESSAY EXAMINATION

TOPIC FIFTEEN: DESCRIPTIVE STATISTICS

DESCRIPTIVE STATISTICS are Univariate Statistics, dealing with one variable.

CENTRAL TENDENCY- typical case

DISPERSION- diversity, how divided or united the cases are, the form of the distribution (interval level)

Identify the mode and median categories in each of the following examples drawn from the 2008 Mississippi Poll:

Punishment favored in cases of first-degree murder:
Death penalty............... = 48%
Life without parole......... = 44%
A shorter jail term than life = 8%

How rate President Bush's job performance:
Excellent = 9%
Good .... = 24%
Fair .... = 33%
Poor .... = 34%

Ideological self-identification:
Very Liberal......... = 2%
Somewhat Liberal..... = 17%
Moderate............. = 27%
Somewhat Conservative = 34%
Very Conservative.... = 20%

Education Level:
High School Dropout. = 27%
High School Graduate = 29%
Some College........ = 28%
College Graduate.... = 13%
Some Graduate Work.. = 3%

Annual Family Income
Under $10,000 = 16%
$10-20,000... = 15%
$20-30,000... = 15%
$30-40,000... = 10%
$40-50,000... = 9%
$50-60,000... = 8%
$60-70,000... = 7%
Over $70,000. = 20%

Likelihood of Living in the Current Community in Five Years:
Definitely No. = 8%
Probably No... = 14%
Probably Yes.. = 32%
Definitely Yes = 46%

Population of the Community You Live In:
Farm or ranch = 10%
Rural area... = 25%
Under 2,500.. = 14%
2,500-10,000. = 22%
10,000-50,000 = 27%
Over 50,000.. = 2%

MEANS- What follows is a verbal interpretation of means, using the ideological self-identification and ideological perception questions.

Question wording: "What about your political beliefs? Do you consider yourself: very liberal, somewhat liberal, moderate or middle of the road, somewhat conservative, or very conservative?" Question wordings: "Please label the following political figures as very liberal, somewhat liberal, moderate (or middle of the road), somewhat conservative, or very conservative." "Democratic Presidential hopeful Hillary Clinton." "Democratic Presidential hopeful Barack Obama." "Republican Presidential hopeful John McCain." Ideological perception questions were not asked for the U.S. senate candidates. However, such questions were asked in previous years' polls for Musgrove, for when he was lieutenant governor (1998) and governor (2000, 2002 polls). For comparison purposes, we also include the perceptions of previous Democratic presidential candidates, asked in previous Mississippi polls.

The values below are "means" or averages for the ideological variables, all of which are coded as 1 for very liberal, 2 for somewhat liberal, 3 for moderate, 4 for somewhat conservative, and 5 for very conservative.

RANGE is distance between extreme categories. It requires an interval level measurement. Thus, merely subtract the lowest number representing the category at one end of the indicator from the highest number representing the category at the other end of the indicator. Examples follow:

A test of your knowledge of VARIANCE. Remember the example of Mississippi's party organization members. The mean for Democrats was 2.69, which was between somewhat liberal and moderate, but closer to moderate. The mean for Republicans was 4.45, which was between somewhat conservative and very conservative, but closer to somewhat conservative. However, remember the form of the distribution. Nearly 10% of Democrats were very conservative, and almost 20% were somewhat conservative, so there was considerable diversity or dispersion of ideologies in the Democratic party. Therefore, the variance of Democrats' ideology scores was a relatively higher number, a variance of 1.351. For Republicans on the other hand, less than 2% of them were very liberal or somewhat liberal. So there was much unity and clustering of ideological scores for the Republicans, and little diversity or dispersion of scores. Therefore, the variance of Republicans' ideology scores was a relatively low number, a variance of .493. Therefore, Democrats were more divided in ideology (a higher variance), and Republicans were more united on ideology (a lower variance).

Examples of variance follow:

4A. (5 points) The following two questions are based on recent Mississippi Polls, all conducted in the 21st century. Using the statistic of variance, are Democrats or Republicans most divided on each of the following five variables:

4B. (5 points) Using the statistic of variance, are whites or blacks most united on each of the following five variables:

TOPIC SIXTEEN: CONTINGENCY TABLES

Contingency tables can be used with nominal level measures, though we usually employ ordinal or interval level data having a limited number of categories. Contingency tables permit you to view the data in an easily interpretable and understood manner.

Percentage Difference is a measure of strength of the relationship. It ranges from a low of 0 to a high of 100. Always put the independent variable at the top of the table, and the dependent variable at the side. Then, calculate the column percentages. For ordinal and interval level indicators, compare the column percents (for the two extreme categories of the predictor) across the same category of your dependent variable. Make this comparison for the two extreme categories of your dependent variable, and take the average. If one of these comparisons is contrary to your hypothesis, make the difference a negative.

Other Measures of Association to use (Source: Research Methods in Political Science: An Introduction Using MicroCase, 2nd edition, by Michael Corbett; p. 139-144; copyrighted by MicroCase Corporation):

All measures range from 0 for no relationship to 1 for perfect relationship. A positive or negative sign is a function of the direction of the coding of the variables and whether your hypothesis is upheld. p>The following are nine examples of bivariate tables. In class, we will review three features of each table. 1) Is the relationship statistically significant? Is Chi-squared significant at the .05 level or below? 2) What is the magnitude of the relationship? That is, what is the gamma value. To determine the relative importance of the predictors-- which predictor is most and least important-- use the absolute value of the gamma, and ignore the sign. 3) What is the direction of the relationship? That is, devise a hypothesis for each table that reflects how the two variables are related. Example for table 1: People younger in age are more likely to favor spending more on health care, compared to people older in age.
Note: The tables in your research paper should look like these tables in format.

Table 1

Age Differences in State Spending Preferences for Health Care

                                                                                    AGE

STATE SPENDING

DESIRED:

 

18-35

 

36-55

 

56 and Over

Less

10%

7%

8%

Same

18%

18%

34%

More

72%

75%

58%

N Size

(555)

(571)

(524)

Gamma = -.16
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.

Table 2

Income Differences in State Spending Preferences for Health Care

                                                                     FAMILY   INCOME

STATE SPENDING

DESIRED:

 

< $20,000

 

$20-40,000

 

$40-60,000

 

> $60,000

Less

10%

4%

7%

10%

Same

13%

17%

30%

36%

More

77%

79%

63%

54%

N Size

(365)

(363)

(222)

(333)

Gamma = -.28
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.

Table 3

Ideological Differences in State Spending Preferences for Health Care

                                                              SELF-IDENTIFIED IDEOLOGY

STATE SPENDING

DESIRED:

 

Liberal

 

Moderate

 

Conservative

Less

3%

6%

12%

Same

15%

17%

31%

More

82%

77%

57%

N Size

(262)

(495)

(808)

Gamma = -.41
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.

Table 4

Race Differences in State Spending Preferences for Health Care

                                                                                           RACE

STATE SPENDING

DESIRED:

White

African-American

Less

10%

3%

Same

31%

10%

More

59%

87%

N Size

(1050)

(555)

Gamma = .63
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.

Table 5

Sex Differences in State Spending Preferences for Health Care

                                                                                            SEX

STATE SPENDING

DESIRED:

Men

Women

Less

12%

5%

Same

27%

20%

More

61%

75%

N Size

(772)

(889)

Gamma = .33
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.

Table 6

Income Differences in Having Access to a Personal Computer

                                                                     FAMILY   INCOME

HAVE ACCESS TO A PC?

 

< $20,000

 

$20-40,000

 

$40-60,000

 

> $60,000

Yes

54%

67%

85%

94%

No

46%

33%

15%

6%

N Size

(370)

(368)

(232)

(341)

Gamma = -.59
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.

Table 7

Race Differences in Having Access to a Personal Computer

                                                                                           RACE

HAVE ACCESS TO A PC?

White

African-American

Yes

74%

69%

No

26%

31%

N Size

(1084)

(560)

Gamma = .12
Chi-squared significance < .05
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.

Table 8

Sex Differences in Having Access to a Personal Computer

                                                                                            SEX

HAVE ACCESS TO A PC?

Men

Women

Yes

74%

70%

No

26%

30%

N Size

(790)

(910)

Gamma = .10
Chi-squared significance < .06; Not Significant at .05 level.
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.

Table 9

Age Differences in Having Access to a Personal Computer

                                                                                    AGE

HAVE ACCESS TO A PC?

 

18-35

 

36-55

 

56 and Over

Yes

82%

79%

55%

No

18%

21%

45%

N Size

(564)

(585)

(538)

Gamma = .41
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 2006, 2008, 2010 Mississippi Poll.

The following nine examples of bivariate tables are from earlier years of the Mississippi Poll. How have demographic differences in attitudes towards health care, and in access to personal computers, changed over the years? Explain.

Table 1

Age Differences in State Spending Preferences for Health Care

                                                                                    AGE

STATE SPENDING

DESIRED:

 

18-35

 

36-55

 

56 and Over

More

78%

74%

69%

Same

18%

20%

26%

Less

4%

6%

5%

N Size

(630)

(720)

(470)

Gamma = .131
Chi-squared significance < .01
Note: Cell entries total 100% down each column.
Source: 1998, 1999, 2000 Mississippi Poll.

Table 2

Income Differences in State Spending Preferences for Health Care

                                                                     FAMILY   INCOME

STATE SPENDING

DESIRED:

 

< $20,000

 

$20-40,000

 

$40-60,000

 

> $60,000

More

84%

78%

65%

53%

Same

13%

17%

30%

40%

Less

3%

5%

5%

7%

N Size

(418)

(499)

(280)

(222)

Gamma = .371
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 1998, 1999, 2000 Mississippi Poll.

Table 3

Ideological Differences in State Spending Preferences for Health Care

                                                              SELF-IDENTIFIED IDEOLOGY

STATE SPENDING

DESIRED:

 

Liberal

 

Moderate

 

Conservative

More

79%

79%

67%

Same

15%

18%

27%

Less

6%

3%

6%

N Size

(630)

(720)

(470)

Gamma = .223
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 1998, 1999, 2000 Mississippi Poll.

Table 4

Race Differences in State Spending Preferences for Health Care

                                                                                           RACE

STATE SPENDING

DESIRED:

White

African-American

More

67%

89%

Same

27%

9%

Less

6%

2%

N Size

(1206)

(570)

Gamma = -.574
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 1998, 1999, 2000 Mississippi Poll.

Table 5

Sex Differences in State Spending Preferences for Health Care

                                                                                            SEX

STATE SPENDING

DESIRED:

Men

Women

More

70%

77%

Same

25%

18%

Less

5%

5%

N Size

(825)

(995)

Gamma = -.174
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 1998, 1999, 2000 Mississippi Poll.

Table 6

Income Differences in Having Access to a Personal Computer

                                                                     FAMILY   INCOME

HAVE ACCESS TO A PC?

 

< $20,000

 

$20-40,000

 

$40-60,000

 

> $60,000

Yes

28%

61%

76%

87%

No

72%

39%

24%

13%

N Size

(280)

(334)

(187)

(163)

Gamma = -.634
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 1998, 2000 Mississippi Poll.

Table 7

Race Differences in Having Access to a Personal Computer

                                                                                           RACE

HAVE ACCESS TO A PC?

White

African-American

Yes

66%

45%

No

34%

55%

N Size

(806)

(376)

Gamma = .412
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 1998, 2000 Mississippi Poll.

Table 8

Sex Differences in Having Access to a Personal Computer

                                                                                            SEX

HAVE ACCESS TO A PC?

Men

Women

Yes

64%

55%

No

36%

45%

N Size

(547)

(668)

Gamma = .189
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 1998, 2000 Mississippi Poll.

Table 9

Age Differences in Having Access to a Personal Computer

                                                                                    AGE

HAVE ACCESS TO A PC?

 

18-35

 

36-55

 

56 and Over

Yes

72%

62%

38%

No

28%

38%

62%

N Size

(409)

(482)

(321)

Gamma = .410
Chi-squared significance < .001
Note: Cell entries total 100% down each column.
Source: 1998, 2000 Mississippi Poll.

Multivariate crosstabulations:
Multivariate analysis involves one dependent variable and more than one independent variable (predictor).

Controlling- multivariate tables always permit you to examine the relationship between a predictor and a dependent variable, after taking into effect the impact of a second predictor.

For example, African-Americans tend to have a lower turnout than whites. A possible control variable is socioeconomic status (SES). Perhaps African-Americans have a lower average turnout than whites because of the lower socioeconomic status of blacks, and we know that people of all races having a lower SES tend to have lower turnout compared to people of all races having a higher SES. To determine whether a lower SES level explains why African-Americans tend to have lower turnouts than whites we examine: the relationship between race and turnout, controlling for SES. Do whites and blacks of the same SES level have the same turnout level; if so, SES is more important than race in shaping turnout.

___>___________>SES ________>
RACE _____________________> TURNOUT

Three types of variables that one would control for:
1) Outside variables- a variable that has an effect on one of your predictors and on your dependent variable. Here, race is an outside variable. You would control for it to determine if SES has a direct, causal effect on turnout, or whether the race-turnout effect is spurious. If spurious, then race directly affects or causes SES and turnout, but SES does not have a direct causal effect on turnout.
2) Intervening variable- a variable that is located between a predictor and a dependent variable, and that explains why the "early" predictor is related to the dependent variable. SES is an intervening variable here, as it explains why race is related to turnout.
3) Specifying or Conditional variables- a predictor that changes the relationship between another predictor and the dependent variable. That is, the relationship has a different direction or magnitude for different categories of the specifying variable. If a race gap in turnout exists only among college grads in Mississippi but not among other educational groups, then education is the specifying variable.

Examples of Multivariate Tables

MODEL TESTED FOR ALL THREE SCENARIOS

RACE....................................> SES ...................................................> PARTICIPATION

RACE .................................................................................................> PARTICIPATION



SCENARIO 1:

BIVARIATE (includes low, medium, and high SES groups):
White Race Black Race
Low Participation 40% 60%
High Participation 60% 40%
Column % Totalled 100% 100%

MULTIVARIATE (Low SES group only):
White Race Black Race
Low Participation 70% 70%
High Participation 30% 30%
Column % Totalled 100% 100%

MULTIVARIATE (Medium SES group only):
White Race Black Race
Low Participation 50% 50%
High Participation 50% 50%
Column % Totalled 100% 100%

MULTIVARIATE (High SES group only):
White Race Black Race
Low Participation 20% 20%
High Participation 80% 80%
Column % Totalled 100% 100%

RACE ...................................> SES .....................................> PARTICIPATION



SCENARIO 2:

BIVARIATE (includes low, medium, and high SES groups):
White Race Black Race
Low Participation 40% 60%
High Participation 60% 40%
Column % Totalled 100% 100%

MULTIVARIATE (Low SES group only):
White Race Black Race
Low Participation 40% 60%
High Participation 60% 40%
Column % Totalled 100% 100%

MULTIVARIATE (Medium SES group only):
White Race Black Race
Low Participation 40% 60%
High Participation 60% 40%
Column % Totalled 100% 100%

MULTIVARIATE (High SES group only):
White Race Black Race
Low Participation 40% 60%
High Participation 60% 40%
Column % Totalled 100% 100%

RACE............................................................> SES

RACE............................................................> PARTICIPATION



SCENARIO 3:

BIVARIATE (includes low, medium, and high SES groups):
White Race Black Race
Low Participation 40% 70%
High Participation 60% 30%
Column % Totalled 100% 100%

MULTIVARIATE (Low SES group only):
White Race Black Race
Low Participation 70% 80%
High Participation 30% 20%
Column % Totalled 100% 100%

MULTIVARIATE (Medium SES group only):
White Race Black Race
Low Participation 50% 60%
High Participation 50% 40%
Column % Totalled 100% 100%

MULTIVARIATE (High SES group only):
White Race Black Race
Low Participation 30% 40%
High Participation 70% 60%
Column % Totalled 100% 100%



................................................................(40% multivariate)

RACE .........................................> SES .................................> PARTICIPATION

RACE .....................................................................................> PARTICIPATION

.......................(30% bivariate; 10% multivariate)



MODEL OF GENDER AND SENIORITY AFFECTING JOB SECURITY

GENDER ....................................> SENIORITY ..............................> JOB

GENDER ...........................................................................................> SECURITY



BIVARIATE: Gender .......> Job Security
Men Women
Fired 22% (55) 50% (75)
Kept Job 78% (195) 50% (75)
100% (250) 100% (150)

BIVARIATE: Gender .......> Seniority
Men Women
Low Seniority 20% (50) 67% (100)
High Seniority 80% (200) 33% (50)
100% (250) 100% (150)

BIVARIATE: Seniority .......> Job Security
Low Seniority High Seniority
Fired 70% (105) 10% (25)
Kept Job 30% (45) 90% (225)
100% (150) 100% (250)

MULTIVARIATE (Low Seniority Group Only):
Men Women
Fired 70% (35) 70% (70)
Kept Job 30% (15) 30% (30)
100% (50) 100% (100)

MULTIVARIATE (High Seniority Group Only):
Men Women
Fired 10% (20) 10% (5)
Kept Job 90% (180) 90% (45)
100% (200) 100% (50)



GENDER ................... > SENIORITY ........................> JOB SECURITY



MODEL OF PARTY ID AND ATTITUDE TOWARD NIXON PARDON AFFECTING VOTE

PARTY ID ....................................> PARDON .... ..............................> 1976 PRESIDENTIAL

PARTY ID ...........................................................................................> VOTE



BIVARIATE: Party Id .......> Presidential Vote
Democratic Party Id Republican Party Id
Carter (Dem) Vote 80% (400) 10% (30)
Ford (Rep) Vote 20% (100) 90% (270)
100% (500) 100% (300)

BIVARIATE: Party Id .......> Attitude toward Ford Pardon of Nixon
Democratic Party Id Republican Party Id
For Pardon 10% (50) 83% (250)
Against Pardon 90% (450) 17% (50)
100% (500) 100% (300)

BIVARIATE: Attitude to Pardon .......> Presidential Vote
For Pardon Against Pardon
Carter (Dem) Vote 22% (65) 73% (365)
Ford (Rep) Vote 78% (235) 27% (135)
100% (300) 100% (500)

MULTIVARIATE (Among Democrats Only)
For Pardon Against Pardon
Carter (Dem) Vote 80% (40) 80% (360)
Ford (Rep) Vote 20% (10) 20% (90)
100% (50) 100% (450)

MULTIVARIATE (Among Republicans Only)
For Pardon Against Pardon
Carter (Dem) Vote 10% (25) 10% (5)
Ford (Rep) Vote 90% (225) 90% (45)
100% (250) 100% (50)

PARTY.......................................> Attitude to Pardon

IDENT........................................>Presidential Vote

TOPIC SEVENTEEN: STATISTICAL INFERENCE

Statistical inference is our ability to generalize a relationship found in a sample to the entire population from which that sample was drawn. That is, can we infer population characteristics from sample data. If our statistical inference test suggests that in the population the relationship between the two variables is nonrandom, the relationship is said to be statistically significant.

For example, our 1996 Mississippi Poll sampled only 601 adult Mississippians from a population of over two million. We found a definite relationship in the sample between gender and seat belt use. 60% of women said they "always" used their seat belts, compared to only 42% of men. 9% of men said they "never" used their seat belts, compared to only 4% of women. The magnitude of this relationship between gender and seat belt use was 12%: [(60-42) + (9-4)] / 2. But can we generalize this relationship found in the sample to the entire population? Is there a relationship between gender and seat belt use in the entire population? Statistical inference is the procedure we use to determine if any relationship exists in the entire population.
In this example, the chi-squared (Pearson) is 22.9 with 3 df, and is significant at .001 level. Only 1 chance in a thousand that no relationship exists in the population.

Two tests of statistical inference:

1) Chi-squared is for nominal level variables. Hence, it does not provide information about the direction of the relationship, it simply indicates that a relationship exists in the population. Since the value of chi-squared tends to increase as sample size increases, it does not measure the strength of the association between variables.

Chi-squared = summation [ (fo - f e )squared / fe ]
For the expected frequency for each cell, multiply the column total and the row total for that cell, and divide by the table total.
Degrees of freedom equal the number of columns minus 1 multiplied by the number of rows minus 1.
Consult a Chi-squared chart.
On the SPSS output, use the Pearson chi-squared, which is the most widely used form.

Warning: chi-squared should not be used if any cell has an expected value less than 1, or if more than 20% of the cells have expected values less than 5.

Example from Berman, Evan M., Public Administration Review, March/April 1997, Vol. 57 Issue 2, pages 105-113, "Dealing with Cynical Citizens" article, table 3, where he examines whether there is a link between the number of strategies that cities use to keep people informed about local government's actions and how much trust they have in city government.

-- OBSERVED FREQUENCIES--

NUMBER OF STRATEGIES
Few strategies Some or Many Strategies

Row N Sizes
Trust Low 37 (50.7%) 65 (28.1%) 102
Medium or High Trust 36 (49.3%) 166 (71.9%) 202
Column N 73 231 304


-- EXPECTED FREQUENCIES--

NUMBER OF STRATEGIES

Few Strategies Some or Many Strategies

Row N Sizes
Trust Low 24.5 77.5 102
Medium or High Trust 48.5 153.5 202
Column N 73 231 304

The chi-squared computation for each cell is:
(37-24.5)2/24.5 = 156.25/24.5 = 6.4
(65-77.5)2/77.5 = 156.25/77.5 = 2.0
(36-48.5)2/48.5 = 156.25/48.5 = 3.2
(166-153.5)2/153.5 = 156.25/153.5 = 1.0
Summate these four cell results: 6.4+2+3.2+1 = 12.6

Chi-squared value is 12.6 with 1 degree of freedom. (2-1) * (2-1) = 1 df.

2) The t-test is an interval statistic (dependent variable must be interval). It tests the hypothesis that two groups have different means, and that the inter-group difference can be generalized to the population.

Two-sample t-test (SPSS-independent sample) means that each group is considered a sample.

A one-tailed t-test means that your hypothesis has a direction for the relationship. A two-tailed t-test is used to test nondirectional hypotheses. A two-tailed test is stricter, and SPSS does not report a one-tailed test, hence if your results are significant for the 2-tailed test, they will also be significant for the 1-tailed test.

Two statistics are reported-- for two populations having equal variances, or unequal variances.

The t-test is computed using a complex formula.
Degrees of freedom equals the sum of the two sample sizes minus two.

t-value must be larger than table entry to be significant at the specified level.

Using SPSS program. Use Compare Means- Independent Samples Statistics Menu. Your Test Variable is your dependent variable, which should be interval level. Your Grouping Variable should be a dichotomous independent variable (recode it, when necessary). Use Levine test, which must be p <= .05 for equal variances; otherwise, use unequal variances row. Cite t-value and 2-tail sig. in papers. Significance Level must be <= .05.

Example of a t-test problem (drawn from 2004-2008 Mississippi Poll data).

Examining predictors of family income. Family income is an interval data, coded from a low of 1 for under $10,000 to a high of 8 for over $70,000. The following indicates what the average income codes are for pairs of categories of each predictor, as well as what the t-test significance level is. Answer the following two questions: For each predictor, what group has the higher family income; Is the t-test statistically significant for each of the following five predictors (remember, it must be significant at least at the .05 level)?

TOPIC EIGHTEEN: REGRESSION- BIVARIATE AND MULTIPLE REGRESSION

BIVARIATE REGRESSION

This technique finds the best fitting straight line through a set of points. Best fitting is defined by minimizing the sum of squared distances between the points and the regression line.

Equation of line is Y = a + (b * X), where Y is dependent variable, x is independent variable, a is the Y intercept, and b is the slope of the line, or (change in Y)/(change in X)

R2 is explained variance, the variance in Y explained by the independent variable's regression line.

R2 = (total variation - unexplained variation)/ Total Variation

Total Variation = sum of squared distances between the mean of Y and each case's Y value

Unexplained Variation (Residual) = sum of squared distances between each case's Y value and each case's predicted Y value (from the regression equation)

Explained Variation = sum of squared distances between each case's predicted Y value and the mean of Y.

b = unstandardized regression coefficient = slope = (change in Y) / (change in X)

Beta = standardized regression coefficient = b * (sdx/sdy), where sd means standard deviation. It adjusts for the differing ranges and scales of the variables.

Beta ranges from -1 to +1 with 0 being no relationship between the independent and dependent variables. The sign depends on the direction of the coding of your variables. A +1 or -1 is a perfect relationship. b values have a greater range which is not confined to 1 or -1.

Pearson R is the correlation coefficient. It equals the Beta in the bivariate case only. See p. 462 of text (p. 466 for 3rd edition) for formula used in calculating R.

R2 is the explained variation. It is the predictive ability of your independent variable.

Adjusted R2 shrinks the value of R2 by penalizing for each additional independent variable, and is statistically preferable to the R2. See p. 439 of text (p. 440 of 3rd edition).

The F statistic tests the statistical significance of the regression equation as a whole, and must be below .05. See p. 442 of text (p. 443 of 3rd edition).

Problem of outlying or deviant cases. See faculty example.

Example of calculating a biviarate regression problem.

You are asked to examine the relationship between years of service since receiving a PhD degree, and nine-month salaries of ten history professors. You need to plot the following points on graph paper, and then calculate the b value (unstandardized regression coefficient value or slope) and the y-intercept, as well as calculate what salary would have to be given to a senior professor with 30 years of service since their PhD was hired from another university, as well as what the starting salary would be (for someone with zero years of service who just got their PhD):

MULTIPLE REGRESSION

Multiple Regression is linear regression applied to more than one independent variable. With two independent variables, the predicted values comprise a plane (instead of a line in the one independent variable case).

Equation:
Y = a + b1x1 + b2x2 + b3x3 + b4x4

b value is the unstandardized regression coefficient, controlling for the effects of all other predictors. It is used to predict the value of the dependent variable from the known values of the independent variables.

b value is also used in making comparisons across subsamples. For example, if an independent variable is more important in affecting the dependent variable among men or among women.

Beta is the standardized regression coefficient, controlling for the effects of all other predictors. It tells the relative importance of the independent variables in influencing the dependent variable. It ranges from 0 to 1, with 1 being most important and 0 being least important. Negative signs reflect the direction of variable's coding.

Multiple r is the correlation between the actual Y value and the predicted Y value from the multiple regression equation.

R2 is the variance in the dependent variable explained by all of the independent variables.

Class lecture on regression assumptions and problems; class examples

Example of a multiple regression equation problem (taken from the 2004-2008 Mississippi Polls).

Predicting who believes they have been racially profiled. This dependent variable is coded 1 for not profiled, and 2 for being profiled. The independent variables and their coding follow:

The Betas or standardized regression coefficients for these predictors follow:

The significance levels for each of these regression coefficients follow:

The adjusted R-squared for this regression equation is 12%.

Using the above information, answer the following questions:

CAUSAL MODELING

Multiple regression provides only the direct effects that independent variables exert on dependent variables. Yet outside variables may also affect the dependent variable by affecting an intervening variable in the model. Hence, an outside variable may exert an indirect effect on the dependent variable.

Total effects of an independent variable are equal to the sum of the direct effect of that variable and all of its indirect effects. Each indirect effect is the product of the effect that that outside variable has on an intervening variable, and the effect that the intervening variable has on the dependent variable.

Causal Modeling procedures.
1) Devise a model that shows temporal-causal ordering of the variables
2) Use multiple regression SPSS program and regress each dependent variable in the model on all of the independent variables that are "earlier" than it is
3) Draw arrows for all statistically significant linkages. Put Betas just above each line.
4) Indirect effects involve multiplying the relevant Betas together
5) Total effect = direct effect + indirect effects

TOPIC NINETEEN: EXPERIMENTAL DESIGNS

Classical Experimental Design:

Pre-test ---------------> Stimulus ----------------> Post-test
Experimental Group

Pre-test ---------------------------------------------> Post-test
Control Group

Also, both groups must be equal in composition. Ensure equality by: matching; random assignment.

Internal invalidity problems— inferences (conclusions) drawn are not an accurate reflection of what actually happened.

External Invalidity Problems- unable to generalize to a population

(Note: internal and external invalidity problems derived from Donald T. Campbell and Julian C. Stanley's Experimental and Quasi-Experimental Designs for Research, Houghton Mifflin Co., 1963, pages 5-6.)

Solomon 4 Group Design- use same two groups from the classical experimental design, include two more groups. One having stimulus-posttest only, and another having only the posttest. Must have equal groups, which then assumes equal pre-test scores.

Post-Test Only Design- one experimental and one control group, no pre-tests. Groups must be equal, use randomization

Factorial Designs- used with 2 or more stimuli

Classical Experimental Design is strong on internal validity, but weak on external validity

TOPIC TWENTY: QUASI-EXPERIMENTAL DESIGNS

Quasi-experimental designs are only moderate on internal validity, since they are natural-occurring experiments, and people cannot be randomly assigned to the groups.

Two major types of quasi-experiments:

1) Time Series Design- multiple pre-tests before stimulus; multiple post-tests after stimulus; no control group. Failure to control for numerous threats to internal validity of quasi-experiment.

2) Control Series Design- two time series, one for experimental group, one for control group. Must have groups that are as comparable as possible. Controls for many internal validity problems.

Correlational Design— extensive social science research, such as survey research.
This is a post-test only design, with statistical controls used to simulate experimental and control group. However, random assignment is not used to create groups.

One shot case study is a pre-experiment weak on both internal and external validity. It consists of a stimulus and a post-test.

TOPIC TWENTY ONE: REVIEW OF STATISTICAL MATERIAL FOR SECOND EXAM

TOPIC TWENTY TWO: CLASS WORK ON RESEARCH PAPER

SECOND ESSAY EXAMINATION

TOPIC TWENTY THREE: DATA SOURCES, ECOLOGICAL FALLACY, PANEL DESIGNS

SECONDARY DATA ANALSIS: an example of the first paragraph of the methods section of your research paper

To test the hypotheses in my model, I used the 1998 telephone survey conducted by the Survey Research Unit of the Social Science Research Center at Mississippi State University. A random sampling technique was used to select the households, and a random method was employed to select one individual in each household to interview. Six hundred eight adult Mississippi residents were interviewed from April 14 to April 26, 1998. The response rate was 64%. The sample was adjusted by demographic characteristics (education, sex, race, adults, phone numbers) to ensure that all social groups were adequately represented in the survey. Census data for 1996 were used to obtain population estimates for education, and census data from 1990 were used for race and sex population estimates. With 608 people surveyed, the sample error is plus or minus 4%, which means that if every Mississippi resident had been interviewed, the results could differ from those reported here by as much as 4%.

Information on the methods used in each year of the Mississippi Poll is provided here.

AGGREGATE DATA (ECOLOGICAL FALLACY)

Ecological fallacy is the incorrect assumption that relationships existing at the aggregate level also exist at the individual level.

Example of religion and presidential vote in the 1940s. Two tables showing individual level relations and aggregate marginal results.

First example from 1990 census- foreign born and college degrees aggregate relationship
STATE.....% FOREIGN BORN.....% COLLEGE DEGREE
Mass...................9%......................20%
N.H....................5%......................18%
Vermont................4%......................19%
N.Y...................14%......................18%
N.J...................10%......................18%
Alab...................1%......................12%
Ark....................1%......................11%
La.....................2%......................14%
Miss...................1%......................12%
Ga.....................2%......................15%
S.C.................2%......................13%

The above table suggests that the foreign born are more likely to have college degrees than are U.S.-born adults. Such a conclusion would be committing the ecological fallacy. In reality, the data are merely indicating that states (not people) with a higher percentage of foreign born residents are also states that happen to have a population that contains a greater percentage of college educated adults, compared to states with a lower percentage of foreign born residents. The relationship between foreign born and education is a spurious (non-causal one); states with well-funded education systems tend to be located in the Northeast and Midwest, and those are the same states where many immigrants settle.

Second example from 1990 census- % black and % Republican presidential vote at state level of analysis
STATE.....% BLACK.....% REPUBLICAN PRES. VOTE IN 1988
Alabama........25%.............59%
Georgia........27%.............60%
Miss...........36%.............60%
Virginia.......19%.............60%
Iowa............2%.............44%
Minn............2%.............46%
Penn............9%.............51%
Wash............3%.............48%
Wisc............5%.............48%

The above table suggests that African-Americans are more likely to vote Republican for President than are whites. Such a conclusion would be committing the ecological fallacy, since the table provides aggregate data, not individual-level data. The table in reality is merely showing that states having a high percentage of African-Americans are also states that just happen to be more likely to vote Republican for President, compared to states having a lower percentage of African-Americans. The relationship between race and vote at the state unit of analysis is a spurious, non-causal one. African-Americans merely happen to be concentrated in southern states, since such states historically relied on slavery on large plantations, and southern whites tend to be more conservative politically than are whites in the north.

Third example from 2010 census- % black and % Republican presidential vote at state level of analysis
STATE.....% BLACK.....% REPUBLICAN PRES. VOTE IN 2008
Alabama........26%.............60%
Arkansas.......15%.............59%
Georgia........31%.............52%
Miss...........37%.............56%
Iowa............3%.............45%
Minn............5%.............44%
Penn............11%.............44%
Wash............4%.............41%
Wisc............6%.............42%

Example from Joe Parker book on Mississippi electoral patterns, 1st edition

PANEL STUDIES

Problems with cross-sectional surveys that gather data at only one time point:
1) Inability to study change
2) Hard to make recursive causal assumptions

Panel design definition: the same people, asked the same questions, at two or more time points. Each time point is called a wave.

Problems with panel designs:

Examples of panel studies:
1) National election studies panels of 1956-58-60, 1972-74-76, and 1992-94-96. The second panel study was able to study the effects of Watergate in the 1970s. The third panel study examined how party identification and issue attitudes had reciprocal effects, each affecting the other. It also found reciprocal effects for external efficacy and turnout.
2) 1980, 4 wave U.S. national election study. It examined the effects of campaigns on voters. It found that dissatisfaction with President Carter's leadership caused his defeat.
3) The M. Kent Jennings panel of high school seniors and their parents. Wave 1 was in 1965, wave 2 in 1973, and wave 3 in 1982. Subject was socialization and persistence of attitudes over time. It found that political orientations tended to stabilize by the time people were 30 years old.

TOPIC TWENTY-FOUR: UNOBTRUSIVE MEASURES AND CONTENT ANALYSIS

Problems with obtrusive measures: (derived from the book Unobtrusive Measures: Nonreactive Research in the Social Sciences, by Eugene Webb, Donald T. Campbell, Richard D. Schwartz, and Lee Sechrest; Rand McNally Co., 1972)
1) Guinea Pig or Testing Effect: subjects may feel must leave a good impression, or test may make them interested in subject
2) Role Selection: nonrepresentative role selected, especially by less educated and less familiar with subject of test
3) Response Sets, such as acquiescence bias, sequence, wording: Mississippi Poll, spending items
4) Interviewer Effect: race, age, and sex of interviewer may affect responses

Unobtrusive Measures directly remove the researcher from the research setting.

Types of Unobtrusive Measures (Source: Research Methods in the Social Sciences, 5th edition, by Chava Frankfort-Nachmias and David Nachmias; St. Martin's, 1996, pages 315-324)

RESEARCH PAPERS DUE

TOPIC TWENTY-FIVE: STUDENT ORAL REPORTS ON THEIR RESEARCH PAPERS