STATISTICAL INFERENCE learning module

Statistical Inference

We use public opinion polls a lot, because we don’t have enough money like the Census to talk to everyone. Even with a randomly drawn sample, our estimates of public opinion will not be as exact as if we had interviewed everyone in the population. Statistical inference measures tell us something very simple. They tell us whether whatever relationship between two variables exists in the poll, does it also exist in the population. So, if you find that women are more liberal than men in your poll, does that relationship also exist in the population. Statistical tools are limited in the information they provide in the sense that they are just looking for whether any relationship of any strength exists in the population. So, for example you might find that in the sample, women vote 5% more Democratic than do men. If your findings are statistically significant, that just means that the direction of this relationship exists in the entire population. But that percentage difference might be only 1%; or it might be 10%. Either way, the statistics we use (chi-squared, t-tests) would say that your results are statistically significant. Because of the limited info these statistics provide, some researchers do not find them very valuable, and we are not talking about them under the tail end of the course. They are important, because they must be reported in all tables and mentioned in the text of your paper. And, they do at least tell you that your relationship is not limited to your sample, but it exists in the population.

To recap: Statistical inference is our ability to generalize a relationship found in a sample to the entire population from which that sample was drawn. That is, can we infer population characteristics from sample data. If our statistical inference test suggests that in the population the relationship between the two variables is nonrandom, the relationship is said to be statistically significant.

An example of statistical inference is also drawn from my Class Notes. our 2010 Mississippi Poll sampled only 601 adult Mississippians from an adult population of over two million. We found a definite relationship in the sample between gender and seat belt use. 83% of women said they "always" used their seat belts, compared to 76% of men. 12% of men said they "never" or "seldom" used their seat belts, compared to only 5% of women. The magnitude of this relationship between gender and seat belt use was 7%: [(83-76) + (12-5)] / 2. But can we generalize this relationship found in the sample to the entire population? Is there a relationship between gender and seat belt use in the entire population? Statistical inference is the procedure we use to determine if any relationship exists in the entire population.

In this example, the chi-squared (Pearson) value is 10.8 with 3 df, and is significant at .05 level. This means that there are only 5 chances in one hundred that no relationship exists in the population; thus, there is a 95% chance that this relationship does exist in the entire population.

In other words, Chi Squared significance level is one of those statistics where the lower the value, the better it is. A .01 significance level indicates that there is only one chance in one hundred that no relationship exists in the population. A .001 level of significance indicates that there is only one chance in one thousand that no relationship exists in the population.

How do you find the Chi Squared significance level in your computer output? Take a look at one of your crosstabs tables. Under it are too more tables, one for chi squared statistics and one for gamma.

Take a look at one student’s computer output from the 2020 class, who looks at the relationship between religiosity and ideology. There is a clear relationship in the poll between these two variables, since 68.0% of weekly church attenders are self-described conservatives, compared to only 42.4% of yearly church attenders. Conversely, 24.8% of the yearly church attenders are liberals, compared to only 12% of weekly church attenders. How statistically significant is this relationship?

*ideology1 Ideology religfre1 Religiosity recoded Crosstabulation**
			religfre1 Religiosity recoded			Total
			1.00 Weekly	2.00 Monthly (codes2,3)	3.00 Yearly (codes4,5)	Total
ideology1 Ideology	1.00 Liberal	Count	27	36	31	94
	1.00 Liberal	% within religfre1 Religiosity recoded	12.0%	25.2%	24.8%	19.1%
	2.00 Moderate	Count	45	48	41	134
	2.00 Moderate	% within religfre1 Religiosity recoded	20.0%	33.6%	32.8%	27.2%
	3.00 Conservative	Count	153	59	53	265
	3.00 Conservative	% within religfre1 Religiosity recoded	68.0%	41.3%	42.4%	53.8%
Total		Count	225	143	125	493
Total		% within religfre1 Religiosity recoded	100.0%	100.0%	100.0%	100.0%

Chi-Square Tests
	Value	df	Asymptotic Significance (2-sided)
Pearson Chi-Square	34.359^a	4	.000
Likelihood Ratio	34.946	4	.000
Linear-by-Linear Association	23.922	1	.000
N of Valid Cases	493
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 23.83.

Using a formula to calculate chi-squared, you (the computer program) get a value of 34.359 at 4 degrees of freedom (3 rows minus 1=2)(3 columns minus 1 = 2). Rows and columns denote number of categories of each variable. You subtract one from each number. Multiply the results together. 2 times 2 = 4, gives you four degrees of freedom. A table in a textbook or on-line source, or in your case your computer program, gives you the significance level of this statistic. In this case, the results are so statistically significant that it is basically zero. In published papers, we typically only report four values. Is the significance level the best, at < .001. Or is it < .01. Or is it < .05. The arrow to the left means, “less than.” We also report a rejected hypothesis, which is > .05; the arrow to the right means “greater than”. In this case, zero is less than .001, it is the best case scenario, so in your tables and text you just report Chi-squared sig. < .001.

Symmetric Measures
		Value	Asymptotic Standard Error^a	Approximate T^b	Approximate Significance
Ordinal by Ordinal	Gamma	-.331	.058	-5.481	.000
N of Valid Cases		493
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.

This third table gives the gamma values, which are in the first column. Remember with gamma values, the higher the absolute value of the number, the better; the highest possible is 1.0 or -1.0. This value in the first column of these Gamma table is what you report in your tables and the text of your paper. I have done that also in your lab, so just double check my work for your paper.

Normally, we won’t be reporting the significance levels for gamma (the last columns). The only exception is in certain cases where the significance levels for gamma and chi-squared are different, and that difference reflects something important. Chi-squared is a nominal level measurement, so it reports any deviation from chance for all of the cells in your table, even for the middle categories. However, your hypotheses are all directional, meaning that you posit that one category of one variable is related to one category of another variable. So, occasionally, gamma significance might be worth reporting.

This student example is looking at sex differences in educational level. Historically, you might hypothesize that men tend to have a higher education level than do women. And indeed, 21.1% of men are college graduates, compared to 16.9% of females, which is consistent with your hypothesis. However, 21.3% of men are high school dropouts, compared to only 18,9% of females. This is the opposite of the hypothesis. So what happens?

*educate1 Education Level sex Gender Respondent Crosstabulation**
			sex Gender Respondent		Total
			1 MALE	2 FEMALE	Total
educate1 Education Level	3.00 < Hi Sch	Count	137	140	277
	3.00 < Hi Sch	% within sex Gender Respondent	21.3%	18.9%	20.0%
	4.00 Hi Sch Grad	Count	213	236	449
	4.00 Hi Sch Grad	% within sex Gender Respondent	33.1%	31.9%	32.4%
	5.00 Some College	Count	158	239	397
	5.00 Some College	% within sex Gender Respondent	24.5%	32.3%	28.7%
	6.00 College Grad + >	Count	136	125	261
	6.00 College Grad + >	% within sex Gender Respondent	21.1%	16.9%	18.9%
Total		Count	644	740	1384
Total		% within sex Gender Respondent	100.0%	100.0%	100.0%

Well, as you can see from Chi-Squared, it says that it is significant at the .01 level. But would you really say that the hypothesis was upheld? After all, there really is no substantively significant differences between the sexes in education levels.

Chi-Square Tests
	Value	df	Asymptotic Significance (2-sided)
Pearson Chi-Square	11.598^a	3	.009
Likelihood Ratio	11.654	3	.009
Linear-by-Linear Association	.093	1	.760
N of Valid Cases	1384
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 121.45.

The gamma table in this case is very informative. Note that the value is only .019. That value, which measures the significance of this relationship between ordinal variables, is NOT statistically significant. In this rare case (and I have identified them for you in your attached Tables), I would conclude that the hypothesis was rejected. And I would explain that there is a statistically insignificant Curvilinear relationship between sex and education level. Men are both slightly more educated and slightly less educated than women, while women are more likely than men to be in a middle category of having “some college.” But the overall relationship between sex and education is so weak that it is statistically significant.

Symmetric Measures
		Value	Asymptotic Standard Error^a	Approximate T^b	Approximate Significance
Ordinal by Ordinal	Gamma	.019	.041	.461	.645
N of Valid Cases		1384
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.

Note that further information about Chi-Squared and how it is calculated is available in the Class Notes that I previously sent you.

A third test of statistical inference that is often found in published articles is the t-test for differences between means. From my Class Notes:

The t-test is an interval statistic (dependent variable must be interval). It tests the hypothesis that two groups have different means, and that the inter-group difference can be generalized to the population.

Two-sample t-test (SPSS-independent sample) means that each group is considered a sample.

A one-tailed t-test means that your hypothesis has a direction for the relationship. A two-tailed t-test is used to test nondirectional hypotheses. A two-tailed test is stricter, and SPSS does not report a one-tailed test, hence if your results are significant for the 2-tailed test, they will also be significant for the 1-tailed test.

Two statistics are reported in the SPSS program-- for two populations having equal variances, or unequal variances.

The t-test is computed using the formula in textbooks or on-line.
Degrees of freedom equals the sum of the two sample sizes minus two.

The t-value must be larger than table entry to be significant at the specified level. See page 576 of your textbook.

Using the SPSS program. Use Compare Means- Independent Samples Statistics Menu. Your Test Variable is your dependent variable, which should be interval level. Your Grouping Variable should be a dichotomous independent variable (recode it, when necessary). Use Levine test, which must be p <= .05 for equal variances; otherwise, use unequal variances row. Cite t-value and 2-tail sig. in papers. Significance Level must be <= .05.

You don’t have to understand the mechanics of this, except for the following example of a test question. My test questions are very straight forward.

Example of a t-test problem (drawn from 2008-2010 Mississippi Poll data).

Examining predictors of family income. Family income is an interval data, coded from a low of 1 for under $10,000 to a high of 8 for over $70,000. The following indicates what the average income codes are for pairs of categories of each predictor, as well as what the t-test significance level is. Answer the following two questions: For each predictor, what group has the higher family income; Is the t-test statistically significant for each of the following five predictors (remember, it must be significant at least at the .05 level)?

Education: high school dropout income mean is 2.68; college graduate income mean is 6.31; t-test is statistically significant at .001 level.
Sex: male income mean is 4.68; female income mean is 4.20; t-test is statistically significant at .01 level.
Race: white income mean is 4.92; black income mean is 3.15; t-test is statistically significant at .001 level.
Ideology: moderates' income mean is 4.26; conservatives' income mean is 4.75; t-test is statistically significant at .05 level.
Number of adults living in household: 1 adult households' income mean is 3.16; 2 adult households' income mean is 4.84; t-test is statistically significant at .001 level.

Lab work, focusing on your individual papers:

The next Findings and Tables section of your paper is the most critical part, since it basically counts for half of your overall paper grade. The literature review counts for one-fourth of your overall paper grade. So if you get pressed for time, you might want to put more time into the Findings and Tables section.

The bivariate part of the Findings section is pretty straight forward, as you can see from the sample student paper I sent you at the start of the class. You have one paragraph for each of your hypotheses. It is probably most readable to put the table first, and then have the text paragraph. Put the next table in, then have the next text paragraph. And so on. Feel free to renumber the table numbers to confirm with the hypothesis numbers in your model and hypotheses, and literature review sections. I just gave them a number based on the computer output. You will each have 5 of these bivariate tables, testing each of your hypotheses.

The most complicated part of the paper is the multivariate section. First, you can double check the tables that I sent you with the computer output that was provided earlier in the course. If nothing else, you can see how the printed computer output is so hard to make sense of, and how the tables in your paper are MUCH more readable to a reader.

We talked about multivariate tables and why we do these analyses previously in this class. But here are some examples from your own papers on the value of multivariate analyses.

One project in 2020 looked at Abortion as the dependent variable, with sex and religiosity as independent early variables and ideology as the middle, intervening variable. Interestingly enough, sex may not affect attitudes toward abortion (don’t worry about rejected hypotheses; as in this case, such findings are still very interesting and valuable to know). However, both religiosity and ideology may affect abortion attitudes, with liberals and the most religious being more pro-choice than conservatives and seculars. Now the question is, do these bivariate relations exist in a multivariate sense, that is, are both of these predictors important. Or, is ideology the only important predictor, and highly religious people are more pro-life only because they are more conservative than the seculars. This is when we have multivariate tables. In this case, we produced three multivariate tables that broke up the sample into three groups (weekly church attenders, monthly church attenders, and rarely attenders), and for each group we looked at whether ideology affected abortion attitudes. In this case, it looks like ideology does affect abortion attitudes, for each of these three religiosity groups examined separately. So ideology is important in your final model. Then, we produce three more tables, this time separating the sample into three ideology groups (liberal, moderate, conservative), only this time, we look at whether religiosity is important in affecting abortion attitudes. In these cases, it looks like religiosity is important in affecting abortion attitudes for each of these three ideology groups examined separately. So in your final model, religiosity is also important. This final model only has to be redrawn for the Conclusion section, which is not due until the entire rewritten paper is turned in.

You can take a look at the student sample paper sent earlier for some ideas on how to write up each of your multivariate tables. It would probably be easiest to just talk about each table separately. Then, have a sum up paragraph on what they all tell you, as the sample paper does.

You might ask, why reproduce the same percentages but in a slightly different format in these two sets of multivariate tables. Isn’t that redundant? Yes, but like in the test you just took, it is easier for you and a reader to follow the results. You don’t have to compare across three multivariate tables. You can just look at each table separately. So many of you will have these types of multivariate tables.

Another project from 2020 looked at a complex but valuable subject like support for defense spending, and might examine the important variables of sex, age, and ideology. It is just as valuable to have rejected hypotheses, since we may learn to our surprise that in Mississippi at this time, maybe there were no sex or even ideological differences in support for defense spending. But age may be a critical factor. In that case, we might control for ideology and sex separately to see whether age is still important for each ideological and sex group. The information is valuable, as attitudes of the older generation (policy makers0 and the younger generation (future leaders) is very important to know. The results may show that the generations gap in support for defense spending exists for only two (but numerous) ideological groups, and the gap may be especially strong for one sex group (but also exist for the other).

So some of you have multivariate tables that take one significant predictor, and controls for the other two variables. One project looked at abortion attitudes being shaped by sex, age, and education, and that was another example of this type of series of multivariate tables. This student in a previous lab had already identified some complex patterns that results, and I found them very interesting and unique. Good eye!!

I will leave you all now to think about all of this. Take a look at your own project and the Tables, and try to understand what they are telling you. E-mail me with any questions that you have about your own project and tables. I will give you some advice individually, and may mention your own ideas in the next learning module.