WEEK 7: DESCRIPTIVE STATISTICS
DESCRIPTIVE STATISTICS are Univariate Statistics. Univariate means
that they deal with only one variable.
There are two main groups of descriptive statistics- measures of central
tendency, and measures of dispersion.
CENTRAL
TENDENCY
refer to what is the typical case. That is, what are the characteristics or attitudes
of the “typical person” in a poll. There are three ways of measuring central
tendency.
- 1. Mode. Mode is the
category with a plurality (the greatest single number of cases). It only
requires that an indicator be measured at the nominal level. For example,
if the variable studied is how people voted in the 1968 presidential
election, and 44% of the sample voted for Republican Nixon, 43% voted for
Democrat Humphrey, and 13% voted for Independent Alabama Governor George
Wallace, the mode is the category of Nixon. It is the category with the
plurality, the single largest number of people, which in this case is 44%.
So the typical case, the typical voter in 1968, voted for Nixon, and he
became President. You only need variables measured at the Nominal level
when studying Mode.
- 2. Median. Median is the
category of the variable that contains the "middle" case. The “middle
case” means that half of the cases have scores that are below this case,
and half of the cases have scores that are above the median. So, basically
you line everyone up from highest to lowest scores, and whoever is in the
middle, that is the median case. Let’s say that we were studying the age
of students in the class, and we had 11 students. Let’s say that their
ages ranged from 20 thru 30, and one student fell into each age group. The
median age would be 25. Half of the students (ages 20-24, 5 students)
would be below age 25, and half would be above it (ages 26-30, 5
students). It’s a little more complicated when you have more than one
person in each category, which is called “grouped data.” In that case, you
again order the variables’ categories from lowest to highest, add up the
cumulative percentages of cases, and when you get to the category that
contains the 50% cumulative percent (the middle case), that is the median.
Also, reverse the ordering and add up the cumulative percentages starting
from the opposite end of the variable, and the 50% level should give you
the same median category. That is how you can check your work. Median
requires variables measured at the ordinal level, since you have to be
able to rank the variable’s categories from lowest to highest scores.
- 3. Mean. It just means
the average. You add up all of the cases' scores, and divide by the number
of cases. Calculating your grades for one of my previous classes, if I required two
tests and a research paper, and if I said that each counts equally, then I
calculate your mean or average grade. If you got a C on the first test,
that is a score of 2. A B on the paper would be a score of 3. An A on the
final test would be a score of 4. The mean of these three grades would be
(2+3+4)/3, each score added together and divided by the number of grades.
The mean of your grades would be 9/3 = 3, which would be a B. It is more
complicated with grouped data, as you have more than one case with every
category. So I would have to weight each category by the number of people
falling into that category. A calculator or the SPSS computer program
calculates the means for our variables, so I will ask an applications
question on the test, and not require that you calculate the mean. Means require
an interval level of measurement, since the scores of the category have a
real meaning and do not merely denote how they are ordered from low to
high.
Identify the mode and median categories
in each of the following examples drawn from the 2010 Mississippi Poll. Do the
mode first, for all of the questions. Then, go back and do the median. Most
people have a greater problem with median than mode, so let’s go through the Obama
example. How do you find median? Well, start at the top and keep adding the percents
together until you exceed 50%. So, 14% + 24% = 38%. You have to keep going. Now,
take the 38% and add the next category’s size of 23% to it. You now have a
cumulative 61%, which exceeds 50%. Therefore, that last category (that has the
additional 23% of people) is your median. That category is Fair. Now, check
your work by starting from the bottom and calculating the cumulative percents.
39% is not yet 50%, so you have to add the next category to it. So, 39% + 23% =
62%, and now you have passed 50%. So that category that has that last
percentage of 23% is your median. That category is Fair. So, you are now certain
that your median category for President Obama’s job performance rating is Fair.
Punishment favored in
cases of first-degree murder:
Death penalty............... = 51%
Life without parole......... = 42%
A shorter jail term than life = 7%
How rate President
Obama's job performance:
Excellent = 14%
Good .... = 24%
Fair .... = 23%
Poor .... = 39%
Ideological
self-identification:
Very Liberal......... = 6%
Somewhat Liberal..... = 8%
Moderate............. = 34%
Somewhat Conservative = 26%
Very Conservative.... = 26%
Education Level:
High School Dropout. = 23%
High School Graduate = 30%
Some College........ = 29%
College Graduate.... = 13%
Some Graduate Work.. = 5%
Annual Family Income
Under $10,000 = 14%
$10-20,000... = 11%
$20-30,000... = 14%
$30-40,000... = 16%
$40-50,000... = 10%
$50-60,000... = 6%
$60-70,000... = 8%
Over $70,000. = 21%
Likelihood of Living in
the Current Community in Five Years:
Definitely No. = 8%
Probably No... = 13%
Probably Yes.. = 30%
Definitely Yes = 49%
Population of the
Community You Live In:
Farm or ranch = 11%
Rural area... = 30%
Under 2,500.. = 12%
2,500-10,000. = 18%
10,000-50,000 = 22%
Over 50,000.. = 7%
Now identify the mode
and median for some of the other results of the Mississippi Poll in each year. These
percentages total 100% across each row, however.
Presidential job rating
over the years:
|
Excellent
|
Good
|
Fair
|
Poor
|
|
Reagan
|
|
|
|
|
|
1981
|
22
|
31
|
29
|
18
|
|
1982
|
10
|
30
|
34
|
26
|
|
1984
|
21
|
33
|
22
|
24
|
|
|
Bush 2
|
|
|
|
|
|
|
2002
|
36
|
33
|
19
|
12
|
|
|
2004
|
25
|
24
|
25
|
26
|
|
|
2006
|
15
|
30
|
27
|
28
|
|
PARTY IDENTIFICATION:
YEAR
|
DEMOCRATS
|
INDEPENDENTS
|
REPUBLICANS
|
1981
|
61%
|
7%
|
32%
|
1992
|
47%
|
13%
|
40%
|
2014
|
44%
|
8%
|
48%
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
“Overall, how would you rate Mississippi as a place
to live- excellent, good, fair, or poor?”
YEAR
|
EXCELLENT
|
GOOD
|
FAIR
|
POOR
|
1981
|
37%
|
40%
|
18%
|
5%
|
2014
|
32%
|
39%
|
19%
|
10%
|
“Do you strongly agree, agree, disagree, or
strongly disagree with the following statement: By law, a woman should be able
to have an abortion as a matter of personal choice."
YEAR
|
Strong Agree
|
Agree
|
Undecided
|
Disagree
|
Strongly Disagree
|
2000
|
9%
|
34%
|
5%
|
32%
|
20%
|
2012
|
19
|
32
|
4
|
30
|
15
|
2014
|
16
|
34
|
4
|
21
|
25
|
MEANS. This question on
the second test will be an application of means. It will ask you to come up
with the verbal interpretation of means, using an ordinal scale.
The example below is the ideological self-identification and ideological
perception questions.
Self-identification question
wording: "What about your political beliefs? Do you consider yourself:
very liberal, somewhat liberal, moderate or middle of the road, somewhat
conservative, or very conservative?" Candidate perception question
wordings: "Please label the following political figures as very liberal,
somewhat liberal, moderate (or middle of the road), somewhat conservative, or
very conservative." The presidential candidates asked about were: "Democratic
Presidential hopeful Hillary Clinton," "Democratic Presidential
hopeful Barack Obama," "Republican Presidential hopeful John
McCain." Ideological perception questions were not asked for the U.S.
senate candidates. However, such questions were asked in previous years' polls
for Mississippi public official Ronnie Musgrove, for when he was lieutenant
governor (1998) and governor (2000, 2002 polls). For comparison purposes, we
also include the perceptions of previous Democratic presidential candidates,
asked in previous Mississippi polls.
The values below are
"means" or averages for the ideological variables, all of which are
coded as 1 for very liberal, 2 for somewhat liberal, 3 for moderate, 4 for
somewhat conservative, and 5 for very conservative. Refer to these numbers in
order to come up with the words that describe what the means are. You will
always say that the typical person fell between two categories, but was closer
to a specific category. The test answers are given right after the means, so
you might want to cover up these answers with your hand and stop reading when
you get to the number, and try to answer in your own words, and then check the
answer to see if you were right.
2008
Mississippi Poll:
- Hillary Clinton's perceived ideological mean = 2.21.
Mississippians perceived Hillary Clinton's ideology as between somewhat
liberal and moderate, but closer to somewhat liberal.
- Obama's perceived mean = 2.08. Mississippians perceived
Barack Obama's ideology as between somewhat liberal and moderate, but
closer to somewhat liberal.
- McCain's perceived mean = 3.59. Mississippians
perceived John McCain's ideology as between somewhat conservative and
moderate, but closer to somewhat conservative.
- Average Mississippian's own mean = 3.52. The average
Mississippian's self-identification was between somewhat conservative and
moderate, but closer to somewhat conservative.
- Who was the average Mississippian ideologically closer
to? Obama or McCain? The answer of course is McCain.
Previous
Democratic presidential nominees, 1988-2004 Mississippi Polls:
- Kerry = 2.12. Mississippians perceived John Kerry's
ideology in 2004 as between somewhat liberal and moderate, but closer to
somewhat liberal.
- Gore = 2.35. Mississippians perceived Al Gore's
ideology in 2000 as between somewhat liberal and moderate, but closer to
somewhat liberal.
- Bill Clinton (1996) = 2.20. Mississippians perceived
Bill Clinton's ideology in 1996 as between somewhat liberal and moderate,
but closer to somewhat liberal.
- Bill Clinton (1992) = 2.50. Mississippians perceived
Bill Clinton's ideology in 1992 as midway between somewhat liberal and
moderate.
- Dukakis = 2.20. Mississippians perceived Michael
Dukakis' ideology in 1988 as between somewhat liberal and moderate, but
closer to somewhat liberal.
Ronnie
Musgrove's perceived ideology in previous Mississippi Polls were:
- Lieutenant Governor Musgrove (1998 poll) = 3.02
- Governor Musgrove (2000 poll) = 2.71
- Governor Musgrove (2002 poll) = 2.68
Now identify the means
in words of the perceived ideologies of candidates in the more recent
Mississippi polls:
- Adult Mississippi resident (2012 poll)= 3.56
- President Barack Obama (2012 poll)= 2.07
- Republican Mitt Romney (2012 poll)= 3.47
- Governor Phil Bryant (2012 poll)= 3.62
- Attorney General Jim Hood (2012 poll)= 2.97
- Adult Mississippi resident (2014 poll)= 3.52
- President Barack Obama (2014 poll)= 2.04
- Republican Chris Christie (2014 poll)= 3.34
- Senator Thad Cochran (2014 poll)= 3.55
- Governor Phil Bryant (2014 poll)= 3.81
- Attorney General Jim Hood (2014 poll)= 3.15
You can see that these
elected Mississippi Democrats, such as Musgrove and Hood, were viewed by
Mississippians as less liberal than Democratic presidential candidates were
viewed. Thus, they were seen as being ideologically closer to average
Mississippians than were the party’s presidential candidates. Hence, they were more
electable.
DISPERSION- diversity, how divided or united the cases
are, the form of the distribution (interval level variable is required)
- 1.
Range. Range is the distance between the extreme categories of a
variable. Thus, merely subtract the lowest number representing the
category at one end of the indicator from the highest number representing
the category at the other end of the indicator. The resulting number is
the range. Since these categories’ numbers have real meaning, range requires
that variables be measured at the interval level of measurement.
- 2.
Variance is the average squared deviation of each case from the
mean. Click on this link for the formula and
an example of how variance is calculated. Note that the lower the variance
score, the more united and homogeneous the cases are on the variable. The
higher the variance score, the more divided the cases are. In this example
of the ideologies of Mississippi’s local party activists at the turn of
the century, you could see that the Republican variance was only .493,
while the Democratic variance score was 1.351. That was because
Republicans were much more united and homogeneous than were the Democrats.
Over 90% of Republican fell into only two categories- the “somewhat
conservative” and the “very conservative.” Democrats on the other hand
were more dispersed and divided, as 90% of them fell into four categories-
very liberal, somewhat liberal, moderate, and even somewhat conservative.
For a variance example on the test, you will just compare two variance
scores for two groups, and indicate which group is more united (has the
lower variance score) or which group is more divided (has the higher
variance score), whatever term I ask for. You will not have to calculate
the Variance scores yourself, since the computer program does that for us.
- 3.
Standard Deviation is merely the square root of the Variance, and will not
be asked about on the test.
Examples of calculating
the Range from previous tests follow. The answers are given as
the last sentence in each paragraph example, so again you might want to put
your hand over it and try to calculate these numbers yourself.
- Party identification ranges from a low of 1 for Strong
Democrats to a high of 7 for Strong Republicans. Other categories are: 2
for Weak Dems; 3 for Independent Dems; 4 for Pure Independents; 5 for
Independent Reps; 6 for Weak Reps. The range is 7-1 = 6.
- Ideology ranges from a low of 1 for Strong Liberals to
a high of 5 for Strong Conservatives. Other categories are: 2 for Weak
Liberals; 3 for Moderates; 4 for Weak Conservatives. The range is 5-1 = 4.
- A feeling thermometer for President Barack Obama ranges
from a 0 for feeling cold towards him to a 100 for feeling hot towards
him, with 50 being a midpoint of indifference towards him. The range is
100-0 = 100.
- Annual family income, reported in an open-ended
fashion, ranges from $5,000 to $500,000. The range is $500,000-$5,000 =
$495,000.
- Years of formal education, ranges from a low of 4th
grade or 4 years, to a high of having a PhD, which is equivalent to 20
years. The range is 20-4 = 16.
- Years lived in Mississippi, range from someone who just
moved to the state, which is coded as 0, to a 99-year-old person who has
lived here all their lives, which is equivalent to 99 years. The range is
99-0 = 99.
- Do you strongly agree (1 code), agree (2 code),
disagree (3), or strongly disagree (4) with the following statement:
Rather than being tried in court, suspected terrorists should be
imprisoned indefinitely. The range is 4-1 = 3.
- How would you rate the job performance of President
Obama: excellent (4), good (3), fair (2), or poor (1)? The range is 4-1 =
3.
- What is your age? The lowest age in the survey is 18,
and the highest age is 90. The range is 90-18 = 72.
- Do you think that more (3), the same (2), or less (1)
should be spent on public grade schools and high schools? The range is 3-1
= 2.
Let’s briefly refresh
your memory of terms like mean and variance by going back to the example of
Mississippi's party organization members in 2001. The mean for Democrats was
2.69, which was between somewhat liberal and moderate, but closer to moderate.
The mean for Republicans was 4.45, which was between somewhat conservative and
very conservative, but closer to somewhat conservative. Remember the form of
the distribution. Nearly 10% of Democrats were very conservative, and almost
20% were somewhat conservative, so there was considerable diversity or
dispersion of ideologies in the Democratic party. Therefore, the variance of
Democrats' ideology scores was a relatively higher number, a variance of 1.351.
For Republicans on the other hand, less than 2% of them were very liberal or
somewhat liberal. So there was much unity and clustering of ideological scores
for the Republicans, and little diversity or dispersion of scores. Therefore,
the variance of Republicans' ideology scores was a relatively low number, a
variance of .493. Therefore, Democrats were more divided in ideology (a higher
variance), and Republicans were more united on ideology (a lower variance).
Examples of previous test
questions on variance follow. The answers are immediately given in
each example A-E.
Test Question 4A. (5
points) The following two questions are based on the last three Mississippi
Polls, all conducted in the 21st century. Using the statistic of variance, are
Democrats or Republicans most divided on each of the following five variables:
- A. Affirmative Action: Democrats' variance is .813.
Republicans' variance is .544. Who is more divided, having a higher
variance? Democrats.
- B. Improving blacks' socio-economic conditions:
Democrats' variance is .480. Republicans' variance is .625. Who is more
divided? Republicans.
- C. Death penalty for murder: Democrats' variance is
.330. Republicans' variance is .184. Who is more divided? Democrats.
- D. Liberal-conservative ideology: Democrats' variance
is 1.340. Republicans' variance is 1.018. Who is more divided? Democrats.
- E. Rate President Bush' job performance: Democrats'
variance is .897. Republicans' variance is .715. Who is more divided?
Democrats.
Test Question 4B. (5
points) Using the statistic of variance, are whites or blacks most united on
each of the following five variables:
- A. Death penalty for murder: Whites' variance is .227.
Blacks' variance is .297. Who is more united, having a lower variance?
Whites.
- B. Liberal-conservative ideology: Whites' variance is
1.077. Blacks' variance is 1.565. Who is more united? Whites.
- C. Income: Whites' variance is 5.325. Blacks' variance
is 3.445. Who is more united? Blacks.
- D. Party Identification: Whites' variance is .207.
Blacks' variance is .106. Who is more united? Blacks.
- E. Support for government providing jobs and a good
living standard for all: Whites' variance is .776. Blacks' variance is
.441. Who is more united? Blacks.