WEEK 5: SAMPLING AND SURVEY TYPES

 

A poll is a sample of the population that is designed to be representative of the entire population. As such, you may read about a national poll of only 1,000 people, but if it is done with the correct scientific method, it should be a pretty accurate representation of the entire population. Historically, though, we have had at least four problems with polls (the first two of which are especially important to remember):

1) Biased samples- the sample does not accurately represent important characteristics of the population, because it has an over-representation or under-representation of a particular group. For example, the Literary Digest poll in 1936 was pretty accurate in previous presidential elections, but it sampled its readers and car owners, and those groups were of a higher socioeconomic status (SES, high incomes and education levels) than the general population. President Franklin D. Roosevelt (FDR) was running for re-election pioneering liberal economic policies that were popular with the lower but not the upper SES, so the poll had the Republican Landon beating him in a landslide. Actually, FDR ended up winning in a landslide, as the poll had a biased sample. As such, today we check our polling samples with census data and then weight the results by demographic characteristics to ensure that we have representative samples.

2) Time-bound polls. Polls are only accurate for the exact time that they were conducted, since people can change their opinions over time. Some inaccurate polls were in 1948, when pollsters stopped polling weeks before the election, at a time when incumbent Democrat Harry Truman was behind because of high post-world war 2 unemployment and the Russian occupation of Eastern Europe. Truman proceeded to conduct an aggressive whistle-stop campaign where he blasted Republicans as only caring for the rich and big business and reminded voters that FDR and the Democrats had given them Social Security and protected their right to join labor unions. Truman won an upset election victory, and the next day he laughingly held up a Republican newspaper’s (Chicago Tribune) erroneous headline of “Dewey Defeats Truman.” (Pollsters also made the mistake of using a quota sampling method, which underrepresented the lower SES, since it gave pollsters too much freedom to decide whom to interview.) Another example of a time-bound poll was in 1980, when Republican Ronald Reagan had only a slight lead in the polls over incumbent Jimmy Carter (who faced a bad economy and international disasters). Over the last weekend, the one debate sank into the thoughts of the undecideds, as Reagan rebutted Carter’s claims that he was a conservative extremist by saying, “There you go again, Mr. President. Golly shucks. I didn’t support Medicare in the 1960s because I supported the alternative free market plan called Eldercare.” Reagan’s closing statement was: “Are you better off today than you were four years ago? Can your paycheck buy as many groceries as it could four years ago? If so, vote for the incumbent. If not, vote for a change, give my program a chance.” Also, the day before the election was the one-year anniversary of the Iranian hostage crisis, and the network news programs proclaimed: “Day 365. America held hostage.” So the national polls that had stopped polling the Friday before the election were wrong, as Reagan won in a landslide, and even brought in a Republican-controlled Senate. Two pollsters were correct, however. They were the candidates’ own pollsters, who kept polling until the night before the election. And so when President Carter was flying down to Plains Georgia to vote in his hometown, his pollster walked up to him and said, “Mr. President. We’ve just finished our polling, and you have lost. You have lost big. Your party is going to lose the Senate, maybe even the House.” And so poor Jimmy Carter got off the plane and tearfully told the crowd, “I hope I haven’t let you down.” So the moral is, keep polling right up to election day.

3) It is hard to estimate likely voters. Some people will lie and say they plan to be good citizens and to vote, when they will not really vote. So pollsters ask a series of questions to determine who is likely to vote (called “likely voters”). They can ask about interest in the campaign, knowledge of where your polling place is, and similar political interest and knowledge questions. The Mississippi Poll asks three questions to determine likely voters: reported likelihood of voting with only the Definitely Will Vote category considered as likely voters (not the Somewhat Likely category); interest in the political campaigns; ability to recall their U.S. House incumbent’s name (only about half can recall the name without prompting). A further problem is to determine who is likely to vote in a party’s primary, when you have to ask about their party primary vote intention. In the 1991 Mississippi gubernatorial contests, one pollster estimated that about one-third of those voting on primary day would vote in the Republican primary, and they predicted that a moderate conservative (party switcher state auditor Pete Johnson) would beat a strong conservative (life-long GOP activist and construction company owner Kirk Fordice). Consistent with history at that time, only about 10% of the votes were cast in the GOP primary, and they were the most conservative and partisan Republicans, so Fordice won the nomination. The poll probably relied on a party identification question to determine the party primary voted in, and since the real contest at that time was in the Democratic primary, many “weak” and “independent” Republicans voted in the Democratic primary (which was a slugfest between incumbent governor Ray Mabus and his challenger, Congressman Wayne Dowdy). If the poll had just considered strong Republicans as GOP primary voters, they would have gotten down to about the 10% turnout level, it would have been a more conservative group, and Fordice would have been favored. Fordice went on to upset Mabus in the general election, becoming the first Republican governor of Mississippi since Reconstruction (his speaking demeanor was kind of like Trump).

4) Social desirability response bias. Many polls, particularly in the Rust Belt, underestimated Trump’s vote in 2016. It is likely that Trump was so controversial that some of his voters weren’t willing to admit that they planned to vote for him, so they said they were undecided or even said they planned to vote for Clinton. The press and Democratic politicians were labeling him as a bigot (and continued to do so throughout his presidency). One of my students said that her father had told a pollster that he planned to vote for Hillary Clinton. “Why did you say that, Dad? You know you’re going to vote for Trump?” “Uh, I don’t want anyone to get the wrong impression,” he responded. This problem continued into 2020, as again some polls underestimated Trump’s support. Our Mississippi Poll had a similar problem in 2014, as we greatly underestimated conservative activist Chris McDaniel’s support in the GOP primary against incumbent Senator Thad Cochran. In that case, we may have had a biased response rate as some conservatives just refused to even answer a poll from a perceived “liberal university,” so McDaniel’s supporters may have been underrepresented in the sample. After McDaniel led by 1% in the first GOP primary, Cochran had to fight for his life to barely win the runoff, and he then went on to easily win the general election against a Democrat (our poll correctly called the general election outcome). This is a hard question to deal with, and the media and politicians who keep engaging in calling candidates bad names just make the lives of pollsters harder. Indeed, these kinds of problems led me to discontinue the Mississippi Poll project until we could get a better handle on this problem. The polls in the 2022 midterm election were more accurate, but polling experts are still skeptical of the accuracy of today's polls for reasons we will later discuss.

 

Sampling Error. Fox News did an election poll in December 10-13, 2023, and its results were similar to other polls taken at the end of last year. They sampled 1,007 registered voters, and had a sample error of plus or minus 3%. The poll asked respondents: "If the 2024 presidential election were held today, how would you vote if the candidates were Democrat Joe Biden and Republican Donald Trump?" The results were 50% for Trump and 46% for Biden with 4% giving other response. With a 95% confidence level, this means that if the entire population of registered voters in the United States had been polled, it is 95% likely that Trump's support would have been between 47% and 53%, and Biden's support would have been between 49% and 43%. Given the closeness of the poll and the size of the sample error, the race is too close to call. While Trump is favored, it is possible that Biden is actually ahead in the entire population, since he might have as much as 49% support and Trump might have as little as 47% support. Also, since we use the electoral college system, pollsters need to conduct polls in each American state. Also, remember that public opinion changes over time, as the trailing Harry Truman's case in 1948 showed.

Three things affect the level of sample error:

1) Sample size- the larger the sample size, the less the error. So, it is better to interview 1,100 people (yielding only 3% error) than 400 people (5% error). It’s like flipping a coin, the more tosses, the closer you get to the 50-50 split of a two-sided coin, heads and tails.

2) Homogeneity of population- the more united the population is on an issue, the smaller the sample error. A sample size of 400 with a population evenly divided on an issue (half want to vote Republican and half Democratic) yields a 5% sample error. A sample of 400 people who are overwhelmingly for an issue by a 90% value yields a 3% sample error. (Such a question might be: are you proud of your nation?) See the sample error chart previously mentioned.

3) A cluster sample produces higher sample error, about 20% higher. A cluster sample is when the people in your poll sample are not independently selected from each other. For example, in-person surveys (and even two-stage random digit dialing phone surveys) may randomly choose up to 5 people on the same city block, and since those people may share similar SES and race characteristics, they are not independently selected. As such, your sample size is not as great as you think, so you have more sample error. When our Mississippi Polls used two-stage random digit dialing, we began by sampling 600 people. The chart at a 50-50 worst case split gave a 4.1% sample error. We added 20% to that, which is .2. So, 4.1 X .2 yields .82 additional error. So our total sample error for our cluster sample was about 5%, not 4.1% (4.1 + .82), which is what we reported.

 

Advantages and disadvantages of three major types of surveys historically used by pollsters:

In-person surveys (where pollsters go door-to-door to survey people). They use a multi-stage cluster design, discussed later.

Advantages:

1) You can observe the respondent, and clear up any confusion that shows on their face. You can slowly ask the question a second time, but you can’t add any more information to the question. So I see this as a limited advantage.

2) You can obtain objective information about the respondent (example, if they say they have a low income, but they have a new Cadillac in their driveway, you can report their likely untruthfulness). Again, you get limited information, however, since few important issues can be physically observed.

3) You can use visual aids. When ranking candidates on a 100-point feeling thermometer in terms of how hot (like them) or cold (dislike) you are to them, you can show the respondent an actual thermometer. You can do card sort for five ideological groups, having five boxes with labels ranging from very liberal to very conservative, give them cards with the pictures of the candidates, and then have them put each card in the appropriate box to measure their perception of the candidate’s ideology. We have been able to use the ideological self-identification question and the perception of candidates’ ideologies questions in our Mississippi telephone polls by having pollsters slowly repeat the questions, so I see this also as only a limited advantage.

Disadvantages of in-person surveys (big problems in my view):

1) Very Expensive. You have to pay the cost of the interviewers traveling across the country (for national surveys), then the cost of their staying at a motel, and the cost of their own individual time doing all of that. Plus, this process can take up to two months for all of the interviews to be conducted. Such national in-person surveys can cost hundreds of thousands of dollars, or even millions of dollars.

2) Safety of the interviewer may be endangered. Your interviewers are going into all kinds of neighborhoods that may have high crime, neighborhoods where even a pizza company may refuse to go into. You may face legal liability if they are hurt.

3) Interviewer fraud. It may be hard to monitor your interviewers, as they work on their own. They may falsify some of their surveys. If someone is interviewing in New Orleans, they may be in a bar on Bourbon Street and filling out the forms themselves (Uh, this is a liberal, Democrat, voted for Biden. Uh, this next survey is a conservative Republican, loves Trump. Etc.). You have to have some way of verifying that they actually interviewed a real person, perhaps by having them report the phone number or address of each respondent, then contacting that person to ensure that the survey was actually done (You then have to remove such identifying information from each survey to ensure that the results are anonymous.).

Telephone surveys (where pollsters sit in a room, and call people). The specific methods used are discussed later.

Advantages (very good advantages, so today most surveys are done by phone):

1) They are Quick. You can complete your entire survey in a few days or a few weeks. You can use as many phones as the room will contain and use as many interviewers as you can hire. Typically, you phone from 5:30-9:30 PM during the weeknights, and most of the day on Saturdays and Sundays. You could technically even complete a survey in one day, but that is not advisable, since you need to call back people whom you can’t get so that you have a representative sample (Don’t leave out people who are busy with work or with their social lives, as they may have distinctive attitudes.).

2) They are very Cost Effective. You do not have to pay travel expenses; you are just paying for the interviewers’ time when they are on the phone or dialing the phone. You also have the cost of the computers and their upkeep, plus the cost of buying the phone numbers from a marketing firm. You also have the expense of your telephone calls. When we did the Mississippi Poll, our cost was as little as $3,000 (for the marketing firm phone numbers, plus the long-distance LDS cost); students in Political Analysis did the calls as their lab requirement; two professors donated their time supervising and upkeeping the computer equipment.

3) You can Eliminate Fraud, due to the centralized interviewing. As supervisor of the Mississippi Poll, I would walk around the room and listen to the student interviewers, so it was clear that they were actually doing the interviews with real people. If you hire a supervisor, make sure that they do their job, and that they don’t study for their graduate courses or socialize with a best friend.

4) Interviewer Safety. Interviewers work in this one (or two) room, so there is no danger of their going into unsafe neighborhoods. Our Mississippi Poll polling was done at the SSRC (Social Science Research Center) in the Research Park at MSU, and interviewers could park literally 20 feet from their polling room, so even the walk from their car to the building was safe (watch out turning into the Research Park, though, as it is a high traffic area).

Disadvantages of telephone surveys:

1) Historically, we worried about excluding people without telephones or those with only cell phones. When we started polling in 1981, only about 80% of Mississippi households had telephones, so we worried about under sampling the lower SES. By the turn of the century, households were up to 98% phone coverage, even in Mississippi. But our methodology sampled only landlines, so the rise of cell phones led us to eventually greatly under sample young adults. Our current methodology includes cell phones as well as land lines, so this is not a big disadvantage anymore.

2) You can’t use visual aids, so you are entirely dependent on the voice of the interviewer. This hasn’t been a real problem for the Mississippi poll, as the interviewer just slowly repeats the complex questions or what their response categories are, such as very liberal, somewhat liberal, moderate or middle of the road, somewhat conservative, very conservative. Race and gender of the interviewer has not been a real problem in causing social desirability responses or refusals. I thought we’d have a problem with a real “country” white guy who yelled into the phone when asking questions, but he ended up having a high response rate (I guess he sounded like a regular guy, hunter, fisherman, whatever.). So this is also not a big disadvantage. As you can see, telephone interviewing has many advantages and few disadvantages.

Mail surveys (we used bulk mail, and relied on three waves or mailings if people didn’t return them, in our NSF grassroots party activists study).

Advantages:

1) They are Cheap. There is no cost for interviewers. You only pay for paper, printing, postage and return envelopes and their postage. If you can require that respondents use a number 2 pencil, then the returned forms can be read into a machine that then produces an electronic database, so there is no cost of typing in all of the survey responses. You might add the cost of a graduate student to process the mailings, which we did for our NSF grants studying grassroots party activists in Mississippi.

2) You can use mail surveys for some Specialized Populations. A specialized population is more interested in the subject of the survey, so they are more likely to complete the surveys. Thus, we used the mail surveys for the NSF grants and sent them to each county party’s chair and its committee members. It is not generally advisable to use mail surveys for the general population, for the following reasons cited under disadvantages.

Disadvantages (big problems with mail surveys):

1) They exclude illiterates. Unless someone in the home reads the surveys to them, which may seldom happen. So you can end up with a biased sample.

2) You can’t control who answers the survey. You may want to randomly select adult men and women, but a man may get the survey, and he may just give it to the woman of the house to answer. Or the adult may give the survey to their teenage kid (“Hey, you’re taking civics. Why don’t you fill out this thing?”). So, the person answering the survey may not even be eligible, age-wise.

3) You can’t control the order of the questions asked. We may prefer that respondents answer general life questions first, then a vote choice item, then specific issues and candidate trait items, and lastly sensitive personal items such as their income and race. We want them to answer the issues and candidate trait items after the vote choice, so that their vote is not influenced by what the researcher thinks are the important issues. But unlike the other two methods of surveying, in mail surveys the respondent can look through the entire questionnaire before answering, so they can answer out of order, so to speak. That can influence the results.

4) Mail surveys are slow, as they take months to complete. With a 3-wave method to increase your response rate, you may send a second mailing out to those who don’t response to the first mailing, and that second mailing may go out a month after the first. The third wave may be sent a month after the second wave. So, it is a slow process. Indeed, how valid is the process, given that public opinion can change over time, so different people are responding at different time points?

5) Incomplete forms. What do you do if they leave out a whole page, or skip individual items? If they skipped items, we assumed that they just had no opinions on those items, or refused to answer. If they left out a whole page, we’d xerox that page, and send it back to them, and politely ask them to complete it and send it back to us. Only half at best sent it back to us. A general low response rate was not a problem for us, however, since we used a 3-wave mailing system; on the first wave we might get a 35% response rate; the second wave might yield an additional 10%; the third wave might yield 5%; so the total response rate might be 50%, which was pretty good back at the turn of the century (when we did the NSF study).

 

Sampling Techniques:

1) Multi-stage Cluster Sampling. It is usually used in in-person national surveys, where a national list of the population is typically not available. The following stages are: PSUs (Primary Sampling Units), such as U.S. House districts, maybe randomly select 100 of them; choose a city/cities or a rural area; for each city, randomly select 3 city blocks; for each city block, send the interviewer to draw up each housing unit, and randomly select 5 of them; the interviewer then visits each household, gets the first name of each adult in that household, and randomly selects 1 to interview, and comes back if that person is not home. In this example, the total sample size is 1500 (100 X 3 X 5 X 1 = 1500).

2) Telephone Survey Sampling. Telephone directory sampling historically, you may have 50 pages, 4 columns on each page, if you want a sample size of 400, then randomly select 2 names in each column. The problem with phone directory sampling is that unlisted numbers are not included, nor are people who just moved into the community, so you can have some sampling bias. We corrected for this problem by using random digit dialing, based on a table or computer-generated random numbers, which would include those left out by directory sampling. The Mississippi Poll in the 1980s and 1990s used Two Stage Random Digit Dialing, whereby eligible telephone exchanges (the first 3 digits of a phone number, about 350 were used in the state) were randomly generated and dialed, and 80% of the time it would be a non-existent number. But when you got a real household, you would then repeat the first five digits up to five times with new random last two digits attached to each; this method was used since the phone company tended to assign phone numbers to adjacent numbers, so non-existent numbers would drop to 25%. So this was more efficient in terms of interviewer time, but it is a form of cluster sampling as those same first five digits would be from the same community. After the turn of the century, we just purchased these “working household numbers” from a marketing firm, which guaranteed that 90% of the phone numbers would indeed be real households.

3) Sampling Within the Household. The Carter-Trodahl method ensures a random selection of each adult in each household by asking two questions- how many adults live in the household, and how many of them are men. It has at least 4 versions of a selection table (In a 2-adult, one male and one female house, 2 of the versions would tell the interviewer to interview the man, and 2 of the versions would say interview the woman.). The Mississippi Poll used 12 versions of the selection table, thereby permitting up to a 4-adult household to be accurately represented. The biggest problem with this method was interviewee resistance to answering such personal information at the outset of the survey, so that took a few minutes to explain why we needed that info. Sociologists used a method that we then turned to, which was the Last Birthday Method; it asks to talk to the adult of the household who had the most recent birthday. The problem with that method is that most people would not take the time to figure out who had the most recent birthday, so they’d just say It’s Me about 90% of the time. So we got an oversampling of women. We then corrected for that problem by Weighting the Sample, which comes later.

 

Demographic Groups Historically Under sampled by Telephone Polls:

1) High school dropouts. They are very busy at blue-collar jobs so they often lack time, and they may lack interest in or knowledge of politics and government. They may also feel that their opinions aren’t important or valued by college-educated pollsters. As such, we oversample the college educated by nearly double. In the 2014 poll, 38% of our sample was college grads or higher, compared to only 21% of the adult population (census data). Our sample consisted of 12% high school dropouts and 23% high school educated, compared to a census of 18% dropouts and 30% high school grads. (In 2010, though, we accurately sampled high school grads, but not dropouts.)

2) Lower income. Again, very busy at work, often lacking political interest or knowledge. Years ago, they were less likely to have the money for a phone. They tend to have lower education levels.

3) African Americans and other racial minorities. These groups tend to have a lower SES level. I was initially concerned that in Mississippi there might be a fear of answering sensitive political questions given the state’s troubled racial history, but by the 1981 poll this did not seem to be a problem. In the 2014 poll, 29% of our sample was African American, compared to 35% of the adult population, and this level of under sampling minorities was about the same in previous polls.

4) Men are under sampled repeatedly. They are hard to get, less likely to be home, often less verbal, more assertive in being unwilling to answer surveys. Plus, the last birthday method gets fewer men. In the 2014 poll, only 39% of the sample were men, compared to 47.5% of the adult population.

5) The young adults, as they are socially active, busy, mobile in residency, and less interested in politics than older groups. In the 2014 poll, 16.5% of the sample were under 30 age, compared to 23% of the adult population. The last survey where we relied solely on land lines saw a huge problem, as only 6% of the sample were under 30, compared to 23.5% of the population. So today we combine both land lines and cell phones, as do other pollsters.

6) Old adults historically were under sampled, as they were more likely to be deaf, had health problems preventing long phone conversations, or were institutionalized in nursing homes. Today this is less of a problem, as it is easier to represent those over 60 than it is to represent those under 30. Indeed, in 2014, 32% of our sample was over 60, compared to 26% of the adult population.

 

Weighting the Sample to ensure a representative sample. Multiple stages.

1) You should determine during the survey how many phones the respondent could have been reached from, and how many adults had access to those phones. Historically, you would create a Weight variable using the SPSS computer package that would be the number of adults in the household, divided by the number of different telephone numbers. Thus, in a two-person household, the numerator would be a 2, which would be twice that of a one-person household; that would compensate for each person in the 2-person household having only half of the chance of being included in the survey compared to the person in the 1-person household, since we only interview one person in each household. If a person’s household could be reached by dialing two different phone numbers, that number 2 would go into the denominator; thus, the person had double the chance of being called than a person who had only one telephone number (only 5% or less of the population had more than one phone, however), so their response would be cut in half. So that is the first weight variable.

2) You now compare the weighted sample with census data to see how representative your sample is. You then create a Weight2 Variable that is equal to Weight1 * (a different number based on the category of the variable that you are correcting for). In 2010 we so under sampled young adults that we corrected for Age first. We used the SPSS, Transform, Compute menu. Target variable was Weight2. Numeric Expression was Weight1 * (the population % of the age category divided by the sample % of the age category). At bottom left, use If, then Include If whatever age group you are correcting for. Such as age < 30, then the value is 23.5/5.9, the population percent divided by the sample percent. That value now gives each young adult in the sample close to 4 votes, so the sample percent of young adults will be equal to the population percent of young adults. You have to do that for each age category you are using. You then compare the Weight2 demographic frequencies of your sample with the census data, and you will find that age is corrected for. You now have to correct for other problem demographics. We did sex next, so Weight3 = Weight2 X (male or female correction). Weight4 corrected for education, and Weight5 corrected for race. Weight6 just took Weight5 and divided by a constant, whatever was necessary to get the sample size back to what it originally was. For instance, if you originally had 600 people in your sample, but your weighted sample now had 1200 (if you had all 2-adult households), Weight6 would be Weight5 X .5. That ensures that you don’t mislead a reader or get statistically significant results that don’t really exist (because of an inaccurately inflated sample size).

 

How accurate are polls? Despite the problems with recent Trump polls, polls in this century have generally been pretty accurate. We gauge accuracy often from presidential election results, since it is a measurable event that we can compare poll results with. In examining state polls of the presidential race in 2004, for example, polls were most accurate if they were conducted as recently as possible, such as within five days of the election. Another way of increasing accuracy was to combine the results from multiple polls, since you are basically increasing your sample size and thereby reducing your sample error. That is why in this class, the student papers will usually combine the three most recent state polls.

 

Check out the Mississippi Poll website. One informative link on that page is a summary of methods (already mentioned) used in the polls, which extend from 1981 thru 2014. It shows yet another problem with polling, and that is declining response rate. From 1981 thru 1990, the response rates were over 70%. From 1992 thru 1999, they were in the 60’s. From 2000-2006, they were about 50% (meaning that half of the people we wanted to interview refused to even talk to us). In 2008-10, it was about 41%. By 2012-14 it had fallen to 26-31%. A low response rate can cause validity problems, since one wonders if the sample is indeed representative of the population. Of course, you can weight the sample by demographic characteristics, but is a weighted male who participates the same in attitudes to a male who refuses to be interviewed, for example? My Public Opinion class textbook pointed out that this is a major and increasing problem nationally, as polling response rates have become as low as 9%. One recent national poll even had a response rate of 1.5%! Polls have become so controversial that RealClearPolitics website has started to rate the accuracy of polling organizations. Most organizations do not clearly explain their methodology, such as weighting or determining likely voters, and I suspect that the 2022 polls were more accurate because they weighted by the expected turnout of party identification groupings. In short, polling is a very complex subject, though I have had students who got jobs that used campaign or issue polling or analysis of polls.

Oh, by the way, an excellent source of political polls is the website Real Clear Politics. That website also has an informative daily sampling of ideologically diverse news stories and analyses.