Surveys – Why Averages Aren’t Enough

We recently had the opportunity to review a survey performed by a group of graduate students, who had volunteered, in fulfillment of a class assignment, to help a prominent nonprofit organization. The survey featured a series of appropriately worded Likert items, in which respondents were asked their level of agreement with several statements.

Likert questions are commonly used in surveys, because they measure a respondent’s agreement or disagreement with a statement in balanced and relatively equal increments on both sides of a “neutral” response. This helps determine not only whether a respondent agrees or disagrees with a statement, but also how strongly the respondent feels in either direction. A typical Likert-style survey item might be “The classic American doll “Barbie” presents women in a favorable light.” Typical wording in a 5-point series of response options might read: “strongly disagree,” “disagree,” “neither agree nor disagree,” “agree,” and “strongly agree.”

It was obvious that items in this survey had been very carefully worded; the meaning of each item was quite clear. Imagine, for the sake of this blog, that statements might have read something like this, “At the end of the Dr. Seuss book How the Grinch Stole Christmas, the Grinch is portrayed in a favorable light.” (It’s pretty easy to decide “absolutely” or “yeah, maybe” or “no way!” on that one, huh?)

They survey was fielded with SurveyMonkey, and invitations were sent only to people who had opted into receiving communications from this particular nonprofit. Several hundred people responded, and the demographic information of this group of respondents was very similar to the general population that the nonprofit serves. So far, so good.

But when it came time to analyze results, the students reported the results exclusively in terms of mean responses. In other words, if the response options were numbered one through five, and the same number of people gave each response, then the “average” response would be 3, and “3” would indicate that the group, overall, was neutral. This led the students to make recommendations based on the statements that had the highest mean responses, or those that were most favorable to participants. Big mistake.

The problem with their analysis was that it failed to identify issues which subgroups felt very strongly about.

Imagine a survey being fielded by a neighborhood restaurant, who was trying to decide what items to put on the breakfast menu, and that the mean response to the question, “I would order pancakes regularly if they were on the menu.” and the mean response was “3” or “neither agree nor disagree.” These students might have recommended pulling pancakes on the menu, in favor of a food item whose mean response was 3.75. But imagine that, out of 100 respondents, 50 said “strongly disagree,” and 50 said “strongly agree.” In that case, pancakes are a sure bet, because 50% of your customers would order them.

The point is this: the purpose of surveys is not only to identify how a group of people feels, on average. It is also to find out which issues are strongly supported (or opposed) by subgroups. Then, by factoring the strength of support for a given issue and the size of the market represented by that subgroup, you can size the opportunity. And you’ll never get there if you work only from averages.