Faunalytics worked with Animal Charity Evaluators to compile the following set of best practices and advice for animal advocates when conducting surveys and other research.
There are many areas of consideration when designing studies and accompanying surveys, including
- Study Design
- Survey Design
Study design is the design of the process surrounding a survey or other data-collection effort. It considers who will be surveyed, when they will be surveyed, how they will be approached, how data will be analyzed, and other procedural issues. Study design can be very difficult and complicated, for instance if the research goals involve following a group of participants throughout a ten-year period, or quite simple, if the research goal is to get immediate feedback about what most interested viewers of a presentation.
Every aspect of a study should be guided in part by the purpose of the study. It is often easiest to think of the purpose of the study in terms of a list of questions that it should answer. Examples include:
- “How many animals are we saving with this activity?”
- “Should we include a health argument in our new video?”
- “Are our Facebook followers really involved with our organization?”
If there are many questions that the study should answer, it is helpful to rank them in order of priority. The study can be designed to answer the most important questions as accurately as possible, and other questions can be addressed within that framework.
If you want to answer questions about the long term impacts of a program, consider conducting multiple surveys with the same group of participants each time. This is called longitudinal research. Because very little longitudinal research has been done on vegetarianism and related issues, conducting this sort of study would be especially helpful to the movement as a whole.
The population of (potential) participants is the group or groups of people who need to be surveyed in order to fulfill the purpose of the study. It is natural to initially make the pool of potential participants be the pool of people who are easiest to reach. An organization looking to evaluate the effects of an online video might start with the idea of polling people on its email list because these people are easy to reach. If the goal was to find out how many people on the email list forwarded the video to friends, this would be a good choice of participant population. However, if the goal was to understand what percentage of people who watched the video took an offline action recommended by it, this would give incomplete and possibly misleading information. And even if the procedure was amended to somehow include all viewers, this procedure still wouldn’t be able to determine how many took the action because of the video and how many took it by coincidence. To know this, a randomly assigned sample of viewers should somehow be selected not to see the video, and surveyed anyway. (This is a control group.)
Some questions that might help in determining whether a participant population is appropriate include:
- “Does this population include everyone we are trying to influence or whose opinion we care about?”
- “If we are trying to compare two activities, are the people we are asking about the first activity different in relevant ways from the people we are asking about the second activity before the activities occur?”
- “If we’re trying to measure the total effect of our actions, how do we know what would be happening if we did not act?”
- “Are we giving people a choice of whether to be in the population or not? Does giving them that choice suit our purpose?”
The timing of a survey also affects the information it provides. People are most accurate when answering questions about the present or the recent past, not the future or the distant past. Therefore, a study seeking to understand changes in behavior over the course of a year would ideally survey the same participants at the start and end of the year, to reduce the need for accurate recall of the distant past. If only one survey could be conducted, preferably it would be at the end of the year, since people are better able to recall the past than to predict the future.
The timing of the survey can also affect other factors, like the response rate and the perception of the study’s purpose. A survey administered to a captive audience will have a high response rate, making it less subject to response bias and more representative. However, it may also be more difficult to obscure the purpose of such a survey, leading to biased responses.
Whether a study uses a survey that is given in person, by telephone, or online is often determined by logistical considerations. However, sometimes a choice is available. All methods will reach slightly different populations. Respondents may be more frank in an online survey or automated telephone survey, and less frank when a real person is surveying them. For some involved questions, a real surveyor can provide cues and clarifications that allow respondents to provide more accurate answers. Also, the medium of the survey can affect the response rate, as in-person surveys (especially of captive audiences in lines or in classes) have higher response rates than surveys online that respondents can delay or can decline to participate in without disappointing a visible person.
When a survey is conducted, subjects are often provided deliberately with certain information about the survey, such as the identity of the organization conducting it and the purpose of the survey. If this information is not provided explicitly, respondents will often develop their own guesses about these things. Since answers and response rates can be affected by what subjects perceive to be the purpose of the study and by how they view the organization conducting the study, it is important to consider what information will be provided to respondents that explicitly answers these questions or hints at answers to them. To reduce potential sources of bias, organizations can adjust how surveyors present themselves to potential respondents and what, if any, materials and information are provided to respondents immediately before the survey, as well as the text and questions on the survey itself. In particular, it may be helpful to avoid giving the impression that the surveyors are connected to an animal advocacy group.
For large scale studies, it may be helpful to conduct a pilot prior to investing large amounts of effort in carrying out an unfamiliar study design. A pilot is a small trial of the study, using the intended protocol on only enough respondents to provide estimates of response rate and rough estimates regarding what the study is intended to measure. Especially when a study involves investing significant resources, a pilot can help prevent waste by identifying problems in study design. A pilot can also help determine how large the study needs to be, by providing estimates of the size of effects the study is trying to detect or of the number of subjects who will eventually respond.
The analysis phase of a study is the phase after data collection, when the results are used to answer the original questions of the study. There are two main study purposes, and they call for different analysis styles.
An exploratory study seeks to discover completely new information. For instance, if an organization has no idea what aspects of their leaflets are most interesting to readers, they might conduct an exploratory study to get some ideas about how readers perceive the leaflets. In this case, many of the questions asked would be open-ended, and the data analysis would also be open-ended. An analysis procedure might not be specified ahead of time and typically a large number of possible findings would be considered. The result is that any relationships which appeared significant would still have to be considered somewhat tentative until confirmed with further information. Often, qualitative research methods are useful in this situation: free response questions and focus group discussions are hard to tabulate, but allow responses which might not be possible to collect through multiple choice questionnaires.
Instead of seeking to discover completely new information, a study could begin with a hypothesis (perhaps generated in a previous exploratory study) that it seeks to test. In this case, it is important not to run an open-ended data analysis. For instance, if a study intended to show that students who heard a humane education lecture were more likely to go vegetarian than students who didn’t, it would not be appropriate to test the effects separately for every possible major the student could have. Instead, tests should be run only for groups believed likely to behave differently for specific theoretical reasons– for instance, students in a school of agriculture might be analyzed separately from students in a school of engineering, since they would likely have different amounts of background knowledge of the facts presented. When the number and kind of tests is carefully considered, the results of a study can be taken with more confidence than when an analysis is performed on every possible combination of responses and characteristics.
Questions should be selected that will address the goals of the study. Where possible, using questions identical to those that have been asked in previous surveys will simplify comparison to those results. However, for some purposes, it will be necessary to modify existing questions or write new ones.
Questions asking respondents directly whether they are vegetarian or vegan should be avoided or treated as bearing on the respondents’ beliefs rather than their behaviors, due to very high rates of misreporting. In some surveys, over half of self-reported vegetarians also reported having eaten meat on at least one of two specific days. We believe that measuring dietary change is the most important way to assess veg advocacy programs and so specific dietary questions may need to be privileged over other questions, which may mean asking fewer complementary questions if respondent fatigue is thought to be an issue. A substantial Food Frequency Questionnaire will often be more useful for evaluating program success than a survey of comparable length that touches on a variety of questions.
Questions where respondents choose a single answer from a list are usually the easiest to analyze. Questions where respondents choose all relevant answers can be ambiguous, since there is no means of determining which answer was most relevant. Questions where respondents rank answers in order of importance can provide more information during analysis, though they can be more difficult to analyze. Questions where respondents write in their own answers are harder to analyze and should be used mostly for exploratory purposes.
To avoid indicating a desired response, scale questions should have equally many response options on either side of the neutral answer, if applicable. For instance, the scale Disagree, No Opinion, Agree Somewhat, Strongly Agree is unbalanced and should be corrected to Strongly Disagree, Disagree Somewhat, No Opinion, Agree Somewhat, Strongly Agree.
If keeping the purpose of the study unclear is a goal, some questions may be included that will not be used in the final analysis. For instance, if a survey includes many questions about diet already, questions could be added about caffeine, sugar, or alcohol intake, to suggest that the purpose of the survey is related to health outcomes.
Social desirability bias is a serious concern with studies conducted by animal advocacy organizations. Survey results regularly come back with unreasonably high rates of success, due to a combination of respondents incorrectly identifying as vegetarian, higher rates of engagement and response from the participants most moved by an activity (also known as response bias), and responses exaggerating the aspects of behavior that are believed to be pleasing to the surveyor (social desirability bias).
The most reliable way to control for social desirability bias is to avoid giving the respondent clues about which answers the surveying organization would prefer. Since eating meat and disregarding the welfare of farmed animals are social norms, if the respondent does not know the surveying organization’s agenda, their responses to questions will likely not be much influenced by social desirability bias. (The exception is that respondents may under-report consumption of red meat, especially if they are in groups which are expected to be concerned about their weight or dietary habits.) This approach can be strengthened by including a control group and considering their responses as a baseline – attitudes and behaviors of participants in a program are attributed to the effects of the program only insofar as they differ from what was reported by the control group.
If respondents to a survey will unavoidably know which answers the surveying organization would prefer, social desirability bias can be addressed by including a set of questions designed to measure which respondents are most likely to be answering in ways that reflect what they believe the surveyor wants to hear, rather than the truth. Several instruments for this purpose have been developed in the psychological literature. A high correlation between scores on the social desirability instrument and responses to other questions on the survey would suggest that answers to those questions may be driven in part by socially desirable responding, rather than by respondents’ true beliefs and behaviors. We give a suggested instrument and method for using it in our page on social desirability bias.
Question order can affect response rate and responses. In general, neutral questions and very important questions should be placed early in the survey, so that respondents answer them before getting tired. Questions that might influence answers to other questions should be asked later. For instance, a question about whether the respondent has seen a video about factory farming should come after questions about their diet and attitudes towards animals, because being reminded of the video might affect their answers to other questions. Finally, demographic questions come last, especially potentially sensitive questions about topics like race, education, and household income, so that respondents who don’t want to answer these questions have already answered the other questions. Responses to demographic questions should not be required.
The survey should be tested before administration, even in a pilot, to ensure that instructions are clear and survey length is reasonable. If members of the target audience are not available, several people who were not involved in writing the survey should take it and make note of any unclear points and of how long it took them.
Conducting survey research can require significant expenditure of resources, so careful design is essential to ensure the study produces useful results. Many resources are available to help survey designers address important concerns. Survey designers should be especially careful to ensure that responses to the most vital questions will be useful. Multiple methods are available to help reduce bias, including using control groups, disguising the purpose of the survey, and choosing a method of administration that increases response rate. Survey designers should consider these and other factors when deciding how to conduct a study that provides the best value to their organization.
1Control groups are a major tool used to allow researchers to conclude that variables are causally related instead of only correlated. Without a control group, any finding that the population who filled out the survey differs from the general population could be attributed to a pre-existing characteristic of the people who chose to engage in the program being tested or to take the survey.
2While reports of even the very recent past can be inaccurate, problems only grow with time. “[A] recent review of the survey literature found reduced levels of reporting or reduced reporting accuracy for hospital stays, health care visits, medical conditions, dietary intake, smoking, car accidents, hunting and fishing trips, consumer purchases, and home repairs as the length of the retention interval increased.” Tourangeau, Roger. (1999). “Remembering What Happened: Memory Errors and Survey Reports.” The Science of Self-report: Implications for Research and Practice. Stone, Arthur A. et al, eds. Psychology Press. The review cited is Jobe, J. B., Tourangeau, R., & Smith, A. F. (1993). Contributions of survey research to the understanding of memory. Applied Cognitive Psychology, 7(7), 567-584. Because the future is inherently uncertain, predictions are typically less accurate than recollections. See the next footnote for an example.
3For instance, in the context of voting, about 10-20% of people incorrectly identify whether they have voted in a recent election. (Tittle, C. R., & Hill, R. J. (1967). The accuracy of self-reported data and prediction of political activity. The Public Opinion Quarterly, 31(1), 103-106.) When polled before an election to ask whether they will vote, a much higher percentage predict their behavior inaccurately: in one study 83% predicted they would vote, but only 43% did so. (Smith, J. K., Gerber, A. S., & Orlich, A. (2003). Self‐Prophecy Effects and Voter Turnout: An Experimental Replication. Political Psychology, 24(3), 593-604.) Since both effects are hypothesized to be caused in part by social desirability bias, errors may be smaller in situations where desirability operates less strongly than with voting.
4For examples of the literature comparing modes of survey administration, see Fricker, S., Galesic, M., Tourangeau, R., & Yan, T. (2005). An experimental comparison of web and telephone surveys. The Public Opinion Quarterly, 69(3), 370-392. and Chang, L., & Krosnick, J. A. (2009). National surveys via RDD telephone interviewing versus the internet comparing sample representativeness and response quality. The Public Opinion Quarterly, 73(4), 641-678. For an example of a case in which a real surveyor has been thought to be crucial to the quality of data collected, see the 24 hour diet recall, described in Thompson, F. E., & Byers, T. (1994). Dietary assessment resource manual. The Journal of Nutrition, 124(11 Suppl), 2245s-2317s. (Some 24-hour diet recalls are now conducted with specialized computer tools.)
5If multiple tests are conducted, statistical procedures can be used to account for the increased likelihood of finding apparent patterns by chance.
6In a thorough national diet study, almost two thirds of self-described vegetarians reported eating significant amounts of meat during one or both 24 hour diet recalls. Haddad, E. H., & Tanzman, J. S. (2003). What do vegetarians in the United States eat? The American Journal of Clinical Nutrition, 78(3), 626S-632S.
7For more general tips on survey design and especially question design, see this guide. Although it is aimed primarily at academic programs, many of the suggestions are broadly useful.
8For instance, Justice for Animals conducted a survey of students exposed to humane education lectures and found that 9% of respondents reported having been vegan or vegetarian before the lecture. Since the students had not had a choice about whether to see the lectures or not, the divergence of this number from the rate of vegetarianism (or even self-reported vegetarianism) in the general population was most likely due in large part to a combination of response bias and social desirability bias. In the same survey, 14% of respondents reported having become vegetarian or vegan after the lecture; this rate is probably also affected by social desirability and response biases.
9For a more thorough discussion of biases in responses to general dietary surveys, see Thompson, F. E., & Subar, A. F. (2008). Dietary assessment methodology. Nutrition in the Prevention and Treatment of Disease, 2, 3-39.
10Several socially desirable response scales have been developed and validated with the intent of measuring which respondents answer questions based on what is socially correct and which questions are particularly affected by such response styles. We include one such scale in our suggested questions document. However, some evidence suggests that these scales may instead measure a personality type that is particularly likely to actually behave in a socially desirable way. For this reason, we suggest dealing with social desirability bias through other means if possible. For more on problems with controlling for socially desirable responding through use of a scale, see Barger, S. D. (2002). The Marlowe-Crowne affair: Short forms, psychometric structure, and social desirability. Journal of Personality Assessment, 79(2), 286-305.