The goal of these checks is to catch two types of fraudulent survey completion: bots and survey farmers (people who “farm” paid surveys, often but not always from foreign countries with lower hourly rates of pay; see TurkPrime, 2018). Except where otherwise noted, most of these checks could catch either type of scam, though some are passable by more sophisticated scammers with VPNs or by bots working in conjunction with humans. The checks we previously had in place were not sufficient to catch more sophisticated scammers.
For each study, our pre-registration plan will indicate which specific metrics and pass/fail criteria will be used.
Usage Guidelines and Examples
General Usage Guidelines
Data quality checks that stem from responses to survey questions can be very helpful, but it is important to remember that they are subject to human error. That is, even an honest responder may make a mistake from time to time if they misread something, are sleepy, or briefly aren’t paying attention. We recommend using these checks in combination—for example, with a two- or three-strike rule—to avoid accidentally excluding honest responders.
Edit (8/29/19): Humans who farm surveys with bot assistance are likely to go through the survey once manually to program the bot with correct answers to these types of questions. It is best to have a larger bank of items for each from which you can randomly select a question so that all participants don’t get the same one.
Consistency Checks
This category of check refers to asking the same question twice in different ways (preferably not close together in the survey) or with one version reverse-coded so that it says the opposite of an earlier question. The goal is to catch people who aren’t answering honestly—if they are, their answers should be consistent.
- An example of asking the same thing in two different ways is requesting age and birth year. Although this failed for us, it may work if the questions are placed farther apart in the survey.
- An example of asking the same thing with reverse-coding is “Do you usually make decisions quickly?” and “Do you generally take a long time to make a decision?” If the respondent provides the same answer to both questions (e.g., “Strongly agree”), that should be flagged as suspicious.
- Another example with reverse-coding is to ask people to check boxes for feedback at the end of a survey. If they check, for example, both “annoying” and “enjoyable,” that response should be flagged as suspicious.
Attention Checks
This category of check is intended to catch people who are answering randomly, too quickly, or selecting as many options as possible to try and pass a screening question.
- An example of an attention check is: “Which of the following have you done in the past 12 months? Please select all that apply.”
- Ran a half-marathon or marathon
- Purchased a new television
- Used the internet ***flag if unchecked; they’re using one for the survey***
- Ran a mile in less than 2 minutes ***flag if checked; the world record is 3:43***
- Ate a pre-packaged frozen dinner
- Read a book
- Typed faster than 220 words per minute ***flag if checked; the world record is 216 WPM***
- Donated to charity
- Applied for a patent
- None of the above
More overt attention checks (e.g., “Please select strongly agree”), while formerly common, have more recently been criticized as introducing demographic and social desirability bias into a sample (e.g., Clifford & Jerit, 2015; Vanette, 2016), so Faunalytics no longer recommends using them.
Basic Comprehension Checks
This category of check asks about the meaning of a simple sentence in order to flag possible foreign workers. However, if your sample may legitimately include people with a low level of reading comprehension, it would be both unethical and methodologically unsound to exclude people for failing only this check.
- An example of a comprehension check is: “Sally’s blue dress is her favorite. What does the previous sentence imply?”
- Sally only has one dress
- Sally has a favorite dress ***flag if not checked***
- Sally doesn’t wear dresses
- Sally only wears blue
Open-Ended Question Coding
Like the basic comprehension checks described above, this type of check is meant to flag foreign workers and survey farmers who just want to get through your survey as quickly as possible. We have adopted it based on the findings of Turkprime (2018), which found that 95% of non-farmers passed this check and only 8% of farmers did. As with the basic comprehension checks above, however, people with low reading comprehension may also fail, so this check should only be used in conjunction with others.
Once you have your first batch of data to check, skim the full set of open-ended responses to get a sense of what the answers are like. Then, for each, consider whether it:
- Is a duplicate of another response
- Has poor grammar
- Is hard to understand (i.e., not perfectly intelligible)
- Is clunky (e.g., “You are examining the mood of the complex moments”)
- Does not answer the question (e.g., “Nice survey” in response to a question about food)
If any of the above are true, flag the response as suspicious.