Skip to main content
Log in

Incentivizing Responses to Self-report Questions in Perceptual Deterrence Studies: An Investigation of the Validity of Deterrence Theory Using Bayesian Truth Serum

  • Original Paper
  • Published:
Journal of Quantitative Criminology Aims and scope Submit manuscript

Abstract

Objective

Criminological researchers want people to reveal considerable private information when utilizing self-report surveys, such as involvement in crime, subjective attitudes and expectations, and probability judgments. Some of this private information is easily accessible for subjects and all that is required is for individuals to be honest, while other information requires mental effort and cognitive reflection. Though researchers generally provide little or no incentive to be honest and thoughtful, it is generally assumed that subjects do provide honest and accurate information. We assess the accuracy of deterrence measures by employing a scoring rule known as the Bayesian truth serum (BTS)—that incentivizes honesty and thoughtfulness among respondents.

Method

Individuals are asked to report on self-report offending and estimates of risk after being assigned to one of two conditions: (1) a group where there is a financial incentive just for participation, and (2) a BTS financial incentive group where individuals are incentivized to be honest and thoughtful.

Results

We find evidence that there are some important differences in the responses to self-reporting offending items and estimates of the probability of getting arrested between the groups. Individuals in the BTS condition report a greater willingness to offend and lower estimates of perceived risk for drinking and driving and cheating on exams. Moreover, we find that the negative correlation between perceived risk and willingness to offend that is often observed in scenario-based deterrence research does not emerge in conditions where respondents are incentivized to be accurate and thoughtful in their survey responses.

Conclusion

The results raise some questions about the accuracy of survey responses in perceptual deterrence studies, and challenge the statistical relationship between perceived risk and offending behavior. We suggest further exploration within criminology of both BTS and other scoring rules and greater scrutiny of the validity of criminological data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. We use an example from deterrence research, but our discussion of the necessity to motivate research subjects to provide truthful and thoughtful information which requires some effort on their part to retrieve and deliver cuts across criminological subfields.

  2. To reiterate our point in note 1 about the generality of the issue we discuss, researchers also ask questions about how closely research subjects are attached to their parents or friends, how fearful they feel in their neighborhoods, how much they trust their neighbors, all of which require effort on the part of research subjects in order to accurately convey their thoughts and feelings.

  3. Our purpose here is not to critique issues with deterrence model specifications, inclusion of appropriate controls/counfounders or functional forms. Others have done this quite extensively (e.g., Bachman et al. 1992; Grasmick and Bursik 1990; Loughran et al. 2012; Nagin 1998; Nagin and Pogarsky 2001; Paternoster et al. 1983; Williams and Hawkins 1986). In the discussion, we do however comment on the utility of this line of work going forward.

  4. Social desirability bias has been shown to be prevalent in the self-reporting of other illicit and otherwise taboo behaviors (e.g., Kreuter et al. 2008; Tourangeau and Yan 2007), and research has shown that its presence differs between different survey methods, such as face-to-face versus over the telephone questioning (Holbrook et al. 2003).

  5. In many studies that use self-reported offending there is always an issue of both unrealistic proportions of cases at zero and upper outliers. As just one example, in the third wave of the National Youth Survey (NYS) when respondents were between the ages of 13 and 19, nearly 90 % of the respondents reported that they had never vandalized property (school, home or other), and 85 % reported never stealing anything worth $5 or less. However, nine respondents reported engaging in sexual intercourse 200 times or more in a year, 5 had sold marijuana and 10 reported carrying a weapon more than 200 times a year.

  6. Several scholars have cleverly maneuvered around the problems with reported offending behavior or intentions by designing experiments in which participants can be directly observed either engaging in or refraining from a certain delinquent behavior such as cheating to earn a larger monetary incentive (Paternoster et al., 2013) or visiting a website for the purposes of downloading illegal music (Exum et al. 2011). However, while these types of studies are very important in that they allow for causal interpretation of certain elements of the decision process, they are ultimately limited in the types of behaviors that can be reasonably studied, and they do not directly address the issues with measurement error of prior offending behavior.

  7. For instance, questions studied involving subjective expectations include future income earnings expectations (Dominitz and Manski 1997), beliefs about social security (Dominitz et al. 2003), college major choices (Zafar 2011), experimental psychology (e.g., Erev et al. 1994), polling and elections (Manski, 1990) and general belief updating (Nyarko and Schotter 2002).

  8. Interestingly, Manski (2004) covers a timeline of events in the 1940s through the 1960s which he argues are the basis for the general distrust many academic economists have of using subjective data to study individual behavior.

  9. Only recently, Loughran et al. (2011) explored the idea that an individual draws his or her risk perception from the mean of a subjective distribution and argue that other moments of this distribution could also play a role in the deterrence process. Yet, this argument is only predicated on variability within—not accuracy of—one’s subjective distribution.

  10. Of course, one could argue that even if individuals are overconfident but they act on this flawed perception, then this is not a measurement issue in the deterrence context. If however, the stated intentions are not what they truly believe (i.e., they are over-confident in their reporting but this is not reflected in their behavior), then it is a measurement issue. At a minimum, we believe this possibility is a reason to at least question the validity of reported risk measures.

  11. The issue of risk perception formation is a complicated and understudied one which we do not fully engage here (for a recent overview see Apel and Nagin 2011), but we do note that one’s risk current risk perceptions are likely a combination of many things, including but not limited to one’s own offending experiences (Anwar and Loughran, 2011), experiences of peers, neighborhood conditions, and police activity in the area. In other words, we think that arriving at one’s true current perception of risk requires some important thought and reflection.

  12. This idea of ‘satisficing’ actually predates its appearance in the psychology literature on cognition by several decades—as Krosnick (1991) notes, the term was first introduced by Simon (1957) in the economic literature on choice as a challenge to the traditional assumption that individuals expend whatever energy is required to maximize expected utility.

  13. Recently, econometricians have begun to model such so-called ‘focal’ responses of 0, 50 and 100 as indicators of second-order uncertainty, or what is commonly called ambiguity, in belief formation (Hudomiet and Willis 2012; Lillard and Willis 2001; Manski and Molinari 2010).

  14. Years ago, for example, critics described deterrence research as the “science of sophomores” because of the reliance on college undergraduate sample (Gibbs 1975; Jenson et al. 1978).

  15. These rules have been referred to in the econometric measurement literature as proper scoring rules. To be absolutely clear, it is worth noting that the term “proper” does not suggest that this method is the correct or best manner in which researchers can elicit accurate responses from study participants, rather “proper scoring rule” is simply the name for a subset of scoring rules given by Savage (1971) in which participants mathematically maximize their expected reward by reporting honestly and accurately about subjective probability distributions.

  16. However, Johnson et al. (1990:873) present what is unquestionably the most elegant argument for using proper scoring rules to elicit private respondent information: “it is natural to introduce monetary incentives, given the substantial drawbacks and social inefficiencies of alternatives such as sodium pentothal, mind control, and torture.".

  17. We say “competing” because the financial bonus was based upon respondent’s placement in the upper one-third of the information score distribution.

  18. Prelec (2004: 463) shows that under the BTS score rule, truth-telling for each respondent represents the Bayesian Nash equilibrium strategy, and hence it is the optimal thing to do. In game theory, a Nash equilibrium is a solution set of strategies in which no player has an incentive to change his or her strategy or behavior after considering the possible strategies and payoffs for all other players. It is appropriate for static (i.e., simultaneous move) games of complete information. A Bayesian Nash equilibrium is the analog solution concept for static games of incomplete information, which this scoring system creates.

  19. We credit multiple reviewers for helping us to clarify this important point.

  20. In the state in which data collection took place, texting while driving is illegal. Cheating on a college exam is not, of course, a criminal act, but it is a serious violation of the University’s honor code where this study took place. Moreover, McCabe and Trevino (1996) have found that in schools with well-developed honor codes (like the one where the survey took place) students view cheating as “social unacceptable” and feel that there is environmental pressures at both the university and student level to not cheat.

  21. For instance, for the drunk-driving scenario, the self-report questions were asked as ‘Have you driven while knowing you were probably legally drunk in the past year? (Y/N)’ immediately followed by the percentage of others question: ‘Please estimate the % of participants in this session who you believe have driven while they were legally drunk in the past year.

  22. As a validation check, note that overall, 92.7 % of all subjects thought that the drunk driving scenario was either ‘highly realistic’ or ‘somewhat realistic’, and this total varied little across conditions. Similarly, the overall percentage of subjects who believed the exam cheating, marijuana and texting while driving scenarios were either ‘highly realistic’ or ‘somewhat realistic’ were 89.7, 87.6 and 95.6 %, respectively, each with little variability across conditions.

  23. The computer lab was required because in the BTS condition we needed to quickly calculate the complicated total information score in order to provide the financial bonus for those who scored in the upper third of the distribution. Computer software was created which allowed us to do that.

  24. We took multiple extra steps to convince the subjects in the BTS condition that both the BTS scoring method and the ‘truth-telling’ strategy was legitimate and not a trick, including showing them the complicated scoring formula (but explaining that knowledge of how it worked would not be an advantage), going through the SAT scoring analogy shown in the instructions, and providing a website on the informed consent form that subjects could visit afterwards to learn more about the study and methodology.

  25. Because the Wilcoxon rank-sum test requires at least ordered data, we do not use this test for the P(WTO) and self-reported offending responses, which are both Bernoulli distributions.

  26. We could have run a Chi square test but there was a large proportion of the cells with zero cases. When we ran Fisher’s exact Chi-square test it too revealed that the perceived risk distribution for texting and driving were significantly different between the two groups.

  27. We did not examine the zero-order correlations between perceived risk and prior behavior for several reasons. First, the temporal ordering of the constructs makes the examination of a deterrent effect illogical; as one’s current risk perceptions cannot be used to predict prior behavior. We could potentially assess the relationship with regards to an experiential effect, however, such an analysis would be uninformative as we do not know whether their prior behavior resulted in a punishment or not, nor do we know the frequency of the activity to determine a rate.

  28. The under reporting of a willingness to offend and over reporting of risk perception estimates may be even greater when there are no incentives—a data collection condition characteristic of the vast majority of perceptual deterrence research. We tentatively explore this conjecture in the following section.

  29. We also gave the same scenarios in paper and pencil form to a large introductory class (N = 156) that was randomly assigned to this condition prior to any recruiting efforts. We fully recognize that the lack of random assignment at the respondent level introduces confounders that raise serious concerns about the direct comparability of these individuals to the respondents in the BTS and RI conditions, and we caution readers to not make any firm conclusions about the accuracy of paper and pencil surveys based on this study. Nevertheless, we have chosen to present the results of this condition because of the commonality of paper and pencil surveys in perceptual deterrence research. In other words, our purpose in having this condition was to replicate the procedure that has been common in past perceptual deterrence research data collection efforts. Appendix C provides the full details of the results of this survey, and the comparisons between this classroom condition and the BTS and RI groups, but we will describe the substantively important findings here. In general, the results indicate that the no incentive condition (NI) record similar responses to the RI (regular incentive) condition. Compared to the BTS condition, those in the NI condition report a significantly lower WTO for drunk driving and cheating on exams. There is also evidence that individuals in the NI condition are less likely to report that they have driven drunk or cheated on an exam in the past year when compared to the BTS group. Moreover, individuals in the NI condition report higher levels of perceived risk for offending for drunk driving, cheating on an exam and texting while driving when compared to the BTS condition. Finally, whereas the correlations between WTO and perceived risk are generally small and indistinguishable from zero in the BTS condition, in the NI condition they are large, statistically significant, and comparable to the sizes found using the RI condition. Again, however, we wish to reiterate that these findings are not directly comparable to those in the regular and BTS incentive conditions given the lack of randomization of the respondents. Readers can make of them what they wish.

  30. We also studied the nonresponse patterns in the data in each of the three conditions for sources of possible bias. In the NI, normal classroom condition, a total of five students failed to complete the survey outright (as we will detail below, this does not include those who skipped single questions). Two of these students filled out the demographics but did not answer any of the questions. One student provided information on the first scenario and nothing else. Two more of the students left entire scenarios blank. Multiple students came in late to the study and had to finish the survey after the lecture began. Furthermore, as expected there was not perfect attendance (likely nonrandom) in the class section where we conducted the study. Only 156 of the 249 total students enrolled in the class, or 62.6 %, took the survey (this excludes the individuals whose data we discarded). Of those who did complete the survey, multiple students refused to answer some of the specific questions. Most notably, several refused to answer the question with regard to whether or not they cheated on an exam in the prior year. Finally, it became quite apparent in reviewing the data that some of the students in the NI condition likely did not fully read the questionnaire. All of this suggests that the ‘usable’ data from this survey condition likely suffered from sample selection bias, systematic nonresponse bias, and lack of both thoughtfulness and/or truthfulness on the part of the subjects. In stark contrast, as subjects were told that they would only be eligible for the bonus if they answered each question on the survey, every subject completed all questions in both the RI and BTS conditions.

  31. We credit a helpful reviewer for this observation.

References

  • Ajzen I, Fishbein M (1980) Understanding attitudes and predicting social behavior. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  • Anwar S, Loughran TA (2011) Testing a Bayesian learning theory of deterrence among serious juvenile offenders. Criminology 49:667–698

    Article  Google Scholar 

  • Apel R, Nagin DS (2011) General deterrence: A review of recent evidence. In: Tonry M (ed) The Oxford handbook of crime and criminal justice. Oxford University Press, New York, pp 179–206

    Google Scholar 

  • Bachman R, Paternoster R, Wald S (1992) The rationality of sexual offending. Law Soc Rev 26:343–372

    Google Scholar 

  • Barrage L, Lee MS (2010) A penny for your thoughts: inducing truth-telling in stated preference elicitation. Econ Lett 106:140–142

    Article  Google Scholar 

  • Baumrind D (1985) Research using intentional deception: ethical issues revisited. Am Psychol 40:165–174

    Google Scholar 

  • Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78:1–3

    Google Scholar 

  • Bruine de Bruin W, Fischhoff B, Millstein SG, Halpern-Felsher BL (2000) Verbal and numeric expressions of probability: “It’s a fifty–fifty” chance. Organ Behav Hum Process 81:115–131

    Article  Google Scholar 

  • Bruine de Bruin W, Fishceck PS, Stiber NA, Fischhoff B (2002) What number is “fifty–fifty”? Redistributing excessive 50% responses in elicited probabilities. Risk Anal 44:713–723

    Article  Google Scholar 

  • Chiricos TG, Waldo GP (1970) Punishment and crime: an examination of some empirical evidence. Soc Probl 18:200–217

    Article  Google Scholar 

  • Davis B, Dossetor K (2010) (Mis)perceptions of crime in Australia. Australian Institute of Criminology. No. 396, July 2010

  • Dominitz J, Manski CF (1997) Using expectation data to study subjective income expectations. J Am Stat Assoc 92:855–867

    Article  Google Scholar 

  • Dominitz J, Manski CF, Heinz J (2003) “Will social security be there for you?”: How Americans perceive their benefits. National Bureau of Economic Research Working Paper 9798

  • Duffy B, Wake R, Burrows T, Bremner P (2008) Closing the gaps: crime and public perceptions. Int Rev Law Comput Technol 22:17–44

    Article  Google Scholar 

  • Dunning D, Heath C, Suls JM (2005) Picture imperfect. Sci Am 2(4):20–27

    Article  Google Scholar 

  • El-Gamal MA, Grether DM (1995) Are people Bayesian? Uncovering behavioral strategies. J Am Stat Assoc 90:1137–1145

    Article  Google Scholar 

  • Erev I, Wallsten TS, Budescu DV (1994) Simultaneous over- and underconfidence: the role of error in judgment processes. Psychol Rev 101:519–527

    Article  Google Scholar 

  • Erickson ML, Gibbs J (1978) Objective and perceptual properties of legal punishment and the deterrence doctrine. Soc Probl 25:253–264

    Article  Google Scholar 

  • Exum ML, Bouffard JA (2010) Testing theories of criminal decision making: some empirical questions about hypothetical scenarios. In: Piquero AR, Weisburd D (eds) Handbook of quantitative criminology. Springer, New York, pp 581–594

    Chapter  Google Scholar 

  • Exum ML, Turner MG, Hartman JL (2011) Self-reported intentions to offend: all talk and no action? Am J Crim Justice 37:523–543

    Article  Google Scholar 

  • Fischhoff B, Bruine de Bruin W (1999) “Fifty-fifty” = 50%? J Behav Decis Mak 12:149–163

    Article  Google Scholar 

  • Fischoff B, Beyth-Marom R (1983) Hypothesis evaluation from a Bayesian perspective. Psychol Rev 90:239–260

    Article  Google Scholar 

  • Fishbein M, Ajzen I (1975) Belief, attitude, intention, and behavior: an introduction to theory and research. Addison-Wesley, Boston

    Google Scholar 

  • Fisher RJ (1993) Social desirability bias and validity of self-reported values. Psychol Mark 17:105–120

    Article  Google Scholar 

  • Gardenfors P, Sahlin NE (1983) Decision making with unreliable probabilities. Br J Math Stat Psychol 36:240–251

    Article  Google Scholar 

  • Geerken MR, Gove WR (1975) Deterrence: some theoretical considerations. Law Soc Rev 9:497–513

    Article  Google Scholar 

  • Gibbs JP (1975) Crime, punishment and deterrence. Elsevier North-Holland, Inc., New York

  • Gold M (1970) Delinquent behavior in an American city. Brooks/Cole, Belmont

    Google Scholar 

  • Grasmick HG, Bursik RJ (1990) Conscience, significant others, and rational choice: extending the deterrence model. Law Soc Rev 24:837–862

    Article  Google Scholar 

  • Hindelang MJ, Hirschi T, Weis JG (1981) Measuring delinquency. Sage, Beverly Hills

    Google Scholar 

  • Holbrook AL, Green MC, Krosnick JA (2003) Telephone versus face-to-face interviewing of national probability samples with long questionnaires: comparisons of respondent satisficing and social desirability response bias. Public Opin Q 67:79–125

    Article  Google Scholar 

  • Howie PJ, Wang Y, Tsai J (2011) Predicting new product adoption using Bayesian truth serum. J Med Mark 11:6–16

    Google Scholar 

  • Hudomiet P, Willis RJ (2012) Estimating second order probability beliefs from subjective survival data. Unpublished working paper. http://www.nber.org/papers/w18258

  • Huizinga D, Elliott DS (1986) Reassessing the reliability and validity of self-report delinquency measures. J Quant Criminol 2:293–327

    Google Scholar 

  • Jenson GF (1969) “Crime doesn’t pay”: correlates of a shared misunderstanding. Soc Probl 17:189–201

    Article  Google Scholar 

  • Jenson GF, Gibbs JP, Erickson M (1978) Perceived risk of punishment and self-reported delinquency. Soc Forces 57:57–78

    Article  Google Scholar 

  • John L, Prelec D, Loewenstein G (2012) Measuring the prevalence of questionable research practices with incentives for truthtelling. Psychol Sci 23:517–523

    Article  Google Scholar 

  • Johnson S, Pratt JW, Zeckhauser RJ (1990) Efficiency despite mutually payoff-relevant private information: the finite case. Econometrica 58:873–900

    Article  Google Scholar 

  • Jones EE, Sigall H (1971) The bogus pipeline: a new paradigm for measuring affect and attitude. Psychol Bull 76:349–364

    Article  Google Scholar 

  • Kahneman D (2010) Thinking fast and slow. Farrar, Straus and Giroux, New York

    Google Scholar 

  • Keren GB (1991) Calibration and probability judgments: conceptual and methodological issues. Acta Psychol 77:217–273

    Article  Google Scholar 

  • Kleck GD, Barnes JC (2008) Deterrence and macro-level perceptions of punishment risks: is there a “collective wisdom”. Crime Delinq 58:1006–1035

    Google Scholar 

  • Kleck G, Sever B, Li S, Gertz M (2005) The missing link in general deterrence research. Criminology 43:623–660

    Article  Google Scholar 

  • Kreuter F, Presser S, Tourangeau R (2008) Social desirability bias in CATI, IVR, and web surveys: the effect of mode and question sensitivity. Public Opin Q 72:847–865

    Article  Google Scholar 

  • Krosnick JA (1991) Response strategies for coping with the cognitive demands of attitude measures in surveys. Appl Cogn Psychol 5:213–236

    Article  Google Scholar 

  • Lillard L, Willis RJ (2001) Cognition and wealth: the importance of probabilistic thinking. Unpublished working paper. http://deepblue.lib.umich.edu/handle/2027.42/50613

  • Lochner L (2007) Individual perceptions of the criminal justice system. Am Econ Rev 97:444–460

    Article  Google Scholar 

  • Loughran TA, Paternoster R, Piquero AR, Fagan J (2013) A good man always knows his limitations: the role of overconfidence in criminal offending. J Res Crime Delinq 50(3):327–358

    Google Scholar 

  • Loughran TA, Paternoster R, Piquero AR, Pogarsky G (2011) On ambiguity in perceptions of risk: implications for criminal decision making and deterrence. Criminology 49:1029–1061

    Article  Google Scholar 

  • Loughran TA, Pogarsky G, Piquero AR, Paternoster R (2012) Reassessing the functional form of the certainty effect in deterrence theory. Justice Q 29:712–741

    Article  Google Scholar 

  • Manski CF (1990) The use of intentions data to predict behavior: a best case analysis. J Am Stat Assoc 85:934–940

    Article  Google Scholar 

  • Manski CF (2004) Measuring expectations. Econometrica 72:1329–1376

    Article  Google Scholar 

  • Manski CF, Molinari F (2010) Rounding probabilistic expectations in surveys. J Bus Econ Stat 28:219–231

    Article  Google Scholar 

  • Matsueda RL, Kreager DA, Huizinga D (2006) Deterring delinquents: a rational choice model of theft and violence. Am Sociol Rev 71:95–122

    Article  Google Scholar 

  • McCabe D, Trevino L (1996) What we know about cheating in college. Change 28:28–33

    Article  Google Scholar 

  • McClelland A, Bolger F (1994) The calibration of subjective probabilities: theories and models 1980–1994. In: Wright G, Ayton P (eds) Subjective probability. Wiley, New York, pp 453–481

    Google Scholar 

  • Nagin DS (1998) Criminal deterrence research at the outset of the twenty-first century. In: Tonry M (ed) Crime and justice: a review of research, vol 23. University of Chicago Press, Chicago

    Google Scholar 

  • Nagin DS, Pogarsky G (2001) Integrating celerity, impulsivity, and extralegal sanction threats into a model of general deterrence: theory and evidence. Criminology 39:865–892

    Article  Google Scholar 

  • Nyarko Y, Schotter A (2002) An experimental study of belief learning using elicited beliefs. Econometrica 70:971–1005

    Article  Google Scholar 

  • Offerman T, Sonnemans J, van de Kuilen G, Wakker PP (2009) A truth serum for non-Bayesians: correcting proper scoring rules for risk attitudes. Rev Econ Stud 76:1461–1489

    Article  Google Scholar 

  • Paternoster R, McGloin JM, Nguyen H, Thomas KJ (2013) The causal impact of exposure to deviant peers: an experimental investigation. J Res Crime Delinq 50:476–503

    Google Scholar 

  • Paternoster R, Saltzman LE, Waldo GP, Chiricos TG (1983) Perceived risk and social control: Do sanctions really deter? Law Soc Rev 17:457–480

    Article  Google Scholar 

  • Pogarsky G (2002) Identifying deterrable offenders: implications for research on deterrence. Justice Q 19:431–452

    Article  Google Scholar 

  • Pogarsky G (2004) Projected offending and contemporaneous rule violation: implications for heterotypic continuity. Criminology 42:111–136

    Article  Google Scholar 

  • Prelec D (2004) A Bayesian truth serum for subjective data. Science 306:462–466

    Article  Google Scholar 

  • Savage LJ (1971) Elicitation of personal probabilities and expectations. J Am Stat Assoc 66:783–801

    Article  Google Scholar 

  • Seidenfeld T (1985) Calibration, coherence, and scoring rules. Philos Sci 52:274–294

    Article  Google Scholar 

  • Simon HA (1957) Models of man. Wiley, New York

    Google Scholar 

  • Thornberry TP, Krohn MD (2000) The self-report method for measuring delinquency and crime. In: U.S. National Institute of Justice (ed) Measurement and analysis of crime and justice: criminal justice series, vol 4. National Institute of Justice, Washington, DC, pp 33–83

    Google Scholar 

  • Tibbetts SG (1997) Shame and rational choice in offending decision. Crim Justice Behav 24:234–255

    Article  Google Scholar 

  • Tittle CR (1980) Sanctions and social deviance: the question of deterrence. Praeger, New York

    Google Scholar 

  • Tourangeau R, Yan T (2007) Sensitive questions in surveys. Psychol Bull 133:859–883

    Article  Google Scholar 

  • Tversky A, Kahneman D (1974) Judgment under uncertainty: heuristics and biases. Science 185:1124–1131

    Article  Google Scholar 

  • Weaver R, Prelec D (2013) Creating truth-telling incentives with the Bayesian truth serum. J Mark Res 50:289–302

    Article  Google Scholar 

  • West DJ, Farrington DP (1977) The delinquent way of life. Heineman, London

    Google Scholar 

  • Williams KR, Hawkins R (1986) Perceptual research on general deterrence: a critical review. Law Soc Rev 20:545–565

    Article  Google Scholar 

  • Wright RT, Decker SH (1997) Armed robbers in action: stickups and street culture. Northeastern University Press, Boston

    Google Scholar 

  • Yates JF (1990) Judgment and decision making. Prentice-Hall, London

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas A. Loughran.

Appendices

Appendix 1: The Four Scenarios

Drinking and Driving

First, please imagine a hypothetical scenario in which you drove by yourself one night to meet some friends at a party. The party is located approximately 2 miles from your apartment. You have been casually drinking throughout the evening, and now you are ready to leave. You also remember that you have to be at work early the next morning, and your boss will have a fit if you are late. You can either drive home yourself or find another ride. However, if you find another ride tonight and leave your car, you will have to return early the next morning before work to pick it up.

For the purposes of this hypothetical situation, assume that at this point you are ‘on the margin’ of legal intoxication, meaning your blood alcohol level (BAC) has just barely exceeded the legal limit. However, if pulled over assume you will definitely fail a breathalyzer test, meaning you will be subject to a penalty, including maybe having to pay a fine and/or spend a night in a local jail or police station.

Cheating

Next, please imagine another hypothetical scenario in which you are taking an important exam in one of your required courses. When you begin taking the exam, you notice that most of the questions on the exam cover material that you did not prepare for because you did not have the time to study that material. You also notice that you have a clear view of the answer sheet of the student sitting directly beside you whom you know is always well-prepared for class and has done very well on past exams. You know that if you fail this exam, you will have to retake the class.

If you are caught cheating, you will have to appear before a student honor council that may choose to impose any one of several possible sanctions including, but not limited to, a grade of XF on your academic transcript that reflects both course failure and academic dishonesty; suspension for one or more semesters; or expulsion from the university.

Marijuana Use

Now please imagine that you are in a public place such as a concert, beach or in front of your apartment with some friends after a stressful day in which you took a very difficult exam that you had been studying for some time. You find your roommate/friend has brought some marijuana and s/he suggests that the two of you ‘smoke a bowl’ to relax and unwind. You also know that after studying hard all week, you really have nothing to do tonight or tomorrow.

If you are caught smoking marijuana you will be charged with misdemeanor possession which could result in a fine of up to $250, community service, and a drug possession charge on your criminal record.

Texting and Driving

Now imagine that you are driving to meet your boyfriend/girlfriend for dinner for a date you have been planning for a couple of weeks. Unfortunately, you underestimated the amount of time it would take to get to the restaurant and will be at least 20 min late to meet your boyfriend/girlfriend. Suppose that if you text your boyfriend/girlfriend while driving to inform that that you are running late they will be much less upset at the situation and your date will be much more pleasant throughout the night.

However, if you are caught texting and driving you will be pulled over and issued a ticket for $100.

Appendix 2: Instructions Read to Subjects Under Each Experimental Condition

BTS Condition Introduction

Thank you for agreeing to participate in this brief survey. For you participation, you will be paid $10. In addition, if you answer every question, you will also be eligible for a $25 bonus payment as well. One-third of the participants in this session will be awarded this bonus. The bonus will be given to those individuals with the highest BTS Score (which we will explain shortly), and it will be paid at the end of the session.

Your score on this survey is based on a mathematical algorithm known as the ‘BTS’, which was recently invented by an MIT professor, and published in the elite academic journal Science. This algorithm scores both your personal answers to each question, as well as your predictions about the answers of others. To maximize your BTS score (and hence maximize your chances of receiving the $25 bonus), it is your best strategy to be sure that your answers are completely truthful, that is, they accurately reflect your own opinions and beliefs, and your predictions are your best guess of how others will answer.

How Does this ‘Truth Serum’ Work?

The algorithm depends on the mathematical theory of information. The formula for calculating the BTS score is based on the actual percentage that choose any one of m choices, x, and the predicted percentage who should choose a certain answer, y:

$$ log\frac{{\bar{x}}}{{\bar{y}}} + \mathop \sum \limits_{i = 1}^{m} \bar{x} + log\frac{{y_{i} }}{{\bar{x}}} $$

Do You Need to Understand the Formula to do Better?

No! Understanding how the formula works will provide absolutely no advantages. It is simply a proven mathematical theorem in which, regardless of your opinion, proving an honest answer will maximize your information score. That is all your need to know to use it.

Is the BTS an Actual Lie Detector?

No. BTS cannot tell if any single answer is honest or not, nor does a low score imply that you are lying. With that said, your BTS score will be maximized if you respond as truthfully as you can. It works similar to the way the SAT exam is scored. Your best strategy on the SAT exam is to provide your best guess of answers which you think are correct. If you were careless or chose answers which you knew are not correct, your SAT score would have been much lower.

Unlike the SAT, which is reflective of aptitude, to score high on the BTS, you will only need your own honest and effort to gain a higher BTS score. Simply, be truthful and you will maximize your BTS score. Also, unlike the SAT exam, there is no ‘correct answer key’. No one—including the test proctors—know in advance which scores will be highest!

Does it Matter If I Have an Unusual Opinion?

No. The scoring algorithm is not biased in any way for or against unusual answers. So don’t worry if your answer is not typical, just make sure that it is truthful.

Because only the top one-third of the BTS scorers in this session will receive the bonus, you will maximize your chances to earn the extra $25 bonus if you answer each of the questions as truthfully as you can. In this case, ‘truthfully” means not only being honest, but also considering each question thoroughly and thinking carefully about your answer before responding, as well as taking the time to avoid careless mistakes (such as making a typographical error in entering your answer).

Finally, because some of the questions we will be asking involve sensitive subjects relating to breaking the law, you should be aware that your answers will remain completely confidential and the only thing the experimenter will see is your total BTS score. The individual answers that you provide will merely be aggregated with other individuals’ answers and will not be connected to you in any way. So again, there is no reason not to be as truthful as possible.

Regular Incentive Condition Introduction

Thank you for agreeing to participate in this brief survey. For you participation, you will be paid $10. In addition, if you answer every question, you will also be eligible for a $25 bonus that will be rewarded to 1/3 of the participants who answer all of the questions. We will randomly draw numbers to determine who shall receive this, but you must answer each question to be eligible for this drawing.

Please try to answer each of the questions as honestly and thoughtfully as you can.

Because some of the questions we will be asking involve sensitive subjects relating to breaking the law, you should be aware that your answers will remain completely confidential. The individual answers that you provide will merely be aggregated with other individuals’ answers and will not be connected to you in any way. So there is no reason not to be as truthful as possible.

Appendix 3: The Non-randomized No Incentive Control Condition

Instructions to Subjects for the No Incentive Condition Introduction

Thank you for agreeing to participate in this brief survey. Please try to answer each of the questions as honestly and thoughtfully as you can.

Because some of the questions we will be asking involve sensitive subjects relating to breaking the law, you should be aware that your answers will remain completely confidential. The individual answers that you provide will merely be aggregated with other individuals’ answers and will not be connected to you in any way. So there is no reason not to be as truthful as possible.

Control Condition Results

See Tables 5, 6 and 7.

Table 5 Mean responses for the control condition for four behaviors and the perceived risk of arrest for self
Table 6 Z-values based on Wilcoxon rank-sum test of the WTO distribution between control and BTS group
Table 7 Z-values based on Wilcoxon rank-sum test of the perceived risk distribution between control and BTS group

Appendix 4: The BTS Scoring Rules

Briefly, here’s how the BTS scoring works. A group of individuals are simultaneously asked to respond to a pair of questions for each construct the researcher wants to measure. The method then uses a Bayesian strategy to empirically examine the relationship between the pair of questions. The two questions are: (1) each individual provides their own personal answer to the question, (2) they then provide their prediction of the empirical distribution of how they believe everyone else in that particular data collection session will respond to that question. For example, after reading a hypothetical scenario involving drinking and driving we asked our respondents to estimate: (1) “On a scale of 0-100 where 0 indicates that there is no chance at all (0 % chance) you would drive your own car home and 100 indicates that you are dead certain (100 % chance) that you would drive your own car home, how likely is it that you would drive home under these conditions.”, and (2) “Using the same 0-100 scale, please indicate your best guess as to the percentages of individuals participating in this session right now who would drive their own car home under these conditions”. In essence then, each person provides both an answer about themselves and a prediction of the empirical distribution of answers in the group.

After completion of the two items, each individual receives an information score, the formula for which is predicated on giving answers which are “surprisingly common” (Prelec 2004: 462) or more specifically, answers where the actual frequency exceeds their predicted frequency drawn from the same group of respondents. So for example, a response “endorsed by 10 % of the population against a predicted frequency of 5 % would be surprisingly common and would receive a high information score; if predictions averaged 25 %, it would be a surprisingly uncommon answer, and hence receive a low score” (Prelec 2004: 462). The information score for a question is calculated as the log of ratio of the actual relative frequency of one’s own answer to the (geometric) mean predicted frequency of the answer (Table 8; Fig. 6).

Table 8 Zero-order Pearson correlations between perceived risk and willingness to offend in the control group
Fig. 6
figure 6

Estimate of perceived self risk of arrest for four scenario conditions for the control (no incentive) group

Rights and permissions

Reprints and permissions

About this article

Cite this article

Loughran, T.A., Paternoster, R. & Thomas, K.J. Incentivizing Responses to Self-report Questions in Perceptual Deterrence Studies: An Investigation of the Validity of Deterrence Theory Using Bayesian Truth Serum. J Quant Criminol 30, 677–707 (2014). https://doi.org/10.1007/s10940-014-9219-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10940-014-9219-4

Keywords

Navigation