Elsevier

Artificial Intelligence

Volume 240, November 2016, Pages 65-103
Artificial Intelligence

Comparing human behavior models in repeated Stackelberg security games: An extended study

https://doi.org/10.1016/j.artint.2016.08.002Get rights and content
Under an Elsevier user license
open archive

Abstract

Several competing human behavior models have been proposed to model boundedly rational adversaries in repeated Stackelberg Security Games (SSG). However, these existing models fail to address three main issues which are detrimental to defender performance. First, while they attempt to learn adversary behavior models from adversaries' past actions (“attacks on targets”), they fail to take into account adversaries' future adaptation based on successes or failures of these past actions. Second, existing algorithms fail to learn a reliable model of the adversary unless there exists sufficient data collected by exposing enough of the attack surface – a situation that often arises in initial rounds of the repeated SSG. Third, current leading models have failed to include probability weighting functions, even though it is well known that human beings' weighting of probability is typically nonlinear.

To address these limitations of existing models, this article provides three main contributions. Our first contribution is a new human behavior model, SHARP, which mitigates these three limitations as follows: (i) SHARP reasons based on success or failure of the adversary's past actions on exposed portions of the attack surface to model adversary adaptivity; (ii) SHARP reasons about similarity between exposed and unexposed areas of the attack surface, and also incorporates a discounting parameter to mitigate adversary's lack of exposure to enough of the attack surface; and (iii) SHARP integrates a non-linear probability weighting function to capture the adversary's true weighting of probability. Our second contribution is a first “repeated measures study” – at least in the context of SSGs – of competing human behavior models. This study, where each experiment lasted a period of multiple weeks with individual sets of human subjects on the Amazon Mechanical Turk platform, illustrates the strengths and weaknesses of different models and shows the advantages of SHARP. Our third major contribution is to demonstrate SHARP's superiority by conducting real-world human subjects experiments at the Bukit Barisan Seletan National Park in Indonesia against wildlife security experts.

Keywords

Game theory
Repeated Stackelberg games
Human behavior modeling

Cited by (0)

This journal article extends a full paper that appeared in AAMAS 2015 [49] with the following new contributions. First, we test our model SHARP in human subjects experiments at the Bukit Barisan Seletan National Park in Indonesia against wildlife security experts and provide results and analysis of the data (Section 13.1). Second, we conduct new human subjects experiments on Amazon Mechanical Turk (AMT) to show the extent to which past successes and failures affect the adversary's future decisions in repeated Stackelberg games (Section 8.1). Third, we conduct new analysis on our human subjects data and illustrate the effectiveness of SHARP's modeling considerations and also the robustness of our experimental results by: (i) showing how SHARP based strategies adapt due to past successes and failures of the adversary, while existing competing models like P-SUQR converge to one particular strategy (Section 12.4); (ii) comparing a popular probability weighting function in the literature (Prelec's model) against the one used in SHARP and showing how the probability weighting function used in SHARP is superior in terms of prediction performances, even though the shape of the learned curves are the same (Sections 3.2 and 12.2.1); (iii) comparing an alternative subjective utility function based on prospect theory where the values of outcomes are weighted by the transformed probabilities, against the weighted-sum-of-features approach used in SHARP – the alternative model yields the same surprising S-shaped probability weighting curves as the weighted-sum-of-features functional form used in SHARP but the weighted-sum-of-features model yields better prediction accuracy than the prospect theoretic subjective utility function (Sections 7.2 and 12.2.2); and (iv) proposing and comparing a new descriptive reinforcement learning (rl) model for SSGs which is based on a popular reinforcement learning model for simultaneous move games against SHARP – although the rl model learns based on feedback from past actions, it performs poorly as compared to SHARP (Sections 11 and 12.1). Fourth, in this article we also provide methodological contributions towards conducting repeated measures experiments on AMT and show the effects of various strategies on the participant retention rates in such repeated experiment settings (Section 6). Fifth, we discuss additional related work (Section 3), directions for future work (Section 14), and provide additional detailed explanations, proofs of theorems and feedback from participants who played our games.