Introduction

Crime and Place

A substantial body of research shows that crime occurs in certain places for a reason. This literature on ‘crime and place’ has gained a great deal of recognition in recent years (Weisburd et al. 2009a, b). It seems that now, more than ever, the focus on geography, terrain and ‘dots and lines on the map’ in the study of crime patterns and the causes of crime, is acknowledged as a crucial pillar of any crime theory (Wortley and Mazerolle 2008; Short et al. 2010; Wikström et al. 2012; Weisburd et al. 2012; Sampson 2013). Virtually everything is somehow connected to a ‘place’ at its most rudimentary and crime is no exception: antisocial behavior must also take place somewhere on the spatiotemporal spectrum. For this and other reasons, criminologists now widely accept that attention must be focused on the routine, ecological and physical environment of where crime takes place (Sherman et al. 1989; Weisburd and Eck 2004; Weisburd et al. 2012).

The contemporary version of ‘crime and place’ has signaled to criminologists that the unit of analysis should be as small as possible—less about the neighborhood and the overall community levels (Pawson and Tilley 1997), but rather at the ‘micro’ level, or what are often referred to as ‘hotspots of crime’. An overwhelming body of research shows the disproportionate concentration of criminal events at street segments, road intersections, city blocks or unique addresses—in virtually any given city in which crime data are collated (Weisburd et al. 2009a, b, 2010; Weisburd 2015). This global evidence cuts across topographies, cities and urban structures (Sherman et al. 2014; Weisburd and Amram 2014). The most celebrated finding in this area is attributed to Sherman et al. (1989) who were the first to show that 50 % of calls for service to the police occur in less than 5 % of places, coining the phrase ‘criminal careers of places’ for this phenomenon. More recently, a growing body of evidence has suggested that hotspots remain consistently hot over an impressively long period of time (Weisburd et al. 2004). The stability of the evidence has led Weisburd and his colleagues to refer to this phenomenon as the ‘law of concentration of crime in place’ (Weisburd et al. 2012; Weisburd 2015). Notably, with these spatial concentrations there are also temporal concentrations in terms of times of the day, days of week and certain months of the year (Farrell and Pease 1994; Ratcliffe 2004; Johnson et al. 2008; Townsley 2008)—which means that the more precise characterization of the phenomenon should be the ‘law of concentration of crime in place and time.’

Certain attributes of the places of crime are correlated with higher frequencies of incidents. For instance, it is well established that a lack of a capable guardian is very often associated with more crime (Cohen and Felson 1979). More recently (Caplan et al. 2011), the literature has listed five risk factors found to be correlated with crime, particularly those that are violent, these are: gang members (Kennedy et al. 1996; Braga 2004); schools (Roncek and LoBosco 1983); public housing (Newman 1972; Roncek et al. 1981; Eck 1987); the facilities of bars, clubs, fast food restaurants, and liquor stores (Roncek and Bell 1981; Roncek and Maier 1991; Block and Block 1995; Brantingham and Brantingham 1995); and bus stops (Golledge and Stimson 1997; Loukaitou-sideris 1999; Roman 2005; Van Wilsem 2009; Yu 2009; Hart and Miethe 2014; Stucky and Smith 2014). Whilst most of these risk factors are too broad to be dealt with in one operation, they signal the direction in which to look for these antecedents of crime.

Policing Hotspots of Crime and Disorder

While the work reviewed thus far can be helpful in predicting which places to target, another body of work has looked at testing preventative initiatives to cool down such hotspots, mostly through formal social control interventions. ‘Hotspot policing’—crudely a tactic of placing ‘cops on the dots’—has been rigorously tested dozens of times. A recent Campbell Collaboration systematic review showed that most tests of hotspot policing were associated with a significant reduction in crime in the treatment hotspots, compared to control conditions (Braga et al. 2012). The list of hotspots experiments is continuously growing (Ariel et al. 2016; Ratcliffe et al. 2011; Rosenfeld et al. 2014) and collectively reflects a ‘strong body of evidence [which] suggests that taking a focused geographic approach to crime problems can increase the effectiveness of policing’ (Skogan and Frydl 2004:247).

There is also evidence to suggest that crime is usually not spatially displaced to adjacent areas, in the vicinity of the targeted hotspots, as a result of hotspot policing (Weisburd et al. 2006; Bowers et al. 2011; Johnson et al. 2014). There can, instead, be a ‘diffusion of benefits of these social control mechanism[s]’ to surrounding areas (Clarke and Weisburd 1994), or ‘radiation’ of the treatment effect (Ariel 2014), not only ‘around the corner’ from the targeted hotspots (Weisburd et al. 2006), but also to larger geographic areas (Telep et al. 2014a, b).

The deterrent effect of police presence is evident on mass transit systems as well—albeit the evidence thus far has been less rigorous than in other hotspots studies. Crime rates on New York’s subway dropped when officer numbers were increased at certain times of the day with a residual deterrent effect at other times (Chaiken et al. 1974). Police patrols on buses reduced crime up to 400 m from bus routes in a study in Liverpool and London (Newton et al. 2004).

Thus, the evidence on hotspots policing is clear: when police officers focus on hotspots, they are able to reduce crime and disorder compared to control conditions. Directing the police to micro-places, so that officers may apply social control mechanisms, prevents crime. It is still an open question as to which is the best approach to dealing with hotspots, on a tactical level (Koper 2014). While some recent studies continue to reaffirm Sherman and Weisburd’s (1995) original finding, that the saturated police presence at hotspots reduces crime and disorder (Telep et al. 2014a, b), others have begun to look more closely at precisely what type of police presence prevents crime. For example, some have looked at problem-oriented policing (Weisburd and Green 1995; Braga et al. 1999; Braga and Bond 2008; Taylor et al. 2011); drug enforcement operations (Weisburd and Green 1994, 1995); increased gun searches and seizures (Sherman and Rogan 1995); foot patrol (Ratcliffe et al. 2011); crackdowns (Sherman and Rogan 1995); ‘zero-tolerance’ policing or ‘broken windows tactics’ (Caeti 1999; Weisburd et al. 2011); 'soft policing' (Ariel et al. 2016); and intensified engagement (Rosenfeld et al., 2014). Despite these treatment variations, there are nevertheless common attributes to all hotspots policing approaches. First, it seems that the police must target these micro-places of crime and disorder. In all studies of police initiatives that target hotspots with high spatial concentrations of events, officers have consciously focused both resources and efforts on these places. Once officers are tasked with applying any sort of intervention, crime generally goes down, compared to hotspots not exposed to these focused treatments.

The Role of Deterrence in Hotspots Policing

The second common—and crucial—theme is the clear and manifested deterrent presence of ‘sentinels’ in the hotspots (Nagin 2013a, b). As opposed to police acting in this role as apprehension agents [incidentally, apprehension risk is probably not materially increased by improved investigations (Braga et al. 2011)], police officers are primarily ‘crime preventers’ when they are visible to the public. In some ways, this view of police officers as predominantly guardians had already been raised in Cohen and Felson’s (1979) situational crime prevention approach: The police in their role as sentinels act as guardians who reduce opportunities for committing a crime (Nagin et al. 2015)—because a drug store, with a police officer standing outside, is not an attractive criminal target. As such, even when they are tasked to problem solve, engage through neighborhood policing or anything else, officers are nevertheless uniform-wearing, often gun-carrying power-holders that exercise the authority of the state by their presence. This quale, universally symbolized by police insignia, carries a literal threat of apprehension which sends an unequivocal message: Beware! No matter the tactic applied, the presence of officers intensifies the cognitive perception of plausible apprehension for any transgression of the law, including against risk-takers such as offenders. Even ‘softer’ police approaches, for example community policing, still contain an ingredient of deterrence, at the very least when officers are physically situated within the hotspots (Ariel et al. 2016) . To be sure, this presumption of effective threat is not just theoretical; based on interviews with 589 arrestees in New York City following the police’s quality of life initiatives, ‘the most important factor’ behind behavioral change—that is, reductions in the likelihood of committing crime and disorder—was police presence (Golub et al. 2003: 690). Wright and Decker (1994) reported similar results: offenders appear to be aware of police presence when they select their targets: they avoid neighborhoods with increased police presence when making a decision to commit robbery.

There is, then, ample evidence that the perceived certainty of punishment is causally associated with less crime (McCarthy 2002; Lochner 2007; Bushway and Reuter 2008; Tonry 2008; Berk and MacDonald 2010; Paternoster 2010; Loughran et al. 2012). Increasing the likelihood of being caught rather than the severity or celerity of sanctions (Von Hirsch et al. 1999; Nagin 2013a, b) is inversely linked to the likelihood of committing an offense. This ‘certainty effect’ carries wide probabilities, over a range of settings in which the criminal justice system attempts deterrence.

Given the available evidence on crime concentrations within hot spots, as well as the efficacy of hot spots policing, we raise both a practical as well as a theoretical consideration: can the police reduce crime and disorder in mass-transit systems? Thus far, the crime and place literature focused on street segments, blocks or wider areas characterized by a disproportionate volume of problems. Yet the prospects of reducing crime at hot spots characterized by a disproportionate volume of people as well, has generally gone untested. In this study, we focus on bus stops as the unit of analysis.

Such a focus raises another theoretical consideration beyond the potential efficacy of policing mass transit environments: given the size of the bus stop, and the extent to which events can be attributed to these places, how effective could hot spots policing be in such a “micro–micro-place?” Conceptually, the deterrent effect should be elevated, as the perceived risk of sanction is exacerbated when the likelihood of apprehension is very high: naturally, the geospatial terrain is very small; therefore, it is unlikely that offenders would commit crime right next to an officer. However, there is no available evidence on this effect.

In order to test this approach, we conducted a randomized controlled trial with London’s ‘hottest’ 102 bus stops. We assigned uniformed officers to half of these bus stops, three times per shift, for a duration of 15 min, over a 6-month period. Given the size of London and the distance between bus stops, officers arrived and departed their assigned bus stops on a bus, with significantly less time spent outside the bus stop setting (we tracked their movements with GPS tags). In order to measure the treatment effect, outcomes were measured in terms of victim generated crimes reported to the police and bus driver incident reports (DIRs), within targeted and catchment areas. Overall, we have found that while DIRs went down significantly compared to control conditions, victim generated crimes—the primary outcome measured in nearly all hot spots policing experiments—increased.

Below, we present these findings and their meaning. We begin by reviewing the methods we used and then move on to present the outcomes. We then discuss this backfiring effect, by focusing on the concept of bounded rationality in everyday policing. The findings seem to deviate from the available evidence (Braga et al. 2014) because previous experiments focused on clusters of addresses, which allowed the patrolling officers to ‘roam’ unpredictably within the boundaries of the hot spot. However, the officers who participated in the present test were ‘over-focused’ within a few meters of the bus stops. This seems to have enabled offenders to predict their movements. We therefore defend the claim that hot spots policing can backfire under these conditions.

The Present Study: The London Hot Bus Stops Experiment

Mass transit stops, as well as their immediate surrounding environments, can be considered ‘places’ in the hotspots construct, as they fit both the physical and sociological definitions of such. The transit stop environment encompasses its own set of behaviors. This set of behaviors can be considered the routine activities of the transit stop environment. These activities add to the potential criminality of the area, and increase the number of potential targets. Additionally, transit stops are often located in areas with high amounts of activity. This adds to the density of victims in the surrounding environment (Piza and Kennedy 2003).

The concentration and stability of crime, at the micro-level, has also been identified in parts of the public transport environment (Smith and Clarke 2000; Loukaitou-Sideris et al. 2002; Newton 2004; Smith and Cornish 2006; Newton and Bowers 2007). In her study of Los Angeles, Loukaitou-Sideris (1999) found that ten bus stops accounted for 18 % of all bus stop crime, whilst Block and Block (2000) showed that street robberies in the Bronx, New York, were concentrated in the ‘environs of rapid transport stations’. Such bus stops and stations can be described as crime generators (Brantingham and Brantingham 1995) because they attract large crowds of vulnerable people who are often preoccupied or unfamiliar with the area. Pickpockets typically exploit the crowds gathered in mass transit settings, matching its bumping and jostling whilst scanning for unzipped bags and unguarded pockets. Thus, as shown by Block and Block (2000), mass transit stations and bus stops serve as behavior settings conducive to criminal activity. In particular, because motivated offenders and targeted victims use them regularly and are therefore at greater risk of crime because they ‘set the stage’ for criminal events. As found in previous research, particularly on violent crimes, targeted victims are most vulnerable when they arrive at or leave bus stops (Golledge and Stimson 1997; Roman 2005), and this, therefore, must be an important consideration in any crime prevention policy on bus crime.

Furthermore, Pearlstein and Wachs (1982) found that 88 out of 223 bus routes (less than 40 %) in Southern California experienced serious incidents of crime. The same study found that bus routes that reported high levels of crime were more likely to serve areas with correspondingly high levels of crime. More recently, Newton’s (2008) study confirmed that ‘en route’ bus-related crime is positively associated with crime in the area it passes through, but also that the risk of crime is elevated on routes that have multiple entry and exit points to these high crime areas. Apart from these two systematic studies, very little quantitative evidence is found in the literature on crime events that occur on moving public transport vehicles.

Methods

Settings and Design

Our study took place in London, United Kingdom. London is the capital of the UK and its most populous urban metropolitan city, with 8.63 million residents in a nearly 607 square miles area. Overall, London is a safe city compared to other metropolitan areas worldwide, with 8.3 violence with injury offences; 1.7 sexual assaults; 2.6 robberies and 8.8 residential burglaries per 1000 residents during 2014/15 (Statistics 2015). The Metropolitan Police Service (MPS) has territorial jurisdiction for law enforcement in Greater London (excluding the ‘square mile’ of the City of London, which is the responsibility of the City of London Police). The MPS is the largest UK police force, with more than 30,000 sworn police officers and nearly 4000 non-sworn police community support officers. The MPS is also in charge of policing the bus system in Greater London.

In terms of the bus network, London hosts one of the largest systems in the world, with over 9000 buses, 675 bus routes and 19,000 bus stops (TFL 2015a). The bus network attracts over 2 billion commuter trips per year. The internationally recognized red double-decker bus has been a London icon for many years. The MPS recorded 7.2 crimes per million passenger journeys during 2014/15, making the system a relatively safe environment (TFL 2015b).

In order to test the effectiveness of policing at bus stops, we identified the hottest stops in the Greater London area, and randomly assigned them into treatment and control conditions. The precise locations of the treatment bus stops were communicated to local commanders directly: they were given their assigned hotspots and were informed that ‘these are the hottest bus stops’ in the city, but at no point during the experiment were they informed of the location of the control hotspots, in order to avoid contamination—thus maintaining a partially-blinded experimental design.

Bus Stops as Hotspots

For the purpose of this experiment, a ‘hotspot’ was defined as a bus stop that had a disproportionately higher count of driver incident reports compared to the other 19,000 bus stops in London. We rank ordered all bus stops in London, and our inclusion criteria was that there had been at least 3 incidents associated with the bus stop in 6 months. While this seems a relatively low threshold compared to other studies (Sherman and Weisburd 1995), the nature of the crime problem for London’s buses was such that most bus stops experienced no reported incidents at all, at any given time.

Furthermore, around the bus stops we ‘drew’ 50 m buffers in order to test for the treatment effect, as the effect of a police presence at the bus stops was hypothesized to take place in the near vicinity as well (based on the ability of the eye to detect objects; see Woodman and Tidy 1877; Loftus and Harley 2005). As explained below, officers were not instructed to patrol beyond the bus stop vicinity, and therefore we drew concentric buffer zones around these epicenters (i.e. the bus stops). We also drew additional ‘cushions’ around the bus stops, of up to 150 m, in order to detect the displacement or diffusion of benefits (see below), as well as 100 m buffer zones. The buffer zones ensured that no two hotspots and their cushions overlapped. This criterion helped to avoid a situation in which one treatment and one control hotspot, for instance, are next to each other and the treatment effect spills over to the control area. The minimal distance between bus stops was therefore no less than 400 m. Further tests for the presence of spatial autocorrelation reduced the likelihood of obtaining a false positive result (Type I error). For example, if a highly prolific pickpocket was arrested, the consequence might be a drop in theft at a number of bus stops because the offender operated at all them. Spatial separation of the hotspots therefore helped ensure that the sampled locations were independent and the number of bus stops was actually 102. Eliminating spatial autocorrelation was a necessary procedure, yet it greatly reduced the number of eligible hotspots.

As noted, bus stops were identified for random assignment based on the frequency of incidents recorded over a 6-month period during 2013. Measuring the frequency of 999 calls for service at bus stops is not straightforward. In the UK, police recorded crimes are typically geocoded to property addresses using a gazetteer (Chainey and Ratcliffe 2005). Crimes that occurred at non-addressable locations like bus stops are therefore assigned the coordinates of the nearest property or road junction rather than the bus stop itself. When there are two or more bus stops outside an address, it is not possible to determine at which bus stop the crime occurred. To clarify, previous studies have tended to use buffers or administrative units to measure the incidence of bus stop crime. Buffers (Liggett et al. 2001), Thiessen polygons (Yu 2009), grid squares (Stucky and Smith 2014), and administrative areas (Kooi 2007, 2013) have all been used. These units of analysis are appropriate for obtaining aggregate measures of crime in and around bus stops, but unhelpful when measuring the frequency of crime at individual and well-designated bus stops.

Instead, we adopted a proxy measure of crime at bus stops rather than use spatially aggregated police recorded crime. Bus Driver Incident Reports are instances of criminal damage, fare evasion, and passenger disturbance on London’s bus network. They represent a self-reported measure of bus-related crime and disorder that is often crimed by the Metropolitan Police Service, but only when an emergency response is required. Like police recorded crime, Driver Incident Reports are geocoded to property addresses or street intersections, but importantly they also contain additional information: bus route and bonnet numbers can be matched with data from “i-Bus”, which is Transport for London’s Automatic Vehicle Location system that contains the bus’s direction of travel, and specifically the bus stop number. This contextual information enables Driver Incident Reports to be re-geocoded to the nearest bus stop that is served by the bus reporting the incident.

Furthermore, London Buses actively encourages drivers to report all incidents, through the Driver Incident Report system. The London Buses control room—CentreComm—deals with real-time incidents and assesses if assistance is required and, if so, pass this information to the Metropolitan Police Service (MPS) control room (MetroComm). Since DIRs are a reliable self-reported measure and most calls are made when the bus is stationary, they represent a strong proxy measure for disorder in and around bus stops and on the bus network.

Formally, Feature Manipulation Engine (F.M.E.) was used to carry over attribute data from the nearest bus stop to each Driver Incident Report, within a tolerance level of 100 m. The ‘NeighbourFinder’ transformer matched attribute data firstly by route run then by route number alone. This maximized the likelihood that each DIR was updated with attribute data from the nearest appropriate bus stop. A 100 m tolerance was chosen to capture DIRs which have been geocoded to property addresses offset from the road.

Following this procedure, our dataset comprised 67.3 % (6,263) of Driver Incident Reports that were matched to nearby bus stops within this tolerance level. Of these, 58.4 % (3,657) were matched using route run information and the remaining 41.6 % (2,606) by route information. Bus stops were then ranked by the frequency of DIRs, whereas some bus stops were removed from the list (prior to random assignment) if they did not meet the criteria below: (a) Bus stops must not be spatially auto-correlated. K nearest Local Indicators of Spatial Association (LISA) were used (K = 2); (b) the coefficient of variation (CoV) must be less than or equal to 0.5 (Johnson et al. 2008); A CoV close to zero indicates a temporally stable pattern of DIRs between 2012 and 2013; and (c) Police assistance must have been required in at least 33.3 % of the DIRs at each bus stop. These criteria ensured that high frequency bus stops were not too close to each other, were temporally stable and indicated the presence of police recorded crime.

Finally, we note that our baseline temporal analyses suggested that there are two ‘hot hours’ and ‘hot days’ peaks for the bus network: Monday through Fridays, between 12:00 and 20:00 and nighttime over weekends. We focused on the former because police shifts did not coincide with the weekend peak times. Monday through Fridays, between 12:00 and 20:00 were the patrol hours for the police officers and within these temporal boundaries we measured the direct treatment effect.

Random Assignment and Partial Blinding

The exclusion criteria we used generated a list of 102 eligible bus stops across the entire city, shown in Fig. 1. We conducted simple random assignment, which resulted in  a 1:1 split in the number of hotspots (Table 2). As noted earlier, the police were not given the full list of hotspots: the police were informed of the location of their treatment bus stops, but they were not told of  the location of the control bus stops. This blinding process decreases the chance of contamination and reduced the risk of violating the Stable Unit Value Transfer Assumption (SUTVA)—that is, that the effect of the treatment condition on each unit (treatment or control) is independent of the effects of treatment on any other units (Sampson 2010).

Fig. 1
figure 1

Treatment and control sites

Treatment

Prior to the implementation of the experimental protocol, all participating officers were briefed, during a one-day training course, about the merits of the experiment. The intervention in this experiment was carried out and delivered by teams of two uniformed officers, who were tasked to ‘be visible’ and deter crime and anti-social behavior at the bus stops. They arrived at the bus stops by bus during the ‘hot hours’—meaning that they were physically present at the bus stop at its ‘peak’ moment in terms of crime opportunities—i.e., when passengers embark or disembark the bus.

At any given moment during the lifecycle of the experiment (6 months), there were about 32 officers conducting these patrols during these hot hours. Each patrol unit had ownership of about 2–4 hotspots, depending on the travel distance between the bus stops: as the study was citywide, some bus stops could be a great distance apart, particularly given that the average traffic speed, in central London, is around 8.98 mph due to daily heavy traffic, particularly during the hours of the experiment. The officers were actively in charge of cooling down these bus stops through saturated presence only, compared to the control hotspots. This intervention, therefore, sits squarely within deterrence theory, because officers were not tasked to problem-solve (Goldstein, 1979); conduct community policing in the classic sense (Skogan and Hartnett 1997) or targeted patrols for any particular social or crime problem (Sherman and Rogan 1995; McGarrell et al. 2001). The primary nature of their task was ‘visibility’, with the aim of causing a change in the risk perceptions of prospective offenders. The aim was simply to deter.

Practically, there is very little that could, in fact, have been achieved in terms of active engagement within the 15-min patrols allocated to each hotspot. Each officer was accountable for 2–4 hotspots and the distance between the hotspots, often a mile apart, made the experiment operationally challenging in terms of interactions with the public, not least with potentially offending parties. The officers repeatedly told the research team that they were continuously pressured to ‘beat the clock’. This is not to say that officers were instructed not to deal with events as they occurred. If members of the public required assistance or the officers encountered crime or disorder, they were still required to report the event. In terms, however, of ordinary allocated time during the experiment, these officers focused on preventative saturated presence compared to control conditions. The no-treatment bus stops were not exposed to these preventative directed patrols, and we are able to describe these dosages—for both treatment and control conditions—given the GPS data made available for this study (see Ariel et al. 2016).

Officers were provided with patrol cards that contained maps of the designated bus stops, the bus segments/lines that were associated with the bus stops that would take them to their next bus top stops. Officers were instructed to follow the rigid patrol plan, without deviations, and therefore police presence in the experimental hot spots coincided with actual bus arrivals—as they arrived at the bus stops by bus.

Finally, we note that inquiries were made with TfL and the Metropolitan Police Service whether extra police activities have taken place during the experimental period, in the form of additional police presence at hotspots during hot hours. To the best of our knowledge, no officer beyond our team conducted proactive patrols in the hotspots during the hot hours, mainly because all available resources were seconded to the experiment. The Metropolitan Police Service would normally not ride on buses, as there is a jurisdictional divide between officers seconded to patrol the buses and the bus stops (who took part in the experiment) and all ‘other’ resources’. We also asked the Metropolitan Police Service whether any special operations were conducted within the hotspot areas (both treatment and control), and we were assured that no such activities have taken place, beyond routine response policing. We finally note that would there have been any such activities, they should have been randomly distributed between treatment and control conditions, in particular when the locations of the hotspots were not communicated at any point during the experiment to non-participating policing units.

Dependent Variables

We used two outcome measures in order to assess the treatment effect. We compared changes in these two outcome types between the period before and after the beginning of the trial, and then compared this difference among the two study groups (treatment and control conditions). First, we collected the number of self-reported calls-for-emergency assistance to CentreComm and Control Center (approximately 48,000 incidents per year). London Buses actively encourages drivers to report all incidents, through the Driver Incident Report (DIR) system. CentreComm use the system to deal with real-time incidents, simultaneously assessing if emergency assistance is required and, if so, passing this onto the Metropolitan Police Service control room, MetroComm. Some DIRs are internal and routine (i.e., the reporting of technical problems or route issues), so we looked at DIRs that were flagged ‘Community Safety’. These included instances of a disturbance on a bus (54 %), criminal damage to the bus (9.6 %), a passenger refusing to pay (30 %), a passenger threatening violence (4.3 %), theft (1.6 %) and robbery (0.5 %). Not all DIRs result in a recorded crime, which therefore adds an interesting complexity to the study of hotspots: self-reports. We counted the number of baseline DIRs 6 months before the RCT (February to July 2013) and then for the 6 months experimental period (February to July 2014). We also broke down the data based on the time of the intervention (Monday–Friday, between 12:00 and 20:00), and outside these hours, to test for temporal displacement, and measured the same for the treatment area (50 m around the bus so), between 50 and 100 ms around the bus stop, and then again within 100–150 m around the bus stops.

Second, we counted the number of victim-generated crimes reported to the Metropolitan Police Service, in the same manner described earlier for DIRs. Police-generated crimes—that is, crimes that are essentially police outputs rather than treatment outcomes—such as proactive searches for drugs offences, stop-and-searches, and traffic stops—were excluded from the data (Sherman and Weisburd 1995). Crime data were considered our primary outcome variable and DIRs as secondary outcome variables.

GPS Tracking Data

Every team was equipped with a hand-held GPS-tracker that could track the movement of the officer, at any given moment. The GPS trackers were used to measure how much time officers spend in particular areas (duration), and how many visits were made (frequency). Every tracker was set to transmit a ‘ping’, giving spatiotemporal coordinates (latitude, longitude and a timestamp) of the tracker (Wain and Ariel 2014). For the purposes of the experiment, the system was set to ping every 5 min. Importantly, the GPS back-office systems could be used to ‘geo-fence’ areas of land, and by counting how many ‘pings’ the trackers sent from within these geo-fenced areas, we were able to measure, with high precision and accuracy, how many visits each officer has made to these geo-fenced areas and for how many minutes. This ‘point in polygon’ analysis was applied to all participating hotspots, which were geo-fenced in such a way as to allow us to accurately measure dosage delivery.

Statistical Procedure

Our DIR and victim-generated crime data are comprised of counts. Therefore, the most appropriate statistical analyses require either a Poisson or negative binomial regression framework. Our analyses incorporated 12 models across 102 total units (51 T and 51 C hotspots), and each model looks at a different spatial or temporal unit of count data, and across two data sets (DIRs and victim-generated crime data). However, under all models, there was suspicion of over-dispersion (see Cameron and Trivedi 2013)—which are not uncommon in criminology (MacDonald and Lattimore 2010).

One way to fix this is to analyze the data using negative binomial models, which have the same mean structure as Poisson regression, but they have an extra parameter to model the over-dispersion (see Osgood 2000). Another way to address the over-dispersion is to apply a generalized linear model with adjusted Poisson models (McCullagh and Nelder 1989: pp. 124–135). In this procedure, an adjusted Poisson distribution is created using a Pearson Chi Square Scale Parameter Method, within the Generalized Linear Model. This procedure corrects for over- or under-dispersion in regression distributions, and by implication corrects the standard errors of the estimates. The standard errors of the parameter estimates are multiplied by the square root of the new scale statistic, making the statistical tests more conservative.

We explored the two models, yet when we compared the models using the Bayesian Information Criteria (BIC) (Schwarz 1978), we have found that the most appropriate functional form of the variance was the adjusted Poisson regression models: the differences ranged between −1 and 8 % in favor of the adjusted Poisson model. In some comparison there were also signals of under-dispersion under the negative binomial models, which was another reason for us to choose the adjusted regression model in order to estimate the differences between experimental and control groups in terms of DIRs, and then in terms of victim-generated crime counts.

Group assignment [‘treatment’ (1)/‘control’ (0)] were the predictors, the baseline outcome data served as a covariate, and a Pearson Scale Parameter for the over-dispersion correction, all within Generalized Linear Models. We present the estimated marginal means (for more on marginal means, see McCulloch et al. 2008), in order to report the mean responses for the treatment effect, adjusted for the baseline covariate in the model. We applied the same statistical method to account for (a) the treatment effect in the vicinity of the bus stop; (b) within the 50–100 m catchment area, (c) within the 100–150 m catchment area—during the experimental temporal period and outside of these times. We conducted these analyses for both DIRs and crimes.

Measures of Displacement

In order to account for the displacement or diffusion of benefits, we measured the number of DIR reports, as well as the number of victim-generated crimes, that took place within contiguous radii expanding from each bus stop—that is, within the vicinity of the bus stops (50 m, 50–100 m, and 100–150 m radii from the bus stops), in both treatment and control conditions. Given our experimental design, the same type of statistical analysis was conducted for each area (see below), without resorting to more complicated measures (Guerette 2009). Following Kondo et al. (2015), we observed the aggregated counts of DIRS and crimes with difference-in-differences effects found in the regression models described above; a reduction in counts immediately surrounding the bus stops, but an increase in counts in catchment areas, would indicate a spatial displacement.

Statistical Power

102 target bus stops were used in this experiment. This created a study with medium statistical power. Statistical power was defined by Bornstein and Cohen (1988) as the probability of detecting a statistically significant outcome in an experiment, given the true difference between the treatment group and the control group. By using Optimal Design (Spybrook et al. 2009), we estimated that this sample size is large enough to detect effects of 0.5, in which the significance level is 0.05, the hypotheses are assumed to be non-directional, and the estimated power is 0.80.

Results

The Sample

The total number of DIRs and victim-generated crimes associated with treatment and control bus stops, before and during the experimental period, are shown in Table 1 below. The distribution of events at baseline values after random assignment (24 months prior to the study), for crimes and calls for service, are presented in the table as well. The table illustrates that while the threshold for ‘being a hotspot’ was set as 3 incidents in a 6-month period, the mean number of DIRs per hotspot at baseline was about 4.1. Table 1 also shows, by implication, the value of the random assignment; none of the pre-treatment between-groups comparisons were significant.

Table 1 Baseline DIRs and crime data (Feb–Jul 2013)—within three contiguous radii around the targets (treatment and control bus stops)

Manipulation Checks

The GPS data are presented next (Table 2). GPS data provide precise measures of dosage delivery, in both treatment and control conditions, as a way of proving the authenticity of an experiment. When the officers entered the geo-fenced areas (the hotspots), they left two types of ‘digital footprint’: the patrol time in the hotspots (measured in minutes), as well as the number of patrol visits per day, or total for the year of the RCT.

Table 2 GPS tracking data: 50 m, 50–100 m, 100–150 m around the target

Officers were tasked to make 3 × 15 min visits per day to all hotspots. The integrity of this delivery was maintained throughout the RCT. Overall officers spent 19 min per bus stop (SD = 8.20), while in the control hotspots officers spent 0.08 min (SD = 0.24). Officers overdosed the hotspots with 3.8 visits per day (SD = 1.64), while almost never visiting the control hotspots (M = 0.02; SD = 0.04). These amount to 23,280 visits to 51 treatment bus stops (M = 456.5; SD = 196.7) and 94 visits to the 51 control bus stops (M = 1.84; SD = 5.79). All between- group differences were statistically significant at the p ≤ 0.001 levels, when measured using independent sample t-tests (see Table 2).

The GPS data also indicates how much time officers spent in the catchment buffers more than 50 m radii from the bus stops. As shown (Table 2), while officers were specifically requested to stay within the vicinity of the bus stops, we suspect that operational necessities drove them to spend 11.6 min (SD = 6.12) in the 50–100 m buffer zones, and 6.5 min (SD = 6.66) in the 100–150 m buffer zones around the treatment bus stops. These unintended treatment deliveries—stepping outside the assigned treatment area boundaries—were reduced the further the buffers were from the epicenter, and these differences were statistically significant when measured using analyses of variables (F(50,2) = 44.970; p ≤ 0.001), with all subgroup pairwise comparisons using Tukey’s Honestly Significant Differences significant at 0.001 level as well. Some time was spent by officers in the control catchment zones (Table 2), although only marginally and without significant differences between the three contiguous radii around the control bus stops (F(50,2) = 0.919; p ≥ 0.10).

Estimating the Treatment Effect

Table 3 shows the coefficient estimates (b) and their associated standard errors (SE), for up to 50 m around the bus stops, 50–100 ms, and 100–150 ms catchment errors. The table lists the factors, the baselines, and the intercepts, for DIRs during the experimental time (Mondays through Fridays, between 12 PM and 8 PM), in the remainder times (outside the experimental time), in order to show for temporal pushing effect, and then the narrow time bands for victim-generated crimes.

Table 3 Difference-in-differences estimates of treatment effect on DIRS and victim-generated crimes, by radii and time (London): parameters, standard errors (SE)

As shown (Table 3), a police presence in the immediate vicinity of the bus stops significantly reduced the numbers of DIRs (b = −0.469; p ≤ 0.10),Footnote 1 compared to control conditions. The treatment effect on DIRs carried through to the surrounding catching area (50–100 ms radius; b = −0.513; p ≤ 0.05), but was non-significant in the outer radius of 100 ms or more, although in the same direction (b = −0.108; p ≥ 0.10).

Temporally, the treatment effect was in the hypothesized direction during out-of-hours as well, in the near vicinity of the bus stops (b = −0.348; p ≤ 0.10). This means that, compared to control conditions, the number of DIRs was significantly lower even on days and hours when the patrols were not conducted. However, a backfiring effect emerged in the 50–100 m radius, with significantly more DIRs reported during treatment conditions than control conditions (b = 0.387; p ≤ 0.01). As with the patrol hours and days, no apparent effect was detected for the 100–150 m radius for out-of-hours patrol days and times (b = 0.118; p ≥ 0.10).

In terms of victim-generated crimes, an overall backfiring effect was detected, in all geographic comparisons. Under treatment conditions the number of crimes was significantly higher in the near vicinity of the bus stops (b = 0.225; p ≤ 0.10), higher between 50 and 100 ms around the bus stops (b = 0.204; p ≤ 0.01), as well as in the outer catchment area of 100–150 ms (b = 0.112; p ≤ 0.01).

Table 4 illustrates the findings in more accessible terms, with the estimated marginal means. These mean responses suggest significant overall 37 and 40 % reductions in DIRs in the 50-m vicinity and 50–100 m buffers during patrol days and hours. Conflicting findings emerge in the 50–100 m buffer areas when the officers did not patrol the hotspots, with a 47 % increase in DIRs compared to control conditions. In both temporal moments, the percent changes are not significant at the 100–150 m buffers (−10 and +13 %, respectively). The backfiring effect in terms of victim-generated crimes is shown again here, with 25, 23 and 11 % increases compared to control conditions, as we move away from the bus stops (50 m, 50–100 and 100–150 ms, respectively). In raw figures, we counted 4227 crimes in treatment conditions (all areas) and 3962 in control conditions, post random-assignment (265 additional crimes committed as a result of the intervention); 143 and 162 DIRs committed during treatment and control conditions, respectively, during patrol days and hours, and 664 and 636 when officers did not conduct patrols, respectively, in treatment and control conditions. Collectively (the two temporal moments), a marginal increase of 9 DIRs was recorded (807 during treatment and 798 during control conditions). These raw counts are shown in Figs. 2, 3 below.

Table 4 Estimated marginal means, standard errors (SE) and per cent changes by radii and time
Fig. 2
figure 2

Post-random assignment raw DIR counts

Fig. 3
figure 3

Post-random assignment raw victim-generated crime counts

Discussion

In this experiment, we sought to replicate findings published in US studies on the effect of hotspots policing on crime and disorder in England and Wales. While the study looked at unique settings—bus stops—the theoretical mechanisms behind deterrence theory were hypothesized to be materialized here as well. Following a six-month experimental intervention, our results suggest that we were able to repeat the same findings only partially: self-reported incidents by the bus drivers (DIRs) went down significantly by 37 % (p = 0.07) in the near vicinity of the bus stops (50 m), by 40 % in the 100 m catchment area (p = 0.04) and marginally and non-significantly in the farthest catchment (10 %; p = 0.66), compared to control conditions. However, victim-generated crimes—the primary outcome measured in previous experiments - increased by 25 % (p = 0.10) in the near vicinity, by 23 % (p = 0.08) and 11 % (p ≤ 0.001) within the 100 and 150 m catchment areas, respectively.

What explains the overall backfiring effect of the intervention, in terms of victim-generated crimes, and the evident divergence from existing evidence on hotspots policing? We offer an explanation, with direct implications for policy and future research. The findings illustrate the importance of bounded rationality in everyday policing: not sustaining the unpredictability that characterizes effective hotspot police patrols can lead offenders to accurately calculate a lower risk of apprehension.

Rational choice theory (Clarke and Cornish 1985) maintains that the entire process of crime is rooted in rationality. The decision to commit a crime, through the search and selection of a suitable opportunity, to the actual perpetration of the crime itself can be viewed as a process of rational decision making (La Vigne 2015). While the exact mechanism in which ‘rationality’ operates is still debated, human behavior is strongly directed towards economic gains, which requires some form of rational calculation, and criminal behavior is no different. The identification and selection of a target is fairly rational and guided by the offender’s (and victim’s) everyday activities (Cohen and Felson 1979). Whether the action will be successful or not, whether the offender will be apprehended and the likelihood of this event taking place, are all part of a rational cost-benefit analysis, which every one of us ‘computes’, regardless of how morally wrong the selected action might be.

A motivated offender is more likely to decide to commit a crime if the benefits outweigh the perceived costs. However, offender decision-making is not perfectly rational but is limited, or rather ‘bounded’ (Simon 1982), by the individual’s ability to process the information in their environment. Constrained by the processing limitations of the human mind, offenders will tend to rely on cues that help simplify the complex world around them (Kahneman 1973). They will use rules of thumb or ‘heuristics’ to simplify their target selection decision-making processes (Kahneman and Tversky 1973; Gialopsos and Carter 2015). Under conditions of uncertainty with limited information about their situation these heuristics simplify the decision-making process of whether to offend or not.

We argue that the success of hotspots policing depends in large part on exploiting the condition of uncertainty under which offenders decide with little information about their environment to commit crime. Weisburd and Braga (2006) have observed from interviews with offenders arrested during a hotspots policing operation that they often did not have a “clearly defined understanding of the geographic scope of police activities” (p. 580). Offenders operating under bounded rationality assumed that “the crackdowns were not limited to the target areas but instead were part of a more general increase in police enforcement” (ibid.), with limited information on police deployment patterns the offender will likely use police presence as a heuristic. When the police are present (or perceived to be present), offending is less attractive and therefore less probable. The existence of a police officer in the spatiotemporal vicinity of a suitable target must be viewed by a simple rule of thumb: do not offend! Alternatively, when the police officer is not present, a heuristic approach is also employed: offend! (see further in Ariel et al. 2016).

The heuristic approach also explains under which conditions hotspots policing would have no effect—or as in our case, even backfire. When the unpredictability of police patrols is not sustained, and rationality is no longer bounded by a lack of information, hotspots policing will inevitably be less effective. In this study, the police targeted a very small spatial area (bus stops) for a relatively long period of time (6 months): their comings and goings were highly predictable: all the offenders had to do was to see the police leave on the bus, to know that the risk of apprehension had substantially reduced. The search for targets is not disorganized. It follows specific paths with familiar awareness spaces in the vicinity of crime generators or crime attractors (Brantingham and Brantingham 1981; Brantingham et al. 1991; Brantingham and Brantingham 1993, 1995). Victimization will soon follow when the targets are unguarded (LaGrange 1999), particularly when the offender can predict, with a fairly high degree of accuracy, that the likelihood of apprehension by a capable guardian is low.

Limitations

There were four key limitations to this study. First, Driver Incidents Reports were used as a proxy for crime because police recorded crimes are not geocoded to non-addressable locations like bus stops. Often a crime at a bus stop will be geocoded to a property address like a shopping center, which may be surrounded by ten or more bus stops. We regarded Driver Incident Reports as a good proxy measure because a significant proportion result in a recorded crime and they can be matched to the nearest bus stops using route and direction of travel information. At the same time, in certain cases, this can produce a false positive, as the actual crime incident may have little to do with the actual bus stop that the point was attributed to. It is possible, though not evidenced, for an incident to occur when a bus is in motion but not be reported until the bus stops.

Second, we cannot accurately characterize the interventions—if any—that took place in the hotspots. Unlike Rosenfeld et al. (2014), we do not have data on the types of engagements that the officers have done while on patrol, in treatment nor in control conditions. Future research should look at these dimensions as well.

Third, our study is limited in time, without a follow-up period beyond the 6 months of the experiment. We do not know to what extent there is any residual deterrence (Sherman 1990), or nil effect beyond the experimental period. Future experiments need to focus on these temporal issues—and by this we echo a similar conclusion made by Smith and Clarke (2000:219) who observed that, particularly in the framework of crime and public transport—“what is needed are long-term programs of study carried out by criminologists knowledgeable about the transport environment who are able to use information about victimization, offender decision making, and the distribution of crime in order to address these problems”.

Fourth, the absence of any insights into offender decision making means that the heuristic approach under which we believe hotspots policing is effective remains hypothetical. While logically convincing, further research in this area is therefore encouraged.

Conclusions

This study has shown that targeted deployments to micro-places like bus stops can lead to an increase in crime because the risk of apprehension diminishes when police patrol patterns are in plain view of offenders. The knowledge that police activity is confined to a bus stop rather than part of a wider increase in enforcement indicates to the offender that the risk of apprehension is reduced in the immediately surrounding area.

Does this mean that hotspots policing in micro-places is ineffective? No. A recent study in Sacramento, California demonstrated that random police deployments could be effective at micro-spatial scales (Telep and Weisburd 2012); randomly rotating officers between treatment group street blocks resulted in significant overall reductions in both calls for service and Part 1 crime incidents. Random preventive patrols have been routinely described as an ineffective crime control measure over large areas (Kelling et al. 1974; Weisburd and Eck 2004; Telep and Weisburd 2012) but the rotating deployment of officers increases the unpredictability of enforcement at locations whose size encourages expected police behavior.