Keywords

1 Introduction

100 years after the 1918 Spanish flu pandemic, we are currently struggling to contain the inexorable spread of the novel coronavirus (SARS-CoV-2). In December 2019, a cluster of pneumonia cases of unknown etiology emerged in the city of Wuhan, some of which were connected to a food market, in the Chinese province of Hubei (WHO 2020a). On January 9, 2020, the Chinese health authorities and the World Health Organization (WHO 2020a) officially announced the discovery of a novel coronavirus, 2019-nCoV, now known as SARS-CoV-2 (Zhou et al. 2020; Zhu et al. 2020). The suspected food market had already been shut down on January 1, but with an increasing number of cases, the Chinese authorities put Wuhan and subsequently all 15 cities in Hubai province into a lockdown of unprecedented levels, affecting about 57 million people for over two months. Such an unparalleled public health response was likely influenced by the previous other two novel coronavirus outbreaks: the severe acute respiratory syndrome coronavirus SARS-CoV 2003 and the Middle East respiratory syndrome coronavirus MERS-CoV 2012, which had estimated associated case fatality ratios of 11% and 35% respectively (WHO 2003, 2019). SARS-CoV-2 then spread globally and, faced with rapidly overburdened health systems, countries worldwide, for the most part, followed suit, by imposing increasingly stringent confinement policies and leaving governments facing the challenge of balancing immediate health and economic costs (Anderson et al. 2020; Fisher and Wilder-Smith 2020; Gates 2020; McKee and Stuckler 2020). The implementation of such extreme and widespread imposed lockdowns is unprecedented and yet, along with individual social distancing measures, these and other non-pharmaceutical interventions (NPIs) have a rich history in the public health response to infectious diseases. Despite our technological, scientific, and medical progress, we still have recourse to centuries old methods to mitigate against COVID-19 while awaiting the development of effective drugs and/or vaccines. As WHO Director-General Dr. Tedros stated: “we have a long way to go. This virus will be with us for a long time” and “the world cannot go back to the way things were. There must be a ‘new normal’—a world that is healthier, safer and better prepared” (WHO 2020b). In this chapter, we discuss how integration of a social sciences perspective into the more classical epidemiological approach to study infectious diseases could contribute to the development of a more efficient framework to mitigate against such large-scale outbreaks.

2 Public Health Intervention Measures

The current widely implemented NPIs include measures to reduce contact between individuals in an attempt to prevent any potential transmission at several scales: (1) a mass scale (sanitary cordons to prevent people leaving or entering the infected area); (2) an intra-city scale (closures of educational establishments, businesses, and restaurants); (3) a more familial level (banning of social gatherings and mixing between houses); and (4) an individual level (testing of incoming travellers, tracing of contacts of cases, isolation, and self-quarantine).

2.1 Testing, Track and Trace

Considerable attention has been paid to their relative utility and efficacy in reducing the disease burden and when to implement which approach—what is the best strategy. Robust testing capacity, prompt isolation of cases, and timely and effective tracing of individuals who have had contact with a known case and their subsequent quarantine, along with social distancing, are currently the main pillars of the European COVID-19 public health response (ECDC 2020a). Over the course of the COVID-19 pandemic, testing strategies across Europe focussed primarily on individuals with symptoms that were clinically compatible with SARS-CoV-2 infection. These included many symptoms widely shared by respiratory infections (fever, diarrhoea, cough, dyspnoea, headaches, and confusion), as well as unusual symptoms, notably odynophagia (pain when swallowing), ageusia (loss of sense of taste), and anosmia (loss of sense of smell). In addition, testing has been widely implemented for individuals in high-risk groups (for example, healthcare workers, staff and residents of long-term care facilities), in accordance with the ECDC case definition, surveillance strategy and long-term care facility surveillance guidance (ECDC 2020b, c, d). Quantifying the utility of a range of different testing, isolation, tracing, and physical distancing scenarios in the UK revealed that combined isolation and tracing strategies would reduce transmission more than mass testing or self-isolation alone (Kucharski et al. 2020). If combined with moderate physical distancing measures, self-isolation and contact tracing would be the most likely approach to achieve control (Peak et al. 2020). However, it has been estimated that a high proportion of cases would need to self-isolate and a very large proportion of their contacts to be successfully traced to reduce the basic reproductive number, R0 (the number of secondary infections from a single primary infection) to less than 1 in the absence of other measures. Although there is at present no global consensus, it is increasingly agreed, however, that contact tracing and testing is simply not possible when the epidemic is in full flow and that it can only be implemented when the number of infections is relatively low (Ferguson et al. 2020). Approaches that are more passive have been developed to aid in identifying contacts of cases in the form of phone apps (eHealth Network 2020). Attempts to encourage the general public to download and use phone apps have, however, had considerable teething problems and been met with little success and the number of asymptomatic infections throws into doubt to what extent trace and test is of use (Chidambaram et al. 2020). Testing of household members of cases is common, but random sampling from households, irrespective of whether a case was identified or not, has also been performed to assess the overall prevalence rates and estimate the extent to which infections are asymptomatic (Buitrago et al. 2020). One additional challenge in contact tracing is how quickly to test. The latency period (prior to detectable/transmissible virus in the individual) lasts 5–6 days (range 4–12 days) (Backer et al. 2020; Lauer et al. 2020; Petersen et al. 2020). Testing too early might therefore miss incubating infections. Crucial to the utility of isolation is knowledge of the latency and infectious periods, which consequently determine the length of the isolation period. Such information can only be gathered with experience, hence the initial uncertainty about the required period of self-isolation for individuals either coming from an infected area or having had contact with a confirmed case. Mass testing or population-wide testing has been carried out in many countries in several different ways. Testing of incoming travellers is common. Some countries have also set up drive-in testing centres for all, but with increasing numbers of cases, the logistics and cost of such an approach have rapidly become unviable and tests are now often restricted to those with COVID-like symptoms and contacts thereof. However, despite the logistical challenges, a pilot mass-testing approach was implemented in Liverpool, with the intention of rolling it out across the whole of England (Iacobucci 2020).

2.2 Quarantine

The isolation of infected individuals has been documented to have occurred since biblical times and throughout the middle ages, prior to the development of germ theory. The most extreme example of individual isolation is that of Typhoid Mary, a typhoid superspreader (see below), who was arrested and spent the last 20 years of her life in a prison hospital. Long-term isolation of individuals infected with leprosy is well known and enforced sanitary cordons around infected villages, towns and cities have been implemented since the fourteenth century when the first plague pandemic occurred (Porter 1998). Although often used interchangeably with cordon sanitaire (sanitary cordon), quarantine (derived from quarantena, the Venetian language form, meaning “40 days”) was first used as a measure of disease prevention related to the plague (Sehdev 2002). The individuals of arriving ships had to spend 30 days (later extended to 40) in specified isolation sites waiting to see whether the symptoms of plague would develop. This practice was subsequently used in the later plague epidemics and in an attempt to ward off the yellow fever and cholera epidemics of the nineteenth century (Sehdev 2002). More recently, thousands of individuals were put into quarantine during the 2003 SARS epidemic and home-based quarantine was carried out during the Ebola epidemic. This form of individual isolation based on testing for symptoms and/or contacts with known cases is currently used for the COVID-19 pandemic and its efficacy demonstrated in the case of the quarantine on the Diamond Princess cruise ship. It is thought that an infected individual boarded the vessel from Hong Kong late January, developed symptoms, and was tested positive for SARS-CoV2. On February 1, the ship was then placed in quarantine and by February 15 there were 285 cases. At this point, the passengers were disembarked and placed in quarantine units before repatriation. The final case count exceeded 700 out of a total of 3,711 passengers and staff and, through testing, 18% of infections were identified as being without symptoms (that is, asymptomatic) or with few symptoms (paucisymptomatic). The contagiousness of the virus was estimated to be four times higher than that in Wuhan, at least before passengers were confined to their cabins (Rocklöv et al. 2020). This underlines how contact rates (and conversely quarantine) can have a very large impact upon the spread of the virus and lead to a superspreading event (see below).

3 Fundamental Epidemiological Parameters and Superspreaders

Arguably, the most important measure in infectious disease epidemiology is the basic reproductive number, R0: how many individuals will become infected by a single infected person (in a completely immune-naïve population). This gives a measure of the contagiousness of the pathogen and provides a metric to assess the efficacy of an intervention. Ideally, the aim of any intervention strategy is to reduce R to below one, at which point the pathogen would eventually die out. The R0 of SARS-CoV-2 has been estimated to be 2.5–4 in China (Li et al. 2020a; Wu et al. 2020), Italy (Remuzzi and Remuzzi 2020), France (Salje et al. 2020) and across Europe generally (Flaxman et al. 2020). This means that transmission would need to be reduced by more than 60% to reduce R below 1 (Petersen et al. 2020). A study in the UK estimated that the observed 74% reduction in number of contacts per person during lockdown enabled a reduction of R from the pre-lockdown value 2.6 to below 1 (Jarvis et al. 2020). However, much higher values of R have been estimated in Oman, UK, and through a re-analysis of Wuhan, with values reaching as high as six (Al Wahaibi et al. 2020; Dropkin et al. 2020; Sanche et al. 2020).

Crucial parameters that determine R0 are the duration of an infection and the infectious period. Infectiousness is estimated to start at the onset of symptoms and last 5–6 days, although viral shedding in hospitalized patients has been found to last as much as three weeks (Petersen et al. 2020; Zhou et al. 2020). However, there are high numbers of asymptomatic/paucisymptomatic infections and little is known about their infectiousness or the duration of such infections. A single point estimate in households suggests that not only are asymptomatics 65% as infectious as symptomatic individuals, but also they tend to lead to other asymptomatic infections (Kucharski et al. 2020). This may reflect a lower viral load in asymptomatic infections, but could also be influenced by human genetics, as household members are related.

Importantly, whilst R0 gives an overall average of the contagiousness of the pathogen, the actual number of people infected by a single infected person will vary enormously. This can occur for a variety of reasons, whether due to individual innate characteristics (human genetics), viral load, or simply because some people have more contacts than others (Kucharski et al. 2020; Petersen et al. 2020). This latter is probably the most important feature determining pathogen transmission and likely influenced by age, behavioural gregariousness, type of work, and type of transport used. Thus while R0 is a useful metric, it belies the actual transmission structure. Understanding the contribution of superspreading individuals, events, or places to the global case count is essential for developing appropriate intervention strategies and assessing the utility of approaches targeting potential reservoirs of infections.

The role of superspreading events is well documented in the literature and plays an important but variable role in pathogen transmission (Lloyd-Smith et al. 2005). During the 2003 SARS epidemic, one index case led to 292 secondary cases living in the same block of flats, mostly likely transmitted through the ventilation system (Yu et al. 2004). Superspreading events during the current pandemic have been notable, with many types of social gatherings (music festivals, church services), generating a large number of COVID cases (Ghinai et al. 2020; Hamner et al. 2020). Smaller clusters also occur within households, as might be expected from exposure in an enclosed space, and are probably more common than thought due to the large number of asymptomatic cases. As previously found for SARS 2003, nosocomial transmission is common as is transmission in care homes (Leclerc et al. 2020). While such superspreading events have led to the banning of public events, the tendency for pathogens to generate clusters varies significantly according to the pathogen in question, suggesting that both human behaviour and the nature of the infection itself are important. The extent to which the infections occur in clusters can be quantified as an aggregation parameter k, with small values (<1) indicating high clustering and values of 1 or larger an increasingly random distribution. For example, the k of SARS was estimated to be 0.16 and of MERS 0.25, indicating clustering. Estimates of SARS-CoV-2 are currently uncertain, but may be of the same order of magnitude, with an estimated 10% of the cases contributing to 80% of the spread (Endo et al. 2020). This disproportionate contribution to viral spread has been backed up by a recent study of more than a half million people in India who were exposed to SARS-CoV-2 (Laxminarayan et al. 2020). The report suggests that the continued spread of the virus was driven by only a small percentage of those who become infected. An estimated 71% of infected individuals did not infect any of their contacts, while a mere 8% of infected individuals accounted for 60% of new infections. In that study, they also found that close contact community level transmission was less important (2.6% transmission probability) than household transmission (9% probability). However, they also found that although high risk contact individuals (spending > 6 h in the same public transport as a case) did have higher risk of infection than low risk contact individuals, the difference was only marginal (10% vs 5% infection probability). Another study in the state of Georgia, US, estimated that 2% of infections were found to generate 20% of new infections and individuals <60 year olds contributed more than older individuals to transmission (Lau et al. 2020).

At local scales (that is, city, town, hospital, house), the significance of such heterogeneous infectiousness can be relatively easily understood and to some extent managed. Over a larger scale (county, region, country), however, it is less clear what this means for the spread of the pathogen and thus what measures should be taken.

4 Travel Restrictions and Mobility—The Value of Social Media and Mobile Phone Data

To this point, we have addressed how NPIs have been designed to slow the rate of viral transmission predominantly at a local level. One important feature of the response to the first wave was implementation of the sanitary cordon and placing draconian travel restrictions on the population to prevent the spread of the pathogen. The efficacy of this strategy lies in the timing of the implementation. The travel restrictions in Hubei have been estimated to have delayed the epidemic progression throughout China by only 3–5 days, probably because the virus had already been spread widely prior to the lockdown because of extensive travel linked to the Lunar New Year holiday (Gibbs et al. 2020). The sanitary cordon was likely therefore implemented too late and although human mobility data correlated well with the spatial spread of the pathogen prior to the travel restrictions, social distancing, and local measures to reduce contact became more important post lockdown (Kraemer et al. 2020). On a greater scale, however, the Hubei lockdown was estimated to have significantly delayed the spread of the virus internationally, where case importation was reduced by nearly 80% until mid-February (Chinazzi et al. 2020).

Key to understanding the efficacy of reduced contact intervention measures, whether travel restrictions or social distancing (including self-quarantine, reduced gatherings, stay-at-home orders), is being able to quantitatively measure the extent to which they are being adopted by the population and if there is variation in their adoption, whether by place (city size, rural vs. urban) or population strata (age, gender, economic status). The value of being able to quantitatively visualize mobility data at all geographical scales is many-fold and especially clear with respect to the COVID measures taken by governments. Mobility data and changes thereof can inform how people respond to imposed and imminent announcements of measures to be taken. For example, in many affected cities, there was a large efflux just before the announcement of the lockdown and particularly in those areas with highest rates (Pullano et al. 2020). In France, after the announcement of school closures, there was a considerable exodus from Ile-de-France to the countryside in neighbouring Normandy (Pullano et al. 2020). This represented the wealthier population with second homes or those with families in the countryside, with the intention to wait out the epidemic in more comfortable settings. Secondly, in near real time, the extent to which the lockdown is actually being effective can be observed and how the effects differ across geographical areas. Thirdly, the extent to which short and long (>100 km) distances moved can be differentiated: large distance movements will contribute to the spread of the epidemic across the region and country, whereas small distance movements will contribute to within-region epidemic levels. This in turn can inform about the potential flux of infectious individuals, in so far as the majority of the mobile working population will be part of the age range that is more likely to have subclinical infections.

One interesting example is Tokyo, Japan, where the justice system did not allow a mandatory lockdown. The government announced a non-compulsory remote working request at the end of February, followed quickly by closure of public schools. This entailed a significant decrease in mobility, as estimated through mobile phone data, even though there was no mandatory lockdown (Yabe et al. 2020). Subsequently on April 7, a state of emergency was announced, and this was followed by a rapid decrease in mobility at both an intra-Tokyo level and at longer distances. However, calculations of R suggested that the value was already low (~0.3) prior to the announcement, begging the question of the added value of this announcement and consequent public reaction. This would suggest a more nuanced response is warranted, although hindsight is of little value for decision-making. For this reason, worldwide, there was a tremendous effort to increase testing capacity to improve knowledge of the state of the epidemic and attempt to develop apps devices to improve track and testing of case contacts.

In our hyper-connected world there is extensive use of mobile phones, internet providers, and social media and there is currently much discussion on the use of such data to guide the public health response to COVID (Grantz et al. 2020). Within strict guidelines concerning personal privacy, several mobile phone operators, social network and internet service providers across the world have offered their data at an aggregated level to be able to visualize patterns of mobility and contact (Buckee et al. 2020; Pollina and Busvine 2020; Romm et al. 2020). Several different types of data can be generated, enabling varying levels of mobility and contact rates to be addressed. Firstly, call data records (CDRs) are routinely collected by operators and inform on an individual’s movement (origin/destination) as indicated by proximity to the closest telephone tower. These capture the major components of human mobility: total trip distance (TTD), radius of gyration (RG) (characteristic distance travelled a person), and number of sites visited. Aggregated across a population, these data can reveal important features of larger-scale movement patterns and how the distribution of TTD, RG, and sites visited alters over time and place. Aggregated, anonymized, passively collected mobile phone data have previously used in infectious disease modelling of many infectious diseases, including cholera (Bengtsson et al. 2015; Finger et al. 2016), dengue (Wesolowski et al. 2015), Ebola (Peak et al. 2018), malaria (Wesolowski et al. 2012) and measles (Wesolowski et al. 2018) and more recently SARS-CoV-2 (Kraemer et al. 2020). The disadvantage of such data is their local imprecision, potential population bias, and dependency on telephone tower densities (Wesolowski et al. 2016). Secondly, many phone applications offer applications that will generate GPS location data, which enables many of the same advantages of CDRs, but allows more spatial precision. The disadvantage of such data is the bias in the user population (Wesolowski et al. 2016). Facebook falls into this category. These first two types of data are useful for quantifying large geographical scale population mobility patterns, predicting how the flux of the spread of a disease may occur and tracking how the components of mobilities change over time and place. The third type of data is Bluetooth data that records connections between two Bluetooth users. This offers much more detail on fine-scale population clustering and how, for example, intervention impacts upon social gatherings. Bluetooth data are clearly limited by user bias. Finally, many governments have endorsed opt-in applications that signal whether an individual has been in close proximity to a known infected case. This Bluetooth-dependent approach was aimed to be of value for contact tracing but requires significant compliance by the population and the extent to which this will work is debatable. The EU drafted a recommendation on April 8, 2020 to develop a common EU toolbox for the use of technology and data to combat and exit from the COVID-19 crisis, in particular concerning mobile applications and the use of anonymized mobility data (European Commission 2020). An eventual aim of linking apps from all European countries was to enable better management of COVID across Europe to account for inter-country population mobility. To date 19 of 27 countries have developed an app, eight either have none or an app under development. Of all these actual or developing apps, 20 are potentially interoperable (can communicate with apps from other EU countries), but of which only six are currently so. The extent to which a Europe-wide system will actually be feasible before the end of the current pandemic remains to be seen, but at least the infrastructure will have been developed for future occasions of extreme events.

With so much quantitative data on movement and available case data on COVID infections from many countries, it should be possible to analyze the actual importance of reductions in mobilities on the spread of infection and whether light or hard lockdown should be implemented and under what circumstances (Benkimoun et al. 2020; Telle et al. 2021).

5 Quantifying the Impact of Travel Restrictions Using Facebook—Worked Examples

The majority of studies to date that have used mobility data have focussed on changing patterns of human mobility associated with confinement measures and a few have attempted to correlate this with alterations in key epidemiological parameters such as R0. While useful, these approaches are indirect and yet it is possible to generate a much-improved understanding on the impact of human mobility on the spread and extent of the epidemic at geographical scales pertinent to public health responses. In the majority of countries worldwide, while there is general guidance by the central government to shape national health policies, governance and implementation of health measures is taken at the local scale, at an intermediary geographical scale. For the most part, however, government advisory panels at best categorize local administrative units into severity scales based on current numbers of infections and stress on the local health system without recourse to local inter-administrative unit mobility: that is, the classification of local epidemic severity and resilience of the local health system is neither dynamic nor predictive. This is surprising as we know that a significant proportion of infections are asymptomatic/paucisymptomatic, people will move across borders and thus will carry the infection with them. Although local infections will contribute a substantial reservoir of infection, their importance relative to an influx of immigrating infections needs to be addressed and on a regular basis as it will certainly alter over the course of the epidemic at the global and local scale. Thus, here we address this across several countries that imposed differing levels of confinement and assess the added value of inclusion of mobility data in explaining and predicting the epidemic dynamics at the local scale.

5.1 Methods

Using data from the “Facebook data for Good” program, which provide data on all users of the application, we first assessed the impact of the lockdown strategy on mobility patterns among administrative units in four different countries: Colombia, France, US (west coast), and Sweden. These countries were chosen to cover the range of degrees of lockdowns enforced from none (Sweden), mild (US), and increasingly rigid (France and then Colombia). The administrative units in these countries are county (Sweden, US), department (France), and municipio (Colombia). We then assessed the contribution of these mobility fluxes for the diffusion of SARS-CoV-2 among administrative units of the same territories.

Case data for US, Sweden, and Colombia and positive viral tests for France at the appropriate administrative scales and dates for which Facebook data were available, were retrieved from national public health data websites. All mobility data were fully anonymized and no individually identifiable information was provided. Facebook movements from one administrative unit to another were then summed by week for every administrative unit. The number of cases in the unit of origin was then summed for the same week and multiplied by the summed Facebook movements from unit of origin to unit of destination. This thus generated a potential incoming flux of infection (FoI) for every unit of destination from every unit of origin. These values were then logarithm transformed and a statistical regression analysis (loglinear poisson regression) performed to assess the added value, on top of the case number recorded in each locality the previous week, of including such an influx on the number of cases or infections in the unit of destination the following week. In addition, because one might expect viral import to be more important at the start of the epidemic and then local case number more important once the virus was seeded in a place, we modelled the interaction between the two variables (FoI and Case number) on the number of cases the following week. The log(e) of the administrative unit was used in the analysis as an offset term. Thus the fitted regression analysis was:

$${\text{Log}}_{e} \left( Y \right)_{t} = \, \beta 0 \, + \, \beta {1}.{\text{Cases}}_{{t - {1}}} + \, \beta {2}.{\text{FoI}}_{{t - {1}}} + \, \beta {3}.({\text{Cases}}_{{t - {1}}} \times {\text{ FoI}}_{{t - {1}}} ) \, + \, \varepsilon$$
(1)

where Y is the number of cases, subscript t is time (here week) and FoI the incoming flux of infection.

To quantify the contributions of these variables to COVID cases we estimate the increased relative risk associated with increased FoI. Relative risk is the ratio of the probability of COVID incidence in an exposed group to the probability of incidence of COVID in an unexposed group. Here exposure and non-exposure is a continuous scale determined by incoming flux of virus to an area. Thus, more precisely we are assessing how case number is influenced by different levels of exposure as defined by the incoming flux of infection. Relative risk measures how much increased risk of COVID is associated with a unit increase in the incoming flux of infection (note that incoming flux was log transformed and thus we are assessing the increased risk associated with an increase of one log of the flux of infection). For example, a relative risk of 2 means twice the incidence rate of COVID in an area that has one log increase in the incoming flux of infection. We also calculate the percentage of variation in COVID cases explained by the variables in the regression to provide an additional measure of the contribution of each of the variables.

5.2 Results

5.2.1 Colombia

On March 6, 2020, Colombia recorded its first case in an individual who had recently returned from Italy (minsalud.gov.co). Over the next week 15 more cases were detected in several of the largest cities throughout the country and public health authorities declared a health emergency, suspending all public events involving more than 500 people (Reuters 2020a). Days later educational establishments were shut and a state of national emergency was announced. A week later, on March 24, a national lockdown was announced and this was repeatedly extended until August 1. Following lockdown, mobility, as discerned through inter-administrative unit Facebook data, initially decreased by 60–70% of pre-lockdown values before steadily rising again to only a 30% decrease by the epidemic peak (Fig. 1a). During the period March 16 until July 27, Colombia recorded 324,451 cases peaking in the week starting July 13 with 55,936 cases. The number of affected municipios (for which Facebook data were also available, that is, 816 of the total of 1123) increased gradually during the initial phases of the epidemic (March 15 until May 10) at which point the number increased more rapidly before accelerating from June 14 up to the peak in the week of July 6 (Fig. 1b). This temporally corresponds to the period when mobilities were on the increase again. The peak number of affected municipios occurred one week prior to the peak in cases the following week. The previous month case number (30–70%) explained the majority of current month’s case number variation among municipios. However, the FoI initially also explained a considerable amount of unit case number (5–15%), before steadily decreasing to becoming insignificant during the middle weeks of the epidemic (May 25 to June 22), before it increased again, explaining 1–8%. This is reflected in the significant relative risk associated with this FoI at the end of the epidemic when mobility was on the increase again and the virus being increasingly spread throughout the country (Fig. 1b). In addition, there was strongly significant risk associated with the FoI at the beginning of the epidemic. Coupled with a small but statistically significant negative interaction with the local case number, this would suggest that viral import, as estimated through Facebook mobility patterns, was more significant in explaining the local case number when the case number in the same municipio the previous week was small. This is thus consistent with the idea that viral spread is being captured by Facebook mobility, but once seeded the virus rapidly expands locally. Surprisingly, however, despite the virus being widely distributed in many hundreds of municipios towards the end of the epidemic, the FoI still remained an important factor. That is, viral import was still contributing a substantial fraction of the observed cases.

Fig. 1
figure 1

a Total Facebook inter-municipio mobilities (blue line) and number of cases (orange line) by week in Colombia b Total number of weekly cases (orange line), number of municipios affected (grey line) out of the 816 municipios and the relative risk (+95% confidence intervals) associated with a log unit increase in Facebook incoming flux of infection (blue bars) in Colombia. See Sect. 5.1 Methods for details

5.2.2 France

In France, the government imposed the first national lockdown on March 17, which lasted until May 11. The impact of lockdown on mobility was high with a 70–80% decrease over the entirety of this period (Fig. 2a). From the period of March 18 until the July 28, there were 68,652 positive tests out of over three million tests performed. The number of positive tests peaked in the week of March 25, despite the number of tests being performed increasing ten-fold from May 27. Although initially identified on French soil in Chinese tourists on January 24, retrospective testing of pneumonia cases found one individual positive on December 27, 2019. During February, the number of clusters of cases increased rapidly and during the first week of the lockdown 51 departments recorded positive cases despite a low level of testing. The number of departments recording positive cases every week escalated rapidly to concern almost every one of the 94 metropolitan departments by the beginning of April. The FoI explained nearly 20% of variation in positive test number in the first week of lockdown and its explanatory power decreased only very slowly during the lockdown before once again attaining 10% explanatory power post-lockdown. Thus, despite the fact that the virus was already implanted throughout the country, the FoI explained a considerable amount of variation in positive test number and irrespective of the number of tests being performed. The importance of viral import is underlined by the estimated relative risk associated with this FoI that remained significantly above 1 throughout the entire study period (Fig. 2b). Once again, there was a significant and negative interaction effect between FoI and department infection number on the number of infections in the following week. This would again suggest that although the virus was widespread, viral import was still playing a major role, especially where and when infection numbers were low.

Fig. 2
figure 2

a Total Facebook inter-department mobilities (blue line) and number of positive tests (orange line) by week in France b Total number of weekly positive tests ×100 (orange line), number of departments affected out of the 94 departments (grey line) and the relative risk (+95% confidence intervals) associated with a log unit increase in Facebook incoming flux of infection (blue bars) in France. See Sect. 5.1 Methods for details

5.2.3 US (West Coast)

In the US states of Arizona, California, Idaho, Montana, Nevada, Oregon, Utah, and Washington, from March 10 until July 21, there were over one million cases with the epidemic showing no sign of slowing down. Of the 3194 counties in these states for which Facebook data were available, cases were detected in 461 counties in the week starting March 1 before rapidly rising within a few weeks to cover over 2000 counties. This number remained relatively stable until June 1 with over 2300 counties detecting cases regularly and reached 2900 counties affected by July 14. Lockdown was a very patchy affair in the US, with local officials often delivering stay-at-home orders even prior to any individual state responses. California issued such orders state-wise before March 22, Idaho, Montana, Oregon, and Washington followed suit before March 29 and Nevada and Arizona by April 12. Utah did not order a state-wise order, but Salt Lake City, the capital, did. Patterns of mobility thus varied substantially across the region, but overall mobility initially did decrease by 30–50% of pre-lockdown values before steadily rising again to only a 30% decrease by the epidemic peak in the month of May across the whole of the region (Fig. 3a). The majority of the variation in the current month’s number of cases was explained by the previous month case number (50–80%). However, again the FoI also explained a considerable amount of county case number. After initially explaining 15% of the variation, the explanatory role of the FoI dropped and stabilized at ~6% during the months of May and early June when mobility was reduced. However, as mobility increased again, so did the contribution of FoI. This reiterates the important role that mobility is playing in viral import despite the virus already circulating locally. Likewise, after an initial drop, the relative risk associated with this FoI increased again to pre-lockdown levels and more and especially so when the weekly case number was decreasing during the mid-part of the epidemic (Fig. 3b). This, coupled with the negative interaction effect as described above, reinforces the hypothesis that the incoming virus is of most significance when the local case number is low.

Fig. 3
figure 3

a Total Facebook inter-county mobilities (blue line) and cases (orange line) by week in the US b Total number of weekly cases ×100 (orange line), number of counties affected out of the 3194 counties (grey line) and the relative risk (+95% confidence intervals) associated with a log unit increase in Facebook incoming flux of infection (blue bars) in US. See Sect. 5.1 Methods for details

5.2.4 Sweden

Sweden is a country apart where no lockdown was imposed, although higher educational establishments were shut and people encouraged, when possible, to work from home. Sweden is sub-divided into 290 municipalities, but only ten municipalities house almost 30% of the total population. The first case of COVID was detected on January 31. Subsequently, from April 13 until July 20 there were 66,184 cases of COVID. The shape of the epidemic curve in Sweden was much flatter and higher for longer than that of France, which would be expected when there was no lockdown, as can be seen from the absence of any decrease in mobilities Fig. 4a, b. Intriguingly, there was no association of FoI with case number at any time during the epidemic. In addition, there was no interaction (between weekly case number and FoI) as seen in the other three countries. By contrast, the previous week’s cases explained 70–80% of all the variance, in line with findings from the other countries. The absence of any mobility associations with case number may be a consequence of the possibility that the virus was already homogeneously spread throughout the country. However, observing the incidence rate distribution across Sweden suggests this not to be the case (Fig. 5), although the high proportion of asymptomatic infections may generate a deceptive picture and age-specific hospitalization data may be more informative as to the true distribution of the virus (O’Driscoll et al. 2020). Interestingly, a serological survey carried out at the end of April found that only 7% of the population carried antibodies, a number similar to that estimated elsewhere (Reuters 2020b; Salje et al. 2020). Achieving such low levels of contamination without lockdown has been suggested to reflect a natural tendency in Swedes to reduce social contacts despite no imposed regulations, akin to observed behavioural changes in Tokyo (Yabe et al. 2020). While this would contribute to local viral transmission, there was no noticeable change in Facebook mobility at the inter-municipality level suggesting no significant reduction in population movements at this scale.

Fig. 4
figure 4

a Total Facebook inter-county mobilities (blue line) and cases (orange line) by week in Sweden b Total number of weekly cases ×100 (orange line), number of counties affected out of the 22 counties (grey line) and the relative risk (+95% confidence intervals) associated with a log unit increase in Facebook incoming flux of infection (blue bars) in Sweden. See Sect. 5.1 Methods for details

Fig. 5
figure 5

Source https://fr.wikipedia.org/wiki/Pand%C3%A9mie_de_Covid-19_en_Su%C3%A8de#cite_note-FHM_Official_stats-1

Covid-19 incidence rate per 100,000 individuals per county in Sweden.

5.3 Discussion

Across three of the four study countries there was a significant relative risk associated with population mobility, despite quite different degrees of confinement. In all cases, this mobility associated risk remained relatively stable over time. Only in Colombia was there some evidence after more than two months of lockdown that mobility patterns no longer contributed to increased risk of case number. Although not directly comparable, because of differing case definitions, the stability and similarity of the relative risk over time within US and France, with intermediate vs. high levels of lockdown, would suggest that a mass quarantine strategy has to be very rigorously applied for it to work. A priori, brute comparison of countries with very different societal structures and pre-COVID mobility patterns would not be expected to generate such similar findings. Pre-COVID mobility patterns were highly concentrated in administrative units neighbouring the large cities. This was especially the case for Colombia. Upon lockdown in France, US, and Colombia, the largest proportion of the decrease in mobilities occurred at longer geographical scales, while local movement among neighbouring units decreased far less. Such local-scale movement would thus generate a forest fire type signature of viral seeding and expansion among neighbours. Secondly, measuring overall reductions in mobility belies the true nature of the change in population mobility. It is likely that a disproportionate number of individuals who maintained their mobility patterns among neighbouring units would be those actively involved in commercial activities and having a higher average number of contacts per person than the general population. Thus, although not being superspreaders per se, such individuals would create the opportunity for viral spread. This then begs the question of the utility of imposing lockdown on the general population when the vast majority, if using local social distancing measures, would have few contacts and contribute little to viral spread.

There are clear limitations to the utility of Facebook data and inter-country comparisons. Facebook users may not be representative of the general population. Children and the elderly use Facebook less but are more likely to have a reduced mobility at all times. The adult working population are thus more likely to be Facebook users and more likely to travel at all times, including being part of the essential workers and therefore permitted to travel during lockdown. In so far as the epidemiological role of children is now believed to be less than that of adults (Li et al. 2020b), the Facebook data may actually give a more representative picture of the flow of the important epidemiological sector of the population. Secondly, different countries use Facebook to varying extents. Rough estimates simply based on the total number of inter-administrative unit mobilities pre-lockdown divided by the total population generate values of 1 in 7 for the US, 1 in 20 for France, 1 in 100 for Sweden, and 1 in 1000 for Colombia. This reflects many differences of the countries, including not only actual Facebook use but also how the society is structured. For Colombia and Sweden, for example, most activity is always focussed at a very local scale, directly around the cities. This is less the case in the US and France. However, this being said, it is remarkable that such a simple measure of mobility yields consistent results with respect to risk attributed to mobility across three very different settings. Case and/or infection data will always be subject to uncertainty, even within the same country because of changes in case definitions and extent of testing. While such under- or mis-reporting would impact upon the total number of cases/infections, over time within France, US, and Colombia, there was relative stability in the contribution of mobility for local case count despite increased overall incidence rate; this would suggest that mis-reporting is not a significant issue.

6 Incorporating Mobilities for Public Health Preparedness and Response

In contrast to the first wave, European governments have now implemented less draconian measures with classification of administrative regions (for example departments in France and boroughs in the UK) into levels or tiers of vulnerability. These tiers are based on epidemiological indicators, including overall incidence rates, case counts specifically in the elderly population, hospital bed and Intensive care unit occupation, and predicted increase/decrease of the epidemic curve. The epidemiological models currently used for predicting how incidence rate and the other indicators will evolve are based on classical models for respiratory viruses largely developed for influenza and fine-tuned during the first wave of the pandemic. While generally successful in estimating the dynamics of the epidemic, ignoring the importance of inter-regional population flux will lead to poor predictive power in the short term. This in turn will lead to administrative units rapidly having to alter their severity level, thereby unnecessarily creating confusion and potential societal unrest. In light of our findings that incorporating mobility improves significantly the statistical model fit, its inclusion in predictive models is warranted. One way of approaching this is through a metapopulation approach.

Metapopulation theory considers populations as patches of populations that are loosely connected through migrational events, where populations are locally governed by intrinsic population dynamics but which are globally connected. At the metapopulation scale, local patches are colonized at a rate governed by immigration of the invasive organism, but which then may become extinct locally. Metapopulation approaches, originally developed for ecological systems, have been applied to infectious diseases (Grenfell and Harwood 1997; Teissier et al. 2020). A key feature of the metapopulation approach is that while local populations of hosts and pathogens may be unstable, with for example local extinction, loosely connected populations, at whatever scale, may enable pathogen survival at the global scale. Theoretical work, to some extent supported by empirical data, has highlighted several key requisites for a metapopulation approach to be useful: notably a small (ish) R0 and a short period of infectiousness. SARS-CoV-2 fulfils these two criteria. During the epidemic phase within a local population, the incidence rate will be largely governed by deterministic dynamics that can be estimated using classical susceptible-infectious-resistant (SIR) models and the influence of the metapopulation is small. Subsequently, there will be a density-dependent deceleration in the number of new cases and eventually a decline, which may or may not lead to local pathogen extinction. Persistence of the pathogen will be strongly affected by stochastic factors, notably the number of susceptible hosts, as well as the connectedness to other populations and the consequent influx of new pathogens. Metapopulation models explicitly take into account space, allowing for the investigation of the effect that spatial heterogeneity, such as environmental stochasticity and change, has on the dynamics of the system. Recent investigations on social complex networks indicate that physical distances do not always map well onto the actual contacts among individuals and distant populations can exhibit stronger interactions than expected. Therefore, complementary to deterministic modelling approaches and with the recognition of the importance of spatial structure in viral transmission, metapopulation models (of coupled SIRs) offer a highly appropriate method to model SARS-CoV-2 dynamics at a country scale. Thanks to the real-time availability of Facebook and potentially other social media data, such modelling can be performed in real time. Perhaps more importantly, because we now have accumulated data from the start of the epidemic, we have gained knowledge as to how mobilities actually change under specific conditions and thus these fluxes can be used predictively.

7 Concluding Remarks

The increasing amount of mobile phone and social media data is providing us with an ever more detailed description of human mobilities. Incorporating such information into models of infectious diseases will vastly improve our capacity to identify sources and pathways of the spread of infection. This in turn will enable a much more targeted approach for implementing meso-scale interventions without the need to impose blanket lockdowns. Although perhaps too late for the current pandemic, generating nationwide matrices describing mobility patterns will serve as frameworks within which the spread of other pathogens can be addressed, enabling predictability for preparedness and intervention.