Introduction

Problem of tracking the fracture growth as a possible way for fluid migration in rock mass is of scientific-wide concern in studies dealing with the exploitation of geo-resources. There are two aspects of this interest: optimizing technological production (e.g. Al-Dhamen et al. 2013; Warpinski 2013) and assessing the environmental impact of the production (e.g. Chen et al. 2014; Zhang and Yang 2015, SHEER project). The new developing technologies of geo-engineering operations, e.g. hydraulic fracturing, geothermal energy production by water injection and underground wastewater storages, have increased the concerns and debates on their potential to induce seismic activity, to reactivate tectonically preloaded faults or to contaminate the groundwater and pollute the air. The mentioned technologies usually involve transmission of fluid pressure to create new cracks or to open existing cracks in rocks and open permeability paths to exploit the trapped gas, oil or geothermal energy. Most of the possible environmental impacts and risks are linked to undesired evolution of the fracturing process and enhancement of permeability. Understanding how far this enhanced permeability pattern can develop in space and time can be approached by the description of the possible fracture network growth and fracture network connectivity (e.g. Berkowitz 2002; Davy et al. 2013; Hope et al. 2015). This process can be strongly supported by the seismic activity analysis. A key feature of seismicity is its spatio-temporal clustering (e.g. Zaliapin and Ben-Zion 2013a, b; Martinez-Garzón et al. 2018). Monitoring of the seismic activity induced by technological process is not only a way to assess earthquake hazard, but it is also a way to track the course of fractures’ development (e.g. Chorozoglou et al. 2018). Seismic information, rock mechanics data and a deep insight into time pattern of operational activity are the most relevant information to work out a physical modelling of the evolution of the fracture process (e.g. Sausse et al. 2010).

SHale gas Exploration and Exploitation Risks project (SHEER, www.sheerproject.eu) aims at assessing short- and long-term risks associated with groundwater contamination, air pollution and induced seismicity, in connection to shale gas exploration and exploitation. In the framework of this project, we have been studying the problem of development of fracture networks towards the potential for forming the pathways for fluid migration. In these studies, we have been using the equivalent dimensions approach (Lasocki 2014). The transformation to equivalent dimensions converts a set of parameters of seismic events into a set of their equivalent dimensions, which are strictly comparable. Thanks to that it is possible to introduce any complex, multidimensional criteria of potential for linking fractures. We consider seismic event as a fracture segment in a fracture array and use the introduced criteria to identify fracture networks. In this work, we present an example of this way of fracture system identification applied to the injection-induced seismicity data from the north-western part of The Geysers (TG) geothermal field in California, USA. Seismicity of The Geysers results from thermoelastic and poroelastic effects that change the local stress field in the reservoir. The seismic activity in TG highly correlates with the injection operations. Some authors indicated also temporal variations of seismic focal mechanisms in accordance with the changes in flow rate in TG (e.g. Martinez-Garzón et al. 2014a, b; Kwiatek et al. 2015). The analysed north-western seismicity was isolated from the rest of seismic activity in the broader area of TG field. The seismic catalogue for this specific seismicity cluster (Martinez-Garzón et al. 2014a, b; Kwiatek et al. 2015) in addition to occurrence time, location and magnitude contains seismic moment tensors and spectral parameters. Thanks to that and the application of equivalent dimension transforms, we can formulate criteria for possible linking of fractures in a prescribed direction, based on (i) fault plane orientations of fractures, (ii) locations of hypocentres with respect to the injection well, (iii) sources radii and (iv) angles between the fault planes of fracture segments. To identify fracture arrays, we use the hierarchical clustering. The potential of an array to link fractures into longer crevices is quantified by the number of fractures’ intersections.

Data

The Geysers (TG) geothermal field in California, USA, is one of the most productive geothermal fields. It has been operating since the 1960s. It contains also the largest group of geothermal power plants in the world. The geothermal energy is received in vapour-dominated form. TG is also one of the most seismically active geothermal fields in the world. The largest recorded event so far occurred in the 1980s and reached magnitude 4.6. That time fluid production at TG was at its peak (Majer et al. 2007; Cladouhos et al. 2010).

The stress regime at TG geothermal field is extensional, with regional stress field prevailing over local stresses. Based on the fault plane solutions of seismicity recorded in the area, the N/NE orientation of the maximum horizontal stress was established, indicating a combined normal and strike-slip faulting regime (Boyle and Zoback 2014). In spite of the occurrence of several faults in the area, there are no mapped faults active in the last 10 000 years that cross the geothermal field (Oppenheimer 1986). The reservoir is built of highly fractured and hydrothermally altered greywacke sandstone (Lockner et al. 1982) with low total porosity of about 1–1.5 per cent (Rutqvist et al. 2015).

The reservoir pressure in The Geysers has been being slowly reduced over the years since the maximum production period. This has caused a decrease of steam production. Then, one of the two major treated wastewater injection projects, namely Santa Rosa Geysers Recharge Project (SRGRP, since 2003), resulted in approximately tripling the injection rates in the NW part of TG. Following, induced seismic activity also increased (Kwiatek et al. 2015).

We studied a small volume in the north-western part of TG geothermal field, which contains two injection wells: Prati-9 and Prati-29. Injections into Prati-9 well started in 1989, six years after drilling of the well. After well completion in 2001, the fluid injection in this well was renewed in 2007. In April 2010, the second well became operational in this area, namely Prati-29, which was located a few hundred metres NE from Prati-9 (Viegas and Hutchings 2011).

Induced seismicity in this area, an immediate response of the reservoir due to fluid injection, was already studied in detail in many papers (e.g. Viegas and Hutchings 2011; Martinez-Garzón et al. 2014a, b; Kwiatek et al. 2015, Leptokaropoulos et al. 2017). Following Martinez-Garzon and also Jeanne (Jeanne et al. 2014), it was suggested that the seismicity there is governed by thermoelastic and poroelastic effects that change the local stress field in the reservoir. A number of studies indicated correlation between the fluid injection volume, steam production and induced seismicity rates at The Geysers (e.g. Smith et al. 2000; Stark et al. 2005; Martinez-Garzón et al. 2014a, b). Leptokaropoulos et al. 2017 ascertained this correlation statistically.

We analysed all seismic events, which occurred in the considered part of The Geysers geothermal field between 10 December 2007 and August 23 2015. Altogether, these were 1254 seismic events. They located in a close surrounding of Prati-9 and Prati-29 injection wells (Fig. 1).

Fig. 1
figure 1

Analyzed seismicity from NW part of The Geysers geothermal field. Locations of a epicentres and b hypocenters of seismic events

Method

Our study of fracture network identification was based on cluster analysis of seismic events. Every seismic event was represented by a point in a parameter space, quantified by the vector of its parameters’ values.

Figure 2 presents a schematic explanation of the earthquake parametrization, which was used to assess possibility of fracture linking. If we consider hydraulically induced fracturing as shown in Fig. 2, the information about fracture segment location and fracture plane orientation is necessary to map a build-up of the fracture network. In this connection, we selected three parameters of earthquake. The first parameter, connected with hypocentre location, was the distance between the open hole of Prati-9 well and the event hypocentre, r. This parameter allowed to track the reaction of rock mass to driving stresses of fracturing, which are assumed to propagate from the well out.

Fig. 2
figure 2

Schematic explanation of the used earthquake parametrizations and their usefulness to assess possibility for fracture linking

The fracture growth depends also on principal stress orientation. This second factor can result in an elongation of fracture network. Usually, an observed shape of the anthropogenic seismicity clusters is consistent with the regional stress field. The spatial pattern of seismicity recorded in the north-western part of TG was shown to be elongated subparallel to SHmax (Martinez-Garzón et al. 2014a, b). The direction of preferred fault planes turned out to be also parallel to the maximum principal stress orientation (Boyle and Zoback 2014). Such alignment of seismicity with SHmax was also observed in other geothermal fields, e.g. in Soultz-sous-Forêts (Cuenot et al. 2006; Cornet et al. 2007).

Following up, for proper identification of activation of fracture networks potentially comprising linked fractures, which were probably preferentially aligned parallel to the stress field orientation, we introduced a deflection parameter. The deflection, def, was the angle between the position vector of hypocentre of seismic event and SHmax direction as shown in Fig. 2 (e.g. Mirek et al. 2010).

The third parameter was 3D rotation, rot. This parameter was connected with the orientations of P, T and B axes representing the focal mechanisms of earthquakes. 3D rotation is the angle by which one double-couple (DC) earthquake source is turned into another arbitrary DC (Kagan 2007). Based on P and T axes’ orientations, we were calculating the rotations (Kagan 2007, software MSATSI, Martinez-Garzón et al. 2013; MATLAB extension to SATSI library of Hardebeck and Michael 2006) of DC sources of events to the fault plane strike/dip/rake: 26/45/-45, which compromises the prevailing orientations of fault planes in the study area. The rotations were expected to be in general consistent with the stresses promoting failures. Similar rot values indicated a similarity of fault plane orientations.

The parameter space {r, def, rot} was not Euclidean; the distances between earthquakes in this space could not be straightforwardly determined. To remedy this metric problem, we applied the solution proposed by Lasocki (2014) and transformed the parameter values into equivalent dimensions. The transformation is based on a concept of probabilistic equivalence of parameters. Under this concept, intervals of different parameters have equivalent lengths if the probability for each of the parameters to take values from its respective interval is the same. The equivalent dimension of a parameter is its cumulative distribution function (CDF). Parameters transformed to equivalent dimensions are uniformly distributed in [0, 1] interval, and the distance between parameter values and points in equivalent dimension space is Euclidean.

Because in general the probabilistic models, CDF, for parameters are not known, it is recommended to use instead of the nonparametric kernel estimators of CDF (e.g. Silverman 1986 and the references therein). Given a sample of parameter values \(\left\{ {p_{i} , i = 1,2, \ldots ,n} \right\}\), the nonparametric, kernel estimate of a probability density function (PDF) is constructed by summing up symmetric kernel densities, of the expected values \(p_{i}\) and the adapted standard deviations.

To transform r, def, rot to equivalent dimensions, we used an adaptive kernel estimator of CDF with the Gaussian kernel function:

$$\hat{F}_{P} \left( {p,\left\{ {p_{i} ,n} \right\}} \right) = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \varPhi \left( {\frac{{p - p_{i} }}{{\omega_{i} h}}} \right),$$
(1)

where \(\varPhi \left( \cdot \right)\) denotes the standard Gaussian cumulative distribution, \(h\) is the smoothing factor automatically selected from the data using the least squares cross-validation technique (Bowman 1984; Silverman 1986), and \(\omega_{i} , i = 1, \ldots ,n\) are the local bandwidth factors, which adapt the smoothing factor to irregular data density along the range of parameter values. After transformation, fracture network was characterized by distances between all pairs of seismic events (i, j):

$$D\left( {i,j} \right) = \sqrt {\left( {F_{ri} \left( r \right) - F_{rj} \left( r \right)} \right)^{2} + \left( {F_{roti} \left( {rot} \right) - F_{rotk} \left( {rot} \right)} \right)^{2} + \left( {F_{defi} \left( {def} \right) - F_{defj} \left( {def} \right)} \right)^{2} }$$
(2)

where \(F_{r} \left( r \right)\), \(F_{rot} \left( {rot} \right)\), \(F_{def} \left( {def} \right)\) are the transformed to equivalent dimensions r, def, rot, respectively.

To identify fracture structures within the data, we used hierarchical clustering with agglomerative approach. The distance between events was the indicator of the fracture network development—the smaller distance between events the stronger link between them. We used the linkage method of Ward (1963), which takes into consideration the sum of the squared error (SSE), summing squared distances over all parameters, and all of the events within each fracture network and cluster, as well as between clusters. The distances are calculated between cluster members and centroid. The centroid of the cluster is defined as the point at which the sum of squared distances D(i, j) between the points itself and each other points in the cluster is minimized, and a sum of all points in a cluster is divided by the number of these points. Ward’s method attempts to minimize SSE of any cluster and at the same time to maximize SSE between clusters.

Having identified event clusters, we were analysing characteristics of each cluster, specifically possibilities of fracture linking. We attributed to each seismic event the section of strike line of length equal to two times source radius and centred at the event epicentre. As proxy of the number of connections among fractures, we used the number of intersections among these sections. We call these intersections as nodes. We also calculated the connectivity coefficient C. This is the ratio between the actually observed number of nodes in a cluster and the number of possible nodes within the cluster. The number of possible nodes is f(f − 1)/2, where f is the number of fractures building the cluster (Watts and Strogatz 1998; Albert and Barabasi 2002). Acceptable performance of these two 2D measures of fracture linkage stems from that, that fractures are usually much more extended horizontally than vertically. As the last characteristic of clusters, we calculated the extensions of clusters that is the maximum distances between hypocenters in the clusters.

Results

Results of hierarchical clustering of 1254 studied events, represented by r, def and rot parameters transformed to equivalent dimensions are shown in Fig. 3. From the resultant dendrogram, we extracted 30 clusters of events, characterized by the lowest linkage distances. The threshold linkage distance was at about 0.5.

Fig. 3
figure 3

Hierarchical tree of 1254 events parameterized by r, def and rot transformed to equivalent dimensions

In 2D, fracture arrays can be visualized by a set of lines, whose lengths are equal to doubled source radii of seismic events in a cluster. Examples of fracture arrays are presented in Fig. 4 for clusters 1, 2 and 3, respectively. For 3D visualization of fracture arrays, at each seismic event its focal mechanism and P/T/B axes are plotted. The size of the graphical representation of the focal mechanism corresponds to the event magnitude Mw and the axes lengths are equal to the doubled seismic source radius. Examples of 3D visualization of fracture arrays are presented in Fig. 5a–c, for clusters 1–3, respectively.

Fig. 4
figure 4

2D visualizations of fracture arrays from the first three clusters. Orange—31 fractures forming cluster 1, green—49 fractures forming cluster 2, blue—61 fractures forming cluster 3

Fig. 5
figure 5

3D visualizations of fracture arrays from the first three clusters

Information about the number of fractures, the extension of fracture array and the maximum observed magnitude, Mw, in the identified clusters is summarized in Table 1.

Table 1 Identified clusters of events

To get an insight into a relation between connectivity within fracture networks and injection rate, we were extracting from the identified clusters those events, which had occurred during two injection cycles described by Martinez-Garzón et al. (2014a, b). Each cycle was divided into three stages: before, during and after peak injection. The analysed time series of injection rate was filtered using the 10 samples moving average filter to remove artificial high-frequency changes. The resulting maximum values in the particular stages were cycle 1—stage 1: 5.2 × 103m3/day, stage 2: 9.2 × 103m3/day, stage 3: 5.2 × 103m3/day; cycle 2—stage 1: 10.3 × 103m3/day, stage 2: 12.7 × 103m3/day, stage 3: 8.5 × 103m3/day.

The breakdown of the groups of extracted events into subgroups respective for these three stages resulted in very poorly populated subgroups. To increase the number of events in such subgroups, we revisited our hierarchical tree (Fig. 3) and applied the linkage cut off equal to 3. This resulted in identification of 13 fracture families (clusters).

We broke down the event groups that built the newly singled out fracture families into the subgroups linked in time with the injection stages. We calculated the connectivity characteristics separately in these subgroups. Table 2 presents data on the number of fractures and nodes, the maximum possible number of nodes and the coefficient C in these subgroups. Each row relates to one of the families, respectively. The last row presents the above-mentioned numbers obtained for the aggregations of all subgroups related to the particular injection stages. Figure 6 presents the distribution of the identified fractures together with their connections.

Table 2 Characteristics of connectivity in clusters linked to the three injection stages: before, during and after the peak injection
Fig. 6
figure 6

Connections among fractures from the extracted 13 clusters in the three stages of injection. Fractures occurred during the stages before, during and after the peak injection are plotted in blue, pink and green, respectively. Intersections (nodes) are marked with green dots. Open holes of Prati-9 and Prati-29 injection wells are shown as yellow and pink diamonds, respectively

Discussion

Our study shows that when parametrization of seismicity includes also mechanism and spectral parameters, it opens a multitude of possibilities to track various features of the seismicity. The content of the seismic catalogue from the north-western part of The Geysers geothermal field is an example of such rich parameterization.

We have assumed that fractures having similar directions of extension, similar orientations of fracture planes and similar distances from the injection well can link to form longer crevices, which eventually may become pathways for fluid migration. This similarity assumption by no means represents fully the potential for linking the seismic fractures. Studies with the use of other criteria on so-called fracture families are on the way. Nevertheless, we can extract the families fulfilling this or any other this kind criterion only when the earthquake parameters are transformed into equivalent dimension. This transformation ensures the Euclidean metric in the space built by the transformed parameters, thus enables unequivocal definition of distance between the points/events in this space.

Based on the assumed similarity criterion, the equivalent dimension approach, and the hierarchical clustering method we have been able to identify the event families in the seismicity from the north-western part of The Geysers field. The value of connectivity parameter, C, in the families distinctly changes between stages of injection in which the events in the family occurred. For the subgroup of events related to the peak injection stage, this value seems to be smaller than the respective values related to the other two injection stages.

C is a proportion, and the test statistics for the difference between proportions p1, p2 is

$$Z = \frac{{\hat{p}_{1} - \hat{p}_{2} }}{{\sqrt {p^{ *} \left( {1 - p^{ *} } \right)\left( {1/n_{1} + 1/n_{2} } \right)} }}$$
(3)

where \(\hat{p}_{1} ,\hat{p}_{2}\) are the sample estimates of p1, p2, respectively, n1, n2 are the sizes of the respective samples and \(p^{*} = \frac{{x_{1} + x_{2} }}{{n_{1} + n_{2} }}\), where x1, x2 are the numerators in the proportions’ sample estimates. Z distribution is approximately normal.

To fulfil the approximate normality assumption, we test the null hypothesis on proportions only for the aggregated data, provided in the last row of Table 2. We obtain the following results:

  • C in stage 1 vs. C in stage 2—Z = 3.64, p value = 0.00014

  • C in stage 3 vs. C in stage 2—Z = 4.18, p value < 0.00001

  • C in stage 1 vs. C in stage 3—Z = 0.47, p value = 0.32

The test indicates that the connectivity values in the stages before and after the peak injection are significantly greater than the connectivity in the peak injection stage. The differences between C values in the stages before and after the peak injection have not been found significant.

The test results suggest that lower injection rates favour linking fractures, whereas higher injection rates inhibit such linking tendency. This conclusion is limited to:

  • the studied injection-induced seismicity in The Geysers;

  • the values and variation of amplitudes of injection rate, which were studied;

  • the used measure of linking potential, C

but the obtained result encourages further studies of this problem.