Introduction

The use of remotely sensed (RS) data in forestry is motivated by efforts to increase cost efficiency, precision and timeliness of forest information [1]. Differently, to traditional field-based sampling, the availability of full-coverage RS data enables the production of maps of key forestry variables, which are useful for forest management purposes. First examples of aerial imagery usage for forestry purposes date back to the beginning of the 1920s [2, 3]. Over the past century, there has been tremendous growth in the number of RS data sources available for the assessment and monitoring of forests. Three-dimensional (3D) RS data, which can describe tree or canopy height, have shown great potential for forest inventory [4]. In the past 20 years, the use of airborne laser scanning (ALS) has been widely used for forest inventory purposes and has become the standard data source for operational forest inventories in many countries around the world [5,6,7]. Nevertheless, the acquisition of ALS data requires a degree of planning and investment, making these data sources cost-effective only on a relatively large scale [8]. Up to the beginning of 2010, there were no cost-effective means of acquiring high-resolution 3D RS data for smaller areas, such as single forest properties or forest stands. Furthermore, in those cases, where ALS-based forest management is implemented, surveys are carried out infrequently, e.g. at intervals of 10–20 years [5]. Hence, for some forest stands, the information may be too unreliable for decision-making. Timeliness is a key requirement to enable the adoption of precision forestry practices. This is especially true when the forest structure is changing rapidly, as is the case in fast-growing regeneration forests, or when growth is hindered by biotic or abiotic disturbances.

Photogrammetric approaches to obtain 3D information on forest structure have become popular, offering substantial cost savings in the case of aerial photogrammetry compared with ALS [9, 10]. Photogrammetry is limited to the reconstruction of surfaces visible in the image data, providing ground information only where large vegetation gaps exist. However, photogrammetric data can be combined with pre-existing ground data, derived from light detection and ranging data (LiDAR) for example. This data synergy has been thoroughly discussed by Goodbody et al. [11], indicating the potential for cost-efficient forest inventory updates. Similarly, Kangas et al. [6] suggest an equal value of photogrammetric and ALS data in forest management planning, given that a ALS ground information is available from previous campaigns. Additional to the proven complementary use of LiDAR and photogrammetric data [9, 11, 12], recent attempts at deriving inventory relevant forest metrics from photogrammetric data alone show potential for aerial [13•, 14] and terrestrial [15•, 16] acquisitions. Further standalone use of photogrammetry was shown for forest health monitoring [17, 18•, 19] species classification [20] and biodiversity assessments [21, 22].

In the last decade, a photogrammetric approach offering flexible and cost-effective acquisition of combined 3D and spectral RS data has found wide application and acceptance in physical geography [23]: Structure from Motion (SfM), paired with multi-view stereo (MVS) algorithms (SfM-MVS, commonly abbreviated to just SfM). SfM is based on computer vision and facilitates the photogrammetric reconstruction from images alone. Contrary to traditional stereophotogrammetry, 3D information can be computed from overlapping images, without the need for prerequisite information on camera location and orientation, camera calibration and/or surveyed reference points in the scene. This allows the use of inexpensive imaging platforms, both for aerial or terrestrial applications.

SfM photogrammetry has been comprehensively reviewed in the geosciences [24, 25, 26••], where it has been gaining prominence for topographical surveys. We complement these findings with a summary of SfM photogrammetry use specific to forestry. We present an overview of the theoretical principles of a SfM-MVS workflow and its applications in forestry by reviewing a representative sample of key research in this field. Challenges and technical considerations are discussed, concluding with opportunities and practical implications for operational use of SfM by forest practitioners.

Structure from Motion: Theoretical Principles

Traditional stereophotogrammetry methods are based on an analogy of the binocular human vision. Depth can be perceived from two points whose relative position is known. However, depth, volumes or 3D features can also be perceived from a single observing point if either the observer or the object is moving [27, 28]. SfM is a photogrammetric technique that is based on both these principles: (i) the binocular vision and (ii) the changing vision of an object that is moving or observed from a moving point [29]. SfM is used for estimating 3D models from sequences of overlapping 2D images. It gained popularity in recent years due to its ability to deal with sets of unordered and heterogeneous images without prior knowledge of the camera parameters [30]. SfM differs from traditional photogrammetry mainly in three aspects: (i) features can be automatically identified and matched in images at differing scales, viewing angles and orientations, which is of particular benefit when small unstable platforms are considered; (ii) the equations used in the algorithm can be solved without information of camera positions or ground control points, although both can be added and used and (iii) camera calibration can be automatically solved or refined during the process. SfM can thus automatically deliver photogrammetric models without requiring rigorous homogeneity in overlapping images, camera poses and calibrations [31,32,33].

‘SfM’ photogrammetry is commonly used to define the entire reconstruction workflow, from image set to dense point cloud; however, strictly speaking, SfM only refers to a specific step in the workflow that provides camera parameters and a sparse point cloud (see Fig. 1). Although some studies use the sparse point cloud as a final product [31, 34], in most cases, dense image matching algorithms, such as MVS, are used in a subsequent step to densify the point cloud. The whole process can thus be referred to as SfM-MVS. Figure 1 contains a schematic workflow of the whole SfM-MVS process, and Fig. 2 shows a graphic diagram of the main three steps.

Fig. 1
figure 1

Schematic workflow of the SfM-MVS process resulting in a dense point cloud from image sets. The point cloud is georeferenced by providing positional information for images and/or ground control points

Fig. 2
figure 2

The three key stages in a SfM-MVS workflow illustrated on two hypothetical images of a Canary Island pine forest: (1) keypoint identification and matching (e.g. SIFT), (2) SfM with camera parameters and a sparse point cloud as output and (3) the densified point cloud following MVS

The SfM-MVS process starts with the automatic extraction of keypoints (i.e. points or sets of pixels with distinctive contrast or texture) in the images. The keypoints are identified in all images and then tied (matched) across images where they appear. The scale-invariant feature transform (SIFT [35]) and its variations are the most common algorithms for keypoint identification and matching in SfM [26••]. SIFT produces numerical descriptors for each point in each image. These descriptors are invariant to scale and orientation, thus suitable for identifying points or objects in pictures taken from different perspectives and under different conditions. Then, coherence of keypoint matches is checked using a coarse reconstruction of the geometry of the images and the relative position of the keypoints on them (Figs. 1 and 2).

Given a sufficient number of images and keypoint matches, SfM performs bundle adjustments to simultaneously compute camera poses and parameters, and a sparse 3D point cloud of the scene (consisting of the position of keypoints matched in different images). The bundle adjustment is solved using (i) initialization values obtained from sequences of randomly selected matched keypoints and, complementarily, parameters from the cameras and poses and (ii) a non-linear refinement [36]. Then, the outputs of SfM are scaled and georeferenced based on ground control points (GCPs) and/or data from navigation devices from the camera or its platform (Figs. 1 and 2).

The camera poses and parameters obtained from SfM are then applied to generate a densified point cloud using MVS algorithms. Prior to the MVS densification, and for computational efficiency or even viability, images are clustered based on their location [37]. In this way, the dense point cloud of each cluster (i.e. group of images) is computed separately (Figs. 1 and 2).

A dense point cloud, with colour/spectral information derived from the input images, represents the primary output of the SfM-MVS workflow. Subsequent processing steps (for aerial surveys) typically involve the derivation of a digital surface model (DSM) and an orthomosaic. A canopy height model (CHM) can be attained by height normalization (i.e. conversion from height above sea level to height above ground) with a pre-existing digital terrain model (DTM). When SfM-derived surface data are height normalized in such a way, this offers the calculation of forest metrics like those commonly derived from ALS (e.g. height, timber volume, biomass). Additionally, image metrics like radiance/reflectance values and texture may be extracted [13•, 18•, 20, 38, 39]. Finally, rasterization can offer opportunities to explore the sensed information in more depth when statistics are calculated for every cell (e.g. height percentiles, surface roughness, spectral indices) [40,41,42].

SfM Photogrammetry in Practice

With photogrammetry being a passive technique, results are highly influenced by the input image data. SfM photogrammetry, employing an automated process to identify and match features by computer vision, is fundamentally dependent on image quality. Sensors, settings and acquisition designs should be considered with great care.

In every circumstance, the camera settings need to be considered to ensure optimal image data is acquired given a set of constraints, namely (i) those from the environment (lighting conditions), (ii) the platform (UAV, pole, tripod or handheld) and (iii) the camera and lens combination (the exposure triangle, focal length, sensor size). Acquiring high-quality image data has been discussed in O’Connor et al. [43••] and Mosbrucker et al. [44], with key rules-of-thumb including keeping the motion of the camera to a minimum, and increasing ISO (i.e. the sensors sensitivity) to account for potential underexposure (Fig. 3). RAW image data is better to capture as it retains the raw pixel values acquired by the camera prior to quantization and compression [45].

Fig. 3
figure 3

Image quality issues illustrated by simulated degradations on UAV image: a adds motion blur, which has negative impacts on the quality of photogrammetric reproduction (Sieberth et al., 2014); b adds JPEG compression, which smoothes subtle contrast changes across an image; c adds noise, which can rapidly degrade image quality at high-ISO values; d adds overexposure, where the image sensor was exposed for too long a time and e underexposure, where the image sensor was exposed for too little time

Image network geometry has an impact on the quality of reproduction, and for every survey, a ‘convergent’ imaging geometry should be sought that where the principal axis (perpendicular to the image sensor) of the images used converge so that systematic error is minimized [46, 47]. In UAV imaging, James and Robson [46] suggest surveying with gently banked turns when using fixed wing UAVs, in order to achieve this. With rotary UAVs, a similar result can be achieved by angling the camera on the gimbal on which it is mounted. For terrestrial imaging, a convergence of images on AOIs is advised, as presented in Mosbrucker et al. [44].

Within image acquisition and SfM photogrammetric workflows, users have many parameters which they can vary depending on the equipment and software used. For some, users can have near full control (e.g. the ‘exposure triangle’; ISO, shutter speed and aperture), though there are several which will only be estimated prior to performing a survey (such as the exact camera positions images will be acquired from). Other influential factors, which cannot be manipulated (e.g. light conditions), will have to be carefully considered when planning a SfM-based survey. The success of reconstruction is ultimately dependent on factors that can be broken down into five categories, as presented in Table 1. The accuracy of the position and scale of a survey is then determined by the referencing approach (e.g. GCPs, direct georeferencing, manual scaling).

Table 1 Overview of variables influencing the results of a SfM survey

To apply SfM photogrammetry in forestry, important aspects to a successful survey are as follows: (i) the scene is covered with overlapping images from multiple locations and angles (high overlap to increase redundancy and multiple viewing angles of the same object to reduce occlusions and systematic errors), (ii) any feature to be reconstructed should be visible in at least three images (five or six images for dense vegetation) and the angular divergence between neighbouring images between should not exceed 10–20°, (iii) the scene is sufficiently illuminated (constant lighting is preferable, e.g. overcast or cloud-free conditions) and (iv) object of interest is fixed (preferably no movement from branches in wind).

The Current Status of SfM in Forestry

With the ability to produce highly detailed 3D information from a set of images alone, SfM photogrammetry lays a powerful tool into the hands of anyone looking to collect their own fit-for-purpose RS data. Owing particularly to the potential of using off-the-shelf cameras and the availability of affordable user-friendly software, the application of SfM photogrammetry in physical geography has increased rapidly [26••, 48]. With SfM photogrammetry being scale independent, images may be acquired from a multitude of platforms ranging from ground-based, handheld or pole-mounted options, to unmanned aerial vehicles (UAVs) and manned aircraft. UAVs have enabled geospatial data to be acquired in new ways. Flexibly deployed at scales from several hectares to square kilometres [49], they allow forest practitioners to collect their own aerial information. In fact, there is an increasing interest in UAV forest surveys that can arguably be attributed to SfM-based photogrammetric processing [26••].

The rapid adoption of SfM photogrammetry is indicated by a growing number of scientific publications in forestry that utilize this photogrammetric technique. We conducted a search for peer-reviewed studies indexed by the Web of Knowledge database using the keywords ‘Structure from Motion’, ‘UAV’ and ‘Forestry’ (and their most common variations). The search results were manually filtered to retain only forestry-related studies applying a SfM-based workflow. We further categorized results into research on aerial and terrestrial inventory, forest health and proof-of-concept studies. These results are presented in Table 2 and reveal a steady rise of publications on forest remote sensing with SfM photogrammetry.

Table 2 Number of publications on SfM photogrammetry for forest/tree remote sensing per year with manually assigned sub-categories

SfM photogrammetry applications aimed at forest inventory are currently the most studied (Table 2). Here, a distinction between aerial and terrestrial approaches can be made. An aerial approach typically utilizes a canopy surface model derived from SfM and/or associated spectral properties to estimate inventory relevant parameters [11]. Terrestrial acquisitions, also termed close-range photogrammetry (CRP), focus on the reconstruction of stems within sample plots or the reconstruction of individual trees [15•].

A further field of research is the assessment and monitoring of forest health condition. For SfM-based mapping of the canopy, hereby an aerial acquisition of image data, most commonly by UAV, with multispectral sensors prevails. Studies dealing with forest health often make use of the 3D information and derived 2D spectral products that SfM photogrammetry delivers [18•, 50]. The following sections describe the research on SfM-based forest inventory and health assessments to date in more detail.

Inventory

Forest inventory holds a central role in all of the forest research. Sustainable management of forests relies on knowledge of their structure, distribution and dynamics over time [51]. The collection of field data for inventory purposes is labour intensive, time-consuming and expensive, and cannot be applied to large areas, consequently drastically limiting the number of field inventories that can be afforded [52, 53]. Efforts to improve on the efficiency of inventory practices therefore drive research in this field [53]. Amongst RS technologies, SfM photogrammetry offers a low-cost and flexible approach to collect information on forest structure, thus naturally there has been an increase in interest to use such data for forest inventory.

Within the context of forest inventory, the main use of SfM photogrammetry has been its application on UAV image data to produce wall-to-wall auxiliary information in a similar fashion to ALS data. As such, UAV-SfM data has been shown to be suitable for the estimation of inventory relevant biophysical parameters such as height, density and biomass [11, 12, 34, 54,55,56]. Even though SfM has mostly been applied to aerial image data, in recent years, there has been an increasing effort in developing terrestrial SfM applications to replace or augment field data collections. The focus of studies incorporating CRP lies on estimating diameter at breast height (DBH), tree position and stem curves. The following sub-sections elaborate further on the developments up to today regarding aerial and terrestrial SfM and highlight some of the key work on using these photogrammetric data for inventory purposes.

Aerial Inventory

The use of SfM techniques applied to aerial image data for forest inventory was pioneered by Dandois and Ellis in 2010 [54]. These authors were the first to use a series of unordered but overlapping images acquired using a consumer-grade camera mounted on a kite to produce a dense 3D point cloud representing the forest canopy. A first attempt to model forest biophysical properties using UAV-SfM data was done by Dandois and Ellis in 2013 [34] and Lisein et al. in 2013 [55]. Both studies found that even though the results were not consistent in all the studied areas, there was a correlation between UAV-SfM data and variables such as dominant height (R2 = 0.07–0.91) or aboveground biomass (R2 = 0.27–0.73). A more comprehensive evaluation of the possibilities to use UAV-SfM for forest inventories came with the studies by Puliti et al. in 2015 [12] and Tuominen et al. in 2015 [56] who extended their evaluation to the range of biophysical variables commonly used in forest management. Their results in terms of RMSE% for dominant height (3.5%), Lorey’s height (13.3%–14.4%), stem density (38.6%), basal area (15.4–23.9%) and timber volume (14.9–26.1%) were found to be similar to errors associated with ALS-based forest inventories. While these two studies set an important benchmark, they were both conducted in even-aged managed boreal forests and thus provided limited information on how UAV-SfM may perform in different forest types and forest developmental stages.

Since the early days of UAV-SfM, the rapid growth in computing capabilities, availability of UAVs and SfM software triggered increased interest in the scientific community (see Table 1). This led to a widespread evaluation of UAV-SfM technology over a variety of forest types and forest developmental stages. UAV-SfM data has been consistently proven to be useful for forest inventories in a large variety of forest types, including temperate European beech forests in Italy [13•], mangrove forests in Malaysia [57], tropical forests in Guyana [58], mixed conifer-broadleaved forest in Japan [59], sparse sub-alpine coniferous forests in China [60], tropical woodlands in Malawi [41] and various plantations around the globe [61,62,63]. From these studies, a conclusion can be drawn that the accuracy of UAV-SfM models is consistent across many different forest types and on a similar scale to ALS models. All of the aforementioned studies dealt with mature to nearly mature forest, while there has been little effort dedicated to estimating biophysical variables for forests under regeneration [64, 65, 66•]. Nevertheless, the use of UAV-SfM data for regeneration forests may outperform alternative data sources such as field assessments or the use of ALS data in terms of costs and accuracy. Goodbody et al. [64] demonstrated the possibility to discriminate coniferous and deciduous species (overall accuracy of 86–95%). Puliti et al. [66•] showed that UAV-SfM data could be used to accurately model stem density and height (RMSE% = 21.8% and 23.6%). Such results represent a substantial increase in accuracy over ALS forest inventories and field assessment. Furthermore, their study reported that data acquired using UAV-SfM techniques halved the amount of time required for traditional field surveys that are commonly performed in regeneration stands. Thus, the use of UAV-SfM for regeneration forest may be particularly interesting since it allows a simultaneous increase in the precision of the inventory while reducing its costs.

Different methodological approaches have been applied to UAV-SfM data, similarly to ALS data. The methods can be categorized into area-based approaches (ABA) [67] and individual tree crown (ITC) approaches [68, 69]. While in the former case, the population units are represented by grid cells of area equal to that of the field plots; in the latter, they are polygons representing single-tree crowns. In both cases, the UAV-SfM data, corresponding either to the grid cells or the single-tree crowns, are then linked to a sample of field observations either for field plots or for single trees through models. These models are then applied to all the population units either for estimation of parameters for stand or forest level mapping. The results of ABA methods have been presented in the previous paragraph. The adoption of ITC approaches to UAV-SfM has been found to be useful for detecting single trees with 25–90% detection accuracy [63, 70, 71], to classify them according to tree species with overall accuracies up to 95% [71], and measuring their height with RMSEs in the range of 0.5–2.84 m [55, 63]. In addition to rather large variability in the accuracy of some of these variables, the results of UAV-SfM ITC approaches vary according to forest types since they remain limited to the detection of the dominant tree layer, while smaller and dominated trees remain mostly undetected.

Terrestrial Inventory

Currently, terrestrial laser scanning (TLS) is the most accurate non-contact method of measurement to derive detailed forest inventory information at the plot level [15•]. The main drawbacks of this technology are the high hardware costs [53], and the time required for multiple scans mitigating occlusions along with post-processing to provide full coverage of a plot [72]. Mobile laser scanning systems reduce acquisition time but high costs remain [73].

The deficiencies of traditional field data collection and the need for reducing the cost of alternative laser scanning solutions have encouraged the application of terrestrial photogrammetry for forest inventory. Efforts to utilize terrestrial photogrammetric point clouds for deriving forest parameters derive from the low-cost of the equipment for the data collection, the automated SfM-based data processing and the potentially simple and fast data acquisition [74]. Requiring only a camera, typically handheld or mounted to a pole or tripod, terrestrial SfM photogrammetry makes such a system highly mobile, reducing the risk of occlusion yet providing a level of detail comparable to TLS [75].

Studies on terrestrial SfM for forestry purposes have become more frequent in the last years (Table 1) and mainly focus on linear rather than volumetric tree metrics. Studies vary according to (i) the scale of application, i.e. at plot level and individual tree reconstruction; (ii) the measured forest parameters like tree position, DBH, height and stem curve; (iii) the resolution of the sensor, e.g. video, mobile phone and SLR camera; (iv) the camera configuration and photographic path and (v) the equipment used to acquire the images, e.g. pole, tripod, camera rig and backpack. Based on these aspects an overview of key work on terrestrial SfM applications together with the obtained accuracies, acquisition method and geo-referencing approach are provided.

Most of the recent studies on photogrammetric measurements of forest parameters are based on the single-camera technique, according to which overlapping images are acquired around the plot (Fig. 4).

Fig. 4
figure 4

Example of a terrestrial SfM survey [51] in an open forest plot showing a the configuration of camera positions and orientations, dense point cloud and ground control points; b an example of an image and the dense point cloud from the same point of view and c the dense point cloud of a single stem without RGB colouring together with a 10-cm cross-section at 1.3 m (light blue point cloud)

Terrestrial photogrammetry has been evaluated in several studies in the past few years at plot scales [14, 15•, 16, 74, 76,77,78]. In these studies, DBH and tree locations were estimated in circular plots with diameters ranging from 12 to 30 m. The reported RMSE of the DBH ranged from 0.88 to 6.80 cm compared with either field or TLS DBH measurements. Tree detection ranged between 60 and 98%. Results were influenced by the complexity of the forest plot, the acquisition path and mode.

The impact of photographic path on the accuracy of forest metrics derived from terrestrial SfM point clouds was firstly investigated by Liang et al. in 2014 and 2015 [74, 75] followed by Mokroš et al. in 2018 [78]. According to Mokroš et al. [78], the optimal acquisition solution resulted in portrait images, stop and go shooting mode and a path leading around the plot with two diagonal paths through the plot. Differently, Liang et al. [75] concluded that the image matching results of landscape images were optimal together with a photographing path based on inside and outside of an inner circle (Fig. 4). For complex forest plots, Piermattei et al. [15•] found that the optimal acquisition path was a combination of the solution found by Liang et al. [75] and Mokroš et al. [78]: landscape images, stop and go mode around the plot pointing in, following by an inner circle pointing out of the plot and two diagonals. This solution allowed reconstruction of stems with an accuracy of few centimetres up to a few metres above ground. For low-density and medium-density forests, Liu et al. [77] proposed a system that combines two pole-mounted cameras with a RTK GNSS for continuous capturing. Compared with total station measurements, their automatic determination of tree position, DBH and height achieved RMSEs of 0.16–0.2 m, 0.92–1.13 cm and 2.41–2.51 m respectively.

Most investigations on the use of terrestrial SfM were performed reconstructing single trees (i.e. not the entire forest plot) [16, 76, 79,80,81,82,83]. In those studies, DBH was the most frequently estimated parameter and often compared with TLS data for accuracy assessment. Although sub-centimetre accuracy was achieved in all cases, the obtained RMSEs differed according to the approaches used, forest types and survey conditions, i.e. natural forest and controlled field settings.

Not all the research studies report the time required for collecting the images. This can range from around 10 min to 2 h depending on the system used, parameters to be estimated, plot size and survey configuration, by excluding the time to acquire scaling measurements. However, the accuracy of the scaling factor is crucial for forest plot and individual tree reconstruction [74]. To scale the photogrammetric point cloud, most of the studies used targets surveyed by total station, implying additional equipment needs to be carried into the field, consequently increasing the total acquisition time per plot and reducing the portability of the entire surveying system. Aside from systems requiring a GNSS solution [77], currently, only Liang et al. [74] tested natural reference objects, e.g. tree stems, for the determination of correct scale. Their results showed that both natural reference objects and artificial targets worked effectively.

Health Assessment and Monitoring

As part of a sustainable forest management, assessment and monitoring of forest health condition play a crucial role. With threats to forest thought to increase globally [84], the identification of declining forest health induced by biotic, abiotic and anthropogenic stress agents becomes imperative. RS approaches offer rapid, spatially inclusive and objective ways to monitor forest health when compared with field assessments. With the aim of identifying and observing stress in plants, multi- and hyperspectral sensors are capable of capturing information outside the visible spectrum, which allow for estimation of biochemical plant traits like chlorophyll, leaf pigments and canopy water content [85]. Spatially continuous spectral mapping used to be exclusive to the manned airborne surveying domain. However, in recent years, lightweight sensors with discrete narrow spectral bands suitable for UAV mounting have become commercially available, allowing researchers to collect their own aerial spectral data [17, 86,87,88]. Such 2D spectral imagers may be used for SfM-based photogrammetric reconstruction and orthophoto generation similarly to RGB cameras, although they typically exhibit lower resolution.

Opposed to the field of precision agriculture, where SfM-based processing of spectral image data is widely applied [89], in forestry there currently are only a few examples where SfM-derived mapped spectral products have found the application. Early UAV/SfM-based studies of forest health made use of off-the-shelf RGB cameras, modified to capture near-infrared (NIR) images. Lehmann et al. [19] and Michez et al. [18•] used an object-based image analysis (OBIA) approach to segment and classify their scenes in order to identify declining tree health caused by biotic stress agents on both alder and oak. They achieved good overall classification accuracies (79.5–84.1% and 82.5–85% at their respective study sites) across five classes. However, they pointed out the limitations of NIR-modified RGB cameras, namely that visible and NIR spectra are not separable on the same sensor, spectral contamination due to broad and overlapping bands (see also Pauly [90]) as well as the inability to correct for changing light conditions (as downwelling irradiance is not captured).

Näsi et al. [86] used hyperspectral image data combined with the SfM-derived DSM for bark beetle damage detection at the individual tree level, achieving an overall classification accuracy of 76% when distinguishing between healthy, infested and dead trees. In a follow -up study, Näsi et al. [91] concluded that the individual tree-based approach, facilitated by the combination of 3D and spectral data, provides a promising and cost-efficient alternative to field-based assessments of pest infestation. Minařík and Langhammer [92] also used a UAV-SfM-based mapping approach to map bark beetle forest disturbance and found that bands from the red edge and NIR part of the spectrum were most suited for stress detection. These findings go in line with the results from Dash et al. [17], who assessed the potential of the commercially available multispectral sensor, the Micasense RedEdge (Micasense Inc., Seattle, WA, USA), for the detection of early signs of stress during a simulated disease outbreak in a pine plantation. In the applied random forest classification of time-series data, normalized difference vegetation index (NDVI) showed to be the best performing predictor variable to map physiological stress symptoms along with the declining tree health. Further, late examples of forest health monitoring are Baena et al. [93] and Brovkina et al. [50], both successfully applying an OBIA approach on SfM-mapped NIR image data stemming from a modified consumer RGB sensor to separate between dead and living trees.

Discussion

Used complementary to existing RS data (e.g. LiDAR) or by itself, SfM photogrammetry has shown great potential for forestry. Particularly attractive is the ability to use uncalibrated cameras paired with unstable or handheld platforms, enabling the use of low-cost and non-expert equipment. Ground and aerial SfM surveys can be carried out with high flexibility offering the option for increased frequency RS surveys to incorporate, e.g. phenological changes in the analysis [59, 65, 71]. The implementation of SfM algorithms in modern photogrammetric software enables on-demand processing with little required user input. SfM photogrammetry thus presents a highly accessible and versatile solution to the acquisition of very high-resolution 3D data. In this regard, SfM empowers common forestry practitioners to produce real-time data analytics with the minimum investment required for hardware and software.

Additional value in a SfM-based processing chain derives from the ability to provide multiple geospatial data products (i.e. 3D models and orthomosaics) from a single sensor. Spectral information is inherently linked to the reconstructed structural data and derived products as these are generated directly from the input imagery. Studies on forest health particularly highlight the benefits of using the fused structural and spectral information that SfM-based processing of UAV image data provides [18•, 50].

Point clouds generated from high-resolution images can exhibit point densities greater than LiDAR, providing higher detail information on the visible surface of forests. The increased spectral variation stemming from such high-resolution data may hereby provide another valuable source of information, namely texture, such as the case in an OBIA approach [94]. Alongside the computational analysis, high-resolution SfM-generated models appear visually realistic, providing experts a near true depiction of the scene. Intuitive to understand, SfM models thus hold an important advantage over coarser remote sensing methods by enabling the rapid visual assessment and/or validation.

As is the case with all RS data, these will only ever be an approximation of the Earth’s surface and some limitations always remain. With SfM photogrammetry being a new technology, the boundaries of these limitations are not fully tested yet. Some of the main challenges with SfM photogrammetry for forest applications that we are facing nowadays relate to the following:

  1. 1)

    Reproducibility:

    With SfM photogrammetry enabling frequent surveys, variations in illumination, atmospheric and seasonal conditions are inevitable between acquisitions. Being a passive sensing technique, these variations are directly reflected on the data thus on the replicability of analyses. To ensure the use of SfM data on demand, allowing acquisitions at different times of the year, it is therefore crucial to develop protocols for varying conditional scenarios and models that account for variations in the data.

  2. 2)

    Availability of accurate DTMs:

    Most airborne inventory studies presented here adopted highly accurate DTMs (e.g. ALS-based DTMs) to normalize UAV-SfM data and these are relatively rare around the globe, thus potentially limiting the area of application of UAV-SfM. To overcome this issue, some authors proposed the use of DTMs generated from the UAV-SfM data themselves [41] or the use of coarse resolution global DTMs such as shuttle radar topography mission data (SRTM) [41]. While the former type of DTM is obtainable only in open forests, the latter source was found to be unsuitable for estimation of aboveground biomass. A conceptually novel approach came with the study by Giannetti et al. [13•] who, to overcome any of the abovementioned limitations, proposed the use of UAV-SfM data-derived variables without prior normalization (i.e. DTM-independent variables). Their results showed that models fitted raw UAV-SfM data alone predicted stem volume with similar accuracy to ALS data, even in the highly productive broadleaf forest in steep terrain. Despite such encouraging results, it remains fundamental to further apply the method by Giannetti et al. [13•] in a wider variety of forest types and response variables. Furthermore, the greater complexity of DTM-independent variables over more traditional explanatory variables could limit the transferability of the models through space and time.

  3. 3)

    Lack of acquisition and processing protocols:

    The success of a SfM-based photogrammetric acquisition is largely based on the sensor used, the photographic path and viewing angles along with the chosen image overlap as well as the composition of a scene. Adjustments to the acquisition approach to ensure quality data are currently undertaken based on the surveyor’s experience. Here protocols that enable certainty for SfM outputs across forest types and phenological stages, yet minimizing acquisition efforts, need to be established. Eltner et al. [24] suggested a protocol for the collection of image data in geoscientific studies, which should be extended to take into account forestry-specific factors. Additional research is required on the parametrization of SfM-based photogrammetric software for vegetated scenes specifically. Processing protocols designed to deliver data adequate to the research question and to optimize processing speed are needed.

  4. 4)

    Image matching issues:

    Forests may prove to be challenging scenes for the feature matching algorithms underlying a SfM workflow. Their fine uniform texture, repeating patterns and potential movement (e.g. branches in wind) can introduce uncertainty in matching, consequently leading to incomplete reconstruction and/or noisy point clouds. In such cases, the likelihood of identifying visual similarities in overlapping images is promoted by increasing the distance to the area of interest (AOI), thus increasing the number of features per image and decreasing perspective distortions. Coarser ground sampling distances (GSDs) and higher image overlaps were shown to positively influence image matching [95, 96]. The overlap should thus be increased when decreasing the GSD (i.e. images with finer detail). Other potential mitigation strategies for reconstruction errors, like the use of high-accuracy position and orientation information for reduction of matching uncertainty, have yet to be studied.

To widely employ SfM photogrammetry in operational forestry, future research needs to tackle the abovementioned hurdles. It is essential to develop a consensus on acquisition protocols and parametrization of SfM photogrammetry software that is set to answer specific research questions across forest types and environmental conditions. We have started to gain some understanding of how image quality, overlap, GSD and photographic path are influencing SfM-based reconstruction [15•, 43••, 95,96,97]. However, prior to processing, uncertainty remains in predicting the completeness of these photogrammetric models. More in-depth work on these influential parameters is needed in conjunction with the development of methods that allow for reliable quality estimation of SfM-based outputs. Towards the quantification of data quality, James et al. [98] presented a method for estimating the precision of each point produced within the SfM pipeline by repeatedly running bundle adjustments on a set of input images. These ‘precision maps’ allow practitioners to describe the spatial variability of precision within SfM-derived products and gain insight into limitations in a given survey (such as image quality or control-point measurements). To our knowledge, ‘precision maps’ have not been applied in the context of forested scenes. In forestry, future studies would benefit from this method to objectively describe the data quality of SfM-derived products and thereby reduce uncertainty in subsequent analysis.

Conclusions

A camera and a computer are the basic requirements for SfM photogrammetry. With the examples given here, and in terms of what valuable data may be extracted from SfM-derived data by analysis, SfM photogrammetry shows great potential for forest practitioners and researchers. Adding the power of UAVs for the acquisition of aerial image data, the canopy of a forest can be mapped nearly in real time, responding rapidly to management needs. The temporal and spatial dimension that can be provided with SfM photogrammetry enables assessment and monitoring of forests in an economical way that has not existed before.

However, constraints linked to the fundamental principles of SfM photogrammetry being a passive optical method will remain. Influencing factors like viewing geometry, lighting and the availability of static texture have to be carefully considered prior to every survey. We suggest that, in order to enable the collection of fit-for-purpose data with predictable quality, further work is needed on acquisition and processing protocols specific to forestry. Further progress in these areas will facilitate the move away from proof-of-concept studies and towards the operational application of SfM photogrammetry in forestry.