Introduction

Tree height is an important factor when investigating forests and forest density, and an important variable for estimating canopy structures and mean stem diameters (McCombs et al. 2003). Tree height is also becoming an increasingly important input variable for analyzing the effects of disasters such as the spread of wild fires, and for screening forests. Several studies have been conducted using remote sensing techniques to measure tree heights, since they are considered some of the most accurate and efficient methods for analyzing forests over a wide area (Næsset 2002; Popescu and Wynne 2004). In atmospheric environment studies, for example, the surface roughness of the land is an essential variable for parameterizing momentum and heat exchange models. In this case, the effects of tree height on roughness become very important (Colin and Faivre 2010).

Studies have also been conducted to extract forest information, where tree height was used to estimate the forest biomass as a source of absorption (Lee and Ru 2012). Forests grow over time and can be affected by natural disasters such as wildfires. As a result, it is essential to update forest information in order to ensure its usability (Chang et al. 2011). Several studies have been conducted to economically determine the tree height distribution in a wide forest area using remote sensing methods. Reinartz et al. (2006) compared the digital elevation models (DEMs) derived from the optical stereo imagery of SPOT-5 high-resolution stereo (HRS) with DEMs derived from the Shuttle Radar Topography Mission (SRTM). The forest canopy and the height of terrain were not modeled over the forest area using either method. The height difference between the two DEMs provided an estimate of the canopy heights of the trees, even though the scattering process (when radar was used) and the reflection and matching technique (when the optical technique was employed) produced lower values than the actual height of the trees. A better estimate of the actual mean tree height is plausible if additional knowledge of tree parameters such as the species and density is provided.

Cornet et al. (2006) compared a three-dimensional database of geographic objects such as roads, trees, and buildings with the height data generated from SPOT-5 imagery, and precisely computed a digital surface model (DSM) ranging from 9 to 12 m utilizing the land-cover characteristics. Simard et al. (2006) determined biomasses by estimating tree height using SRTM data, airborne LIDAR data, and high-resolution United States Geological Survey (USGS) DEM data.

Unless specifically referred to as a DSM, a generic DEM only includes the elevation of the terrain (bare earth z values) without vegetation and manmade features. A bare earth DEM is generally synonymous with a digital terrain model (DTM) (Maune 2001). A DSM is similar to a DEM or DTM, except that a DSM includes the elevations of the top (reflective) surfaces of buildings, trees, towers, and other features above the bare earth. Theoretically, the heights of features on the ground can be calculated by subtracting the height of the DTM from that of the DSM. The tree heights can be calculated using the same approach (St-Onge and Achaichia 2001).

Photogrammetric approaches using stereo satellite imagery can produce a DSM of a target area, even for inaccessible areas. However, DSMs produced using this approach can contain many abnormal elevations due to matching errors in the stereo satellite images, which require lots of time and effort to correct. Baltsavias et al. (2006) examined the potential use of automated image matching methods for DSM generation and the derivation of tree parameters, especially the heights and growth rates of trees, in a mire environment. However, image matching can still lead to many complex and difficult problems. Difficulties when performing image matching from the stereo data sets are caused by, for example, occlusions, repetitive patterns, shadows, perspective differences, surface roughness, surface discontinuities, and mixed surfaces. Improvements in matching can be achieved among stereo images by applying geometric constraints (e.g., exploiting known image orientations), sophisticated matching methods, automatic error detection, density matching, and the use of a small ground pixel size to reduce surface smoothing (Potůčková 2004; Elaksher and Bethel 2010).

The phase correlation method (Kuglin and Hines 1975; Foroosh et al. 2002), which exploits the invariant properties of the Fourier transform, is preferable to matching stereo images acquired under different circumstances, such as under varying lighting or varying atmospheric conditions. It is specifically well suited to images with low-frequency or frequency-dependent noise, such as in forest areas. Stone et al. (2001) and Stone (2011) showed a method based on phase differences in the Fourier transforms of similar images for determining the difference in translation between two images to within a tenth of a pixel. He assumed that the images have identical orientations and scales and differ only in their translations. This method can be applied to image sets with different translations. For this reason, the phase correlation method is suitable for use when there are matching problems in which the images are misaligned by relatively small displacements, as in orthoimages.

Georgopoulos and Skarlator (2003) used a pair of stereo images to refine coarse DEMs. Two orthoimages were generated—one for each input image. The parallaxes between the two orthoimages were computed with image correlation methods using reference and search templates taken from the two orthoimages, respectively. The displacement of the best-matched point was used to compute elevation errors in the input coarse DEM. The errors were then used to refine the DEM. A similar iterative approach was applied until the parallax dropped below a certain threshold (Höhle and Potuckova 2005; Amitabh et al. 2005).

More stable DEMs were generated by refining the existing DEM by iteratively measuring the parallaxes between the orthoimages (Schenk 1989; Norvelle 1996; Yeu et al. 2000; Amitabh et al. 2005). A DEM with the characteristics of a DSM can be produced from a DEM with the characteristics of a DTM by applying the DEM-refinement method to the stereo orthoimages. It is also possible to calculate the height above the bare earth of any feature by evaluating the difference between the newly generated DSM and the existing DTM. In particular, Vector Interim Terrain Data (VITD) has provided information about forested areas and tree heights (NGA 1995). We propose that the tree heights of VITD can be updated by estimating the differences between the DSM and DTM inside the target area.

In the study described in the present paper, we generated DSMs using the DEM-refinement method with satellite stereo orthoimages, and the heights of features were calculated by evaluating the differences between the DSM and DTM. Our approach of using satellite imagery differs from previous studies which used aerial photographs. We focused on phase correlation matching methods, which have already been shown to yield good results when measuring earthquakes from optical satellite images (Leprince et al. 2007). Our DEM-refinement method based on phase correlation matching using stereo orthoimages is also different from conventional automatic terrain extraction (ATE) methods of using stereo images. Our approach allows us to avoid errors and spikes in terrain elevations due to mismatches, which leads to more stable automatic generation of DSMs. It is also possible to selectively refine the DEM inside the target area utilizing the existing geographic information system (GIS) data.

The approach we used in this study was as follows. First, DSMs were generated from an existing DEM by applying the DEM-refinement method to the stereo orthoimages, and the tree heights in a forested area were calculated by comparing the differences between the DSM and DTM. Second, the accuracies of the calculated tree heights were analyzed. Third, information on the tree heights in a forested area was extracted from VITD, and a method for updating the attributes of the tree heights in the VITD was developed accordingly.

Study area and approaches

Study area

As shown in Fig. 1, the northeast forested area of the demilitarized zone (DMZ) located in the middle of the Korean Peninsula (37°00′N, 127°00′E to 38°00′N, 129°00′E) was selected to test our ideas. This site is a mountainous area with a maximum terrain elevation of 1,570 m, an average terrain elevation of 466 m, and an average terrain slope of 17°. This area mainly consists of deciduous, evergreen, and mixed trees.

Fig. 1
figure 1

Geographical location and VITD of the study site

The VITD of the study area was produced using a paper map, and the data were updated in 1998. The map scale of the VITD was 1:50,000. The VITD provided us with information about the tree types and tree heights of the study area. The feature and attribute classifications are based on the Digital Geographic Information Exchange Standard (DIGEST) (DGIWG 2000) and the Feature and Attribute Coding Catalogue (FACC), respectively. The tree height attribute in FACC is predominant height (PHT). The PHT is the greatest height attained by over half of the features, although the average height of the features was used if the PHT was not available.

The DEM used in this study was digital terrain elevation data (DTED) produced using the contour information within a topographic map of scale of 1:50,000 by the Korean Army Mapping Agency. The grid size of the DEM was one arcsecond. The coordinate system of the DEM is the geographic latitude and longitude based on the World Geodetic System 1984 (WGS84) datum. The height of the DEM was the height of the bare terrain, which was used as a reference bare earth height in this study. The DEM ranged from 38° to 40° in latitude and from 127° to 129° in longitude, which corresponds to a total of four cells according to DTED specification.

The ground control points (GCPs) were accurately selected by a skillful operator manually using an analytical photogrammetric plotter. It is estimated that the manually selected accuracy of GCPs using the analytical plotter is within a subpixel for SPOT-5 images. A total of 125 GCPs were collected from the study area at intervals of about 10 km.

SPOT-5 high-resolution geometric (HRG) panchromatic stereo images were used in this study. SPOT-5 images were acquired between 4 October 2003 and 25 May 2004. The processing level of each image was 1A, which leaves the image data in almost raw form, with only radiometric corrections performed. The stereo pairs consisted of right images with look angles of 8.7°–26.2° and left images with look angles of −30.1° to −16.2°. The ground sample distance (GSD) was 2.5 m. Figure 2 represents the coverage map of satellite images of the study area and the distribution of GCPs.

Fig. 2
figure 2

SPOT-5 satellite images and the distribution of GCPs at the study site

Method used to calculate tree heights

Workflow

Figure 3 illustrates the overall workflow during the experiment. The workflow can be categorized into four stages. (1) Orthoimages were generated from satellite images using a georeferencing model. (2) Stereo matching was performed to determine the parallax between the stereo orthoimages, and feature heights on the ground were calculated based on the parallax data. (3) A DSM was generated from an existing DTM using a DEM-refinement method. (4) Information on the tree heights in a forested area were extracted from forest GIS data and the existing tree heights were updated based on the difference between our newly generated DSM and the DTM.

Fig. 3
figure 3

Workflow of the method used during the experiment

Generation of orthoimages

An orthoimage is remotely sensed image data in which the displacements of features in the image caused by terrain relief and sensor orientation have been mathematically removed (FGDC 1999). Orthoimages were generated for each image using stereo satellite images acquired with different look angles at the study area. The orthoimages were generated by calculating the external orientation parameters (EOPs), which describe the location and orientation of the bundle of rays in the object coordinate system with the six parameters: the projection center coordinate and the rotations around the three axes, via bundle adjustments after measuring the GCPs of the stereo satellite images.

The results of the bundle adjustments showed high position accuracies to generate orthoimages using the adjusted EOPs and the existing DEM. The ERDAS Leica Photogrammetry Suite (LPS) of software was used to measure GCPs in SPOT-5 images only, and our own custom-made tools were employed to generate orthoimages in all other processes.

DEM refinement

In many applications of DEMs, surface elevation measurements must be corrected for the elevations above the bare earth. The measurement of surface elevation involves the identification of areas of vegetation and manmade features and the correction of elevation measurements above the bare earth. Areas of vegetation can be identified using a combination of imagery, feature maps, and existing GIS data. Corrected surface elevations can be used to determine the elevations of features on the bare earth, while a method can also be used that adds the heights of features to the elevations in DEMs.

Orthoimages are generated using DEMs that do not include the elevations of manmade features such as buildings or natural objects such as trees on the bare earth, and accordingly the images contain parallaxes that are proportional to the heights of these features. However, the heights of features can be calculated if the displacement among orthoimages is correctly measured, as shown in Fig. 4. Then the relationship between the x-parallax among orthoimages and the feature height above the bare earth can be described as

$$ dh = dx \cdot \frac{h}{B}, $$
(1)

where dh is the height correction, dx is the difference in x (in ground units) between two orthoimages created from the left and right images of the stereo pair, B is the base, and H is the flying height.

Fig. 4
figure 4

Parallaxes and height differences in the DTM and DSM caused by tree features

A DEM-refinement method using orthoimages can determine the displacement between orthoimages generated from stereo images using an image matching method. During the processing, an existing DEM can be refined by recalculating the accurate elevation values from the measured displacement. Once measurements of the displacements between orthoimages have been applied to the manmade and natural features in the images, the refined DEM may represent the characteristics of a DSM. Our DSM-generation process based on the refinement of orthoimages was as follows:

  1. 1.

    The initial orthoimages were generated using the existing DEM (the reference DEM) and GCPs using satellite images. In our study, SPOT-5 satellite imagery was used.

  2. 2.

    The parallax between two orthoimages was determined by phase correlation image matching. A similarity measurement and limit conditions were used to determine accurate matching points. Two images were considered to be matched when their phase correlation attained its maximum value. Best-match conditions were evaluated by considering the relative displacement between the matching points, the grid spacing, and horizontal accuracy of the reference DEM, e.g., geometric constraints.

  3. 3.

    The elevation errors in the existing DEM were determined based on the parallax of the matching points in the orthoimages. A refined DEM was generated by correcting errors in the existing DEM.

  4. 4.

    Since the probability of exterior orientation errors is less than that of the DEM, it was not necessary to iterate the previous steps until the orthoimages became similar. However, if necessary, a new orthoimage pair could be generated using the refined DEM. Steps 2–3 were then iterated until all of the parallax values were below a pre-defined threshold value, e.g., subpixel displacement.

During the image matching, forested areas may lead to inaccurate results due to occlusions, repetitive patterns, shadows, and perspective differences between images. To reduce these effects, the phase correlation matching method was used in our study. Since the phase correlation matching method measures matching points based on the invariant properties of the Fourier transform, it is preferable when images are acquired under varying lighting circumstances and under small translations and rotations.

It is advantageous to perform orthoimage matching, which was the case in this study. Orthoimage matching can exploit various geometric conditions based on the orientation during the modeling process, while the matching method can apply specific measurement conditions. It is also possible to remove errors automatically from the matching process based on the resolution and accuracy of the reference DEM, which produces more stable results than conventional approaches.

The SPOT-5 satellite images used in this study were pushbroom scanning images and were not geocoded. To apply our proposed methods, input satellite images were orthorectified using bundle adjustment. To apply the DEM refinement method—which has been applied to aerial photographs—to the pushbroom satellite imagery, the method had to be modified. Input imagery is acquired line by line in pushbroom scanning processes, but aerial photographs are acquired in a single exposure. This geometric difference leads to different parallax calculations in pushbroom and aerial photographs. We used the results of the bundle adjustment, rather than applying a horizontal difference to correct elevation errors, which is the case for aerial photographs. The source image coordinates were calculated from the ground coordinates of the left and right orthoimages, whereas the DEM elevations in given ground coordinates were determined by recalculating the coordinates based on the parallax between the source image coordinates. These elevation values were then used to correct the height of the DEM (Yeu et al. 2000).

A DEM was used to produce the initial orthoimages. The features, such as buildings and trees, were not reflected in the input DEM. Since these features represent the parallaxes of each other in the orthoimages, the elevation differences between the orthoimages were calculated using a refinement method. The calculated elevation difference represents the height of the features in the DEM, which facilitated the generation of the DSM.

Calculation of canopy height and tree-height updating

A forest stand is a contiguous area that contains a number of trees that are relatively homogeneous or have a common set of characteristics (Næsset 1997). Since the results of matching over the forest area are expected to produce different height levels of treetops, the maximum height of treetops for a given area could be used to estimate the mean tree heights. Estimates of mean heights for entire stands were obtained by subdividing the stands into many cells of a fixed size using regular grids. This is the grid-based approach for mean height estimation of forest stands.

Sets of points corresponding to the tree heights on the ground were obtained by comparing the difference between the DSM and DTM. The points of the forest canopy were selected using a GIS method and forested polygons of the VITD. The canopy height was calculated using the average height of the point sets within the polygons of the forested area. The PHT (the tree height in the VITD) represents the medium value for a given region, not the individual tree height. Table 1 shows the reference values used to determine the PHT attribute values. The field PHT values in the attribute table of forest features in the VITD were updated with the calculated canopy heights.

Table 1 Reference values used to determine the PHT attribute values

Software tools

The ERDAS IMAGINE LPS (Leica Geosystems 2005) was used to measure GCPs in SPOT-5 images. In all other steps, we used our own custom tools that had been developed using MATLAB codes, including the following program modules:

  1. 1.

    The orthoimage generation module computed the EOPs of SPOT-5 stereo images using GCPs and generated orthoimages using the EOPs and the DEMs

  2. 2.

    In relation to stereo matching and feature height calculations, there was a module to produce parallaxes using the stereo orthoimages with phase correlation matching methods, a module to remove mismatch points with similarity measures, a module to calculate ground coordinates and corresponding image coordinates by back-tracing, and a module to compute feature heights by the space intersection method

  3. 3.

    The DEM refinement and DSM generation modules removed abnormal data from the feature heights using planimetric limitation conditions, refined the existing DEM using the newly generated feature heights, and finally generated a DSM

In addition, a tree height updating model was developed using ESRI ArcGIS Model Builder to extract the tree heights from each forested area, to update the tree heights, to compute the PHT, and to update the attribute values of the GIS data. All processes shown in the workflow of Fig. 3 were performed automatically, without any user’s intervention, using our own custom tools.

Experimental results

DSM generation

Production of SPOT-5 satellite orthoimages

The SPOT-5 satellite images of the study site were modeled using the bundle adjustment. The accuracy of the modeling of regions was checked by comparing the root-mean-square error for X and Y with the GCPs. The final result was at the subpixel level. Orthoimages were generated using the results of the bundle adjustment and existing DEMs of the study area corresponding to the SPOT-5 stereo coverage area. The grid interval of the DEM for orthoimage generation was 30 m and the GSD of the orthoimages was 2.5 m. A UTM coordinate system was used to evaluate the accuracy with a metric unit.

Refinement of DEM data

To determine the parallax between orthoimages, matched points were identified using a phase correlation image-matching method. The sizes of the reference and search templates were determined by considering the grid intervals of the refined DEM (30 m) and the GSD of SPOT-5 orthoimages (2.5 m). The size of the template was chosen to be big enough to cover adjacent DEM intervals (2 × 30 m = 60 m), which is bigger than 24 pixels (60 m/2.5 m = 24 pixels). The threshold of the phase correlation coefficient for verifying the accuracy of the matching was determined as <0.2. Another method used to identify mismatching was as follows: if the parallax of a matching point between orthoimages was greater than the estimated range, it was considered a mismatch.

After calculating the ground coordinates using the matching points between two orthoimages, the source stereo image coordinates were obtained by back-tracing in the SPOT-5 image. The three-dimensional coordinates were recalculated using the space intersection method with the image coordinates of the stereo pair. The upper layer of Fig. 5 shows the point sets of part of the study area where the ground coordinates were recalculated using the DEM-refinement method. The lower layer represents the corresponding orthoimage.

Fig. 5
figure 5

Set of points of the ground coordinates corresponding to the parallax of an orthoimage

DSM generation

A DSM that included the tree heights was generated by adding the elevation differences computed from the orthoimage parallax to the elevation values of the existing DEM. The final refined elevation of the DSM was calculated using

$$ H_{\text{DSM}} = \, H_{\text{DEM}} + \varDelta H, $$
(2)

where H DEM is the elevation of the DEM, ΔH is the correction value of the elevation, and H DSM is the elevation of the DSM, i.e., the elevation of the refined DEM.

Tree height updating

Calculation of canopy height

To determine the dominant tree height, the forest area was divided into many small equally sized cells (e.g., 100 m × 100 m). After gridding, the dominant tree heights in each grid cell were estimated. The mean height of the forest stand became the mean value of all dominant tree heights inside the grid cells.

Sets of points were obtained as the tree heights by evaluating the differences between the DEM and the DSM. Points corresponding to the forest area within the VITD were extracted from the point set. Figure 6 shows the canopy heights determined for part of the forested area at the study site.

Fig. 6
figure 6

Point set of canopy heights in a forested area. Each grid square represents 1 km2

Updating the tree heights

The tree heights of the forested areas within 859 polygons were calculated as the average height of the points in each polygon according to the PHT attribute values defined in the VITD specification (Table 1). The PHT attribute values of the forest features in the VITD were updated using the computed tree heights for the forested areas. Height changes in the PHT of the VITD based on the determined tree heights are shown in Fig. 7, while the distribution of changes is shown in Fig. 8.

Fig. 7
figure 7

a Before updating the PHT and b after updating the PHT

Fig. 8
figure 8

Distribution of changes in the PHT

Discussion

It was expected that the elevations of the refined DEM would be higher on average than those in the reference DEM, since the tree heights were included. To verify this, areas identified as forests in the VITD were extracted from the elevations of the refined DEM. To check the accuracy of the extracted tree-height estimates, the elevations of 374 points in the stereo images of the study area were measured manually and used as reference points. To check the height accuracy of each DEM, we created three orthoimages based on three DEM sources, i.e., the refined DEM, the SRTM DTED2, and the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) global digital elevation model (GDEM2). The horizontal positions of the reference points were acquired from each corresponding orthoimage so that there was no confusion over horizontal locations. The heights of reference points were obtained in each original source DEM. In this way, we were able to compare the accuracies of the three DEMs at the same horizontal locations. The manually measured tree heights were compared with the heights in the SRTM DTED2, the ASTER GDEM2, and the refined DEM proposed in this study. The tree heights in the DEM were calculated using the interpolated heights relative to the manually measured points. Table 2 summarizes the results of the height comparison between the reference points and the SRTM DTED2, the ASTER GDEM2, and our refined DEM.

Table 2 Differences (m) between the DEMs and 374 manually measured reference points (interpolated tree height in the DEM − reference height)

As shown in Table 2, the average height differences between the reference heights (manually measured) and the DEMs increased in the following order: the refined DEM, the SRTM DTED2, and the ASTER GDEM2. The tree heights proposed in this study were the most similar to the manually measured tree heights. The differences between the tree heights gauged using the proposed method and the manually measured heights can be explained by the fact that the elevation of the proposed method was an interpolated height, which would be lower than the reference value. The standard deviation of the proposed method is the smallest among the DEM sets used worldwide. It is believed that the DEM-refinement method using orthoimages can produce more stable elevation values than DEM generation based on conventional stereo images. The distribution of the standard deviation of each forested area calculated using the updated tree heights is shown in Fig. 9.

Fig. 9
figure 9

Standard deviation of the PHT for each forested area

Table 3 provides a comparison of the average and standard deviation of the PHT within the forested polygons between VITD and the 374 points obtained from the proposed method. The standard deviations shown in Fig. 9 and those in Table 3 show that there were large differences in the calculated tree heights.

Table 3 PHT statistics (m)

This indicated that a more detailed classification method is required for tree height estimation, since the same tree types in the VITD were classified as having the same heights. Figure 10 shows a more detailed classification of the forested area of VITD based on the segmentation of tree height groups. Manually drawing forested areas during the production of VITDs may result in errors due to fatigue or the subjectivity of workers (Ma et al. 2001). These sources of error could be removed by using the automated method proposed in this study.

Fig. 10
figure 10

Segmentation of forested areas based on variation in the PHT

We speculated that there was a relationship between tree growth and tree type upon comparing variations in the PHT. Figure 11 shows the distribution of tree growth over the study area. It can be seen that different patterns of tree growth over the study period can be identified for deciduous and mixed and evergreen trees.

Fig. 11
figure 11

Comparison of changes in tree heights and tree types in the forested region

Conclusion

We demonstrated that a refined DEM method using orthoimages can be used to estimate changes in tree heights. This approach produced more stable DSMs than DSM generation using conventional image matching. Our proposed method can be used to determine tree heights and other features. Our selection of phase correlation matching methods and application of geometric constraints to areas that were difficult to match, such as forested areas, resulted in more accurate height estimation. This approach could facilitate selective data processing by combining the refined DEM with georeferenced GIS data. Our method, orthoimage-refined DEM, was shown to allow canopy heights over a wide area to be updated. We also identified that there is a relationship between the area classification in the VITD based on conventional tree types and tree height. It appears that a more detailed classification method based on tree heights is required. The automatic processing of tree height estimation proposed here yielded more objective results than conventional methods such as VITD.

In future studies, it would be useful to apply our DEM-refinement method using orthoimages to much higher resolution satellite images. Further studies should also improve the limit conditions by using high-resolution images for image matching over the forested areas. Moreover, variations in the actual growth of trees may be evaluated using images acquired at different times. It should be possible to identify changes in tree growth using a fusion method where tree classification is based on multispectral images.