Estimation of diameter and height of individual trees for Pinus sylvestris L . based on the individualising of crowns using airborne LiDAR and the National Forest Inventory data

Aim of study: The objective of this study is to test the validity of the DBH and total height allometric models fitted to the crown polygon data obtained by the application of a crown delineation and individualisation algorithm which uses the geometrical relationships between the points in the original LiDAR point clouds in the Pinus sylvestris L. stands. Area of study: The study area is located in the province of Álava in the Autonomous Community of the Basque Country. Material and Methods: The crowns are delineated using data from airborne LiDAR point clouds obtained in the 2008 overflight of the Basque Autonomous Community. The DBH and total height data for field trees are obtained from the plots in the 4th National forest inventory. Main Results: For the adjusted total height and DBH models coefficients of determination of 0.87 and 0.74 respectively were obtained. The root mean squared errors were 10.67% and 18.97% respectively. The distributions of obtained DBH and total height fitted values and the distributions of the DBH and total height of the field trees are very similar except for the DBH below 15 cm. Research highlights: For stands of Pinus sylvestris L. in Álava, the geometrical relationships between the points that correspond to laser signal echoes obtained with airborne LiDAR sensors can be used directly to delineate approximations of the horizontal projections of the crowns of the trees. Although the procedure set out here was developed for stands of P. sylvestris L. in Álava, it can be applied to other conifers in regular stands by adjusting the working parameters of the function which delineates the crowns on the basis of the point cloud.


Introduction
Over the past 10 years the availability of data obtained using airborne laser sensors for the assessment of forestry resources has enabled several methods to be developed that are capable of estimating variables related to stand structures (Cuasante & García, 2009;Condes & Riaño, 2005;Kini & Popescu, 2004;Palomino, 2009;Zhao & Popescu, 2007).Methods have also been developed using the same data for calculating the characteristic variables of individual trees such as total height and individual crown (Koch et al., 2006; Rahnman & Gorte, 2 ised using a grid whose dimensions vary from one individual method to another.The cell size in the grid used for analysis is usually greater than 200 m 2 (Naesset 2002; Andersen & Breidenbach 2007;Gobakken & Naesset 2008;Breidenbach et al., 2008).Forestry variables are obtained for each cell in the grid by regressing the values from the field plots with the descriptive values of the LiDAR point clouds for the same plots.The main advantage of using these methods is the low pulse density at which they can be used.Some authors find that densities as low as 0.1 pulses per square meter suffice (Naesset, 1997;Holmgren, 2004).
Under the individual tree approach, some authors resort directly to the point cloud to segment the crown of each tree and use this information to obtain an estimate of the DBH, total height, volume of wood, basimetric area or biomass (Holmgren et al., 2012;Holmgren & Lindberg, 2013;Heurich et al., 2004;Hyyppä et al., 2001;Maltamo et al., 2009;Vaunhkonen, 2010;Vaunhkonen et al., 2010;Yao et al., 2012).Others introduce modifications and use not just the delineated crown of each tree but information on its nearest neighbours too (Breidenbach et al., 2010).There are also authors who model the crown of each tree with a revolution curve that varies depending on each species and then calculate allometric estimates of the DBH and total height (Kopela, 2007;Kopela et al., 2007).
Taking the approach of using the complete data cloud, this paper applies LiDAR data from the 2008 overflight of the Autonomous Community of the Basque Country to a new procedure for individualising crowns in stands of Pinus sylvestris L. For each crown polygon delineated, the DBH and total height of the tree in question are estimated by adjusting the regressions whose dependent variables are the field data obtained for the plots in the 4 th National Forestry Inventory (IFN4) of the province of Álava and whose independent variables are the height and crown diameter of the polygons delineated.

Material and methods
The study area is located in the province of Álava in the Autonomous Community of the Basque Country.Specifically, it comprises the area covered by stands of P. sylvestris.According to the Forestry Map of the Autonomous Community of the Basque Country (CAPV) for 2010, P. sylvestris covers 16,862 ha in Álava, a larger surface area than any other conifer in the province.The procedure does not discriminate between natural and man-made forests.
The field data used were gathered during the fieldwork for IFN4, which took place between October 2010 and June 2011.The measurements taken for each tree in the plots considered were: azimuth, distance, DBH and total height.According to IFN methodology each plot is made up of 4 subplots with a radio of 5, 10, 15 and 25 meters.In these four subplots, the trees were measured with DBH greater than 7.5, 12.5, 22.5 and 42.5 cm.respectively .This plot inventory design requires the use of expansion factors in order to be able to refer the data to the area.When the unit area is the hectare the expansion factors are 127.32 for the trees with DBH between 7.5 and 12.51 cm, 31.83 for the trees with DBH between 12.51 and 22.5 cm, 14.15 for the trees with DBH between 22.51 and 42.5 cm and 5.09 for the trees with DBH greater than 42.5 cm.The plots where more than 75% of the trees measured were P. sylvestris, were selected.The plots where more than 80% of the trees measured were conifers and more than 50% of the total amount of trees measured was P. sylvestris, were also selected.Three plots where P. sylvestris accounted for only around 70% were also included because of the small sizes of those trees which were not Pinus.Those selected plots which met the above conditions but where the average height did not exceed 6 m were then eliminated from the study because the crown delineation algorithm works by exploring the point cloud in a downward direction to a height of 4 m above the ground so as to avoid any distorting effects caused by bushes in the final results.Taking into account that the LiDAR data were gathered three years before the field data, it was decided to leave a safety margin of 1 m so that the growth of the trees on each plot did not invalidate the minimum height premise used in developing the working algorithm.Those plots which had been thinned between the moment of the LiDAR overflight and the taking of the field measurements were also eliminated .On this basis 50 plots were selected .On these plots, those trees whose DBH was less than 10 cm and whose total height was less than 6 m were eliminated.Figure 1 shows the location and distribution of the plots selected.
The centres of the plots selected were measured in the field using GPS with margins of error of less than 15 m.During the process it was necessary to translate some of the plots to adjust them to their true locations.
The LiDAR data used in this study are from the point cloud that resulted from flights over Álava made between 18/06/2008 and 10/07/2008.A laser scanner RIEGL LMS Q560 was used in the data collection.The points obtained were processed using TERRASOLID TerrMatch V8 and TerraScan V8.The result is an irregular cloud of classified points with an average density of 3.18 points/m 2 .The sensor recorded between 1 and 5 returns for each laser pulse.Table 1 shows LiDAR data gathering parameters.The algorithm at the basis of this study was applied to each set of points in a plot.This algorithm is written as a function in SQL language for use in postgreSQL 9.2 with the POSTGIS 2.0 add-on, which gives the database manager geometrical and other GIS functions.The algorithm delineates the horizontal projection of each crown as a polygon that takes in all the points which impact on each tree.The result takes the form of an .shplayer that contains each crown polygon for each tree in graphic form and an associated table with the following data for each crown: maximum height, minimum height, polygon surface area, number of points contained and density measured in trees per hectare calculated from the centroid of the polygon using the 6 th tree method (Prodam, 1968).A further column was added with the heading "LiDAR crown diameter" which shows the diameter of a circle with the same surface area as the crown polygon delineated.To reach this result the algorithm divides the point cloud into layers half a metre deep and analyses the distance from each point to the crown polygons already delineated in each layer, in a downward direction.
The function above mentioned implies that all the points of the slice are projected on its middle horizontal plane In this way the distance in two dimensions is measured instead of three dimensions.As previous crown polygons have not been created, the points of the highest slice should be considered tree apices.Around each apex a 0.75m buffer is created.When these areas are intersected, they are considered parts of the same tree and because of this, they dissolve into one polygon.The radio of the buffer is 0.75 m as the maximum tree density found for P.sylvestris in the study area is 4000 trees/ha .Therefore the minimum distance between apices is 1.50 m. approximately.Having delineated the crown polygons of the first slice the function starts a loop that is repeated with all the points of each slice.The points of the next slice are selected.All these points and the previously delineated crown polygons are projected on the middle plane of the slice.
The point files (in .LAS format) that contained the IFN4 plots selected were compiled, and with the points classed as ground, DTMs were drawn up with a resolution of 0.5 x 0.5 m.The centres of the plots were selected and the LiDAR points within a radius of 60 m around each centre were then cropped.Each of the 50 crops obtained in this way was converted to an ".shp" file (shapefile), a format that can be handled by most Geographical Information Systems (GIS).The files were then normalised by subtracting the reading corresponding to the DTM created in the previous step from each point.In each crop of normalised points, those with heights below 4 m and above 50 m were then eliminated.Points with readings below 4 m were eliminated because this was considered the lowest level at which a distinction could be drawn between bushes and trees.Points with heights in excess of 50 m can be considered as process errors, as there are no trees that high in the study area.The 60 m radius gives an area that is large enough to prevent the shift due to translation of plots from IFN4 from entailing any loss of data.Moreover, the edge effect in the working area does not affect the delimitation of crowns in the inventory plots.Fifty normalised point clouds were thus obtained in .shpformat, corresponding to the fifty IFN4 plots selected.To determine the threshold distance (OA), the algorithm was run on each inventory plot LIDAR point cloud .During the process, the value of the parameter OA was changed to the point where the number of trees which was obtained with the algorithm and the number of trees in the field plot was the same.Once the OA value for each inventory plot was obtained, several regression models were elaborated to relate this variable OA with variables of the field plot trees .The model with the Using a nearest neighbor algorithm the distance between each point and its nearest delineated crown polygon is measured.When the measured distance is bigger than specified threshold, the point is considered to be a new apex and a 0.75 m buffer is established.The polygons of the new delineated crowns are added to the existing polygons.The unselected points should belong to one of the already delineated crown polygon.Using once more the nearest neighbor algorithm, the nearest crown polygon is located for each of these points.Each of them copies the nearest crown polygon identifying code.Once the points have been asigned to the nearest polygon code, the mínimum convex-hull Both the trees according to the field study and the LiDAR crown polygons were then weighted according to the expansion factor for the DBH of the trees in the field study in each case (Bravo et al., 2002), divided by the smallest of the above factors (5.093).The descriptive statistical data for DBH and total height of the whole set of Pinus sylvestris L. in the IFN4 plots selected are shown in Table 2.
smaller Root mean square error (RMSE) value and the greater R 2 value was selected.The model fulfilled the homoscedasticity and not auto-correlation hypotheses.
It can be shown experimentally that over 90% of the polygons that contain fewer than 40% of the average number of points found within 20 m are false crowns, usually caused by the function taking protruding branches as new treetops.These branches have no crown beneath them that could provide a sufficient number of returns, so the number of points is low.Once the initial delineation of polygons is completed the next step is to correct the data by eliminating these false crowns and reassigning their points to the nearest crowns.The crowns affected must be re-delineated and recalculated to include these points.Once the final set of crowns is obtained, the function calculates the density in trees per hectare of each one using the 6 th tree method.The final result of this process provides the data table for each crown described above.
Each group of crowns delineated is entered in a GIS, and the locations of the trees in the plots from IFN4 are superimposed on it.The centre of each plot is determined in the field with a margin of error of +/-15 m, so a 2-D translation of the trees according to IFN4 is required to get them to coincide largely with the delineated crowns by approximation of their heights and spatial structure.In this correction, ortho-photos from 2008 were also used.Following the correction, the location of the centre of the plot is approximated accurately enough to provide reliable values for the variables.Figure 3 shows an example of the final result of the delineation of crowns in a plot with a radius of 60 m around the centre of an IFN4 plot with trees located according to field data, before and after the translation process.
Translations were also applied to the trees contained in plots, and those crown polygons that exactly matched tree locations established in fieldwork were selected.Those trees which occupied intermediate positions between different polygons and those polygons which contained more than one tree were not selected.This resulted in a selection of 958 trees out of the total of 1477 contained in the IFN4 plots IFN4 (65%).The Pinus sylvestris L. were then selected, giving a total of 916 trees.

6
LiDAR crown diameter, understood as the diameter of a circle with the same surface area as the crown polygon delineated.To correct problems of auto-correlation, heteroscedasticity and non-linearity, the five regressions with the largest R2 out of all the combinations of variables and their Napierien logs and square roots were tried.The models tested are shown in Table 4.
As an additional check, once the regressions were calculated for the total heights and the DBH, the equations were applied to the crown polygons resulting from the LiDAR crown delineation in all 50 plots in the inventory, weighted for the relevant expansion factors.The average DBH and height was calculated for all the trees with delineated crowns and the results were compared with the average DBH and total height of trees measured in the field, also weighted for their expansion factors.The results are shown in Table 5.
The distribution of diameters was checked using 5 cm diameter class brackets for both the trees from the field study and those from the LiDAR survey.The distribution results are shown in Figure 4.
Once the positions of the trees in the field study and the LiDAR crowns were matched, the regressions for DBH and total height were adjusted as functions of the variables of the crown polygons delineated.
To determine which variables were relevant in the total height & DBH regression the "select regression model" function of STATGRAPHIC was used and it was checked that the coefficient of determination R 2 did not increase with more than one independent variable.The variables in which that coefficient was the largest were used.
After selection, models with these variables and their logs and square roots were tried out.The validity of the models tried was checked using the hypotheses of homoscedasticity, auto-correlation and non linearity, applying the Breusch-Pagan tests (Breusch & Pagan, 1979), Durbin-Watson (Durbin &Watson, 1971 andReset (Ramsey, 1969).
For the total height model it was found that the coefficient of determination did not increase with more than one independent variable.The maximum R 2 was obtained with the maximum height of the points contained in each delineated crown polygon (LiDAR height).The initial linear model, whose dependent variable was the field tree height and whose independent variable was the LiDAR height, proved to be heteroscedastic and auto-correlated, and failed to pass the RESET test for non-linearity.Several models were tested in an attempt to solve these problems, including logs, squares and square roots instead of the original variables.The models tested are shown in Table 3.
Similar calculations were made to establish which model was most suitable for the DBH.It was found that the coefficient of determination did not increase from two independent variables upward and that the maximum R2 was obtained using the maximum height of the points contained in each crown polygon and the  OA: value of threshold distance in meters dph: number of field trees per hectare After testing multiple relationships between the total height of each tree variable (Ht) and the variables for the LiDAR crown polygons and combinations of the latter, the model selected was the only one that corrected the problems of heteroscedasticity and non-linearity, i.e. the one in which the dependent variable is the square root of the total height and the independent variables are the LiDAR height and its square root.None of the models corrects the problem of auto-correlation.
The following model was selected: Mean absolute error = 0.14 (0.98 m after the transformation is undone) 7.93% where: Ht: total height of each field tree in metres.Hl: LiDAR height of the tree in metres considered as the height of the highest point contained in the crown polygon.
To correct the systematic bias introduced when the square root transformation is undone, the factor proposed by Duan (1983) must be added.
The match in height distributions was checked using 3 m height class brackets.The distribution results are shown in Figure 5.
DCl: LiDAR crown diameter in metres, considered as the diameter of a circle with the same surface area as the crown polygon delineated.
Figure 7 shows the graphs for the field measurement individual trees DBH model.
Once the diameter was calculated for each LiDAR crown delineated according to the individual tree diameter here, the diameter distribution of trees according to the LiDAR survey was found to underestimate the number of trees in diameter class 10 cm (-61%) and 15 (-12%) though that underestimation is offset by an overestimation of the number of trees in diameter classes 20 cm and 25 cm (26% & 30% respectively).
The height distribution of trees according to the LiDAR survey calculated using the total height model for individual trees and trees grouped into height class brackets of 3 m closely matches the height distribution of trees according to the field study, with error levels of less than 10% in all brackets except 25.5, where the error level is 28%.
To validate the process in an independent sample, 183 trees were measured which were in three plots apart from the already measured.The results of validation are shown in table 6.
where n is the number of data and Ɛ i is the residue from observation i.
For the equation for the square root of the height, CF= 0.0378 Once the transformation is undone the equation looks like this: Ht = 0.0378 + (1.267 + 0.06984*Hl + 0.4087* Hl ) 2 Figure 6 shows the graphs for the field measurement individual trees height model.
For DBH the model selected was the only one that met the conditions of homoscedasticity and linearity.None of the models met the no auto-correlation condition.The model finally selected was the one in which the DBH depends on the square roots of the LiDAR height and the LiDAR crown diameter.Diameter & height based on crown individualisation using LiDAR data while our own study covers the whole province of Álava: the models are therefore more general in their application and necessarily less precise.
The main problem of the models obtained is that they do not meet the requirement of no auto-correlation.This is because the results are shown grouped by plots.The mean residue and its amplitude are different in each plot.
To check whether any variable excluded from the height model might help to explain the difference in the residues on each plot, the correlations of the residues from the regression were checked against the rest of the variables for the field study and the LiDAR survey on each plot.The only one of these correlations which was significantly different from 0 was the difference between field study height and LiDAR height.The linear correlation between the regression residue and the difference between field study height and LiDAR height is almost perfect.It was then checked whether this difference was linearly correlated with any variable in the field study or LiDAR survey, and again no correlations significantly different from 0 were found.
The residues do not depend on any of the plot variables, so the differences between them may stem from differences in the reliability of the DTM depending on the number of points used to draw them up.This problem cannot be corrected, and during the point cloud normalisation process it will give rise to average residues with systematic biases which are different for each plot, depending on whether the surface of the terrain has been interpolated above or below the actual surface.
With regard to the DBH distribution, the diameter class 10 and 15 undersestimation is compensated for with the overestimation of the two following diametric classes, that is to say, that the procedure tends to homogenize the dasometric characteristics of the smaller trees .As the wood percentage obtained in the smallest trees is low, its influence in the estimation of the stock of wood is limited The height distribution only presents bigger errors than 10% in the bracket of 25.5 m.Although the error level is 28%, in absolute terms, it only affects 109 trees.So its influence in the total amount is very small.
Although the procedure set out here was developed for stands of P. sylvestriss L. in Álava, it can be applied to other conifers in regular stands by adjusting the

Discussion
No other results based on crown individualisation methods have been found for Spanish forest stands, but publications have been found concerning stands of other species elsewhere.
In Sweden (Holmgren & Lindberg, 2013) a new crown delineation algorithm was applied to 6 plots of 6400 m 2 with regular, almost mono-specific stands of Picea abies L. In all 2276 trees were measured.The LiDAR data density was 30 points/m 2 .RMSEs of 3.8% for total height and 6.3% for DBH were obtained.
In Finland (Maltamo et al., 2009) 14 plots of P. sylvestris were measured in which 133 trees were located, with DBH measurements of between 2.2 and 37.6 cm. and heights of between 19.5 and 27.2 m.A crown individualisation algorithm was applied and the results were used to estimate total heights and DBH.RMSEs of 1.95% for total height and 5.6% for DBH were obtained.Also in Finland, crowns have been reconstructed in highly localised studies (Vauhkonen, 2010) using computational geometry on point clouds.Estimates of DBH and total height were obtained with data on individualised crowns.The RMSEs were 13% for DBH and 3% for total height.
In Scotland (Suarez et al., 2009) field measurements and LiDAR data were taken for 12 plots of Picea sitchensis (Bongard) in 2002 and 2006.The densities found were between 3 & 4 points /m 2 in 2002 and between 10 & 17 points/m 2 in 2006.Crowns were delineated on digital crown models using image segmentation programs.Once the crown of each tree was delineated the height and crown surface area delineated were associated with it.These variables were then used to run regressions on field study figures for DBH and total height.Only the R 2 of the relevant ratios are shown: for 2002 the figures are 96% for the height model and 88% for the diameter model.For 2006 they are 93% and 67% respectively.The degrees of error are not shown.
In all the above cases the RMSEs reported for both total height and DBH are lower than the 18.97% for DBH and 10.67% for height obtained with the method described here.It must be considered, however, that all the studies indicated obtain their results from samples that are highly concentrated in terms of space, For applying the method to stands of broadleaf trees, it must be taken into account that the crown delineation algorithm needs to find points that can be considered as clearly defined apices.Because of the shape of their crowns, broadleaf species generally form stands in which the points that make up the apices of the crowns are not very different from those around them.This makes it harder to apply the method to such stands.An in-depth adaptation of the algorithm would need to be developed before it could be applied in such conditions.

Conclusions
For stands of Pinus sylvestris L. in Álava, the geometrical relationships between the points that correspond to laser signal echoes obtained with airborne LiDAR sensors can be used directly to delineate approximations of the horizontal projections of the crowns of the trees from which those echoes were obtained, with no need to resort to simplifications such as rasterisation to obtain digital models that reproduce the shape of the surface area of the canopy formed by the crowns.
The heights of the trees located by delineating their crowns with the procedure described can be estimated using the height of each crown polygon above the DTM, with "crown height" being understood as the height of the highest point according to a model with a coefficient of determination of 0.87 with RMSE= 1.35 m.The DBH can also be estimated using the height and diameter of the crown polygon delineated, according to a model with a coefficient of determination of 0.74 with RMSE = 4.98 cm.
The distributions of obtained DBH and total height fitted values and the distributions of the DBH and total height of the field trees are very similar except for the DBH below 15 cm.These diameters are strongly underestimated and they are incorporated to the following diametric classes.

3
Diameter & height based on crown individualisation using LiDAR data

Figure 1 .
Figure 1.Location of the study area & plots used to adjust models.

Figure 2 .
Figure 2. Crown delineation process using the POSTGRES function.

Figure 3 .
Figure 3. Result of crown delineation on plot 285 according to IFN4 and the plot after a 2D translation of +8 m on the X-axis and -5m on the Y-axis.The circles represent the variable-radius plot of IFN.

Figure 6 .
Figure 6.Graphs for the field measurement individual trees height model.

Figure 7 .
Figure 7. Graphs for the field measurement individual trees DBH model. 9

Table 1 .
LiDAR data gathering parameters the elements with the same code, is drawn.This way each polygon crown grows adapting to the cloud points geometry.A working diagram of the algorithm is shown in Figure2.

Table 2 .
Statistical data for all Pinus sylvestris L. on the plots included in the study & for selected trees DBH: Diameter at Breast Height in cm.Ht: Total height in m.

Table 3 .
Models for total height Hc: total tree height in m.LN(Hc): Naperian log of tree height in m.Hl: Maximum height in m of points contained in LiDAR crown polygon.LN(Hl): Naperian log of máximum height of points contained in LiDAR crown polygon.

Table 4 .
Models for DBH LiDAR crown diameter in m, understood as the diameter of a circle with the same surface area as the LiDAR crown polygon.LN(DCL): Naperian log of LiDAR crown diameter in m.Hl: Maximum height in m of points contained in LiDAR crown polygon.LN(Hl): Naperian log of máximum height of points contained in LiDAR crown polygon.

Table 5 .
Average results of regressions

Table 6 .
Results of the validation plots

Trees (field study) Trees (LiDAR) Absolute error Relative error
parameters of the POSTGIS function which delineates the crowns on the basis of the point cloud.It may be of particular interest for managing stands of Pinus radiata D. Don, a species which accounts for 64% of the standing trees in the Autonomous Community of the Basque Country and 80% of those harvested.Future research should check for increased precision in models adjusted with data obtained from more homogeneous managed and natural forestry situations consistent with these productive species.The delineation function detects secondary máximum which are not trees but once found they cause a delineation of a a new Crown polygon.Although the adjustment of OA parameter minimizes this problem it cannot solve it completely working