Research Article


Comparison of height-diameter models based on geographically weighted regressions and linear mixed modelling applied to large scale forest inventory data


María Quirós-Segovia

Department of Forest Management and Applied Geoinformatics, Faculty of Forestry and Wood Technology, Mendel University in Brno, Zemědělská 3, 61300 Brno, Czech Republic

Sonia Condés-Ruiz

Department of Natural Systems and Resources, School of Forestry, Technical University of Madrid, Ciudad Universitaria s/n, 28040 Madrid, Spain

Karel Drápela

Department of Forest Management and Applied Geoinformatics, Faculty of Forestry and Wood Technology, Mendel University in Brno, Zemědělská 3, 61300 Brno, Czech Republic



Aim of the study: The main objective of this study was to test Geographically Weighted Regression (GWR) for developing height-diameter curves for forests on a large scale and to compare it with Linear Mixed Models (LMM).

Area of study: Monospecific stands of Pinus halepensis Mill. located in the region of Murcia (Southeast Spain).

Materials and Methods: The dataset consisted of 230 sample plots (2582 trees) from the Third Spanish National Forest Inventory (SNFI) randomly split into training data (152 plots) and validation data (78 plots). Two different methodologies were used for modelling local (Petterson) and generalized height-diameter relationships (Cañadas I): GWR, with different bandwidths, and linear mixed models. Finally, the quality of the estimated models was compared throughout statistical analysis.

Main results: In general, both LMM and GWR provide better prediction capability when applied to a generalized height-diameter function than when applied to a local one, with R2 values increasing from around 0.6 to 0.7 in the model validation. Bias and RMSE were also lower for the generalized function. However, error analysis showed that there were no large differences between these two methodologies, evidencing that GWR provides results which are as good as the more frequently used LMM methodology, at least when no additional measurements are available for calibrating.

Research highlights: GWR is a type of spatial analysis for exploring spatially heterogeneous processes. GWR can model spatial variation in tree height-diameter relationship and its regression quality is comparable to LMM. The advantage of GWR over LMM is the possibility to determine the spatial location of every parameter without additional measurements.

Keywords: Spatial analysis; Pinus halepensis Mill; forest modelling.

Abbreviations used: GWR (Geographically Weighted Regression); LMM (Linear Mixed Model); SNFI (Spanish National Forest Inventory).

Citation: Quirós-Segovia, M., Condés-Ruiz, S., Drápela, K. (2016). Comparison of height-diameter models based on geographically weighted regressions and linear mixed modelling applied to large scale forest inventory data. Forest Systems, Volume 25, Issue 3, e076.

Received: 08 Apr 2016. Accepted: 14 Sep 2016.

Copyright © 2016 INIA. This is an open access article distributed under the terms of the Creative Commons Attribution-Non Commercial (by-nc) Spain 3.0 Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Funding: This research was supported by the Internal Grant Agency of the Faculty of Forestry and Wood Technology, Mendel University in Brno within the development project IGA LDF_VP_2015009.

Competing interests: The authors have declared that no competing interests exist.

Correspondence should be addressed to María Quirós-Segovia:





Material and methods







In forestry, spatial heterogeneity is theorized as one of the major drivers of biological diversity (Wiens, 1976). Spatial heterogeneity results from the spatial interactions between a number of biotic and abiotic factors and the differential responses of organisms to these factors (Milne, 1991). It may have significant influences on many ecosystem processes at multiple spatial scales (Turner, 1989). The spatial heterogeneity of vegetation patterns (i.e., landscape heterogeneity) is a structural property of landscapes (Li & Reynolds, 1994) that can be defined by the complexity and variability of ecological system properties in space. The spatial heterogeneity of a tree variable (height, diameter, etc.) in a forest stand results from the complex historical and environmental mosaic imposed by competition and systematic environmental heterogeneity (Zhang et al., 2004); which implies a spatio-temporal heterogeneity. Forest researchers have realized that the spatial pattern of tree locations strongly affects (1) competition among neighbouring trees, (2) size variability and distribution, (3) growth and mortality and (4) crown structure (Zhang et al., 2004). Ignoring spatial heterogeneity in forest modelling causes biased parameter estimates, misleading significance tests, and sub-optimal prediction (Anselin & Griffith, 1988).

Global and local regression models have traditionally been used for studying spatially heterogeneous data (Fotheringham et al., 2002). There are important differences between these models, for example, global regression models summarize the characteristics of the spatial pattern over the whole study area and attempt to identify homogeneity (i.e. Ordinary Least Squares). On the other hand, local regression models make explicit the differences in the pattern observed among parts in the study area (i.e. Geographically Weighted Regression) (Fortin & Dale, 2005). This aspect makes global models easy to compute. However, these models are not realistic and the results obtained may not be correct. Local models attempt to identify exceptions and are variable throughout the space. Hence, these models are suitable for specific sample plots but are of limited use for large populations (Fotheringham et al., 2002).

There are several methods for solving the heterogeneity problem, for example, models based on heteroscedasticity specification, generalized additive models, classification and regression trees, linear mixed models (LMM) and geographically weighted regression (GWR). In this document the latter two methods were analysed.

LMM has been widely used for height-diameter modelling (Lappi, 1997; Mehtätalo, 2004; Calama & Montero, 2004). This method estimates fixed and random parameters simultaneously for the same model. The introduction of random parameters into the model, specific to each sampling unit, enables us to model the variability detected for a given phenomenon among different locations, after defining a common fixed functional structure (Lindstrom & Bates, 1990). Mixed models give an unbiased and efficient estimation of the fixed parameters of the model. Furthermore, to correctly apply mixed models in unsampled areas a calibration is required, which is of critical importance when making local predictions (Meng & Huang, 2009).

The second methodology analysed, GWR, is used to explore spatially heterogeneous processes (Brunsdon et al., 1996). The underlying idea of GWR is that parameters may be estimated anywhere in the study area given a dependent variable and a set of one or more independent variables which have been measured at known locations (Charlton & Fotheringham, 2009). Finally, no additional measurement is needed to predict the value of parameters for unsampled locations. GWR extends ordinary least squares regression models by accounting for spatial structure and estimates a separate model for each geographic location within the studied area (Matthews & Yang, 2012). It can be said that GWR is similar to a ‘spatial microscope’ in reference to its ability to measure and visualize variations in relationships, unobservable for non-spatial global models (Matthews & Yang, 2012). GWR was originally used in relation to the economy and civil engineering. In the field of ecology, GWR has been applied to remotely sensed data counts with some interesting results (Wang et al., 2005; Salas et al., 2010; Chen et al., 2012). However, in forest management, the GWR method has predominantly been used for small study areas although it was developed for use in large scale studies.

Tree height is an important variable for forest management in several ways: (1) total height together with diameter at breast height allows us to estimate the total volume of the tree, (2) in regular stands, dominant height together with age are used to estimate the site index and (3) the distribution of tree heights in the stand is very important to describe the vertical structure.

Nevertheless, height measurement can be time consuming and consequently expensive, therefore it is common to use height-diameter models instead. There are two kinds of height-diameter models: local models and generalized models.

Local models are designed specifically for forest stands and are the most precise curves for a particular forest stand (Diéguez-Aranda et al., 2009). However, they have several disadvantages, for example: (1) they require the heights of many trees in the stand to be measured, therefore are very expensive and (2) due to forest heterogeneity (age, quality, density) and the silvicultural state of stands within a given forest, a local function cannot be adjusted to all situations (Diéguez-Aranda et al., 2009). Generalized height-diameter models could improve the results in such forests. These models include stand variables for obtaining precise and unbiased estimations (Gaffrey, 1988). Dominant height is usually included because it is easy to measure and is independent of silvicultural treatments. Other variables such as basal area, density or quadratic mean diameter could also be included (Sloboda et al., 1993; Páscoa, 1987).

The main aim of this paper is to analyse the suitability of GWR for modelling height-diameter relationships in a large area and to compare it with LMM, which is one of the most frequently used methods for including spatial heterogeneity in these models.

To test whether GWR provides as good results as the more frequently used LMM, we applied both methodologies to two different functions: the Petterson (1955) local function and the Cañadas et al., (1999) generalized function, and compared their ability to predict heights when applied to an independent dataset.

Material and methodsTop

Study area

To study GWR, we selected mono-specific stands of Aleppo pine (Pinus halepensis Mill.) in the Region of Murcia, in South-East Spain (between latitude 37º 47’ 36’’ N and 38º 6’ 11’’ N, and longitude 1º 53’ 48’’ W and 1º 26’ 17’’ W). The study area covered around 1 300 km² in Espuña, Cambrón, Burete, Lavia and Quipar mountain ranges (Fig. 1). In this region, Aleppo pine was growing in low-density stands with relatively low dominant heights, probably reflecting medium to poor site qualities (Condés & Sterba, 2008).

Figure 1. Study area & subsamples.

Data were obtained from 230 sample plots with 2582 trees belonging to the Third Spanish National Forest Inventory (SNFI). SNFI design is based on permanent sample plots located at the nodes of one kilometre square grid, re-measured in an inventory cycle of ten years. The sample plots consisted of four concentric circles of 5, 10, 15 and 25 meter radius, in which the diameters and heights of all trees with a breast height diameter over 7.5, 12.5, 22.5 and 42.5 cm respectively were measured. Furthermore, plot centre coordinates, together with the polar coordinates of each measured tree, i.e. distance and azimuth form the plot centre, were recorded (MAGRAMA, 2015).

Data in the studied area were randomly split into two subsets. The first consisted of 152 plots with 1697 trees and was used to develop the model (training data). The second was used to validate the model (validation data) and consisted of 78 plots with 885 trees (Fig. 1). Table 1 shows the main stand variables for both subsets estimated by weighting individual tree data according to the area of the concentric SNFI subplots.

Table 1. Statistical analysis of data

Modelling height-diameter relationships

Two different height-diameter functions were selected for the study. The first of these was the local height-diameter Petterson’s function (Petterson, 1955), (eq. 1), which has been successfully used in similar, previously published studies (e.g. Juárez de Galíndez et al., 2007; Drápela, 2011; Adamec, 2014; Adamec & Drápela, 2015)

where h is the total height (m), d is the diameter at breast height (cm) and a0 and a1 are the parameters to be estimated.

The second selected function was the Cañadas I generalized height-diameter function (Cañadas et al., 1999), (eq. 2). This function was selected because it offered the possibility to perform the computation with GWR together with the fact that it had previously provided good results for Aleppo pine forests (Cabanillas, 2010).

where h is total height (m), d is diameter at breast height (cm), Ho is dominant height (m) obtained according to the Hart criteria (Hart, 1928), D0 is dominant diameter, calculated as the average of the 100 thickest trees per hectare, and b0 is the parameter to be estimated.

As GWR is basically a technique developed for linear regressions (Fotheringham et al., 2002), to fit these equations it was necessary to linearize them. Consequently, to fit these equations it was necessary to linearize them, which was done as follows in equations 3 and 4:


Cañadas I:

These two height-diameter functions were fitted following the two different methodologies detailed below.

Linear Mixed Models

In a first step, linear mixed models were used for fitting the height-diameter equations. As data came from a hierarchical structure, with several trees measured in the same plot, a mixed model was used with the plot as the grouping structure of the random effects. We included random effects in the a0and a1term of Petterson local model (eq. 3) and in b0for the Cañadas I (eq. 4).

The models were fitted using the maximum likelihood method of the lme procedure (R Development Core Team, 2014) to allow the comparison of results. A level of p = 0.05 was used for significance testing of the variables in the model, and Akaike’s information criterion (AIC) was used to compare results. The significance of random effects was tested by comparing the models with similar ones developed using a constant as grouping variable. Goodness-of-fit measures for linear mixed models were calculated using the lmmR2 procedure (R Development Core Team, 2014).

Geographically Weighted Regression

In a second step, the models were fitted using the GWR with three different bandwidths and an optimal number of trees.

Two of these bandwidths, calculated taking into account the SNFI grid, were 1000 m and 500 m. These two values were used to determine whether the SNFI design influenced the GWR results. The third bandwidth was calculated according to the calibration recommended by Fotheringham et al., (2002), which was based on the optimization of the studied formula according to a kernel and a weighted function. In this case, the use of a fixed kernel plus a gaussian weighted function was recommended. A fixed kernel assumes that the bandwidth at each centre i is a constant across the study area. The gaussian weighted function (eq. 5) is a continuous function of the distance between two observation points (Gollini et al., 2015).

Gaussian weighted function:

where wik is the k-th element of the diagonal of the matrix of geographical weights, and dik is the distance between observations i and k (in our case observations were trees) and b is the bandwidth.

Finally, the optimal number of trees was calculated according to the calibration recommended by Fotheringham et al., (2002). This was based on the optimization of the number of trees, which is to be applied in every height-diameter formula. As in the case of bandwidth, this optimization is based on the type of kernel and weighted function. For this situation, an adaptive kernel together with a bi-square weighted function was recommended. This kernel adapts itself in the size to the data density. The bi-square weighted function (eq. 6), is a discontinuous function, giving null weights to observations with a distance greater than a specific point (Gollini et al., 2015).

Bisquare weighted function:

where wij is the j-th element of the diagonal of the matrix of geographical weights, and dij is the distance between observations i and j, and n is the number of neighbourhoods (number of trees).

GWR provides a range of coefficients geographically located in the study area, and a continuous GWR map of coefficients can be obtained by the coefficients interpolation. The Inverse Distance Weighting (Watson & Philip, 1985) was selected as interpolation methodology because this interpolation is widely used in natural sciences (Chen et al., 2012) and is easily applicable to GWR (Tardanico, 2006). See for instance Fotheringham et al., (2002) for further description of GWR.

GWR was fitted using the GWModel package in R language (Binbin et al., 2015; R Development Core Team, 2014). A level of p = 0.05 was used for significance testing of the variables in the models, and Akaike’s information criterion was used to compare results.

Model validation

Models were validated by applying GWR as well as LMM to the validation dataset, obtaining a set of predicted heights. In the case of GWR, the predicted values were obtained through the GWR map of coefficients.

In order to validate LMM, it was necessary to calibrate the model. Very few measurements can be used in the calibration, even less than the number of random parameters to be predicted (Calama & Montero, 2004). Four trees per plot were used for calibrating the model in this study, this is a very common practice in forest inventories in Spain (Calama & Montero, 2004). The technique of best linear unbiased predictor (BLUP) (Robinson, 1991) was used to estimate of random parts of the parameters.

The validation was performed by comparing observed and predicted heights through a statistical analysis of the residuals. Mean error (ME), mean absolute error (MAE), root mean square error (RMSE) and coefficient of determination (R2) were calculated according to the formula in Table 2.

Table 2. Residual formulas


Table 3 shows the estimated coefficients for equations 3 and 4, firstly for the LMM and GWR when applied to the Petterson height-diameter function, and secondly when they were applied to the Cañadas I height-diameter function. In this table, the minimum, mean, standard deviation and maximum values of the coefficients obtained for the whole study area (considering the map of coefficients for GWR and the fixed and random effects for LMM) were detailed together with R-squared (R2) and the Akaike information criterion (AIC).

Table 3. Coefficient distribution of models - In LMM and GWR using linearized equations (eq. 3 and 4)

For the Petterson height-diameter function, the coefficient a0 mean value varied between 0.37 and 0.55 for LMM and also for GWR independently of the bandwidth (between 0.34 and 0.52 for a bandwidth of 1000, between 0.32 and 0.69 for a bandwidth of 500, between 0.34 and 0.71 for a Fixed Kernel bandwidth, and between -0.19 and 0.64 for an adaptive kernel bandwidth). The range of this parameter was lowest when obtained by LMM or by GWR with a bandwidth of 1000 m, but increased for the bandwidth of 500 and the fixed kernel, and reached a maximum for the adaptive kernel. The coefficient a1, i.e. the slope of the curve, showed higher variation with larger standard deviations and wider ranges than the a0. So, for LMM the standard deviation of a1 was 0.48 while for GWR were from 0.65 to 1.39. However, mean values along the study area were similar for all the methodologies, from 1.79 to 2.06. Once again, the largest variations were found for the GWR fitted with fixed or adaptive kernel, proving to be more sensitive to the spatial variability of the data. Moreover, the smallest AIC was obtained with the fixed kernel, representing an optimal bandwidth of 363 m; and with the adaptive kernel, representing an optimal number of trees of 26. However, all GWR methods showed certain incongruent results such as negative or very large parameter values. An example of estimation using the Petterson map of coefficients is presented in Fig. 2, which provides a graphical representation of the previously explained parameter distribution for GWR with a predefined bandwidth of 1000 m.

Figure 2. GWR map of regression coefficients. Example of Petterson height-diameter model (1000 m predefined bandwidth), coefficient a0 and a1.

In the case of the Cañadas-I height-diameter function it can be seen that the coefficient b0 had slightly larger standard deviations and ranges for GWR than for LMM. Additionally, the mean values were almost the same between both methods. The AIC for this function was smaller when GWR was fitted with a predefined bandwidth of 500 m. For this generalized height-diameter function, the optimal bandwidth calculated by fixed kernel was 755 m, and the optimal number of trees in the adaptive kernel was 65. Again, some incongruent minimum values can be seen for the GWR methodology in Table 3. A representative map of coefficients for the Cañadas I estimation is shown in Fig. 3, and the parameter distribution of GWR is presented with a predefined bandwidth of 1000m.

Figure 3. GWR map of regression coefficients. Example of Cañadas I height-diameter model (1000 m predefined bandwidth), coefficient b

As response variables in the Petterson and Cañadas-I functions were different, the AIC and R2 values were presented in Table 3. However, AIC were not comparable between models.

Model validation

Table 4 shows the errors obtained for the validation dataset when different methodologies were used. Predicted values were estimated according different assumptions: (1) LMM with fixed effects, (2) LMM with random effects and (3) GWR, including all the analysed variations.

Table 4. Comparison of errors obtained for the validation dataset

The validation revealed that the Petterson height-diameter function had poorer predictive capacity than the Cañadas I generalized function. Consequently, it can be seen that the R2 for the LMM with only fixed effects increased from 0.27 to 0.71, when calibration the improvement was from 0.63 to 0.72 and for GWR the R2 ranged between 0.29 and 0.35 for Petterson while for Cañadas was 0.71.

When analysing the errors obtained from Petterson model, it was observed that, independently of bandwidth or kernel, GWR resulted in lower bias than LMM with only fixed effects (ME=0.6) or in similar than the calibrated (ME=0.1). Moreover, MAE and RMSE were very similar for GWR and LMM with only fixed effects (MAE=1.3 and RMSE=1.7 for LMM without calibration; and MAE=1.3 - 1.4 and RMSE=1.7 for GWR). However both MAE and RMSE were smaller when calibrating LMM (MAE= 0.9 and RMSE=1.3). Regarding the Cañadas-I height-diameter function, the error analysis showed similar results for all the GWR and LMM methods with and without calibration (Table 4).

The standardized residual plots showed that residuals are symmetrically distributed in a random pattern around the horizontal axis for any of the functions and fitting methodologies (Figures 4 and 5). Judging from the errors, it can be stated that GWR applied to generalized function works as well as the more frequently used LMM methodology.

Figure 4. Standardized Residuals versus predicted height for the LMM and the different GWR methodologies when applied to Petterson equation.

Figure 5. Standardized Residuals versus predicted height for the LMM and the different GWR methodologies when applied to Cañadas I equation.


Tree height is an important variable which is used for estimating stand volume or site quality (Diéguez-Aranda et al., 2009). Forest variables are typically highly heterogeneous and in this paper, two different methods were presented for modelling height-diameter relationships and their spatial heterogeneity. The first method was LMM, in which the height-diameter curves use fixed and random coefficients and which gives very accurate predictions when random effects can be predicted (Nanos et al., 2004). However, to calibrate the LMM prior to use, additional height measurements of trees within the stand are required (Lappi, 1997). The second method was GWR, which is a geostatistical method that can predict height-diameter models without the necessity for additional measurements. This effect is very interesting in case of large inventory areas such as national forest inventories, because the only available data are those collected in the inventory plots. These two methodologies were tested according to a local (Petterson) and a generalized (Cañadas I) height-diameter function as well as with different bandwidths in the case of GWR.

In the case of the Petterson model, it was observed that this method had insufficient predictive quality when GWR or uncalibrated LMM was applied. It is possible that the large heterogeneity of the studied area made it difficult for the Petterson function to reach the necessary quality. This could be due to the fact that the Petterson formula only had diameter at breast height as an independent variable and would appear to be too simple to reproduce the forest heterogeneity. Nevertheless, when Petterson equation was validated with calibrated LMM, a high improvement in the predictive quality was observed. In the case of height-diameter model, random component represents the specific conditions of every stand in the model. It causes better precision of the model for particular location in comparison with the model with fixed parameters. This could be important in a local model such as Petterson, designed specifically for giving precise prediction in individual forest stands (Diéguez-Aranda et al., 2009).

In contrast, the Cañadas I model revealed high stability in the model fitting and high predictive quality in the validation in both LMM and GWR. In these cases, R2 values in the validation improve from local to generalized regressions from 0.63 to 0.72 in calibrated LMM. Similarly, MAE and RMSE were lower for the generalized models. The comparison between local and generalized equations is analogous with previous studies which concluded that the inclusion of stand-density variables such as dominant height or dominant diameter in the base height–diameter function increased the accuracy of prediction (Temesgen & Gadow, 2004; Diéguez-Aranda et al., 2009).

The predictive quality of the model was improved when few additional height measurements were used for estimating the random parameters for the validation dataset. Calibration using small subsamples of trees is one of the main utilities of mixed models in forest applications (Calama & Montero, 2004). However, when additional measurements are not available GWR provides predictions with higher quality than the LMM.

In some of the GWR models the presence of odd, extreme values was observed. With the Petterson equation these values appear in every GWR model and did not correspond to specific plots but rather, were scattered randomly in the study area. This could be another indicator of the weakness of these models. However, in the case of Cañadas I, these odd values only appeared in the models with a bandwidth of 500 m and adaptive kernel. Specifically, this odd value was located in plot number 1473, which contained only 3 trees and was located in a relatively isolated area. Odd values have also been observed in similar height-diameter modelling studies (Zhang et al., 2004; Zhang et al., 2008; Zhang et al., 2009). It is important to be aware of the presence of these odd values in order to select the correct model. The inclusion of these values in the interpolation could lead to erroneous results in the GWR coefficient map.

In our study, the observed GWR models gave the same residuals regardless of the calibration used. However, according to Zhang et al., (2009), the GWR with adaptive kernel was expected to produce more desirable model residuals in terms of spatial autocorrelation and heterogeneity than the GWR with fixed kernel model for the plots. There are a wide range of calibration options so it is possible that future modifications of settings could improve the current results. For instance, it would be possible to test different bandwidths, or to add robustness to GWR (Fotheringham et al., 2002), or even to calculate the optimal bandwidth differently or the optimal number of trees with different kinds of weighted functions (Gollini et al., 2015).

The maps of coefficient in Figures 2 and 3 illustrate the spatial heterogeneity of the height-diameter relationship. In both figures it can be seen that there are greater slopes in both models in the upper-left area and in the bottom-right area. Additionally, more flat areas exist in the centre of the studied areas. In Fig. 2, two areas can be observed where a0 takes the highest values and a1 the lowest; therefore in Fig. 3, only the left side is visible. It would be interesting to compare these contours with local terrain layers (slopes, kind of soil, vegetation), in order to determine whether there is a relationship between fittings and local conditions. Since the spatial variation of local conditions and competition were taken into account in the GWR model, it showed better fitting results for the height-diameter regression relationship (Zhang et al., 2004).

When comparing the LMM and GWR models in this study, it was found that GWR estimation produces similar errors to the LMM when generalized equation was applied. Analogous results have been found in other studies, supporting our finding that GWR gives similar quality estimations to the LMM methodology and predicts the response variable accurately (Zhang et al., 2005; Zhang et al., 2008, Zhang et al., 2009). However, GWR showed not so good predictive quality with the Petterson function, showing no important improvement over LMM without calibration. We suppose that GWR quality could be improved by more sophisticated calibration, which needs a deeper investigation.

LMM and GWR provide different perspectives and insights for data analysis and model prediction. In LMM, the fixed component explains the impact of different variables as with the ordinary least squares regression (Yang & Huang, 2011), and the random component explains the heterogeneity and randomness by both known and unknown factors (Vonesh & Chiinchilli, 1997). LMM is able to characterize the spatial covariance structures in the data with different geostatistic models (Zhang et al., 2009), and produces accurate predictions for the response variable by accounting for the effects of spatial autocorrelation through the empirical best linear unbiased predictors (Littell et al., 1996; Schabenberger & Pierce, 2002). However, GWR has several very important advantages over LMM, which include the following: (1) It is a useful tool for exploring spatial nonstationarity at different scales (Jetz et al., 2005), which was achieved by changing the bandwidth of study. (2) Its capacity for testing the heterogeneity of the study area (Danlin & Yehua, 2013). The heterogeneity is shown in the coefficient maps (Figures 2 and 3) and can be tested numerically by applying the relative spatial heterogeneity index (Zhang et al., 2009). (3) GWR is able to work with spatial data and every kind of attribute data (Danlin & Yehua, 2013); this is particularly useful in the case of Forest Inventories where attributes of measured trees and their location are known. (4) The residuals for GWR are at least similar as for LMM (Zhang et al., 2004; Zhang & Shi, 2004; Zhang et al., 2009), as confirmed in our research. (5) GWR results, unlike LMM model results, are mappable; these maps of coefficients facilitating the interpretation based on spatial context and known characteristics of the study area (Goodchild & Janelle, 2004). The maps of coefficients in our study are shown in Figures 2 and 3. (6) In contrast to the LMM method, the GWR regression does not require any additional measurements or calibrations.


Two different methodologies were used to model local and generalized height-diameter relationships: the geostatistical method GWR and LMM methodology. Additionally, two different height-diameter functions were selected to fit these models: a local function and a generalized function. Finally, the quality of the estimated models was compared through statistical analysis.

As expected, it was observed that in general, both LMM and GWR provide better prediction quality when applied to a generalized height-diameter equation. However, error analysis revealed that there were no great differences between these two methodologies, evidencing that GWR applied to a generalized function provides results which are at least as good as those obtained using the calibrated LMM methodology. However, GWR has a number of advantages over LMM. For instance, GWR provides information on a continuous map of regression coefficients within the studied area. Consequently, is possible to calculate tree heights in every single location without the need for additional measurements. Furthermore, the GWR method can help us to understand spatial processes and may have interesting applications in the area of forest yield and production.


The authors would like to thank two anonymous reviewers and handling editor for their valuable comments that helped improve the quality of this study.


Adamec Z, 2014. Využití moderních regresních metod pro modelování výškové křivky. Doctoral thesis. Mendelova Univerzita v Brně. Czech Republic.
Adamec Z, Drápela K, 2015. Generalized additive models as an alternative approach to the modelling of the tree height-diameter relationship. J For Sci 61(6): 235-243.
Anselin L, Griffith DA, 1988. Do spatial effects really matter in regression analysis. Papers in Regional Science Association 65: 11-34.
Binbin L, Harris P, Charlton M, Brunsdon C, Nakaya T, Gollini I, 2015. R-CRAN Package “GWModel”. Available at:
Brunsdon C, Fotheringham S, Charlton M, 1996. Geographically Weighted Regression: A method for explore spatial non-stationarity. Geogr Anal 28 (4): 281-298.
Cabanillas A, 2010. Bases para la gestión de masas naturales de (Pinus halepensis Mill.) en el Valle del Ebro. Doctoral Thesis. Dept. de Ingeniería Forestal, Univ. Politécnica de Madrid, Spain.
Calama R, Montero G, 2004. Interregional nonlinear height-diameter model with random coefficients for stone pine in Spain. Can J For Res 34:150-163.
Cañadas N, García C, Montero G, 1999. Relación altura-diámetro para Pinus pinea L. en el Sistema central. Actas del Congreso de Ordenación y Gestión Sostenible de Montes, Santiago de Compostela, Spain. I: 139:153.
Charlton M, Fotheringham S, 2009. Geographically Weighted Regression. White Paper for Science Foundation, Ireland: 1-14.
Chen G, Zhao K, McDermid G, Hay G, 2012. The influence of sampling density on geographically weighted regression: a case study using forest canopy height and optical data. Int J Remote Sens 33 (9): 2909-2924.
Condés S, Sterba H, 2008. Comparing an individual tree growth model for Pinus halepensis Mill. in the Spanish region of Murcia with yield tables gained from the same area. Eur J Forest Res 127: 253-261.
Danlin Y, Yehua DW, 2013. Geographically Weighted Regression.
Diéguez-Aranda U, Rojo Alboreca A, Castedo-Dorado F, Álvarez González JG, Barrio-Anta M, Crecente-Campo F, González González JM, Pérez-Cruzado C, Rodríguez Soalleiro R et al., 2009. Herramientas selvícolas para la gestión forestal sostenible en Galicia. Dirección Xeral de Montes, Consellería do Medio Rural, Xunta de Galicia, Spain. 259 pp.
Drápela K, 2011. Regresní modely a možnosti jejich využití v lesnictví. Habilitační práce. Mendelova univerzita v Brně. Czech Republic.
Fortin MJ, Dale M, 2005. Spatial Analysis: A Guide for ecologist. First ed. Cambridge: Cambridge University Press, UK.
Fotheringham S, Brunsdon C, Charlton M, 2002. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships, John Wiley & Sons, New York, NY, 269 pp.
Gaffrey D, 1988. Forstamts-und bestandesindividuelles Sortimentierungsprogramm als Mittel zur Planung, Aushaltung und Simulation. Diplomarbeit Forscliche Fakultät, Univ. Göttingen. Germany.
Gollini I, Binbin L, Charlton M, Brunsdon C, Harris P, 2015. GWmodel: An R Package for Exploring Spatial Heterogeneity Using Geographically Weighted Models. J Stat Sof 63 (17): 1-50.
Goodchild MF, Janelle DG, 2004. Spatially integrated social science. New York, NY: Oxford University Press.
Hart HMF, 1928. Stamtal en dunning; een orienteerend onderzoek naar de beste plantwijdte en dunning swijze woor den djati. Veenman & Zonen. Wageningen. Netherlands.
Jetz W, Ahbek CR, Ichstein JWL, 2005. Local and global approaches to spatial data analysis in ecology. Global Ecol Biogeogr 14: 97-98.
Juárez de Galíndez M, Pece MG, Gaillard de Benítez C, Sanguedolce J, Mariot V, Mazzuco R, 2007. Ecuaciones altura-diámetro en Prosopis nigra (Griseb) Hieron (algarrobo negro) en Santiago del Estero, Argentina. Foresta Veracruzana 9 (1): 9-14. Recursos Genéticos Forestales. Xalapa, México.
Lappi J, 1997. A longitudinal analysis of height/diameter curves. Forest Sci 43:555-570.
Li H, Reynolds JF, 1994. A simulation experiment to quantify spatial heterogeneity in categorical maps. Ecology 75: 2446-2455.
Lindstrom MJ, Bates DM, 1990. Nonlinear mixed effects for repeated measures data. Biometrics 46: 673-687.
Littell RC, Milliken GA, Stroup WW, Wolfinger RD, 1996. SAS System for Mixed Models. SAS Institute Inc, Cary, NC.
MAGRAMA, 2015. Ministerio de Agricultura, Alimentación y Medio Ambiente. Ministra: Isabel García Tejerina.
Matthews S, Yang TC, 2012. Mapping the results of local statistics: Using geographically weighted regression. Demographic Res 26(6): 151-166.
Mehtätalo L, 2004. A longitudinal height-diameter model for Norway spruce in Finland. Can J Forest Res 34(1): 131-140.
Meng SX, Huang S, 2009. Improved calibration of nonlinear mixed-effects models demonstrated on a height growth function. Forest Sci 55(3): 238-248.
Milne BT, 1991. Heterogeneity as a multi-scale characteristic of landscape. In: J Kolasa and TA Pickett, Ecological Heterogeneity: 69-84. Springer-Verlag, New York.
Nanos N, Calama R, Montero G, Gil L, 2004. Geostatistical prediction of height/diameter models. Forest Ecol Manag 195: 221-235.
Páscoa F, 1987. Estructura, crescimento e produçao em povoamentos de pinheiro bravo. Um modelo de simulaçao. Tesis doctoral. Universidade Técnica de Lisboa. Portugal.
Petterson H, 1955. Barrskogens volymproduktion. Meddelanden från Statens skogsforskningsinstitut 45 (1A): 1-391.
R Development Core Team, 2014. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
Robinson GK, 1991. That BLUP is a Good Thing: The Estimation of Random Effects. Statistical Sci 6(1): 15-32.
Salas C, Ene L, Gregoire T.G, Naesset E, Gobakken T, 2010. Modelling tree diameter from airborne laser scanning derived variables: A comparison of spatial statistical models. Remote Sens Environ 144: 1277-1285.
Schabenberger O, Pierce FJ, 2002. Contemporary Statistical Models for the Plant and Soil Sciences. CRC Press, Boca Raton, FL, 738.
Sloboda VB, Gaffrey D, Matsumura N, 1993. Regionale und lokale Systeme von Höhenkurven für gleichaltrige Waldbestände. Allg Forst-u J-Ztg 164: 225-228.
Tardanico R, 2006. GWR&2D Mapping. Florida International University.
Temesgen H, Gadow K, 2004. Generalized height-diameter models—an application for major tree species in complex stands of interior British Columbia. Eur J For Res 123: 45-51.
Turner MG, 1989. Landscape ecology: the effect of pattern on process. Ann Rev Ecol Syst 20: 171-197.
Vonesh EF, Chiinchilli VM, 1997: Linear and nonlinear models for the analysis of repeated measurements. Boca Raton, Chapman & Hall/CRC: 560.
Wang Q, Ni J, Tenhunen J, 2005. Application of a geographically weighted regression to estimate net primary production of Chinese forest ecosystem. Global Ecol Biogeogr 14: 379-393.
Watson DF, Philip GM, 1985. A Refinement of Inverse Distance Weighted Interpolation. Geoprocessing 2:315-327.
Wiens JA, 1976. Population responses to patchy environments. Ann Rev Ecol Syst 7: 81-120.
Yang Y, Huang S, 2011: Comparison of different methods for fitting nonlinear mixed forest models and for making predictions. Can J Forest Res 41: 1671-1686.
Zhang L, Bi H, Cheng P, Davis CJ, 2004. Modelling spatial variation in tree diameter-height relationships. Forest Ecol Manag 189: 317-329.
Zhang L, Gove JH, Heath LS, 2005. Spatial residual analysis of six modelling techniques. Ecol Model 186: 154-177.
Zhang L, Ma Z, Guo L, 2008. Spatially assessing model errors of four regression techniques for three types of forest stands. Oxford Journals 81(2): 209-225.
Zhang L, Ma Z, Guo L, 2009. An evaluation of spatial autocorrelation and heterogeneity in the residuals of six regression models. Forest Sci 55: 533-548.
Zhang L, Shi H, 2004. Local Modelling of Tree Growth by Geographically Weighted Regression. Forest Sci 50 (2): 225-244.