Short Communication. A comparison of estimation methods for fitting Weibull and Johnson’s S B functions to pedunculate oak ( Quercus robur ) and birch ( Betula pubescens ) stands in northwest Spain

Aim of study. In this study we compared the accuracy of the Weibull and the Johnson’s S B functions for describing diameter distributions in pedunculate oak ( Quercus robur L.) and birch ( Betula pubescens Ehrh.) stands. Material and Methods. A total of 172 diameter distributions in pedunculate oak and 202 in birch stands were finally evaluated. We compared the accuracy of three commonly used estimation methods of the Weibull and four estimation methods of the Johnson’s S B functions for describing these diameter distributions. Main results. For Quercus robur L. stands, the most suitable methods were the Percentiles followed by Maximum Likelihood for the Weibull PDF and the method of Moments for the Johnson’s S B PDF. For Betula pubescens Ehrh. stands, the best fits obtained with the Percentiles and Maximum Likelihood methods were also superior to the method of Moments, whereas the Conditional Maximum Likelihood and method of Moments provided the best results for the Johnson’s S B PDF, depending on the statistic and the value of the location parameter considered. Research highlights. Both distributions were suitable. The results were better for pedunculate oak than for birch stands.


Data set
In total, 172 diameter distributions in pedunculate oak and 202 in birch stands were finally evaluated in the present study. The size of the plots ranged from 200 m 2 to 1,000 m 2 in birch stands. In the pedunculate oak stands, plots ranged in size from 225 m 2 to 1,345 m 2 (depending on the stand density), to achieve a minimum of 30 trees per plot in both species.
Two perpendicular diameters at breast height were measured with calipers, to the nearest 0.1 cm, and the arithmetic average was calculated. The empirical data represent left-truncated distributions in many cases, as the smallest diameter measured in the field was 5 cm. In total, 16,210 diameter measurements in birch stands and 10,248 measurements in oak stands were available for analysis. Summary of main stand variables for the study stands are shown in Table 1.

Model fitting
The Weibull function The three-parameter Weibull CDF is obtained by integrating the Weibull PDF, and has the following expression for a continuous random variable x: where F(x) is the cumulative relative frequency of trees with diameter equal to or smaller than x, a is the location parameter, b is the scale parameter and c is the shape parameter. Three methods of estimating the parameters of the Weibull distribution were compared: Percentiles (PW), Maximum Likelihood (ML) and Moments (MMW) used by Zhang et al. (2003).
Location parameter a of the function was considered in all fitting methods as d min -c, with c = 5%, 10%, 15% and 20% of the minimum diameter observed in each distribution. Similar values proposed Zhang et al. (2003) if diameters of 10 cm are considered.

The Johnson's SB function
The model of the S B PDF (Johnson, 1949) has the following expression for a continuous random variable x: The model is characterized by the location parameter ε, the scale parameter λ, and the shape parameters γ and δ (asymmetry and kurtosis parameters, respectively). Four methods of estimating the Johnson´s S B parameters were compared: Conditional Maximum Likelihood (CML), Moments (MMJ), Mode (MJ) and Knoebel and Burkhart's (KB) method (Knoebel and Burkhart, 1991).
Location parameter ε of the function was considered in the CML, MMJ and MJ methods as d min -c, with c = 5%, 10%, 15% and 20% of the minimum diameter observed in each distribution. In these fitting methods, the scale parameter λ of the function was considered as the maximum diameter observed in each distribution (d max ).
In the f its with the KB method, the location and scale parameters (ε and λ) were predetermined according to Knoebel and Burkhart (1991).

Model comparison
The consistency of the model and the fitting method used were evaluated by the bias, mean absolute error (MAE), and mean square error (MSE), with the following expressions: where Y i is the relative frequency of trees observed value in each diameter class, Ŷ i is the theoretical value predicted by the model, and N is the number of data points.
The Bias, MAE and MSE values were calculated for each fit in the mean relative frequency of trees for all combinations of diameter classes (1 cm) and plots. The Weibull PDF was used for reliable comparison of results instead the CDF.
The Kolmogorov-Smirnov (KS) statistic (D n ) for a given cumulative distribution function F(x) was also used to evaluate and compare the results as Cao (2004): where sup x is the supremum of the set of distances, where the cumulative observed frequency F n (x) is compared with the cumulative estimated frequency F 0 (x).

Results and discussion
The mean values of bias, mean absolute error (MAE), mean square error (MSE) in relative frequency of trees, and the mean value and the standard deviation of the Kolmogorov-Smirnov statistic (D n ) for the fits in Quercus robur stands (N = 172 plots) and the corresponding statistics for Betula pubescens stands (N = 202 plots) are shown in Table 2. The number of trees per ha observed and fitted by four methods for the Johnson's S B PDF and three methods for the Weibull PDF, with c = 10% of the minimum observed diameter in four plots of Quercus robur and Betula pubescens stands, are shown in Fig. 1.
Both functions were suitable for fitting diameter distributions in pedunculate oak and birch stands in northwest Spain. The Johnson's S B PDF provided the best results for the KS statistic (D n ) in both species except for MJ method, while the Weibull PDF generally provided the best fits, in terms of MAE and MSE. Bias may be less important in the comparison of results because errors with different signs can be compensated. The results were more accurate in pedunculate oak than in birch stands for all statistics compared.
For the fits of the Weibull PDF to Quercus robur L. stands, the most accurate results were obtained generally with the method of the Percentiles (PW), in terms of the MAE, MSE and D n statistics and considering all the four parameters of location compared. However, the lowest value of the D n statistic was obtained with the Maximum Likelihood (ML) approach, considering c = 5% of the minimum observed diameter. The most suitable value of c was 10% in terms of MSE in all three f itting methods. For the D n statistic, the best results were obtained with c = 20% in PW, c = 5% in ML and c = 10% in MMW. Different values of the location parameter a of the Weibull PDF were computed in several studies (Río, 1999;Zhang et al., 2003;Cao, 2004;Palahí et al., 2007;Gorgoso et al., 2012).
For the Johnson's S B PDF, the lowest values of D n were obtained with the Moments approach (MMJ), with c = 5% and c = 10%. The mode (MJ) method clearly provided the poorest results. Good fits with this method in same plots are shown in Fig. 1; however, in other cases the fits are clearly more biased than with the other methods (see the B. pubescens plot: 1BAR). Another problem with this method was in determining the mode value in same plots. Finally, the KB method was slightly inferior to MMJ and CML in terms of D n , but values of MAE and MSE were similar to those obtained by the CML when c = 10%. Different values of the location parameter in Johnson's S B function have been compared in several studies (Knoebel & Burkhart, 1991;Zhou & McTague, 1996;Zhang et al., 2003;Scolforo et al., 2003;Parresol, 2003;Palahí et al., 2007;Fonseca et al., 2009;Gorgoso et al., 2012).
In the fits of the Weibull PDF to Betula pubescens Ehrh. stands, the most accurate results were obtained with the ML method in terms of the KS statistic (D n ), with c = 5% and c = 10% of the minimum observed diameter in the plot. However, in all cases, the best results of MAE and MSE were obtained by the percentiles method (PW), i.e. the results were similar to those for pedunculate oak stands. The most appropriate value of the location parameters in terms of MSE were obtained when c = 15% in PW and ML approaches and when c = 20% with MMW. Nevertheless, higher values of the KS statistic were obtained with low values of the location parameter, except in case of the PW, for which the smallest value was obtained with c = 20%.
In case of the Johnson's S B PDF, the lowest values of D n were obtained with the Conditional Maximum Likelihood approach (CML), with c = 5% followed by c = 10%. However, the best results in terms of MAE and MSE were obtained with the method of Moments (MMJ) with c = 20%. The method of the Mode (MJ) was not suitable for this stands and in 3 of the plots the mode value was not able to be computed. In relation to the KB, this method was slightly inferior to MMJ and CML in terms of D n but provided similar values of MAE and MSE as CML and MMJ when c = 10% in both cases.
In both species results in terms of KS statistic were better with the Johnson's S B function although this four-parameter model is more complex than the threeparameter Weibull distribution.