Modelling of the leaf area for various pear cultivars using neuro computing approaches

Aim of study: Leaf area (LA) is an important variable for many stages of plant growth and development such as light interception, water and nutrient use, photosynthetic efficiency, respiration, and yield potential. This study aimed to determine the easiest, most accurate and most reliable LA estimation model for the pear using linear measurements of leaf geometry and comparing their performance with artificial neural networks (ANN). Area of study: Samsun, Turkey. Material and methods: Different numbers of leaves were collected from 12 pear cultivars to measure leaf length (L), and width (W) as well as LA. The multiple linear regression (MLR) was used to predict the LA by using L and W. Different ANN models comprising different number of neuron were trained and used to predict LA. Main results: The general linear regression LA estimation model was found to be LA = -0.433 + 0.715LW (R2 = 0.987). In each pear cultivar, ANN models were found to be more accurate in terms of both the training and testing phase than MLR models. Research highlights: In the prediction of LA for different pear cultivars, ANN can thus be used in addition to MLR, as effective tools to circumvent difficulties met in the direct measurement of LA in the laboratory. Additional keywords: Pyrus communis L.; artificial neural networks; multiple linear regressions; model estimation. Abbreviations used: ANN (artificial neural network); L (length); LA (leaf area); LM (Levenberg Marquadt); MAD (mean absolute deviation); MAPE (mean absolute percentage error); MLR (multiple linear regression); MSE (mean square error); R2 (determination coefficient); VIF (variance inflation factors); W (width). Authors’ contributions: Conceived, designed and performed the experiments: OA and DH. Analyzed the data: CB and KE. All authors wrote and approved the final manuscript. Citation: Ozturk, A; Cemek, B; Demirsoy, H; Kucuktopcu, E (2019). Modelling of the leaf area for various pear cultivars using neuro computing approaches. Spanish Journal of Agricultural Research, Volume 17, Issue 4, e0206. https://doi.org/10.5424/sjar/201917414675 Received: 08 Feb 2019. Accepted: 11 Dec 2019. Copyright © 2019 INIA. This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC-by 4.0) License. Funding: The authors received no specific funding for this work. Competing interests: The authors have declared that no competing interests exist. Correspondence should be addressed to Ahmet Öztürk: ozturka@omu.edu.tr


Introduction
The common pear (Pyrus communis L.) comes from the second largest temperate fruit tree globally and is cultivated worldwide. Overall, world pear production reached 24,168,309 tonnes in 2017, with Turkey producing 503,004 tonnes and thus ranking fifth place in pear production in the world that year (FAO, 2019). According to the FAO report on the state of world's plant genetic resources for food and agriculture at least 1,140 pear accessions are present in world-wide exsitu collections (Federico et al., 2008) and more than 600 cultivars have been reported in Turkey (Özbek, 1978).
Leaf area (LA) is commonly evaluated in fruit physiology experiments. The determination of LA is an important criterion in understanding respiration, transpiration, evapotranspiration, photosynthesis, light interception, water and nutrient use, fertilization, irri ga tion, plant growth, flowering, fruit set, yield, and fruit quality (Smith & Kliewer, 1984;Smart, 1985). The LA prediction models provide many benefits to researchers in horticultural experiments. Moreover, these models allow researchers to measure LA on the same plants during the plant growth period and may reduce variability in experiments (Gamiely et al., 1991;NeSmith, 1992;Demirsoy, 2009).
The LA can be determined by both direct and indirect methods. The direct methods, called destructive methods, require destroying all the leaves in the plant canopy; therefore, it is not possible to take successive measurements from the same leaf Spann & Heerema, 2010). Direct methods are generally simple and accurate although the measurements involved can be laborious and time-consuming making it difficult to get a representative spatial sample and making large-scale implementation only marginally feasible (Rouphael et al., 2006). Recently, new instruments, tools and machines such as hand-scanners and laser-optic devices have been developed for some LA measurements. However, these are too expensive and complex for basic and simple studies. In contrast, indirect methods, called non-destructive methods, are simple, rapid, inexpensive, and save labour relative to the direct measurements, and they can provide precise LA estimates without damaging the plant (Robbins & Pharr, 1987;Ahmadian-Moghadam, 2012).
In recent years, a number of artificial neural network (ANN) studies have been performed in agricultural fields and successful results have been achieved. Yuan et al. (2017) evaluated the performance of random forest (RF), ANN, and support vector machine (SVM) regression models comparing them with a partial least-squares regression (PLS) model for heterogeneous soybean crops. The results showed that the ANN model could be used for estimating leaf area index (LAI). Kumar et al. (2017) developed an ANN model including L and W as inputs and compared them with regression models. They found that the ANN model is more accurate than the regression model. Küçükönder et al. (2016) used ANN and regression analysis techniques for developing the best LA estimation model and concluded that the ANN approach can be used as an alternative method in estimating the LA. Shabani et al. (2017) used ANN in order to estimate the LA of different plants and stated that ANN provides a good estimation of LA. However, they emphasized that ANN is applicable for all plant species, but it is necessary to establish specific equations for each plant in the application of other models.
Though, in the literature, plenty of different equations have been derived for various plants and cultivars, the coefficients, as well as the kinds of equations, are plant specific (Kumar & Sharma, 2013). Although some studies including pear cultivars (Pyrus pyrifolia cv. 'Nijiseiki') were made to determine the LA in the shoots and spurs (Kumar et al., 1977;Spann & Heerema, 2010), there have been no attempts to develop a non-destructive LA prediction model and ANN for only the pear. In the previous studies, other fruit species and pears were examined together. The development of the individual LA model for a single species is more important in terms of developing a basic and correct equation for the individual LA. Thus, this study aims to address this inadequacy by incorporating the ANN and comparing their performance for the pear by linear measurements of leaf geometry with multiple linear regressions (MLR) in pear.

Material and methods
The study was carried out on 12 pear cultivars in Samsun in Turkey from 2014 to 2015 to improve a leaf area prediction model and ANNs model. The examined pear cultivars were grafted on ˈBA 29ˈ quince rootstock. To develop the LA prediction model and validation, these 12 pear cultivars, which are commonly grown and economically important, with different numbers of leaves were selected (Table 1). The full expanded different sized leaf samples were randomly taken from the tree canopy in the actively growing season in onemonth intervals (three months; June, July, August) during two investigation years, i.e. a total of 2975 leaves.
Initially, each leaf was taken and placed on an A4 sheet and copied (at a 1:1 ratio) with a photocopier. A placom digital planimeter (Sokkisha Planimeter Inc., Model KP-90) was used to measure the actual leaf area of the copy. The leaf width (W) and length (L) of the leaf samples were also measured for model construction. W (cm) was measured from tip to tip at the widest part of the lamina and L (cm) was measured from the lamina tip to the point of petiole intersection along the midrib (Fig. 1). All values were recorded to the nearest 0.1 cm.

Model construction for multiple linear regressions (MLR)
The most common method used in the estimation of LA is to employ MLR. To determine the model, MLR analysis using the stepwise method was employed. MLR analysis of the observed data was performed from 12 pear cultivars. For this reason, analysis was conducted with various subsets of the independent variables, namely, L, L 0.5 , L 2 , W, W 0.5 , W 2 , LW, L 2 W 2 , (L+W) and (L+W) 2 to develop the best model for predicting LA by using the Microsoft Office 2015 Excel package program. The MLR analysis was carried out until the deviation sum of squares was minimized.
When two or more variables (i.e. L and W) in the model are correlated, the problem of collinearity may occur. The collinearity provides redundant information about the response. In order to detect collinearity, variance inflation factors (VIF) for each  predictor were calculated (Mansfield & Helms, 1982). The VIF was calculated as, is the coefficient of determination of the model that includes all predictors except the j th predictor. If VIF values are less than 10, then there is no problem with collinearity.

Artificial neural networks (ANN)
The multilayer perception (MLP) is the most broadly used model of neural networks (Fig. 2). The MLP comprises a set of simple interconnected units (neurons or nodes). The nodes are linked by weights and output signals which are a function of the sum of the inputs to the node modified by a simple nonlinear transfer, or activation, function (Haykin, 1994;Gardner & Dorling, 1998).
For processing ANN, the MATLAB (R2010b) software program was used. Levenberg Marquadt (LM) back propagation was employed to train the network as a learning algorithm. This algorithm was selected because it is a more effective and easy to learn model for complex networks than the standard back propagation (Sapna et al., 2012). The ANN structure used in this study contains only one hidden layer, because many theoretical and experimental results confirm that one hidden layer can be enough for forecasting problems (Cybenko, 1989;Hornik et al., 1989). Neurons in the input layer had no transfer function, while for hidden and output nodes there were tangent sigmoid (tansig) and linear transfer functions (purelin), respectively. The number of neurons in the hidden layer changed from 3 and 5 to 7 to achieve the optimal training network. To determine the final network architecture with the most ideal neuron numbers, a trial-and-error procedure was undertaken, starting from 20 iteration steps to 100 steps in increases of 10 steps.

Data pre-processing
Data was divided into 70% for training and 30% for testing both MLR and ANN. For purposes of bringing all the data, that is, input and output data into a comparable range the training and testing data was standardized to fall in a range between 0 and 1 as given in formula (1) where X n is the standardized value; X i is the observed value; X min and X max are the minimum and maximum values of the training and testing data, respectively.
Standardization has the advantage of eliminating the effect of absolute magnitude values which neural networks are sensitive to. The performances of different models were evaluated by using the determination coefficient (R 2 ), mean square error (MSE), mean absolute deviation (MAD) and mean absolute percentage error (MAPE). where LA mea is the measured LA value; LA est is the estimated LA value; LA mea and LA est represent the average values of measured and estimated LA, respectively; and n is the number of da ta consi dered. Higher values of R 2 and lower values of MSE, and MAD indicate a better forecast accurateness by the model. The R 2 measures the extent of the linear association of two variables and values range from 0 to 1. A value of 1 indicates a perfect agreement between the measured and predicted values and a value of 0 indicates no agreement which shows poor model performance. If MAPE values are less than 10%, the model is considered to have a high degree of accuracy (Lewis, 1982).

MLR model
Collinearity between L and W was evaluated before the developed LA model in different pear varieties. As the VIF values were less than 10, there was no collinearity between L and W. Thus, these variables can be taken in the LA models. The best models of MLR for different pear cultivars are presented in Table 2. As seen in Table 2, both L and W variables are compulsory to estimate pear LA. Results showed that all MLR models provide a high degree of correlations between LA and L-W. The R 2 values were between 0.932 and 0.992, MAPE values were between 3.043% and 9.235%, MSE values were between 0.781 and 4.151, and MAD between 0.641 and 1.666 in all MLR models. Among these MLR models, the model for the ˈDeveciˈ pear cultivar [LA = -0.837 + 0.722LW] was the best model based on the selection criteria (higher R 2 , and lowest

ANN model
The second approach to estimate LA is the ANN method. The best models of training and testing for ANN are presented in Table 3. According to the findings, MLP (2,3,1, LM) gives the best results for ˈDeveciˈ, with MAPE values of 4.293 and 3.426, MSE values of 0.524 and 0.530, MAD values of 0.539 and 0.500, and R 2 values of 0.995 and 0.986 for training and testing, respectively. This result means that the ANN model was able to explain 99.50% of variability in LA in training data, and 98.60% of variability in testing data, when L and W were used as input data. Results for the training and testing phase of all pear cultivars and the ˈDeveciˈ cultivar are also graphically given in Fig. 4. As can be seen in Table 3, ˈAkçaˈ also gives the worst estimate with the highest MSE and MAE. The weight and bias values used in the formulation of MLP (2,3,1, LM) are given in Tables 4 and 5.

Discussion
Leaf area is associated with many agronomical and physiological processes, including photosynthesis, res piration, transpiration, canopy light interception, irrigation, fertilization, pruning, fruit thinning and water use efficiency. As there is an integral relationship between leaves and many of these physiological processes, it is often necessary to estimate LA nondestructively in order not to disturb the system, or to allow for multiple measurements to be made. Because of the integral relationships between green leaves and the physiological processes detailed above, the excision 7   of leaves from the plants is necessary. Furthermore, it is not possible to make successive measurements of the same leaf, and the plant canopy would be damaged and cause problems in other measurements (Tsialtas & Maslaris, 2005). This is why several combinations of measurements and models relating L and W to LA have been developed for several fruit trees; however, until now, no models have been developed for pears.
In this study, we developed a simple, easy and reliable LA prediction model for pear cultivars using MLR. There were no significant differences between the predicted and actual LA in any of the pear cultivars. For this reason, the model can be reliably used in pear physiological studies and research aiming to develop more productive and efficient pruning and training methods. In the previous studies, Kumar et al. (1977) and Spann & Heerema (2010) have developed a LA prediction model in some fruit species including pears. Kumar et al. (1977) determined that the K factor in the pear was between 0.706 and 0.756 and showed that the formula can be used reliably. Spann & Heerema (2010), who tried to develop a LA prediction model in order to determine the total LA on a shoot for some fruit species, determined the R 2 of the developed model for the ˈNijiseikiˈ pear cultivar as 0.93. In our study, the confidence coefficient of the LA model was determined for both cultivars and in general was determined to be more reliable than in the previous studies. In addition, the size of the leaves on the plant may vary depending on the position of the leaf on the shoot or spur. Since the leaves on the spurs show a close spacing form, the necessity of determining their area individually is emphasized (Heerema et al., 2008). The estimation model we obtained in this study was more effective than the previous studies on pears in predicting the area of the leaves from all different parts of the plant, and the study only included pears. In addition, validation of the models identified in this study was performed. Validation of the generated LA estimation models is very important for the reliability of the model (Demirsoy & Lang, 2010).
Confidence in the precision of these models would provide researchers with a reasonably fast and inexpensive method to use in studies on plant physiology such as respiration, transpiration and photosynthesis  (Demirsoy & Demirsoy, 2003), grape (Williams & Martinson, 2003;Tsialtas et al., 2008), peach (Demirsoy et al., 2004), strawberry (Mandal et al., 2002;Demirsoy et al., 2005), chestnut (Serdar & Demirsoy, 2006), hazelnut , kiwi Zenginbal et al., 2007), medlar , persimmon , pecan (Torri et al., 2009), citrus (Mazzini et al., 2010), pomegranate (Meshram et al., 2012), walnut (Keramatlou et al., 2015), and apricot (Cirillo et al., 2017). ANN applies for many objectives in agricultural research such as: crop yield and fruit weight prediction, evapotranspiration prediction, soil parameter estimation, water demand forecasting and hydrological forecasting (Shabani et al., 2017). Therefore, ANNs are becoming a popular tool for modelling complex inputoutput dependencies (Maren et al., 1990). Different researchers have shown that ANN models often give better results than traditional methods (Moosavi & Sepaskha, 2012). This ability of ANN models depends on modelling complexity and nonlinearity that is overlooked by traditional statistical regression models and is due to the architecture of an ANN, which allows highly correlated inputs to be used to enhance the ANN's modelling capability. Thus, ANN models are becoming a widespread prediction method because some statistical assumptions that are significant for forecasting regression models are now unnecessary (Ercanlı et al., 2018).
Also, in addition to MLR, we developed ANN models to estimate LA. When comparing the results of two methods, ANN models were better than MLR with L, W inputs for LA estimation considering MSE, MAD, MAPE and R 2 criteria for both the training and testing phases (Tables 2 and 3). As the ANN models use non-linear relations with input and output data, we were able to estimate LA with a high degree of accuracy as compared to the MLR. In the prediction of LA for different pear cultivars, ANNs can thus be used in addition to MLR, as an effective tool to circumvent difficulties in the direct measurement of LA in the laboratory. Similar results were obtained by Ahmadian-Moghadam (2012), Küçükönder et al. (2016), and Kumar et al. (2017). From these results, it can be concluded that ANN can be used for all plants, but a specific equation should be prepared in applications of other methods for each plant.
In conclusion, in this study, different learning algorithms of ANNs and MLR were employed and their performances assessed by using L and W for estimating LA. The results of ANNs were compared to those of MLR and one another statistically using R 2 , MSE and MAD and MAPE. Comparative analysis of the network model results indicated that all models generally performed well for LA estimation during training and testing phases. It is clear from the study that the general predictions by all ANNs and MLR models were good, with ANN models being slightly better in predicting LA than MLR. As a result of this, in the prediction of LAs for different pear cultivars, ANNs can thus be used in addition to MLR, as an effective tool to circumvent difficulties met in the direct measurement of LAs in the laboratory.