An artificial neural network model to predict the effective work time of different agricultural field shapes

The aim of this study was to find a model able to extract the net time per unit of net worked area from different agricultural field basic shapes (square, circle, rectangle and triangle) considering the following variables: field gross area, working speed, number of turnings (these depending on the effective working width), side length parallel and orthogonal to working direction, and working direction type. Being this a non-linear problem, an approach based on artificial neural networks is proposed. The model was trained using an artificial dataset calculated for the various shapes (internal test) and then tested on 47 different agricultural operations extracted by a real field dataset for the estimation of the net time (external test). The net time records obtained from both, the trained model and the external test, were correlated and the performance parameter r was extracted. Both regression coefficients (r), for the training and internal test, appear to be excellent being equal to 0.98 with respect to traditional linear approach (0.13). The variable “number of turnings” scored the highest impact, with a value equal to 44.34% for the net time estimation. Finally, the r correlation parameter for the external test resulted to be very high (0.80). This information is very valuable of the use of information management system for precision agriculture. Additional keywords: multivariate statistics; precision agriculture; non-linear modelling; logistics; agricultural productivity. Abbreviations used: ANN (artificial neural networks); CIOSTA (Commission Internationale de l’Organisation Scientifique du Travail en Agriculture); ET (effective work time); EWW (effective working width); FGA (field gross area); FS (field shape); GIS (Geographic Information Systems); GRNN (generalized regression neural network); Harea (headland area); HW (headland width); MLR (multiple linear regression); NT (net time); NW (Net Worked area); PN (passes number); RDP (Rural Development Programs); RMSE (Root Mean Squared Error); RWWS (return without working speed); SL (side length); SLO (opposite side length); SPXY (sample set partitioning based on joint x-y distances); TAT (tractor turning-around time); TL (turning length); TN (turnings numbers); TPL (total pass length); TS (turning speed); WDT (working direction type); WS (working speed). Authors’ contributions: All the authors equally contributed to the writing of the paper and to its content. Citation: Fedrizzi, M.; Antonucci, F.; Sperandio, G.; Figorilli, S.; Pallottino, F.; Costa, C. (2019). An artificial neural network model to predict the effective work time of different agricultural field shapes. Spanish Journal of Agricultural Research, Volume 17, Issue 1, e0201. https://doi.org/10.5424/sjar/2019171-13366 Received: 23 Apr 2018. Accepted: 07 Mar 2019. Copyright © 2019 INIA. This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC-by 4.0) License. Funding: Italian Ministry of Agriculture, Food and Forestry Policies, MiPAAF (project AGROENER, D.D. n. 26329). Competing interests: The authors have declared that no competing interests exist. Correspondence should be addressed to Costa Corrado: corrado.costa@crea.gov.it


Introduction
The shape structure of agricultural fields significantly affects the farm profitability. Proper setup and arrangement of farm parcels is the primary criterion enabling an efficient production strategy (Gawroński & Jasnowska, 2007). Growing spatial irregularities of the field structure, negatively affected the process of their agricultural operations with an increased required time and forcing to undertake relevant works arrangement (Schultz, 1964). The availability of technology now allows the use of advanced solutions to improve efficiency of field operations (Kwinta & Gniadek, 2017). For example, most agricultural field operations involve a number of highly interconnected tasks executed by co-operating heterogeneous agricultural machines. Such a multiple machinery system is normally involved in both, output material flow operations (such as harvesting), as well as in input material flow ones (e.g., spraying and fertilizing). These operations require considerable efforts in terms of management and planning tasks (Bochtis & Sørensen, 2009). Some research works, mainly related to the implementation of agricultural automated navigation systems, tackle the problem of the optimization of the working time through linear and non-linear algorithms (Backman et al., 2012;Kraus et al., 2013;Kayacan et al., 2014). The implementation of technological advanced solutions purposed by actual research studies (i.e., modern data processing algorithms contained in the GIS software) can help building in decision-making tools to improve the agricultural work arrangement.
The efficiency of field operations has traditionally been analyzed by manually measuring the time spent for each agricultural field operation such as the number of turnings, distance travelled in headlands, vehicle characteristics (for example the larger the turning radius is the larger the headlands and the larger the overhead would be) etc. The estimation of operating costs of agricultural and forestry machineries is a key factor in both planning agricultural policies and farm management, and, for this estimation, the time spent in field operations is strictly necessary (Guerrieri et al., 2016).
Generally, agricultural field is not uniform in shape. As reported by Oksanen (2013), for rectangular field, the analysis of agricultural operation efficiency and path planning for machines is pretty straightforward and can be copied from field to field. However, in regions where the agricultural lands are not rectangular, no two identical shapes exist, and the coverage path planning has to be specific for each agricultural unit. However, often complex shapes can be assimilated to a multiplicity of simple shapes.
In this study, only several items of the Commission Internationale de l'Organisasion Scientifique du Travail en Agriculture (CIOSTA) (Manfredi, 1971;Biondi, 1999) method were taken into consideration for the estimation of agricultural operation efficiency starting from their shapes. These operations regard the effective work time (ET) and the tractor turningaround time (TAT) (which together represent the net time, NT). These variables (ET, TAT and NT) were determined by means of theoretical methods for agricultural fields with different surface areas (from 1 ha to 50 ha) and shapes: square, circle (quarter of a circle and semicircle), rectangle and triangle (isosceles and scalene).
The aim of this study was to find a model able to extract the NT from different agricultural fields of basic shapes and other variables to be easily collected by the farmer. Being this a non-linear problem, an artificial neural networks (ANNs) approach is proposed. This method allows a higher performance in predicting NT when compared with a multivariate linear approach and could serve to implement other similar multivariate approaches used to estimate, costs, consumes and emissions. The model was trained using an artificial dataset calculated for different shapes and then was tested on a real field dataset.

Data collection
An artificial model was constructed to calculate the NT. The concept of artificial model has been developed by Abramo et al. (2015) and consists in combining fixed values (qualitative or quantitative) of each variable to cover the potential variability of a real dataset in a combinatory fashion. This dataset is composed by four independent quantitative, two qualitative variables and three constants. The quantitative variables are: field gross area (FGA; m 2 ), working speed (WS; m s -1 ), effective working width (EWW; m) and the tractor speed during the return without working (RWWS; m s -1 ). The qualitative variables are: field shape (FS; Fig. 1) and working direction type (WDT; one-way or not where one-way considers the working process only on the way there and not on the way back). The constants are: turning speed (TS; 0.84 m s -1 ), turning length (TL; 15 m) and headland width (HW; 5 m). For the sake of simplicity, each turning was considered to have a length of 15 m, empirically estimated considering the tractor manoeuvring in the headland, for any type of work. All these independent variables and constants have been used to calculate the NT per unit of net worked area (h ha -1 ). Ten basic shapes have been chosen. The fixed values for each independent variable/constant used for the construction of the artificial dataset are reported in Table 1.
For each basic shape, the artificial estimation of NT was conducted based on the following formulas and on the Fig. 2 representing a field shape example (a rectangle with ratio sides 1:2 with working direction parallel to the longer side length from Fig. 1): (1) where, PN is the passes number and SL is the side length orthogonal to the working direction (AB in Fig.  2). (2) where TN is the turnings numbers.

(3)
where H area is the headland area, SLO is the opposite side length orthogonal to the working direction (CD in Fig. 2) and HW is AA 1 or CC 1 (equal to 5 m) in Fig. 2.
where NW area is the net worked area (A 1 , B 1 , C 1 , D 1 area in Fig. 2) and gross area is the total field surface (A, B, C, D area in Fig. 2).

(5)
where TPL is the total pass length.
In some field shapes (1 to 4, 6, 8, 9 in Fig. 1), it was necessary to create two headlands in correspondence with the opposite sides, while in the semicircle (5 in Fig. 1) and in the rectangle triangles with orthogonal working direction to the hypotenuse (7 and 10 in Fig.  1), the headlands develop along the field margins. In triangular shapes (from 6 to 10 in Fig. 1), quarter circle (2 in Fig. 1) and semicircle (5 in Fig. 1), due to the progressive reduction of the pass length, it was imposed that the working direction terminated when the length of the straight pass was minor to 15 m.
For the construction of the artificial dataset used to build the model, have been assigned some fixed values for each variable/constant (Table 1). This on the base of direct experiences in Italy and covering the most realistic range of observations. The total combination of these fixed values brings to obtain a total of 8400 artificial records.  Working direction type Oneway No oneway

Turning length (m) 15
Headland length (m) 5 or not where one-way considers the working process only on the way there and not on the way back)]. The seven independent variables represent the x-block. This artificial dataset (7 variables and 8400 records) was used to build a model to estimate the NT per unit of net worked area (h ha -1 ) (y-block). The NT estimation of basic shapes could be combined to obtain NT of complex ones. The model for NT estimation was built using an ANN approach, a non-linear regressive solution. Being the database composed by a series of qualitative and quantitative variables, the best way of finding a regressive solution is a non-linear approach. ANN was built basing on the input layer (x-block) to estimate the output layer (y-block). Between the input and the output layers, one or more hidden layers was built by the ANN procedure basing on its architecture. The type and the complexity of the process or experimentation usually iteratively determine the optimal number of the neurons in the hidden layers (Gupta, 2013).
The ANN model was developed using a generalized regression neural network structure (GRNN), method often used for function approximation (Specht, 1991). The probability density function used in GRNN is the Normal Distribution. The GRNN was trained with a back-propagation learning algorithm. From the 8400 artificial dataset records, to avoid overfitting only 6720 samples (80%) were used to construct the GRNN model. The remaining 1680 samples (20%) were then used to test the performance of the GRNN model (internal test). The partitioning was conducted using the SPXY algorithm (Harrop Galvao et al., 2005) that considers the variability in both X-and Y-spaces. The training of the GRNN was carried out using a learning equal to 0.5 and a momentum equal to 0.1. The training procedure was repeated 1,000,000 times and the best performing GRNN was selected based on the independent test set. The final architecture of the GRNN includes 7 nodes in the hidden layer. Performance parameters, such as the r correlation coefficient between observed and predicted and the Root Mean Squared Error (RMSE) were reported for both training and test sets. A variable impact neural network analysis was performed to assess the relative importance of each variable (Abdou et al., 2012). Operatively, this index is similar to the linear regression Variable Importance in the Projection scores (Chong & Jun, 2005;Febbi et al., 2015).
The GRNN model performance (r correlation coeffi cient and RMSE) on the artificial dataset (training and internal test) has been compared with a multiple linear regression (MLR) model applied on the same partitioned dataset. Ordinary linear regression approaches, such as MLR, are widely used in the Fig. 1 shows the ten different agricultural field shapes and the working directions considered in this study. In particular: (1) square with working direction parallel to the side length; (2) quarter of a circle with working direction parallel to the vertical radius; (3) rectangle with ratio sides 1:2 with working direction parallel to the longer side length; (4) rectangle with ratio sides 1:2 with working direction parallel to the minor side length; (5) semicircle with working direction orthogonal to the diameter; (6) isosceles rectangle triangle with working direction parallel to the cathetus; (7) isosceles rectangle triangle with working direction parallel to the orthogonal hypotenuse height; (8) scalene rectangle triangle with working direction parallel to the larger cathetus; (9) scalene rectangle triangle with working direction parallel to the lower cathetus; (10) scalene rectangle triangle with working direction parallel to the orthogonal hypotenuse height.

Artificial neural networks (ANNs)
From calculation obtained by the artificial dataset only 7 variables were chosen. Those variables were not directly related with the NT estimation but they have been chosen to build the model because they could be easily known by farmers or easily empirically determined (e.g., speed or field size). In details, five quantitative variables were selected (field gross area, working speed, number of turnings, side length parallel to working direction, side length orthogonal to working direction) together with two qualitative ones [field shape (Fig. 1) and working direction type (one-way Figure 2. A rectangular field shape example (3 in Figure  1) used to explain the equations' formulas for the NT calculations. The areas below A 1 and B 1 and above C 1 and D 1 represent the headlands; the central area represents the net worked area (NW area ).  agricultural and forestry frameworks for the estimation of quantitative parameters (Costa et al., 2012). MLR is the most common form of linear regression analysis, generally used to explain the relationship between one dependent variable (y-block) and two or more independent variables (x-block). The trained GRNN was tested on 47 different agricultural operations (external test) collected directly in field (  Table 3 shows the results regarding the perfor mances of the GRNN model (training and test) to estimate the NT constructed from the artificial dataset. Both the r (correlation coefficient) of training (80% of the sample size) and internal test (20% of the sample size) appears to be excellent being equal to 0.98 (R 2 = 0.98). Also, the RMSE (training and test) resulted to be very low being equal to 1.008 and 1.020 respectively. The comparison with the r values obtained by the MLR model (0.14 training dataset and 0.13 internal test) being very low, confirm that the GRNN model was best performing, that the problem was non-linear, and that the x-block was not strictly related with the y-block (i.e., NT). Fig. 3 shows the scatter plot of the observed vs predicted NT obtained from the GRNN model for both training (left side of the Fig.) and test. It is possible to observe that, for both training and internal test regression between observed and GRNN predicted NT values, the records are very proximal to the bisectrix (i.e., perfect attribution), confirming the high r values (Table 3).

Results
Regarding the variable impact on the GRNN model ( Fig. 4) it must be underlined as the "number of turnings" return the higher impact (44.34%) for the NT estimation. This variable was followed by "side length orthogonal to working direction" and the "side length parallel to working direction" (18.49% and 11.04% respectively) which also have a high impact. Field gross area also returned lower but with an important contribution (3.70%). Fig. 5 shows the scatter plot of the observed vs predicted NT obtained from the GRNN model applied on the external test. As also reported in Fig. 5, the regression coefficient is quite high (R 2 = 0.63). It is possible to observe as the points that move away from the bisectrix belong to a shredding and two kinds of plowing (simple and two furrow plow -one-way working). This is probably due to the work being carried out on a small area with a particular shape (quarter circle).

Discussion
This work gives an important contribute in predicting the effective working time needed considering different basic field shapes and the main agricultural operations. The working time calculation is directly related with the farm productivity. The ability to predict in advance the operational agricultural efficiency (i.e., fuel consumption, emissions, pollution, labor, costs, consumes, production, etc.) is an important task which contribute to optimize the resources use. This is a major aim in advanced precision agriculture (Lundström & Lindblom, 2016). The estimation of operating costs of agricultural machinery and the definition of economic competitiveness gap (conditionality standards on agricultural farms and short-and medium-term business

Field gross area (m 2 )
Working direction type  (Guerrieri et al., 2016). Furthermore, the work reports the advantage of using a non-linear approach when compared with a liner one. As confirmed by the comparison of the ANN approach with the MLR one, the relationship between independent variables and the dependent one (i.e., NT) is non-linear. This is an unexpected result considering the basic geometrical problem. A liner multivariate approach (MLR) applied to the artificial dataset returned poor performances in predicting the NT (r values lower than 0.15). Non-linear (ANN) approach, to the other hand, returned excellent performances (r values equal to 0.98 and 0.80 for the internal and external tests respectively) interesting also from an applicative point of view. These applications could involve the precision agriculture framework. The introduction of automatic guidance of agricultural production machines for the improvement of the accuracy of field operations (Kayacan et al., 2014), requires the implementation of specific algorithms for the optimization of path planning and, more in general, of the working time. Some authors hence identified non-linear approaches to better fulfil these aims (Backman et al., 2012;Kraus et al., 2013;Kayacan et al., 2014).
Generally, shape analysis of agricultural fields is of interest in many agricultural and farm management areas and most of the few studies in the literature concern the analysis of the land fragmentation for rural control and administration purposes. Janus & Taszakowski (2015) reported as functioning conditions of agriculture are closely related to the spatial structure of rural areas, one of the most important factors influencing profitability of agricultural production. For example, in the study of Demetriou et al. (2013), a new parcel shape index which integrates GIS with a decision-making method was presented. The study of Gąsiorowski & Bielecka (2014) provided a contribution to an area-wide quantitative statistical description and classification of morphometric parameters such as shape, area and slope of existing agricultural parcels. In this light, the presented study contributes furnishing a model for extracting the NT, that is directly proportional to the agricultural efficiency. In details, the analysis starts from basic field shapes (square, circle, rectangle and triangle) and from other easily collected parameters (i.e., field gross area, working speed, number of turnings, side length parallel and orthogonal to the working direction and working direction type). These variables are easily to collect by operators and, for this reason, the approach may result extremely useful for both farmers (in terms of economic advantages) and policy makers at institutional level.
The shapes taken into consideration, are very simple (Fig. 1), but they could be associated with each other to analyze complex ones. A complex shape could be figured out as a combination of many simple shapes and the NT calculation could result as the addition of many NTs calculated on simple shapes. As reported in Fig.   4, the proposed applicative model is mainly influenced (variables impact in the GRNN) by parameters such as "number of turnings" (44.34%), "side length orthogonal to working direction" (18.49%) and the "side length parallel to working direction" (11.04%). This means that the number of turnings is more important than other variables because it depends on the working width. For this reason, more complex is the shape, more working steps and turnings (TAT) are required. Also, the field shape variable could affect the accuracy of estimation of NT in the model even if in a reduced but important percentage (7.56%; Fig. 4) with respect to the one-way working type and to the field gross area.
As conclusions, in this scenario, the presented study finds a model for extracting the NT (directly proportional to the previously mentioned agricultural efficiency) starting from basic field shapes (square, circle, rectangle and triangle) and from other easily collected parameters (i.e., field gross area, working speed, number of turnings, side length parallel and orthogonal to the working direction and working direction type). The results of the study could be used to implement models predicting fuel consumption and costs. Indeed, NT is estimated with a high precision through a model built with easily collectable data. This kind of modelling approach could be implemented on a web platform and made available to all the stakeholders Figure 5. Scatter plot of the observed vs predicted net time (NT) obtained from the GRNN model from external test real field data reporting the R 2 . Line represents the bisectrix (i.e., perfect attribution). For the single agricultural operation identification, the averages time per hectare values are reported in Table 2.