Multiple Linear Regression (MLR), Artificial Neural Network (ANN) and Rosetta model were employed to develop pedotransfers functions (PTFs) for soil moisture prediction using available soil properties for northern soils of Iran. The Rosetta model is based on ANN works in a hierarchical approach to predict water retention curves. For this purpose, 240 soil samples were selected from the south of Guilan province, Gilevan region, northern Iran. The data set was divided into two subsets for calibration and testing of the models. The general performance of PTFs was evaluated using coefficient of determination (^{2}), root mean square error (RMSE) and mean biased error between the observed and predicted values. Results showed that ANN with two hidden layers, Tan-sigmoid and linear functions for hidden and output layers respectively, performed better than the others in predicting soil moisture. In the other hand, ANN can model non-linear functions and showed to perform better than MLR. After ANN, MLR had better accuracy than Rosetta. The developed PTFs resulted in more accurate estimation at matric potentials of 100, 300, 500, 1000, 1500 kPa. Whereas, Rosetta model resulted in slightly better estimation than derived PTFs at matric potentials of 33 kPa. This research can provide the scientific basis for the study of soil hydraulic properties and be helpful for the estimation of soil water retention in other places with similar conditions, too.

Soil hydrodynamic properties drive the flow of water in the soil-plant-atmosphere system, and hence control processes such as aquifer recharge or nutrient fluxes between soil and vegetation. Knowledge of soil hydrodynamics is important for modeling physical processes related to soil water content. Despite great advances in measurement methods, it is still difficult to determine soil hydraulic properties accurately, especially for undisturbed soils and in the dry range. However, the measurement of the soil hydraulic properties is time-consuming, labor-intensive and expensive (

Recently, an alternative, indirect estimation of soil hydraulic properties from widely available or more easily measured basic soil properties using pedotransfer functions (PTFs) has attracted considerable attention of researchers in a variety of fields such as soil scientists, hydrologists, and agricultural and environmental engineers (

PTFs are classified in three main groups,

At present, there are two common methods to develop PTFs for point and/or function estimations, which are the Multiple Linear Regression (MLR) method (

An advantage of ANN, as compared to other methods, is that it does not require a priori models. The optimal, possibly nonlinear, relations which link the input data (bulk density, particle-size data, etc.) to output data (soil water retention, FWC, etc.) are obtained and implemented in an iterative calibration procedure (

However, this method also has some significant disadvantages which must be taken into consideration (

In order to make the PTFs as widely applicable as possible,

Although there are many studies on developing and using PTFs as listed above, there is no universal method for the prediction of soil hydraulic parameters. Moreover, the existing PTFs for the estimation of soil hydraulic properties in the literature were not always applicable in other regions with acceptable accuracy (

Study area is located in south of Guilan province, Gilevan region, northern Iran (36° 54´ 10” to 36° 50´ 00” N, 49° 02´ 30” to 49° 16´ 08” E) (

Totally 240 samples were taken from 0–30 cm depth and air dried. Some clods were used to measure soil bulk density using clod method. Samples were passed through a 2mm sieve to determine particle-size distribution by the pipette method in combination with sieving method. The organic matter content was analyzed with the Walkly-Black method. Calcium carbonate was determined based on calcimetery method (

The most common method used in point PTFs is to employ MLR.

where, θp is the soil water content at specific matric potentials; a_{0} is the regression constant; a_{1}, a_{2}, a_{3}, a_{4,} a_{5} are the regression coefficients; OC and BD represent the organic carbon and bulk density, respectively.

Neural networks consist of a large class of different architectures. In many cases, the issue is approximating a static nonlinear, mapping ^{K}. The most useful neural network in function approximation is Multi-Layer Perceptron (MLP) network. A MLP consists of an input layer, several hidden layers, and an output layer. Node

It includes a summer and a nonlinear activation function _{k}, _{ki} and summed up together with the constant bias term θ_{i}. The resulting

The output of node

Connecting several nodes in parallel and series, a MLP network is formed (

To facilitate application of the PTFs,

where θ(h) is the measured volumetric water content (cm^{3}/cm^{3}) at the suction h (cm, taken positive for increasing suctions). The parameters θs (cm^{3}/cm^{3}) and θr (cm^{3}/cm^{3}) are saturated and residual water contents respectively; α>0 (in cm^{−1}) is related to the inverse of the air entry suction; and n>1 is a measure of the pore-size distribution (

Accuracy of the MLR and Rosetta methods for derivation of PTFs was evaluated by using the coefficient of determination (^{2}), root mean square error (RMSE) and mean biased error (MBE) between the measured and predicted values of a given hydraulic parameter. The ^{2}, RMSE and MBE are expressed as:

where _{i} denotes the measured value, is the predicted value, is the average of the measured value, and

In this paper, use of MLR, ANN and Rosetta models, for the prediction of soils water content, was described and compared. Descriptive statistics for soil properties are summarized in ^{3} with values varying between 1.28 to 1.6 g/cm^{3}.

Correlation coefficients among the water content in each potential and soil physical and chemical properties were calculated and are reported in _{m} in agreement with _{m} and positive for high h_{m} (_{m} and increased it at high h_{m}. _{33} kPa and BD for western Nigerian soils. However, _{m} values due to their effects on water retention surfaces. But the effect of silt was not significant at 0.01. This is in agreement with _{33 }kPa with silt and clay contents and of θ_{1500 }kPa with clay content. _{33 }kPa and θ_{1500 }kPa and effect of sand content on θ_{1500 }kPa were not significant but that sand content significantly affected θ_{33 }kPa. _{FC} and θ_{PWP}. _{PWP} and clay content. Decreasing effect of sand content on the SWR was obvious and had a descending trend fromθ_{10} to θ_{1500} kPa. Its effect on θ_{1000 }and θ_{1500} kPa was not significant. The influence of CaCO_{3} was minor and had no significant effect on SWR. It increased the SWR due to its impacts on aggregation and flocculation (by soluble Ca^{2+}). _{3} content was the second important independent variable entering PTFs for θ_{1500} kPa. _{m} because the carbonates with clay size behave like silt in water retention. Effect of OC on the SWR was not significant. Increasing effect of organic matter on the SWR is dominant at low h_{m}. _{m}. _{PWP} and organic matter in four Mexican soils.

Based on the collected training data, the following PTFs using MLR method were developed and are listed in ^{2}
_{adj}) given in ^{3}/cm^{3} and wilting point ≅ 0.20 cm^{3}/cm^{3}).This feature can be related to high silt and clay particle size content (

Clay type plays an important role in the retention properties of soils. So, soils in the humid tropics can have a much lower capacity to retain water than soils in the temperate regions with the same clay content but with a different type of clays (

Comparison among different methods for prediction of soil water content from testing data is presented in Suppl. Figs. S2, S3 and S4 [pdfs online] and summarized in

These figures and

The RMSE values for the ANN were smaller than that for the derived point PTFs and Rosetta model in all matric potentials. MLR and Rosetta software hold the second and third places, respectively. The RMSE values for the derived point PTFs were smaller than that for the Rosetta model, except kPa that Rosetta model had better estimation than regression (^{2} values of ANN for all potentials were greater than regressions and regressions were greater than Rosetta, except kPa that Rosetta model had greater than regression. So, the accuracy of the ANN is better than that of the derived point PTFs and Rosetta model. This result is in line with the work done by

The reason of this superior efficiency of ANNs models compared with the basic regression equations is probably because the PTFs that have derived from various areas have different efficiencies. On the other hand, according to the hypothesis of

The RMSE values of different ANN-PTFs and regression-PTFs were lower in the prediction of volumetric water content at PWP than the others. Likewise, based on MBE values, all these PTFs especially at PWP, showed slight underestimation of volumetric water content. But this underestimation of ANN model was very low and could be ignored. Also, in the evaluation study of

With decreasing of matric potential (from 10 kPa to 1500 kPa), correlation coefficient increased in all three methods. In other words, the accuracy of prediction increased with increasing in soil suction. Because soils of study area had low content of organic matter and so, soil structure was very weak. On the other hand, the hydrodynamic behavior of fine silt content, that could be similar to a clay particle size, would increase the water retention content near to permanent wilting point. The content of fine silt of soils was between 57% and 84%.

In this study, multivariate linear regression and neural network model were employed to develop a pedotransfer function for predicting soil moisture using available soil properties. This neural network was consisted of three hidden layers, a sigmoid activation function in hidden and linear function in output layer. Results showed that artificial neural network gave the best model with Levenberg–Marquardt learning algorithm and tangent sigmoid (tansig) transfer function. Multi-Layer Perceptron architecture of different matric potentials according to number of inputs, number of neurons in the hidden layer and output parameter were as: 3-10-1, 3-5-1, 2-8-1, 2-4-1, 2-10-1, 1-9-1 and 1-7-1 for 10, 33, 100, 300, 500, 1000 and 1500 kPa, respectively. The ANN model was more suitable for capturing the non-linearity of the relationship between variables than multivariate regression and Rosetta, and can model non-linear functions and has been shown to perform better than linear regression. With regarding to the evaluation criteria, the results of this study revealed that the ANNs had superiority to the basic regression equations for prediction of mentioned soil parameter. This is a crucial result because, since ANN– PTFs formed from local data produce more accurate predictions than those built from data spread from a wider area, the concept of data conservation becomes a critical factor in ANN–PTFs construction (

This paper is a part of M.Sc thesis of corresponding author. So, the authors thank for laboratory assistance of Soil Science Department technicians in University of Guilan. The authors are grateful to anonymous reviewers who considerably improved the quality of the manuscript.