Short communication . Persistence of point crop yield subjective estimates

The aim of this research was to investigate the persistence of annual crop yield point value subjective estimates, elicited from a series of interviews carried out on a wide group of farmers. Time persistence is a necessary condition of coherence and reliability in subjective crop yield probability estimation. The interviewed subjects gave estimates for point crop yield (mean, highest, most frequent and lowest possible). Limited relative differences for all variables were found, except for the lowest possible crop yield estimates, which had a broad dispersion. The results are deemed valuable in order to determine the level of trust in the techniques applied in obtaining data, and in their effectiveness in designing a farm decision support system (DSS) to enhance the outcome of farmers' decision making.

(accuracy, reliability, acceptability and predictive accuracy; see Norris and Kramer, 1990).This research follows these guidelines.
The main objective of this research was to determine the persistence of point value estimates of the crop yield responses given by farmers.Persistence is a necessary condition to establish the coherence and reliability of the answers in order to estimate crop yield PDF.The beta and triangular PDF estimation is usually made in the literature with three point value estimations: the highest possible or H, the most frequent or M and the lowest possible or L.
Persistence was evaluated by the responses given by the farmers in two different time spans for crop yield values.Farmers come from a wide range of Spanish geographical areas with very different environmental and technological conditions from farm to farm.However, as Pease (1992) has pointed out «geographical location plays a larger role than crop in comparison of relative variability of yields», the aim of this research did not lie in determining a given operational PDF for each of the crops analyzed, but to verify persistence of the responses.
A program for systematic data collection has been set up in order to accumulate experience on crop yield PDF.In 1999, 52 farmers were interviewed for the first time.Two agronomy students interviewed each farmer.Student number one carried out the interview with what was called the «first day questionnaire».Approximately two weeks later, another student interviewed the same farmer with the «second day questionnaire», which was organized in a different way (questionnaires available from the authors).A total of 104 interviews were carried out with 52 farmers.The interviews carried out in the year 2000 followed the same methodology as described above.Forty-four different farmers were questioned, providing a total of 88 new interviews.
Each farmer indicated the annual crop he would provide information for, depending on his own experience.Out of all the answers obtained, only those with the greatest number of responses (five or more) were taken into account (Table 1).
Subjects who were interviewed gave estimates for mean (m), highest possible (H), most frequent (M) and lowest possible point crop yields (L).To evaluate persistence, a concept of persistence (which may be called «time persistence») was used, based on measuring the difference between the estimates declared at two different time spans.To avoid the biases described by Bland andAltman (1995, 1999) if d 1 and d 2 are the values to be compared, as in, for example, the estimates made by a decision maker on the first and second days, relative differences throughout this research will be expressed thus: Ideally, crop yield data given by a decision maker (farmer) at two different time spans, should be the same and, thus, a null difference would be interpreted as an argument for persistence.Nevertheless, a discrepancy between the declared crop yield values of the first and the second day by a decision maker should not infer a violation of the persistence hypothesis, since there might be certain cognitive mechanisms taking part which would account for this discrepancy in the voiced information.Some explanations for this discrepancy might be: the tendency to give rounded figures (and, with some intermediate figures, it would be possible to round both up or down); the existence of a range of values which subjects believe to be equivalent; or the voicing of uncertainty through an interval, out of which a point figure is declared.
The rounding hypothesis needs no further explanation.Farmers tend to answer using rounded values, which may lead to a discrepancy between the amounts declared on one day or another.In the case of the existence of a range of equivalent values, it can be argued that there is sufficient evidence in many areas for the fact that people do not distinguish certain values in a continuous way, but their perception of «similar» values is represented by one value (perception or segregation thresholds).
Both in 1999 and in 2000, farmers were asked to estimate the L, M and H possible crop yields.Pairs of data generated on both survey days are available.Generally, previous literature suggests that subjects have difficulty in understanding the meaning of subjective estimates of L and H possible values.The results of the analysis have been summarised in Table 2.The high value for the standard deviation in lowest possible crop yields is due to the existence of some extreme differences, in some cases, of 75% or more.Although the means and medians of the relative differences in estimations of the lowest possible crop yields are small, SDs show a probable misunderstanding or indetermination surrounding this concept.It is conjectured that this indetermination could occur because lowest possible crop yields are not readily accessible, as they do not play a major part in farmers' estimates or goals.
Table 3 shows the figures obtained in Kolgomorov-Smirnov's normality test and in Wilcoxon's test for independent population groups (first day vs. second day), with a significance level of 5% for the relative differences of the different point estimates of lowest possible, highest possible and most frequent yields.
Most of the Kolgomorov-Smirnov test seems to suggest that distributions do not comply with a normal distribution (5%), except for some values corresponding to the year 2000.On the other hand, Wilcoxon's test makes it impossible to reject the similarity hypothesis between the means in all cases, except for the ones referring to the most frequent yield in irrigated crops for the year 2000.
A significant rank correlation (Pearson, Spearman and Kendall) between lowest possible and highest possible crop yields is yet to be found.It is unlikely that the same people would make poor estimates for both types of crop yield.This result could indicate that there is no personal bias in the differences observed in these estimations.
The aforementioned results indicated a higher persistence in the estimated highest possible crop yield values than in the lowest possible ones.The greater differences found in the estimation of the second variable may be due to the fact that the margin for the highest «possible» crop yields may be subjectively more restricted than the margin for the lowest possible crop yields.
In general, farmers talk about mean crop yields when they compare results from a single crop or from different crops.It seems that farmers are familiar with mean crop yields, which are easily and cognitively available to them.Mean yield could be an anchorage value in the estimation of other crop yield parameters (availability and anchorage are used in the way that Tversky and Kahneman (1974) discussed).When farmers were asked for a crop mean yield value, their response was identified as «declared mean crop yield» (m).The m given by each farmer were taken into account in the year 2000 interviews.Answers for the first and second days, both for irrigated and non irrigated crops, seem to belong to the same population (Wilcoxon test, α = 0.05).The mean of the relative difference is 1.8%, with a standard deviation (SD) of 11.5%, and with extreme differences ranging between 27% and 29%.For irrigated crops, the mean relative difference is 3.1%, with a SD of 9.3% and with extreme differences ranging between 15% and 20%.The median for relative differences is null for both irrigated and non irrigated crops.
Values of m in the 1999 (for the second day) and 2000 (for the first and second days) surveys by each farmer have been compared to «calculated mean crop yields», determined by the Triangular approximation, using Equation [1], and by the Beta approximation, using Equation [2].
where T and B are the calculated mean crop yields in the Triangular and Beta approximation, respectively; H = highest possible crop yield point estimation; L = lowest possible crop yield point estimation; M = most frequent crop yield point estimation.
For the Triangular approximation, in the case of non irrigated crop yields, the mean for the relative difference reaches values between -2% and -1.7% and the median has values between -2% and 0%.In irrigated crops, the relative difference of the declared mean and the mean calculated by the Triangular method lies between -1% and -0.4%, with the median lying between -2.3% and 0%.SD ranged from 7.5% to 11.8%, for non irrigated crops, and from 5.8% to 9.9% for irrigated crops.
With regard to the Beta approximation on non irrigated crops, the mean for relative differences ranged from -2.6% to -1.5%; on irrigated crops, it ranged from -2.9% to -1.5%.The median for both non irrigated and irrigated crops lied between -2.3% and 0%.SD varied from 8.6% to 10.3% (for non irrigated crops) and between 5.4% and 8.7% (for irrigated crops).
Negative values found for the mean of relative differences -as a result of both the Triangular and Beta approximations-, seem to indicate a slight overestimation of the calculated mean crop yield (regarding the directly declared mean).
Comparing the Triangular and Beta methods holds no empirical interest because differences between the mean and variance values, which are estimated from the same set of values L, M and H crop yields, are mathematically determined by functional forms.
The difference between the M and the m for 1999 (second day) and 2000 (first and second days) has been analysed; the results are summarised in Table 4.It can be observed how the differences between the variables are similar to the differences found for the same variables on different days.Wilcoxon's test (α = 0.05) for mean crop yield estimates and the most frequent estimates does not allow to discard the hypothesis of mean equality for all cases.Rank correlation between these two variables is positive and significant in all cases (Pearson, Kendall, Spearman).
From the previous results it can be deduced that declared mean crop yield (m) and most frequent crop yield (M) figures appear to belong to the same population, an argument which would support the idea that one of the values anchors itself with the other.
Establishing the threshold, from which the persistence of values may be admitted, possibly ought to be judged by the economic impact of relative differences, and is an unresolved line of research.In this preliminary investigation -with no reference to this economic impact-the estimates studied seem to verify the persistence criterion in all analyzed variables (except for the lowest possible crop yields), at least, in a high percentage of the surveyed population.
It is conjectured that: (1) the relative differences in the highest possible and lowest possible crop yield estimates are due to the fact that the highest possible crop yields are more accessible than the lowest possible crop yields; (2) the lowest possible crop yields do not play an important role in the calculations made by farmers, whereas the highest possible crop yields may be acting as a goal, against which annual results are measured; and (3) farmers truncate the left tail of the crop yields distribution curve.
Using the Triangular and Beta approximations, a slight overestimation of the calculated mean crop yield, as opposed to the declared crop yield, was made.This overestimation of the calculated mean crop yield could be interpreted as the result of a broader range, contrary to opinions reported in the literature regarding variance underestimations.Values are so small that it is neither possible nor useful to confirm that they may be anything other than zero (meaning that the declared and the calculated crop yields coincide).Nevertheless, it is significant that no general trend of positive relative differences had been detected, which could endorse the tendency to underestimate the variance.
A large number of farmers show a great accuracy and reliability in their responses for the f irst and second day interviews.This circumstance is interpreted as an indication of a good knowledge of PDF, although this hypothesis needs additional research.

Table 1 .
Response from selected crops

Table 2 .
Relative differences in declared crop yields on survey days (%)

Table 4 .
Relative differences between mean and most frequent declared crop yields NIC: non irrigated crops.IC: irrigated crops.