Research Article
Relations between zero-inflated variables in trials with horticultural crops
Federal University of Santa Maria, Rural Science Center, Crop Science Department. Santa Maria, Rio Grande do Sul, Brazil
Polytechnic Institute of Bragança, School of Agriculture, Mountain Research Centre (CIMO). Bragança, Portugal
University of Lisbon, Agronomy Institute, CEABN. Lisbon, Portugal
University of Cruz Alta, Cruz Alta, Rio Grande do Sul, Brazil
|
Abstract Certain characteristics of some vegetable crops allow multiple harvests during the production cycle; however, to our knowledge, no study has described the behavior of fruit production with progression of the production cycle in vegetable crops with multiple harvests that present data overdispersion. We aimed to characterize the data overdispersion of zero-inflated variables and identify the behavior of these variables during the production cycle of several vegetable crops with multiple harvests. Data from 11 uniformity trials were used without applying treatments; these comprise the database from the Experimental Plants Group at the Federal University of Santa Maria, Brazil. The trials were conducted using four horticultural species grown during different cultivation seasons, cultivation environments, and experimental structures. Although at each harvest, a larger number of basic units with harvest fruit was observed than units without harvest fruit, the basic unit percentage without fruit was high, generating an overdispersion within each individual harvest. The variability within each harvest was high and increased with the evolution of the production cycle of Capsicum annuum, Solanum lycopersicum var. cerasiforme, Phaseolus vulgaris, and Cucurbita pepo species. However, the correlation coefficient between the mean weight and number of harvest fruits tended to remain constant during the crop production cycle. These behaviors show that harvest management should be done individually, at each harvest, such that data overdispersion is reduced. Additional key words: multiple harvests; data overdispersion; experimental planning. Abbreviations used: BU (basic unit); CV (coefficient of variation). Authors’ contributions: Conceived and designed the experiments: ADL, LFN and FR. Performed the experiments: MPBP. Analyzed the data: ADL. Wrote the paper: ADL, LFN, FR and MPBP. Citation: Lúcio, A. D.; Nunes, L. F.; Rego, F.; Pasini, M. P. B. (2016). Relations between zero-inflated variables in trials with horticultural crops. Spanish Journal of Agricultural Research, Volume 14, Issue 2, e0906. http://dx.doi.org/10.5424/sjar/2016142-8175. Received: 17 Jun 2015. Accepted: 12 May 2016 Copyright © 2016 INIA. This is an open access article distributed under the Creative Commons Attribution License (CC by 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Funding: The Brazilian Ministry of Education’s Graduate Education Support Agency (CAPES) awarded an overseas post doctorate scholarship (Process BEX 1457/14-4). Competing interests: The authors have declared that no competing interests exist. Correspondence should be addressed to Alessandro D. Lúcio: adlucio@ufsm.br. |
|
CONTENTS |
IntroductionTop
In some vegetable species, certain specific characteristics allow multiple harvests during the production cycle. The realization of such multiple harvests is defined in a subjective manner and varies with the season and with each cultivated species. In experiments on species with multiple harvests all over the world, the above variations should be considered together with interference among these variables. Such interferences can inflate any residual variance and induce inadequate estimates in the experimental design because of the lack of adequate information at harvest, favoring overdispersion in the database, with tabulation of a large number of null values.
In several studies, strategies have been developed to identify the most appropriate procedures to minimize data variability in experiments with vegetable crops. Among these are the studies by Lopes et al. (1998), Lúcio et al. (2006, 2008), Carpes et al. (2010), Santos et al. (2010), and Haesbaert et al. (2011). These studies sought to improve the quality of experiments through the following strategies: determining the plot size and sample (Souza et al., 2002; Mello et al., 2004; Lorentz et al., 2005, Lorentz & Lúcio, 2009; Lúcio et al., 2010; Santos et al., 2010; Haesbaert et al., 2011; Storck et al., 2014), adjusting for the variability of experimental areas and each culture (Lúcio et al., 2006; Carpes et al., 2008, 2010), determining the behavior of variability between plant rows and between harvests (Lúcio et al., 2006, 2008; Benz et al., 2015), studying data transformations (Couto et al., 2009) and using the Papadakis method to minimize the effects of excess zeros and resultant data overdispersion (Lúcio et al., 2016).
Lopes et al. (1998), Lorentz et al. (2005), Carpes et al. (2008) and Lúcio et al. (2008), have pointed out significant variability between crop rows and harvests, regardless of the species used, and that such variability significantly alters the estimates of sample sizes, types of sampling, size and form of the parcel, experimental outline, and number of harvests needed to adequately differentiate the study treatments.
The relationship between the observed variables, number and weight of fruits harvested in experiments with vegetable crops, and behavior of these species during the production cycle is important, as it generates information on how multiple harvesting should be planned and carried out. One of the problems associated with repeated measurements is the excess of variables with zero values. An interesting strategy to reduce this problem is to estimate the ideal plot size so that the majority of results have values greater than zero, subsequently reducing the variance. Another strategy is to estimate the ideal plot size that provides the smallest variance between the evaluated plots, because often researchers solve this problem empirically, based on practical sizes for conducting the experiment, available area, or from experience.
In agricultural research, it is common to evaluate the full cycle of a particular species or compare different treatments for crop development. However, to our knowledge, no studies have described the behavior of fruit production with progression of the production cycle in vegetable crops with multiple harvests that present data overdispersion.
This study aimed to characterize the data overdispersion of the zero-inflated variables and identify the behavior of these variables during the production cycle of several vegetable crops with multiple harvests.
Material and methodsTop
Data from 11 uniformity trials were used without applying treatments. These comprise the database from the Experimental Plants Group at the Federal University of Santa Maria, Brazil. The trials were performed on four horticultural species, hibrids, grown in different cultivation seasons, cultivation environments, and experimental structures (Table 1). Each experimental basic unit (BU) was composed of a single plant in each row of plants, except for trials with Phaseolus vulgaris, where each BU consisted of two plants because of the indeterminate growth characteristic of the species and the tendency to climb ontoin adjacent plants.
Table 1. Uniformity trials without treatment application used in the study.
During each harvest, the number and weight (in grams) of fruits harvested from each BU were observed, except for trials with P. vulgaris, where only the harvest weight was noted. In the trials, the number of bunches harvested by BU was noted.
In each harvest, for number and weight of fruits and number of bunches, an initial descriptive statistical analysis was conducted from which we obtained the percentage of estimates of BU with zero values, the minimum and maximum values, medians, means, coefficient of variation (CV, in %), and degrees of asymmetry and kurtosis. Box-plots were constructed for the number and weight of fruits of each harvest, in order to identify the variability and average behavior of these variables with progression of the production cycle of the species evaluated. Further, we compared the proportions of BU with and without harvest fruits, adopting a 50% probability for presence or absence of fruits ready to be harvested.
A linear correlation analysis between the mean weight and the number of fruits per BU was also performed. For Solanum lycopersicum var. cerasiforme trials, we also estimated the correlation coefficient between the mean weight of fruits and number of bunches per BU for individual species and cultivation season. Next, for each variable, the Shapiro–Wilk test was performed to identify data adherence to a normal distribution and the Levene test to identify variance homogeneity. For all the statistical analyses performed, a probability of error of 5% was adopted, using Action software 2.7 version.
ResultsTop
Lack of adherence to a normal distribution was identified within each harvest along with variance heterogeneity among the multiple harvests, independent of species, season, cultivation environment, and observed variable, because of the high variance estimates, and consequently, the CV (Tables 2 to 6). When plotting the weight and number of fruits (bunches in one case) variability in each of the multiple harvests, independent of the above conditions, we could not identify similar behavior of variability with progression of the production cycle of the species (Figs. 1 to 4).
Table 2. Descriptive statistics for weight (grams per basic unit) and number of fruits harvested per basic unit uniformity trials for Capsicum annuum cultivated in different growing seasons.
Table 3. Descriptive statistics for weight of fruit (grams per basic unit) and number of fruits and bunches harvested per basic unit in uniformity trials of Solanum lycopersicum var. cerasiforme cultivated in the spring-summer seasons under different environmental conditions.
Table 4. Descriptive statistics for weight of fruit (grams per basic unit) harvested in uniformity trials of Phaseolus vulgaris cultivated in different growing seasons and environmental conditions.
Table 5. Data descriptive statistics for weight (grams per basic unit) and number of fruits harvested by basic unit in the uniformity trials of Cucurbita pepo grown in the autumn-winter season.
Table 6. Data descriptive statistics for weight (grams per basic unit) and number of fruits harvested by the basic unit in the uniformity trials of Cucurbita pepo grown in the spring-summer season.
|
|
|
|
|
|
|
|
When comparing the proportion of BU with and without harvest fruits, within each of the multiple harvests, in 13.3% of the harvests (10 of 75 harvests under all study conditions), the proportions did not differ; that is, they had statistically the same number of BU with and without harvest fruits in the specific season. In 15 (23.1%) of the 65 harvests the BU proportion without harvest fruit was significant greater than that with harvest fruit (Figs. 5 to 8). This result is interesting and indicates that in 56 of the 75 study harvests (74.7%), a significant difference was noted with greater number of BUs with harvest fruits than those without harvest fruits.
|
|
|
|
|
|
|
|
Within each harvest, significant correlations coefficients were noted between the mean weight and number of fruits and/or bunches harvested per BU, with estimates of around 0.6 for C. annuum and S. lycopersicum var. cerasiforme species (Figs. 5 and 6). As for C. pepo, the estimates varied as the production cycle progressed, were significant and presented maximum values around 0.8 (Fig. 8). As previously described, C. pepo presented different characteristics during fruit maturation, which generated results different from those obtained with the other studied species that showed similar correlation coefficient estimates in the multiple harvests.
DiscussionTop
The high variability and overdispersion identified in the data is a direct consequence of the high number of BUs with observed values equal to zero. This fact changes the entire behavior of the descriptive statistics estimates, such as the asymmetry and degree of kurtosis (Tables 2 to 6). This situation means that in most cases, the data show a positive asymmetrical distribution and high degree of kurtosis with a platykurtic distribution.
The appearance of fruits on the plants on different days, causing variation in growth among them; early or late maturation of some fruits; and lack of uniformity in size at harvest, beyond their lack of uniformity in size fruit at harvest, resulting in the inability to define the ideal harvest point, are factors increasing variability in the fruit number and weight, causing data overdispersion and consequent variations in the statistical analysis. Cargnelutti Filho et al. (2004) obtained higher CV% values at the beginning and end of the tomato harvest, because the beginning and end of the fruit production were not uniform among the plants. In another study on tomatoes, Lúcio et al. (2010) found that the largest production of fruits occurred mid-way and at the end of the production cycle, and that the variability increased due to physiological aspects of the plants, because they go into a state of senescence.
With larger number of harvests, an increase in variability is noted. Souza et al. (2002), Oliveira et al. (2005), and Lúcio et al. (2006) recommend that homogeneous variances be maintained during the production cycle of the crop. Further, they suggest that researchers clearly define the ideal number of harvests to be performed, which then must be planned and executed by considering each row as a block, thereby allowing experimental repetition. Variabilities were also noted in the studies by Lúcio et al. (2004), Mello et al. (2004), and Lorentz & Lúcio (2009) on C. annuum; Carpes et al. (2008) and Lúcio et al. (2008) on C. pepo; and Storck et al. (2014) on Passiflora edulis. According to these authors, an increase in the number of replicates is recommended, along with possibly increasing the plot size.
The non-adherence to a normal distribution of the data in the C. pepo trials can be explained by the number of harvests, which was larger than those of C. annuum and S. lycopersicum var. cerasiforme. With the increased number of harvests in these trials, within each individual harvest, lesser number of fruits per BU was observed, consequently, showing a tendency of non-adherence to the normal distribution with smaller data overdispersion than for species with fewer harvests with greater number of harvested fruits in the individual harvests (Tables 2 to 6).
These variance behaviors show that harvest management should be done individually, at each harvest, such that data overdispersion is reduced. Appropriate definition of each fruit harvest time can be a practical alternative, as well as defining time intervals for each harvest rather than identifying a specific day. Thus, the BU number with fruit ready for harvest can be increased, with reduction in the data amplitude within each multiple harvests.
The main characteristic of the studied species in plants without fruits ready to be harvested throughout their production cycle was a lack of reduction of variability during the course of crop production cycles. Even without fruit harvest in a BU, the variability of the data remained high and kept increasing, because in this particular case, the value of the crop n remained identical to the value obtained at harvest n−1, while in the BU with harvested fruits, the value increased; thus, variability in the values in each harvest tended to increase (Figs. 1 and 4).
One way to reduce data variability, and thus, overdispersion, is to increase the number of BUs with harvested fruits within each harvest, since this will also increase the number of harvested fruit and total weight of fruit within each BU. As previously mentioned, a practical and viable manner to promote this situation is to clearly define the harvest point and identify time intervals between each harvest.
In summary, within each harvest, there were more basic units (BU) with than without harvest fruit. However, the BU percentage without fruits was high, generating data overdispersion within each harvest. The variability within each harvest is high and increases as the production cycle progresses in C. annuum, S. lycopersicum var. cerasiforme, P. vulgaris, and C. pepo. The correlation coefficient values between the average fruit weight and number of harvested fruits tended to remain constant during the crop production cycle. These behaviors show that harvest management should be done individually, at each harvest, such that data overdispersion is reduced.
ReferencesTop
| ○ | Benz V, Lúcio AD, Lopes SJ, 2015. The spatial and temporal independence of Italian zucchini production. Acta Scientiarum 37: 257-263. http://dx.doi.org/10.4025/actasciagron.v37i2.19398. |
| ○ | Cargnelutti Filho A, Radin B, Matzenauer R, Storck L, 2004. Número de colheitas e comparação de genótipos de tomateiro cultivados em estufa de plástico. Pesquisa Agropecuária Brasileira 39: 953-959. http://dx.doi.org/10.1590/S0100-204X2004001000002. |
| ○ | Carpes RH, Lúcio AD, Storck L, Lopes SJ, Zanardo B, Paludo AL, 2008. Ausência de frutos colhidos e suas interferências na variabilidade da fitomassa de frutos de abobrinha italiana cultivada em diferentes sistemas de irrigação. Ceres 55: 590-595. |
| ○ | Carpes RH, Lúcio AD, Lopes SJ, Benz V, Haesbaert FM, Santos D, 2010. Variabilidade produtiva e agrupamentos de colheitas de abobrinha italiana cultivada em ambiente protegido. Ciência Rural 40: 294-301. http://dx.doi.org/10.1590/S0103-84782010005000007. |
| ○ | Couto MRM, Lúcio AD, Lopes SJ, Carpes RH, 2009. Transformação de dados em experimentos com abobrinha italiana em ambiente protegido. Ciência Rural 39: 1701-1707. http://dx.doi.org/10.1590/S0103-84782009005000110. |
| ○ | Haesbaert FM, Santos D, Lúcio AD, Benz V, Antonello BI, 2011. Tamanho de amostra para experimentos com feijão-de-vagem em diferentes ambientes. Ciência Rural 41: 38-44. http://dx.doi.org/10.1590/S0103-84782011000100007. |
| ○ | Lopes SJ, Storck L, Heldwein AB, Feijó S, Ros CA, 1998. Técnicas experimentais para tomateiro tipo salada sob estufas plásticas. Ciência Rural 28: 193-197. http://dx.doi.org/10.1590/S0103-84781998000200002. |
| ○ | Lorentz LH, Lúcio AD, 2009. Tamanho e forma de parcela para pimentão em estufa plástica. Ciência Rural 39: 2380-2387. http://dx.doi.org/10.1590/S0103-84782009005000202. |
| ○ | Lorentz LH, Lúcio AD, Boligon AA, Lopes SJ, Storck L, 2005. Variabilidade da produção de frutos de pimentão em estufa plástica. Ciência Rural 35: 316-323. http://dx.doi.org/10.1590/S0103-84782005000200011. |
| ○ | Lúcio AD, Mello RM, Storck L, Carpes R., Boligon, AA, Zanardo B, 2004. Estimativa de parâmetros para planejamento de experimentos com a cultura do pimentão em área restrita. Horticultura Brasileira 22: 766-770. http://dx.doi.org/10.1590/S0102-05362004000400020. |
| ○ | Lúcio AD, Lorentz LH, Boligon AA, Lopes SJ, Storck L, Carpes RH, 2006. Variação temporal da produção de pimentão influenciada pela posição e características morfológicas das plantas em ambiente protegido. Horticultura Brasileira 24: 31-35. http://dx.doi.org/10.1590/S0102-05362006000100007. |
| ○ | Lúcio AD, Carpes RH, Storck L, Lopes SJ, Lorentz LH, Paludo AL, 2008. Variância e média da massa de frutos de abobrinha-italiana em múltiplas colheitas. Horticultura Brasileira 26: 333-339. http://dx.doi.org/10.1590/S0102-05362008000300009. |
| ○ | Lúcio AD, Carpes RH, Storck L, Zanardo B, Toebe M, Puhl OJ, Santos JRA, 2010. Agrupamento de colheitas de tomate e estimativas do tamanho de parcela em cultivo protegido. Horticultura Brasileira 28: 190-196. http://dx.doi.org/10.1590/S0102-05362010000200009. |
| ○ | Lúcio AD, Santos D, Cargnelutti Filho A, Schabarum DE, 2016. Método de Papadakis e tamanho de parcela em experimentos com a cultura da alface. Horticultura Brasileira 34: 66-73. http://dx.doi.org/10.1590/S0102-053620160000100010. |
| ○ | Mello RM, Lúcio AD, Storck L, Lorentz LH, Carpes RH, Boligon AA, 2004. Size and form of plots for the culture of the Italian pumpkin in plastic greenhouse. Scientia Agricola 61: 457-461. http://dx.doi.org/10.1590/S0103-90162004000400017. |
| ○ | Oliveira SJR, Storck L., Lopes SJ, Lúcio AD, Feijó S, Damo HP, 2005. Plot size and experimental unit relationship in explanatory experiments. Scientia Agricola 62: 585-589. http://dx.doi.org/10.1590/S0103-90162005000600012. |
| ○ | Santos D, Haesbaert FM, Puhl OJ, Santos JRA, Lúcio AD, 2010. Suficiência amostral para alface cultivada em diferentes ambientes. Ciência Rural 40: 800-805. http://dx.doi.org/10.1590/S0103-84782010000400009. |
| ○ | Souza MF, Lúcio AD, Storck L, Carpes RH, Santos PM, Siqueira LFF, 2002. Tamanho da amostra para peso de massa de frutos, na cultura da abóbora italiana em estufa plástica. Revista Brasileira de Agrociência 8: 123-128. |
| ○ | Storck L, Lúcio AD, Krause W, Araújo DV, Silva CA, 2014. Scaling the number of plants per plot and number of plots per genotype of yellow passion fruit plants. Acta Scientiarum 36: 73-78. http://dx.doi.org/10.4025/actasciagron.v36i1.17697. |
