Improving the monitoring of corn phenology in large agricultural areas using remote sensing data series

Aim of study: Mexico's large irrigation areas demand non-structural actions to improve the irrigation service, such as monitoring crop phenology; however, its application has been limited by the large volumes of field information generated, diversity of crop management and climatic variability. The objective of this study was to generate and validate a methodology to monitor corn (Zea mays L.) phenology from the historical relationship of the vegetation indexes (VIs), EVI and NDVI, with the phenological development (PD) of corn grown in large irrigation zones.Area of study: Irrigation District (ID) 075 “Valle del Fuerte”, northern Sinaloa, Mexico.Material and methods: We used a database of 20 years of climate, field crop growth and crop phenology data, and Landsat satellite images. A methodology was proposed on a large scale supported with GIS and remote sensing data series.Main results: The methodology was validated in 19 plots with an acceptable correlation between observed PD and estimated PD for the two VIs, with slightly better values for EVI than for NDVI. NDVI and EVI models agreed with experimental PD observations in 92.1% of the farms used to validate the methodology, in 2.5% only the NDVI model coincided with the real, in 3.1% only the EVI model coincided, and in 2.3% both models disagreed with observation, generated a stage out of phase with respect to the real phenological stage.Research highlights: is possible to generalize the methodology applied to large irrigation zones with remote sensing data and GIS.


Introduction
In Mexico, more than six million hectares are annually irrigated, representing almost a third of the agricultural area cultivated, and generating 50% of the total value of agricultural production (SIAP, 2019). Mexican irrigation zones (IZs) institutionally are classified in 86 Irrigation Districts (IDs) integrated by Irrigation Modules (IMs) and managed by Water Users Associations (WUAs), and 50735 Irrigation Units (IUs) that users operate autonomously (CONAGUA, 2019a,b).
One of the main problems of these large areas is the low irrigation efficiencies (30 to 45%) at farm level, associated with poor irrigation service by WUAs and poor conduction efficiency (Sifuentes et al., 2015). To improve irrigation efficiency and service, detailed knowledge of the phenological development of crops is required to better couple irrigation to crops' water demand that change with their phenological development (Espinosa et al., 2017). Although efforts have been made to implement farm monitoring of crop phenology in Mexico's IDs, its application has been limited due to the large quantity of irrigated areas, diversity of crops and high volumes of field information that are managed (Ojeda-Bustamante et al., 2007). In a scenario of high competition for limited resources, coupled with food security and the effects of climate change, agricultural monitoring becomes very important (Heupel et al., 2018).
The Growing Degree Day (GDD) concept has been used for several decades to estimate the phenological crop stages for corn (Zea mays L.) (Ojeda-Bustamante et al., 2006), wheat (Triticum aestivum L.) (Kimball et al., 2012) and potato (Solanum tuberosum L.) (Flores et al., 2012), among others; however, its application depends on the knowledge of accurate agronomic and environmental information such as sowing date, type and variety crop, air daily temperature (Ta) and a non-restrictive development management such as the absence of water, nutritional or environmental stress (Roth & Yocum, 1997;Ghamghami et al., 2019;Hufkens et al., 2019).
The use of remote sensing to identify and classify crops is currently a common task, using their spectral characteristics (Heupel et al., 2018). However, its practical application to estimate other agricultural variables remains a major challenge due to the high heterogeneity and size of cultivated farms, the presence of clouds and spatial resolution of images (Burke & Lobell, 2017;Hufkens et al., 2019). Vegetation Indices (VIs), defined as mathematical combinations or transformations of two or more bands of the electromagnetic spectrum to maximize vegetation characteristics (Tsouros et al., 2019), are the most widely used products of remote sensing to estimate crop phenology (Kamble et al., 2013;Zhong et al., 2014). The VIs most commonly used have been reported to be the Normalized Difference Vegetation Index (NDVI) and the Enhanced Vegetation index (EVI) (Matsushita et al., 2007). De Bernardis et al. (2016) fitted functions of phenological development data and NDVI values to predict the rice (Oryza sativa L.) phenology. Reed et al. (1994) used historical series of NDVI obtained from Advanced-Very-High-Resolution-Radiometer images to predict wheat phenology; Wei et al. (2019) used EVI and NDVI indices to monitor the phenology of 12 crops in large areas reporting an accuracy of 85% and 81%, respectively.
Remote sensing also allows monitoring the actual crop conditions in the field with adequate spatial and temporal resolution. When the cumulative growing degree days (CGDD) are coupled with the VIs, phenological monitoring is significantly improved (Mavi & Tupper, 2004;Teal et al., 2006). Viña et al. (2004) reported a good fit of the Visible Atmospherically Resistant Indices (VARIs), VA-RI green and VARI red-edge , and NDVI with the corn CGDD and chlorophyll content. The VARIs were more sensitive than the NDVI with respect to the green fraction of vegetation cover, improving the detection of senescence, vegetative development and presence of various types of stress. Liao et al. (2019) used the cumulative temperature in the parameters related to a phenology model and information extracted from Landsat-8 and MODIS images to simulate growth and biomass production in corn and soybeans (Glycine max L.).
Landsat satellite image time series is an alternative for monitoring crops in large agricultural areas, with a moderate spatial and temporal resolution (30 m and 16 days respectively). They are freely accessible with enough information for use and application (Ghamghami et al., 2019;NASA, 2019). However, in Mexico studies focused on developing practical methodologies for monitoring crop phenology are scarce. Moreover, the sowing period and corn genotypes have changed drastically in the last decade in the northwest of Mexico, so this type of tool to monitor crop phenology is a recurrent demand. The objective of this study was to generate and validate a methodology to monitor corn phenological using vegetation indices, validated with historical data obtained from field and meteorological data and Landsat satellite images of 20 agricultural years; this was done in the largest Mexican ID, where corn is the dominant crop.

Study area
This study was carried out in an area of approximately 20,000 ha located at the union of the IMs "Santa Rosa" and "Batequis" in ID075 "Valle del Fuerte" in the northern region of the state of Sinaloa Mexico, located in the coordinates 25°45'N and -108° 51'W ( Fig. 1). The ID075 is the largest in the country, with more than 260,000 ha harvested annually and a production value about 850 million U.S. dollars, with more than 170,000 ha of corn with an average yield of 12.7 t ha -1 and a production value of 400 million U.S. dollars (SIAP, 2019), the study area records an average annual rainfall of 352 mm, average annual temperature of 25 °C and an altitude of 15 m.a.s.l. The soils are flat, deep, with no problem of salts and mostly with clay and 3 Monitoring corn phenology in large agricultural areas using remote sensing data series loam clay textures with available moisture in the range of 0.143 to 0.155 cm 3 cm -3 (UAS, 2014).
Thirty commercial corn farms that had a detailed historical record of at least twenty agricultural years  were selected as reference in the study area, 20 farms located in the IM Santa Rosa (240.8 ha) and 10 in the IM "Batequis" (120 ha).

Climatic seasonal analysis
This arid region shows high climatic variability, which impacts the phenology and season duration of crops. To obtain more information on this effect, an analysis of seasonal climatic variability in the corn phenological was carried out with respect to a typical growing season from November 12 to April 14 of the following year. This approach was performed to classify each agricultural season: Hot (H), Neutral (N) or Cold (C), according to the CGDD during this fixed period of 152 days, wich corresponds to the duration in CGDD from sowing to maturation in the study area with November 12 as the sowing date (INIFAP, 2017). In this way, a cold season is defined as when the CGDD value in the 152 days was from 1100 to 1200, a neutral season when the value was from 1201 to 1300 GDD and for a hot season when the value was more than 1300 CGDD.

Historical series of Landsat images
A total of 116 Landsat satellite images (5, 7 and 8) were downloaded for the analysis period (USGS, 2018) (https://glovis.usgs.gov/), which were corrected radiometrically and atmospherically to improve both, position and radiometric quality, according to the methodology proposed by Chander et al. (2009) for Landsat 5 and 7, and USGS (2019) for Landsat 8 (OLI/TIRS). The annual period analyzed was seven months (November to May), which corresponds to the main agricultural season Autumn-Winter (AW) of the study area, where the presence of clouds is reduced. Fig. 2 shows the type and number of Landsat images used.

Historical relationship VIs-CGDD
In order to analyze the historical behavior of VIs-CG-DD, the first step was to estimate the CGDD value in each reference farm for each available satellite image. Complementary data, as sowing, harvest and irrigation data, as well as agronomic and management data, were extracted from the SPRITER system as detailed and applied by Ojeda-Bustamante et al. (2007) in the study zone. To estimate the CGDD, a historical series (1998-2018) of air daily mean temperature (Ta) was used from an automated meteorological station operated by the Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias (INI-FAP) located in the center of the study area. The GDD calculation was performed using the following equation applied in the study area (Ojeda-Bustamante et al., 2006): where GDD are the daily growing degree days (ºC day -1 ), T a is the average daily air temperature calculated from records of 15 minutes (ºC), Tc max is the maximum critical temperature of the corn crop (30 ºC) and Tc min is the minimum critical temperature of the corn crop (10 ºC). With CGDD values, the phenological development (PD) was calculated as a development ratio using equation (2): where CGDD ID are the GDD accumulated from the sowing date to the interest date (ºC) and GDD Maturity is the GDD accumulated from the sowing date to crop physiological maturity (1451 CGDD) (ºC).
After that, using the raster calculation and "Area Statistics Calculation" tools of the QGIS (2019), the VI averages for each reference farm on each image date were calculated to be associated with the CGDD. The expressions for the calculation of the VIs used were those indicated by Rouse et al. (1974) for the NDVI (3) and by Huete et al. (2002) for the EVI (4).
where ρ NIR is the average reflectance in the near-infrared band (840 to 880 nm), ρred is the average reflectance in the red band (620 to 670 nm) and ρ blue the average reflectance in the blue band (460 to 480 nm).
Finally, a statistical fitting analysis was done among the values of the two VIs with PD. Table 1 shows the CGDD intervals with their corresponding phenological phases and PD values for each range for most hybrids in the study area, characterization based on the methodology reported by Ritchie et al. (1992) and in the studies carried out by Ojeda-Bustamante et al. (2006).

Methodological approach
The proposed methodology is based on the estimation of PD as a function of VIs, using historical series of Landsat satellite images and phenological field data. The step sequence was the following (as shown in Fig. 3): 1. Compilation of climate data and reference farm data. 2. Download the Landsat image series for the dates of interest from a remote sensing website with historical satellite images (i.e. https://glovis.usgs.gov/). 3. Calibration and radiometric correction of satellite images according to the methodology described previously.

Calculation of VIs with Geographic Information
Systems (GIS) software for each reference farm data using the corresponding multispectral-satellite image (i.e. QGIS). 5. Estimation of PD using reference-farm data based on CGDD with equation (2) and Table 1 using several agricultural years (Table 2). 6. Estimation of PD as a function of VIs using the fitted quadratic equations (Table 3) or using Table 4. Make sure to locate PD in the correct zone of crop development, in the ascending zone (I) if current VI (VI i ) is greater than previous VI (VI i-1 ), in the descending zone (II) if current VI (VI i ) is less than previous VI (VI i-1 ).

Validation
The phenological phases determined with CGDD (from sowing to image date) were compared against the stages observed weekly in the fields. This was done for 19 of the 30 reference plots established with corn during the AW 2017-2018 season. Field information from these plots (farms) was also used to validate the spectral models. The accuracy of the large-scale methodology was determined in the total of corn farms in two different dates of the IM Batequis for the agricultural season AW 2017-2018. It was determined by comparing the observed PD with the predicted by the model.

Results and discussion
Historical behavior of crop development Fig. 4 shows the behavior of the historical development of the corn crop in the reference farms associated with CGDD. Each point is the value of CGDD from sowing to the date of the image. The high variability in the behavior of the series is related to climatic variability among years and sowing dates, which is consistent with Ojeda-Bustamante et al. (2014), who estimated increase of at least 1.2 ºC in the annual mean temperature for northern Sinaloa for the first third of this century with respect to the base period , as a result of a possible effect of climate change. In C (Cold) years, the series' trend showed a lower slope due to a lengthening of the season, in H (Hot) years the slope is more vertical as a result of a shortening of the season, while for N (Neutral) Harvest stage (H) 1600-1700 1.10-1.17 Figure 3. Sequential scheme of the generated methodology seasons, the slope is intermediate. Table 2 shows a more detailed historical analysis, where the effect of the sowing date and temperature on the duration of crop phenological season is observed. For early (October), typical (November) and late (December) sowing periods, it was possible to quantify the CGDD historical variation and the mean temperatu-re (Tm), calculated for a fixed interval of 152 days from sowing to physiological maturity, as well the elapsed time expressed in days after sowing to maturity (DAS). In the early and late sowing periods, high CGDD values were observed due to the effect of high temperature at the beginning and the end of crop seasons respectively, which reduced the duration of crop phenological season. On the Table 3. Parameters for equation PD=f(VI)=α 1 ±[α 2 +(α 3 *VI)] 0.5 form, for VI≤VImax , otherwise PD=PD VT , to estimate phenological development (PD) as a function of vegetation indices (VIs) considering two zones of corn development. Zone I: equation with negative sign (-) for the square root. Zone II: equation with positive sign (+) for the square root.  Monitoring corn phenology in large agricultural areas using remote sensing data series other side, the typical or conventional sowing dates produced the largest crop duration due to colder temperatures. The warming trend found in the last five years compared to previous 15 agricultural years, had a reduction of up to 24 days, which represents a 13% reduction when compared to the phenological duration at the beginning of the analyzed period; this effect was also reported by Castillo & Ibáñez (2017). In consequence, there is an inverse relation between Tm and CGDD, the more Tm is related to a lesser CGDD. The high inter and intra-seasonal variability suggest an improvement crop phenology monitor to better coupling crop water demands, in large irrigation zones. The comparison between the phenological development estimated with CGDD in the dates of the images and the observed growth in the sowings of 19 farms for the AW season 2017-2018 is shown in Fig. 5, where a high correlation can be observed between the values observed with those estimated with an RMSE of 0.029 corresponding to 2.9% (4.5 days) of the total development; the value of one corresponds to the physiological maturity of the crop and values higher than this are farms close to harvest.  Spectral models Fig. 6 shows the PD as a function of the two VIs. It is observed a points dispersion due to variability in crop management and crop stress; Viña et al. (2004) mentioned that the use of VIs through remote sensors makes it possible to detect some symptoms of crop stress that most crop growth and development models do not recognize. A higher dispersion range can be observed in the zone of maximum development or peak zone in EVI (0.6-1.0) than in NDVI (0.6-0.8), probably due to the higher capacity of EVI for the detection of stressed fields in the same phase of crop development. It is also observed that after the maximum value (PD = 0.61), the VIs values have a decreasing behavior and are similar to the values before the maximum PD. To express PD as a unique function of VIs, the curve was divided into two zones (Fig. 6): zone I (upward, from sowing to VI peak) and zone II (falling, from VI peak to harvest). The fitted model for the calculation of PD as a function of VIs for the two zones were derived from the roots of a second-order equations (obtained from PD-VIs data), where VI corresponds to the independent variable and the PD to the dependent variable. Equation parameters for both zones are indicated in the Table 3.

PD = f (VI)
Another option to calculate PD as a function of VI is by using a tabular format. Table 4 shows the interval of NDVI and EVI values for each interval of phenological stages as well as the values of PD in the interval, which were obtained from Fig. 6; the location zone is also indicated. The peak VIs, estimated with the fitted model, are also reported. The first four phases the values of NDVI are slightly higher than those of EVI, while from the fifth to the ninth phase EVI values are greater than NDVI values; while in the last four phases EVI was lower than NDVI. This behavior was also reported by Wei et al. (2019), indicating that it was due to the higher sensitivity of the EVI to high biomass, bare soil, and senescence than the NDVI. EVI showed higher response with high biomass for PD = 0.54 (800 CGDD) and in the senescence stage of the crop from PD = 0.69 after 1000 CGDD. This is also seen in Fig. 7 representing the relationship of the two indices studied, as suggested by Huete et al. (2002) and Jensen (2007). This figure also indicates the relevance of estimating either of the two indices from the value of one of these, due to its high determination coefficient.

GIS-based phenology monitoring
To implement the spectral models in a GIS an algorithm was generated, to automatically estimate PD for each cornfield of the IM, from VIs values calculated using multispectral-satellite images. To find out the zone ( Fig. 6 and Table  4), an indicator was obtained when subtracting the value of a previous date image (VI i-1 ) with the current VI i ; positive values correspond to Zone I and negative values correspond to Zone II. An algorithm was implemented in Visual Basic language to calculate the PD in the total of cornfields as a function of the VIs, subsequently these results were incorporated into the GIS to visualize them spatially.

Methodology precision
The validation results of the models in the 19-reference farms for AW 2017-2018 season was done by using the  Table 3 and shown in Fig.  8. A determination coefficient (R 2 ) of 0.9154 for the NDVI model and of 0.9124 for EVI, with RMSE values of 0.1307 and 0.1219, respectively, were estimated. The values indicated good precision for both models, however the effect of the point´s dispersion and the spatial and temporal variability in the 19 farms distributed in the 20,000 ha of the study area was also reflected, mainly when the value was greater than peak value estimated by the model. This influenced the precision to estimate phenology as reported by Burke & Lobell (2017) and Hufkens et al. (2019). Fig. 9 shows the actual phenology (a) and the validation of the spectral models (b), at IM to two contrasting moments of crop development (beginning and end of the season) generated with a GIS, and using Table 4 in the algorithm. At the first moment (January 11, 2018), Fig.  9 (above), the corn farms were predominantly in phases V6-V8, V8-V11 and V11-VT. In a scenario of normal water availability, this information is very useful for the authorization by the WUA for the second and third irrigation since these should coincide with stages V6-V8 and V11-VT respectively as recommended by INIFAP (2017) and Ojeda-Bustamante et al. (2006); under restricted water availability scenarios, this approach can have greater relevance since the IM operated by a WUA has the recurrent need to optimize the number of needed irrigations  and distribute them in the most sensitive phases to water stress without affecting the yield. The validation in this date indicated that compared with the actual phenology, 92.1% of the farms coincided with that generated by both spectral models, 2.5% with NDVI model, 3.1% with EVI model and 2.3% with neither model. On the second date (April 17, 2018) as Fig. 9 (below) shows, the predominant phases were from R2-R3 to R6 and farms close to being harvested. This is important to schedule the last irrigation and to estimate farms close to phenological maturity. The validation indicated that compared with the actual phenology, 86.6% of the monitored farms coincided with the phenology generated from both models, 2.7% with the NDVI model, 1.8% with EVI model and 8.9% with neither model. This decrease in precision of the models and the increase of the error with respect to the first date may be due to the presence of har-vested farms o in senescence, with weeds and the effect of clouds, as reported by Whitcraft et al. (2015).

Conclusions
The spatio-temporal inter and intra-seasonal variability of corn phenological development, justifies the use of remote sensing data to monitor crops over large agricultural areas. The historical behavior of two VIs (NDVI and EVI), associated with the phenological development (PD), showed differences in the early and late stages of development, having EVI higher sensitivity. An algorithm and methodology for monitoring corn phenology on large scale using both field data and remote sensing data were generated and integrated with the GIS-based phenology monitoring. The validation of the methodology in reference fields indicated an acceptable correlation between the observed PD and the estimated PD for the two VIs with R 2 = 0.9154 and 0.9124 for NDVI an EVI respectively, and RMSE values of 0.1307 for NDVI and 0.1219 for EVI, however, the effect of point´s dispersion and spatial variability was also observed, which can be reduced by using homogeneous areas. On the other hand, the validation of methodology at large-irrigated scale, using field data on two contrasting dates, was more practical and precise due to the points dispersion effect was reduced by using values interval. This model indicated that at the beginning of the crop season the precision to monitoring the phenological phases was of more than 92% with the two VIs and decreased to 86.6% in the end crop season, mainly due to the presence of harvested farms with bare soil o in senescence.