Vineyard area estimation using medium spatial resolution satellite imagery

The European Union requires member states to estimate their wine growing potential. For this purpose, most member states have developed or updated vineyard registers. The present study suggests locating vineyards using medium spatial resolution satellite imagery. The work was carried out using Landsat images that were validated for the Designation of Origin «Bierzo», León, Spain. The methodology described in this paper yields a producer’s accuracy of 0.88 and a user’s accuracy of 0.63. The vineyard areas for each municipality were estimated from the classified images by linear regression, with fits of R > 0.80. The method gives good results at the municipal scale. Additional key words: remote sensing, vineyard area estimation, vineyard inventorying, vineyard register.


Introduction
The vineyard register (VR) is a tool for managing and controlling the Common Organisation of the Market in Wine (COMW) in the European Union (EU).The purpose of this tool is dual: to compile an exhaustive inventory of wine production potential and to meet administrative and control requirements.In order to better identify vineyard parcels, some Member States, including Spain, decided to use aerial photographs in addition to the existing cadastre or land registry maps.The data contained in the VR should provide Member States and EU authorities with information to better manage the COMW by estimating potential production, actual production and current stocks of wine, and by keeping records of the declarations and controls of measures (Masson and Leo, 2002).The VR must be maintained and regularly updated to be useful for its main aim.The present work focuses on providing a tool for updating the VR using remote sensing imagery.
Vineyard identification by remote sensing is beset with difficulties, such as the presence of discontinuous cover within areas planted with a given type of crop.In these areas, the influence of the soil on the image is very noticeable.The arrangement of vines requires the use of high spatial resolution sensors, at a high cost.However, experiments have been performed using medium spatial resolution imagery with good results (Hall et al., 2002).The results obtained from remotely sensed imagery were strongly influenced by local conditions, and the spectral response for the type of vine was influenced by the physical and chemical characteristics of the soil (Arán et al., 2001).
One of the first experiments that classified vines using Landsat images was performed in the state of New York (Trolier et al., 1989).In this experiment, the maximum likelihood classifier was applied to four Landsat images acquired at the beginning and end of the growth period (June-August).Accuracy ratings of 0.88 were achieved.However, the experiment discriminated only six types of soil cover and was carried out in areas with large vineyards.The authors recommended that classifications were made using images acquired at the period of maximum growth (mid-July), and that the likelihood classif iers were supplemented with additional techniques such as the use of masks to incorporate land use cartography.Williamson (1989) compared two different images of irrigated vineyards located at Riverland, Australia, acquired with SPOT VRH (pixel size of 20 m) and Daedalus 1268 ATM (pixel size of 10 m and 11 bands).The study showed that the results obtained with SPOT under these conditions (accuracy rate of 0.85-0.95)were as good as the results obtained for the Daedalus 1268 ATM image.Average plot size (16 ha), image acquisition date and crop type were among the factors that controlled classification accuracy, and most of the information in the images was contained in the red and infrared bands.Therefore, the use of three or more bands did not improve accuracy.
Another way to tackle the problem is to use multitemporal studies, change analysis and a combination of multispectral bands.In this regard, mask classif ication methods have been applied to Landsat-5 imagery (García and García, 2001;Lanjeri et al., 2001Lanjeri et al., , 2004)).The methodology consists in classifying the different land uses in successive stages: non-irrigated crops are determined by analyzing principal components; fallow land, woodland and olive groves are detected through supervised classifications of an image acquired in May; irrigated crops, urban areas and small lakes are detected using supervised classification; and the remainder is considered as vineyard.Therefore, each step in the process produces a mask for each discriminated cover.
Other classifications are based on the search for band combinations that optimize the separability of vines from other crops (Rubio et al., 2001).This technique has proved effective for vineyards in Tomelloso, Ciudad Real (Spain), with an accuracy of 0.91 that could be improved by using seasonal image analysis.
In Portugal, experiments have been carried out to locate vineyards by remote sensing (Bessa, 1994).The aim of that work was to develop a method for determining the area under vines using Landsat imagery and falsecolour infrared photographs.The classified maps were combined in a GIS (Geographical Information Systems) with layers of the physical environment (topography, petrology, climatology, etc.).This methodology generated a map of the potential wine growing area in the region around Oporto.Along the same line, studies on crop evolution over a period of time have been developed, with the main aim of assessing the impact of the European Common Agricultural Policy (García and García, 2001;Lanjeri et al., 2004).
Other authors have used active sensors (radar) to classify vineyards (Company et al., 1994;Bugden et al., 1999).The results are quite good, with an accuracy of 80% for vineyard classification.However, these methods are very sensitive to the vine training system (goblet pruning, cordon, trellis, etc.).So far, the best results have been obtained by applying Fouriertransform based techniques to high-resolution aerial colour photographs (Ranchin et al., 2001;Robbez-Masson et al., 2001;Wassenaar et al., 2001), with overall accuracies over 0.82 (Kappa = 0.64), user's accuracies of 0.96 and producer's accuracies of 0.74.These techniques identify vines by their regular planting pattern, i.e. by the repetition of patterns such as vine shape, texture and orientation, rather than by their spectral response.The great advantage of this group of classifiers is that they allow for some automation of the process, thus avoiding the need for the photo-interpreter to carry out inconvenient visual analyses, point out training areas, etc.
Lately, Gong et al. (2003) have proposed a method using airborne multispectral digital camera imagery.Raw images are processed to generate feature images including grey level co-occurrence based texture measures, low pass and Laplacian filtering results, Gram-Schmidt orthogonalization, principal components, and NDVI.The procedure averaged an overall accuracy of 0.81 for the six digital images tested.Rabatel et al. (2008) proposed an automatic methodology for vineyard detection in aerial images (pixel size: 0.5 m) based on Gabor filters.The detection process was assessed in a vine cultivation zone of France and more than 0.84% of vineyards were detected.
Given the need for vineyard inventories, the EU started to apply space technologies for providing independent and timely information on crop areas and yields.In the 1990s, the «Vinident Study» (included in the Monitoring of Agriculture with Remote Sensing -MARS project) started the evaluation of vineyard identification using aerial photographs (Masson and Leo, 2002).The «Bacchus Project» is a more recent work funded by the European Community, whose main scientific aim is to provide methodologies for vine area location, parcel identification and vine description based on the use of very high resolution remote sensing data and GIS (Fuso et al., 2004;Montesinos and Quintanilla, 2006).The expected results are optimum; but the cost for maintenance derived from using very high resolution imagery turns the Bacchus procedure unfeasible for extensive areas.
The general aim of this study is to propose a method to improve the current methods used in vineyard registers by using medium-spatial-resolution multispectral satellite imagery.Such a method must be simple, feasible, economically viable and applicable to other wine growing areas, since vineyard registers need continuous updating.The proposed method was applied to the Designation of Origin «Bierzo» (DO Bierzo) (Fig. 1).

Study area description
The study area was the district of El Bierzo (Fig. 1), which is located to the West of the province of León,  Spain, and covers 2,954 km2 (18% of the province).Because El Bierzo is the most important district for wine production in the province, the wine produced in the district was awarded the status of quality wine grown in specif ic regions on December 11, 1989, whereby the DO Bierzo was established.
From 2003 to 2007, the average production has increased to 22,458,828 kg of grapes and 15,721,180 L of wine per year.There are 4,183 wine growers and about 3,890 ha of vineyards registered in the DO Bierzo.Furthermore, the 55 wineries registered in the district have sold 6,899,368 bottles of wine per year, 3% of which have been exported to Denmark, Germany and the USA, among other countries.
Environmental conditions in El Bierzo allow for growing wine grapes: an average altitude of 600 m and an Atlantic microclimate (average temperature of 12.3°C and average rainfall of 721 mm) favour the quality of cultivars such as Mencía and Godello.Bud break usually occurs in mid-March and bloom occurs one month later.By the first of June, leaves are fully expanded and maximum growth occurs 25 days later.Crops are harvested during early September or October depending on seasonal climate conditions.Senescence begins in October and the dormant period extends until the end of February.
The standard vine spacing is 1.2-1.5 m and plant density is generally over 6,900-4,500 vines ha -1 depending on the training system.Vineyards are very old in this area, and 90% of them have existed for over 40 years.Yields are fairly high as compared to the yields in the rest of the province, with an average of 6,000-8,000 kg ha -1 .
Wine grape growing in El Bierzo has serious structural diff iculties because the systems used and the spacing of vines are not appropriate for the application of new viticulture techniques, and because the average parcel size is 0.20 ha.Yet, in the last few years, small wineries have specialized in quality wines and compete with other wines offering good value.In new vineyards, vines are planted at a spacing of 3 × 1.2 m, grown using a trellis system and trained with unilateral pruning cordon.

Images and maps
Two Landsat subscenes were used: a Thematic Mapper (TM) subscene acquired on June 25, 2000 (Image 1) and an Enhanced Thematic Mapper Plus (ETM+) sub-scene acquired on September 5, 2000 (Image 2).Both images (Table 1) belong to path 203/row 30, and are Level 1G System Corrected products.These images (not included) allowed monitoring the twenty municipalities included in the DO Bierzo (Fig. 1), which accounts for 97% of the vineyards in the district.
Digital vector maps and colour orthophotos were used to georeference images, select training areas and validate the classifications.Cartographic data were projected using the UTM coordinates in the European Coordinate System1950 (ED-1950; UTM-Zone30N).

Image pre-processing
The Landsat images were calibrated.Calibration of raw digital values to reflectance was performed using the information included in the header files following standard procedures (Markham and Barker, 1986).The atmospheric effects were corrected by obtaining normalisation functions from the radiometric responses of invariant cover classes for both images.After atmospheric correction, the images were co-registered and georeferenced using the control point procedure and resampling by nearest neighbour.Root mean square (RMS) errors for correction were lower than one pixel for both images, which ensures the spatial integrity required by the proposed method.All the images were processed using ENVI 3.4 software (Research System, Inc.; www.RSInc.com).

Image classification
Visual analysis (3/2/1 and 4/3/2 compositions) allowed locating training areas and identifying land cover Different supervised classification algorithms were tested for both images, including hard classif iers (parallelepiped, minimum distance, Mahalanobis distance and maximum likelihood) and threshold determined by vegetation indices.The vegetation indices considered were Normalized Difference Vegetation Index-NDVI (Rouse et al., 1974) and Soil-Adjusted Vegetation Index-SAVI (Huete, 1988).NDVI [Eq.1] was related to vegetation greenness and leaf area index: [1] where ρ NIR and ρ R are reflectances in the near-infrared and red bands, respectively.SAVI was developed to decrease the noise in vegetation response due to soil background effects and to improve vineyard identification.The SAVI equation [Eq.2] introduces a soil-brightness dependent correction factor (L) that compensates for the difference in soil background conditions: [2] Surveys and visual analysis defined the training areas for each cover, homogeneously distributed throughout the ortophotographs and overlapping images.Altogether, 1,303 training areas with an average area of 2.02 ha were defined.For vineyards, there were 23 newly planted areas, 72 areas with wide-distance planting and 172 with short-distance planting, which is the commonest type in the district.To compare results, the same training areas were chosen from the two images.The separability of the classes was determined by calculating the transformed divergence and the Jeffries-Matusita distance for each pair of covers (Jensen, 2000).The values of these parameters were acceptable (between 1.5 and 1.8) for most of the pairs studied except for the separability of the three classes of vineyards from bushes and from one another.
Then, the training areas were overlaid with the images to define the spectral signature for all the eighteen classes; therefore, georeferencing had to be accurate and previous to classification.
To improve the expected results, the most accurate classifications of the June and September images were combined (called Image 1&2), such that each pixel was assigned a cover class if the cover class was the same for both classifications.
The classified images were subjected to an accuracy assessment to compare different classification algorithms and to obtain a level of confidence for each method.The sampling unit used was the pixel.The size of the sample was calculated at 0.95 probability (z a/2 = 1.96), with a maximum allowable error (e) of 0.05, assuming that the error/accuracy prediction possibility (p = q) was 0.5 [Eq. 3].The total number of pixels to sample (n) was 385: [3] Test pixels were defined by a 700 × 700 m grid on the classified image that was then overlapped with orthophotos to check the cover class assigned.The total number of pixels sampled was 2,559 from which 120 were vineyard, which is a sufficiently representative number for the validation designed.

Vineyard area estimation
With a view to assessing the usefulness of the proposed method to estimate vineyard area, and to describe and analyse the time series for the crop, the classifications were compared with the official statistics for the municipalities.This information was required for use in procedures that allowed for a rapid estimation of vineyards and their evolution over time.The methodology proposed to achieve this goal consisted in determining whether there was any correlation between the vineyards areas obtained from the classifications and the areas obtained from off icial statistics.The municipal area of vineyards was estimated by applying a mask for each municipality to each classified image.
Comparisons were made for each municipality.If strong correlations were found, the next step was to fit the equations for vineyard area estimation as a function of the results of the classification.
An ANOVA verified the proposed fits statistically.The Durbin-Watson test was calculated to verify that the hypothesis of no autocorrelation between the estimated vineyards and the official statistics was true.

Results and Discussion
The results of the different classification methods are listed in Table 2.The best overall accuracies were obtained by maximum likelihood, so those classified images produced Image 1&2.With regard to vineyard classification, producer's accuracy increased significantly but commission errors were over 0.70 for all the six methods (results not shown).Neither NDVI nor SAVI improved the classification results because both indices were statistically similar for vineyards and pastures, bushes and reforestation areas in the June image, and for non-irrigated crops and pastures in the September image (Rodríguez-Pérez et al., 2002;Rodríguez-Pérez, 2004).
In order to analyze the contribution of the different cover classes, the confusion matrices that resulted from classifying the June and September images by maximum likelihood are shown in Tables 3 and 4, respectively.As suggested in both tables, vineyards were difficult to identify because crops (classes 1 and 2), grasslands (class 3), pastures (class 5), bushes (class 6) and bushes & trees (class 7) had similar spectral signatures.In the June image, field crops were confused with vineyards because they were spectrally similar, but the classification improved in September, when crop senescence began.
Grasslands kept spectral similarity with vineyards in both periods; in fact, the most important source of error for vineyard identification was grassland cover: 43 grassland pixels were classified as vineyard based on the June image and 42 grassland pixels were classified as vineyard based on the September image.Cover class 7 (bushes & trees) was the second source of error for vineyard confusion in September.Similarly, the algorithms studied were not useful in separating the three different classes of vineyards.
Table 5 presents the validation results obtained by combining the June and September classified images by maximum likelihood.Overall accuracy increases, as well as the user's and producer's accuracies for vineyards.The drawback of this method is that the number of unclassified pixels increases considerably, and that a very precise geometric correction is required to ensure the overlay of the two images.
Table 6 summarizes the accuracy indices derived from Tables 3, 4 and 5. Overall classification accuracy is 0.36 for Image 1 and 0.37 for Image 2. The κ coefficient is a measure of how well the classification agrees with the reference data.The κ values obtained were 0.30 for Images 1 and 2 and 0.50 for Image 1&2.With regard to vineyard classifications, 73% of the vines were wellclassified using Image 1, 62% were well-classified using Image 2 and 88% were well-classified combining both classifications.
These ratings are lower than the ratings obtained by other authors (Company et al., 1994;Bugden et al., 1999;Ranchin et al., 2001;Robbez-Masson et al., 2001;Wassenaar et al., 2001).Yet, these authors worked with high spatial resolution images.In addition, it must be taken into consideration that we have sought to differentiate among many types of cover and that the average vineyard plot size is very small, with over 97% of vineyards being smaller than 0.05 ha, 85% of which of validation with isolated pixels: κ increased to 0.70-0.92by conducting another validation with directed sampling using parcels (Rodríguez-Pérez, 2004).The producer's accuracy for vineyard classification was always over 0.62 (Table 6), although the commission error was quite high.Therefore, the probability that a pixel classified as vineyard was not actually vineyard was over 0.60.The main mix-ups between vine and other uses were observed for grasslands, pastures, croplands and brushes.By combining both images (Image 1&2), overall classification accuracy increased to 0.57.This process offered the highest global accuracy (0.57) and κ (0.50), with one drawback: the number of unclassified pixels and the sample error increased.By combining both classified images, commission errors were reduced by up to 0.37, and producer's accuracy exceeded 0.88.
The areas occupied by the three classes of vineyards (by maximum likelihood algorithm) were determined (Table 7).Some differences were observed between the vineyards area estimations using June or September images.Such variations were due to the different dates and conditions under which the images were acquired.
Vineyard area was overestimated, and a larger vineyard area was obtained from the classifications based on both Image 1 and Image 2 (Table 8).According to data available from the National Institute of Designations of Origin (NIDO) (MAPA, 1999;JCyL, 2004), the vineyard area inventoried for the DO Bierzo (6,982.6 ha) was much lower than the area obtained by classifying     (11,538.8 ha).The main reason for such an overestimation was the confusion between new vines and bare soil, and between widely planted vines and other crops and brushes.In the Image 1&2, the area obtained for each cover was smaller because there were many unclassified pixels (Table 8), and the total area estimated from the combination of the classified images was 3,612.2ha; in absolute terms, the vineyard area estimated by this method was 3,370.4ha (52%) smaller than the area reported in the official source.
Regarding the vineyard area at municipality level the correlations between the results of the classifications and the official data were calculated and their statistical signif icance was checked.Table 9 reports the Pearson correlation coefficients for the correlation analysis and shows a noticeably high level of correlation between the results obtained by classification and the official sources consulted (always over 0.90): by using the combination of both images, a very strong correlation (R = 0.95) with the municipal winegrowing area was observed.
Linear regressions were determined between official source and the data obtained from the classifications because of the strong correlation found between winegrowing statistics and the results of the classifications.Table 10 shows the most significant parameters of the regressions obtained.R 2 coefficients of over 0.80 were obtained for all the regressions, which suggest the good standard of the linear regressions proposed.
The best fits were obtained by combining the two images (Image 1&2) achieving R 2 = 0.90.Regarding the Durbin-Watson test, there was no autocorrelation between each pair of variables considered in the different regressions (Table 10).Regarding Image 1 and Image 2 the linear fits obtained between each pair of actual and estimated area was good because the regressions show very high R 2 values.The main shortcoming of the pro-   posed models is that the mean errors in estimated were quite high.
Figure 2 shows the linear fits between the estimates of municipal vineyard area obtained from Images 1, 2 and 1&2, and the vineyard areas reported in NIDO (actual areas).Using Image 1 or Image 2 (Fig. 2A, 2B) the vineyard areas were overestimated because the commission errors were high (0.60 and 0.66, respectively).With regard to Image 1&2 and NIDO (Fig. 2C), the correlation line shows a good fit.However, the line slope (b = 2.02) indicates that Image1&2 underestimate the vineyard area by a factor of two.Some municipalities, such as Vega de Espinareda (18) and Noceda del Bierzo (14), did not fit to the model, however both absolute vineyard areas were good estimates.The information for Cabañas Raras (4), Cacabelos (5) and Camponaraya (6) did not fit because of the lack of statistical data.Conversely, the area under vine for Corullón (11), Ponferrada (15), Priaranza del Bierzo (16), Villadecanes (19) and Villafranca del Bierzo (20) were underestimated because many vineyards were classified as other croplands and due to how Image 1&2 was made (a pixel was classified as vineyard if this was true in both Image 1 and Image 2).
Currently, the inventorying vineyards require comprehensive field surveys that must be carried out as required by European standards, and it is costly in terms of time and money.So the results shown demonstrate that the proposed methodology would considerably reduce the resources needed, since the use of this methodology would allow for the restriction of field surveys.
The main drawback of the model is that it cannot estimate the absolute value of vineyard area per municipality because the standard error in the estimation were 130.73 ha, 166.09 ha and 120.28 ha for Image 1, Image 2 and Image 1&2 respectively.Nevertheless, the proposed method is useful for estimating vineyard area in relative terms and for estimating variations in vineyards over time, since vineyard area variation data can be updated by classifying Landsat images and by applying the proposed model.Such a prediction would   8).be used as an aid to field surveys, which could be restricted to areas of interest, requiring less time.Thus, the vineyard inventory process would become more efficient.In addition, the proposed method is quicker than the current digitization process.This method is useful for estimating vineyard areas at large scale, or for estimating the future evolution of the winegrowing area in El Bierzo.

Conclusions
This study proposes a methodology for estimating vineyard area using Landsat satellite imagery.The proposed methodology should be useful in developing a vineyard register for El Bierzo.
Accuracy improves by up to 30% by classifying two Landsat images of the same area acquired at two different times of the year, and by combining both images.Although some pixels remain unclassified, the proposed methodology allows for the estimation of vineyard area at the municipal level.
The vineyard area estimated for the municipalities included in the DO Bierzo is strongly correlated with actual vineyard areas from official data.Such a strong correlation enables the calculation of linear regressions that accurately estimate municipal vineyard area based on image classifications.
The proposed method has some advantages over the methods used currently for developing the vineyard register in terms of human and material resources, and enables immediate register update.
Future research work for improving this methodology will consist in using high spatial resolution imagery in order to enable vineyard area estimation at the parcel level.

Figure 1 .
Figure 1.The Study area is the Designation of Origin Bierzo (DO Bierzo).This district is located to the North-West of the Region of Castilla y León, and to North of León province (Spain).

Figure 2 .
Figure2.Scatter plots showing relationships between official statistics data and estimated vineyard areas at the municipal level using Image 1 (A), Image 2 (B) and Image 1&2 (C).Continuous lines in plots represent linear models fitted with their R 2 associated, and dashed lines represent the ideal prediction (1:1).Numbers represent the names of the municipalities (see Fig.1and Table8).

Table 1 .
Characteristics of the satellite images

Table 2 .
Summarized accuracies of classification methods

Table 3 .
Error matrix table for Image 1 (in columns: classes from reference data; in rows: classes from imagery classification)

Table 4 .
Error matrix table for Image 2 (in columns: classes from reference data; in rows: classes from imagery classification)

Table 5 .
Error matrix table for Image 1&2 (in columns: classes from reference data; in rows: classes from imagery classification)

Table 6 .
Summarized accuracies of images classified by maximum likelihood

Table 7 .
Vineyard area obtained from the classifications (ha)

Table 9 .
Correlation between official statistics (NIDO) and classified vineyard area ** Confidence level of 95%.

Table 10 .
Regressions between official statistics and classified vineyard area , a: linear correlation coefficients.R: Pearson correlation coefficient.'R 2 : adjusted R 2 .SE: standard error.D-WS: Durbin-Watson Statistic at confidence level of 95%. b