Assessment of post fire vegetation cover using spectral mixture analysis . Application and comparison of different endmember characterization methods

The analysis of satellite images allows one to monitor the regeneration of vegetation after a fire. In this work, a methodology for quantifying post fire vegetation cover was developed. The proposed methodology is based on the examination of Landsat 7 ETM+ images by using Spectral Mixture Analysis (SMA) and involves the following steps: a) pre-processing, b) inherent dimensionality image determination c) endmember characterization following two methods that thus lead to different models: traditional method based on the knowledge of the area worked as well as Minimum Noise Fraction and Pixel Purity Index method, d) model inversion and combination, e) comparison between the vegetation cover estimated by each model and that measured in field, and f) selection of the most accurate model and mapping of the vegetation cover for the study area. The cover estimated provided by the different models exhibited a high correlation (Spearman’s correlation coefficient >0.89). The average absolute difference between the estimated and field-measured vegetation cover obtained with the most accurate model for each fire never exceeded 6%.


Introduction
Fire, whether natural or man-made, has shaped most Mediterranean ecosystems.Wildfires have diverse, varying effects on the ecosystem depending on the characteristics (physiographic, climatic, and biotic) of the medium, the fire pattern (intensity, frequency, time of year, fire type) and the weather conditions after the fire.Accurate assessments of vegetation recovery in burnt areas requires not only a qualitative analysis (species, communities), but also the determination of abundance (cover, Leaf Area Index, biomass).
The analysis of satellite images allows one to monitor post fire vegetation cover.A number of authors have sought a correlation between vegetation cover and some index derived from the combination of the image bands of a multi-spectral sensor, generally of the Landsat or NOAA sensors (Viedma et al., 1997;Díaz and Pons 1999).The most widely used index for studying regeneration processes is the NDVI (Normalised Difference Vegetation Index); some authors, however, have obtained good results with an index derived from TM4 and TM5 bands (Fiorella and Ripple 1993) or TM4 and TM7 Landsat image bands (Lopez and Caselles 1991).In the quantitative research of vegetation in Mediterranean areas, where the scattered type is abundant, the contribution of the soil to the reflectivity of the pixel is essential, and this limits the use of the traditional vegetation index (Mustard andSunshine 1999, McGwire et al., 2000).
Spectral Mixture Analysis (SMA) is a model based on the linear mixing of two or more pure spectral endmembers (Adams et al., 1986, Gillespie and Smith 1990, Adams et al., 1993, Roberts et al., 1997a;2002).These techniques assume that a pixel can be defined in terms of the proportion of each pure component (endmember) it is composed of and have some advantages over the use of vegetation indexes, especially prominent among which is the fact that the fraction of endmember «vegetation» determined by SMA is consistent with the green vegetation cover projected (McGwire et al., 2000) and that the ratio between both variables is linear and less sensitive to the effect of soil than the NDVI is (Mustard et al., 2001, Small 2001).SMA has become an essential tool for remote sensing vegetation analysis.It has been used to derive the fractional contributions of endmember materials to image spectra in a wide variety of applications as vegetation cover (Cross et al., 1991), mapping land degradation (Metternicht and Fermont 1998;Haboudance et al., 2002), land cover change (Elmore et al., 2000, Roberts et al., 1997a, 1997b, 2002, Rogan et al., 2002) and regeneration after disturbance (Riaño et al., 2002) from images obtained by using mostly multispectral and, to a lesser extent, hyperspectral, sensors (Rambal et al., 1990, Adams et al., 1993, Mustard et al., 2001).Knowledge of biophysical variables (e.g.cover, biomass, Leaf Area Index) is required (Fiorella and Ripple 1993).Recent studies have shed some light on the relationship between such variables and the cover estimate provided by SMA in a specific situation (Small, 2002).
The main purpose of this work was to develop a methodology for the mapping of post-fire vegetation cover from Landsat ETM+ images by using Spectral Mixture Analysis.Specifically, we examined various endmember characterisation methods.Also, we assessed the feasibility of applying them to lithologically non-uniform areas.

Study areas
Two different areas, both of which had been devastated by wildfires in the past decade, were studied, namely: Aznalcóllar and Beas de Granada (Andalusia, South Spain) (Figure 1).The fire at Aznalcóllar took place in 1995 and affected around 2500 ha.Natural vegetation before the fire consisted of sclerophyllous evergreen wood dominated by stone pine (Pinus pinea L.) and cork oak (Quercus suber L.).At present, the area devastated by the fire is covered mainly with rockrose (Cistus ladanifer L.) associated with other shrub species (C.populifolius L., C. salvifolius L., and Phillyrea angustifolia L.) at a medium density.Shady places also exhibit cork oaks (Quercus suber).The areas less markedly affected by the fire, generally scattered, still contain spots of stone pine (Pinus pinea) and, to a lesser extent, black pine (P.pinaster Ait.), in variable density.The climate is Mediterranean with hot summers.Mean annual precipitation and temperature are approximately 650 mm and 12°C, respectively.Topography is comprised of rolling hills and, lithologically, the area is fairly uniform, with abundant slates, quartzite and Palaeozoic schist.
The fire at Beas de Granada took place in 1993 and burnt almost 7000 ha covered mainly with black pine woods and, to a lesser extent, oak woods (Quercus ilex L. subsp.ballota (Desf.)Samp.), dwarf gorse (Ulex sp., Genista sp.) and pinewoods of Aleppo (Pinus halepensis Mill.).The area devastated by the fire exhibits spots covered with thickets of low-average density, consisting of mainly dwarf gorse, rosemary (Rosmarinus officinalis L.) and male rosemary (Cistus clusii Dunal in DC), with occasional bushes of cork oak.Areas covered with withering grassland are also present where spots of regenerated black pine and sizeable patches of Cistus laurifolius L. and Adenocarpus decorticans Boiss.can be found.In the areas less marked by the fire, wooded spots of black pine still remain, although Aleppo pine and cork oaks are also present.The climate is also Mediterranean [with hot summer sobra?] with mean annual precipitation and temperature of approximately 470 mm and 15°C, respectively.Topography is comprised of steeper slopes and the area is lithologically non-uniform; it comprises materials such as compact carbonated stones, limestone and compact dolomites, heavily fractured dolomites, phyllites and mica-schist.

Radiometric calibration
Two Landsat 7 ETM+ images taken on July 12 and August 31, 2000 at Aznalcóllar and Beas de Granada, respectively, were used for this work.Uncalibrated image data were converted into atmospherically corrected reflectance by using the image-based atmospheric correction, the COST model, proposed by Chavez (1996) and calibration parameters obtained from Landsat 7 ETM+ sensor pre-launch calibration data and the ETM+ data header (Irish, 1998).
The radiometrically calibrated images were geo-rectified using ground control point and high-resolution digital ortophoto quadraught (DOQ) of the study area obtained from the Spain Geographically Referenced Information Program.Root mean squared (RMS) errors were held to less than one pixel in all cells.

Inherent dimensionality image determination and endmember characterisation
The linear SMA approach assumes that the spectrum measured by a sensor is a linear combination of the spectra of all components within the pixel (Adams, et al., 1995;Roberts, et al., 1997a).To apply this methodology the following conditions must be satisfied: (1) selected endmembers should be independent of each other, (2) the number of endmembers should be less than or equal to the spectral bands used, and (3) selected spectral bands should not be highly correlated.It is well recognized that remotely sensed data, such as visible bands in Landsat TM/ETM+ data, are highly correlated between the adjacent spectral wavebands (Barnsley, 1999), so that several authors recommend using a maximum of 3 or 4 endmembers (García-Haro, 1996;Radeloff et al., 1999).Several techniques have been used to transform the data from highly correlated bands to an orthogonal subset.Principal component analysis (PCA) and minimum noise fraction (MNF) are the two most common transformations (Green, et al., 1988;Jensen, 1996).
Previous studies have shown that the use of MNF transform can improve the quality of fraction images (van der Meer and Jong, 2000;Small, 2001;Wu and Murray, 2003), and so it was used in this study in two steps (ENVI, 2000): (1) de-correlation and rescaling of the noise in the data based on an estimated noise covariance matrix, producing transformed data in which the noise has unit variance and no band-to-band correlations; and (2) implementation of a standard PCA of the noise-whitened data.The result of the MNF in this study showed that the 3 primary bands in Aznalcollar and the 4 primary bands in Beas contained coherent eigenimages, so that a number of 3 endmembers was selected.
Endmember spectral characterisation is vital to SMA (Tompkins et al., 1997) to take advantage of the latter's ability to provide a physically meaningful fraction.We used two different endmember characterisation methods to construct various models in order to compare their results and select the most suitable one for the purpose intended.The methods involved: (a) Selecting pure pixels directly from the image on the basis of the knowledge of the area (model 1) or (b) selecting pure pixels from the images provided by the Minimum Noise Fraction (MNF, Green et al., 1988) and Pixel Purity Index (PPI, Boardman et al., 1995) (models 2i) (Figure 2): a) The first approach for characterising endmembers has been to select representative homogeneous pixels from satellite images on the basis of a set of reference polygons defined using high-resolution digital ortophotos quadraught (DOQ) of the study areas and field dataPolygons were required to be at least 40 ´40 m in size and dominated by a single land cover type (Dennison and Roberts 2003).The endmembers characterised were (1) soil and (2) vegetation.
Because the lithology in Beas de Granada was non-uniform, we used various groups of pixels to characterise three different soil endmembers in the area that in turn led to the development of various models that were subsequently combined into one upon inversion.In model 1-A, the group of pixels representing the «soil A» component was located in areas of bare soil on fractured dolomites.In model 1-B, the pixel group representing the «soil B» component was located in areas of soil with grassland on compact limestone.Finally, in model 1-C, the pixel group representing the «soil C» component was located in areas of field covered with parched grassland on schist and mica-schist.The image included several areas with quarries that were masked by setting appropriate thresholds (specifically, those pixels with a reflectivity of over 20%, 25% and 30% in bands 1, 2 and 3, respectively, were eliminated).
In addition to the endmember vegetation and soil, we also examined the endmember (3) shade, which was artificially characterised by assigning it a reflectivity value of 0 in all bands.The purpose of including this component was to separate the shading effect of the vegetation and that of the topography.
b) In the second approach we constructed the n-dimensional view of the MNF image by including only the pure pixels that exceeded a threshold value of purity (a threshold PPI value) (McGwire et al., 2000, Small, 2001, van der Meer et al., 2000, van der Meer and Jong, 2000).The clusters of pixels with a high PPI were used to characterise the endmembers.In this way, various models were generated for each study area on the basis of the PPI threshold considered.
Characterizing endmembers from MNF and PPI images was relatively easy in the case of Aznalcóllar.
The MNF images viewed using different PPI threshold values exhibited three well-defined clusters of pure pixels which, when overlapped with the initial Landsat images, were associated with three land cover types (1) soil, (2) vegetation, (3) shade.In order to determine the optimal threshold, we used an iterative model, involving the steps: • Model 2-1: a low threshold for the PPI image was used to bound pure pixels clusters (PPI ³ 1).
• Model 2-3: if the hypothesis that model 2-1 overestimated and model 2-2 underestimated cover relative to field measurements was fulfilled, then a threshold in between those of model 2-1 and model 2-2 (PPI ³ 50) (model 2-3) would be used.
The endmember characterisation from MNF and PPI images proved rather complicated for the case of Beas.The pure pixels in MNF images obtained with a unity PPI threshold revealed the presence of three clusters, one of which corresponded to areas of low reflectivity (water-shade), and two to vegetation areas (one with thick forest vegetation and the other with riverbank vegetation and cultivation).No well-defined pixel cluster for the soil endmember was observed.However, there was a diffuse cloud of pixels corresponding to areas of bare soil that exhibited a disparate spectral behaviour.With a high PPI threshold (PPI=100), the cluster corresponding to thick forest vegetation disappeared and the pixels corresponding to the soil continued to form a diffuse cloud.We thus chose to use PPI=25 as a high threshold and we followed a procedure similar to the one for Aznalcóllar in order to characterise the (1) soil and (2) vegetation endmembers, leaving the characterisation of the different soils for a later stage.
• Model 2-1: a low threshold for the PPI image was used to bound clusters of pure pixels (PPI ³ 1).• Model 2-2: a high PPI threshold (PPI ³ 25) was used.
The above-described process allowed us to characterise the endmember (2) vegetation and (2) shade, but not (1) soil.In order to characterise «soil A», «soil B» and «soil C» a new PPI image was generated, this time from only those pixels with scant or zero vegetation (viz.pixels with NDVI > 0.25 were eliminated).This pure pixel image was superimposed over a 6, 3, 2 colour composition of the LANDSAT image and on the soil map of the area, and a group of pure pixels was selected to represent each of the main «soil» types (but, choosing only those pixels belonging to the PPI image.The three types of endmember (1) soil examined were designated: «fractured dolomites» (soil A), «compact limestone» (soil B) and «schist and mica-schist» (soil C), which led to the intermediate model: 2-1-A, 2-1-B and 2-1-C, their combination gave model 2-1), 2-2-A, 2-2-B and 2-2-C (which gave model 2-2), and 2-3-A, 2-3-B and 2-3-C (which gave model 2-3).

Inversion, combination and post-processing
Once all endmembers to be used in each model had been defined and spectrally characterised, they were inverted to obtain the fractions of endmembers for each pixel in the image.A partially restricted inversion model was used for this purpose that involved the adding equation of fractions equal to 1 (McGwire et al., 2000, Mustard andSunshine 1999).
The presence of a variety of soils in Beas de Granada area led us to generate a different intermediate model for each type of soil.Subsequently, they were combined in such a way that the model best predicting reflectivity was applied to each pixel.The criterion used to combine these intermediate models was to choose, for each pixel, the model with the lowest RMSE (Garcia-Haro et al., 1996).
In order to assess and compare the proposed models, we calculated the average RMSE, after inversion (after inversion plus combination in the case of soil diversity).Since the image portion used included areas with surfaces irrelevant for the intended purpose (viz.cultivated, urban areas, etc.), we calculated the RMSE, taking into account only the area of interest (viz.the one within the perimeter of the fire).
After eliminating negative and aerate-than-unity fractions, as well as the shade fraction, the «soil» and «vegetation» fractions were renormalized so that a combination of the two equalled unity.

Field measurements
Different model estimations were validated by comparison with field measurements.Measurements of post fire vegetation cover were collected during the months of July and August, 2000.Data were collected in 31 plots of land at Aznalcóllar, and in 35 at Beas de Granada.
After post-processing of the developed models, their estimates were validated by comparison with field measurements.Measurements of post fire vegetation cover were collected during the months of July and August 2000.Data were collected in 31 plots of land at Aznalcóllar, and in 35 at Beas de Granada.In each plot, total vegetation cover, shrub cover (including herbaceous cover an tree regeneration) and tree cover were measured using the linear intersect method (Bohanm, 1998) by establishing two normal transects 50 m long (one in the direction of maximum slope) that intersected at the centre.A global positioning system (GPS) was used to record geographic co-ordinates for each transect centre.

Accuracy and mapping
Once vegetation transects were located on the Landsat images, the cover measured in field was assumed to represent the average of the 4 pixels in the 2 ´2 window that it?Included it?
In order to assess the accuracy of the cover estimates provided by the SMA models, we calculated the average absolute difference between the estimated and observed cover values for each model.Correlation between these variables was quantified in terms of Spearman's correlation coefficient; also, significant differences between means were defined using the Wilcoxon test of paired samples.The use of non-parametric statistic techniques on account of the obvious lack of normality in the samples was chosen.Also, residual plots of the models (cover measured in field minus cover estimated by each model versus cover estimated by each model) were visualised in order to detect any relation between estimated cover and residuals.The results obtained in these analyses were used to select the model best estimating vegetation cover in each study area.
Once the most suitable model was chosen, it sufficed to reclassify cover measurements in order to obtain an easier clearer final cartographic output.Water areas were omitted from the final map by using a mask.

Model accuracy and selection
The average RMSE for the four models applied to the fire in Aznalcóllar ranged from 0.7 to 1.1%.In Beas, the differences in RMSE upon inversion and combination of the models were minimal: 0.6 and 0.7%.Thus, the models constructed for both fires provided acceptable average RMSE values (generally below the accepted 1%) (Table 1).When RSME images of each model were visualised, the highest RSME values corresponded to roads and firebreaks, so this could indicate that the models work worse in edge areas between bare soil and vegetation.In Beas, the models presents also high RSME values in pixels that corresponded to riverbank vegetation.
Table 1 shows the coefficients of correlation between the cover measurement estimates provided by the four proposed models and the field measurements.As can be seen, all models exhibited a high positive correlation with the field measurements, with very similar coefficients ranging from 0.89 0.90 at Aznalcóllar and from 0.90 to 0.95 at Beas.The average absolute differences between estimated and field cover values ranged from 6% (models 1, 2-2 and 2-3) to 11% (model 2-1) for Aznalcóllar, and from 5 to 9% (5, 6, 7 and 9% with models 2-3, 1, 2-2 and 2-1, respectively) for Beas (Figures 3a,3b,3c,3d).
Regarding the presence of significant differences between means, Wilcoxon's test revealed that, with both fires, models 1 and 2-3 provided good results for our purpose, i.e., there were no significant differences between the means for the field and modelled data; models 2-1 and 2-2, however, resulted in significant differences (Table 2).A comparison of estimated and field cover values reveals that, for both fires, model 2-1 yielded rather overestimated vegetation values in virtually all land plots (in more than half of the plots, cover was overestimated by over 10%).Model 2-2 provided slightly improved, but still largely underestimated results.Models 1 and 2-3 yielded substantially improved results, the differences between estimated and measured cover values found ranging ±10%.Respect to residual plots, no relation    between residuals and estimated cover was detected (Figure 4).
The next step towards the generation of the final vegetation cover map mainly involved selecting the best characterised model for each study area of the above-described validation results.
The four models for Aznalcóllar exhibited a high correlation with field measurements, so this parameter by itself did not allow us to identify the best model.
In terms of an average absolute difference between the estimated and field cover values, model 2-2 provided appreciably worse results (10%) than the three other models (6%).This, together with the fact that Wilcoxon's test exposed significant differences for this model between the mean, the estimated, and field cover values, and the obvious underestimation of the former, led us to reject this model.Model 2-1 was also rejected.Although the average absolute difference between estimated and field cover values was not greater than with the other models, Wilcoxon's test revealed significant differences between both variables and a clear tendency towards overestimation.
The next step was to choose the better model between the remaining ones, viz.models 1 and 2-3.The comparison of field measurements and estimates for both models provided good results (high correlation, 6% average absolute difference, and non-significant differences in Wilcoxon's test).We therefore adopted model 1 simply because its RMSE was slightly lower (0.7%) than that for model 2-3 (1%) The situation for Beas was similar to that for Aznalcóllar.Thus, models 2-1 and 2-2 were rejected because, despite the high correlation between their cover estimates and the field measurements, the average absolute difference was greater than in the other models, Wilcoxon's test revealed significant differences, and a tendency towards underestimation and overestimation with models 2-2 and 2-1, respectively, was clearly observed.
Likewise, it was difficult to choose between the remaining models (models 1 and 2-1).Thus model 2-3 exhibited an average absolute difference between its estimates and the field measurements, which was slightly lower than inferior 1% (5 and 6%); on the other hand, RMSE model 1 had a slightly lower 0.6% than model 2-3 (0.7%).Such small differences led us to conclude that either model would be acceptable.

Mapping
The selected SMA models were used to model the Landsat images corresponding to the study areas obtaining the post fire vegetation cover maps.Finally, these cover maps were reclassified using four vegetation cover intervals (Figures 5a, 5b).Qualitatively, these maps are a good approximation of post fire vegetation cover.The Aznalcollar fire was dominated for draft.With a high fraction cover of Cistus ladanifer as can be seen on the map.On the other hand, Beas fire has a lesser vegetation cover due to specific composition (Ulex and Genista sp.) and a complex lithology.

Discussion
This study showed that detection of post fire vegetation recovery with SMA is feasible.The results presented here suggest that routine assessment of a restoration process will be possible in the future with existing space-borne multispectral imaging systems.Imaging spectroscopy has provided detailed information on vegetation damage and post fire vegetation on the last decade (e.g.Riaño et al., 2002) and this study demonstrated how such information can be used to support local studies of land cover restoration after wildfires.
The greatest difficulty in analysing spectral mixtures is the accurate spectral characterisation of endmembers.Several endmember selection methods have been proposed for multiple endmember techniques.These methods have concentrated on finding the set of endmembers that best represents spectral variation in materials in an image (Roberts et al., 1997a;1997b;Okin et al., 2001).In this work, we compare the performance of two reported methods based on: (a) the selection of pure pixels from the work image (method 1) and (b) the characterisation of end-members by generation of MNF and PPI images, and their n-dimensional visualisation (method 2).
Selecting pure pixels directly on the image was quick and easy, as a comprehensive knowledge of the study areas was available that allowed us to visually bound sufficiently wide and uniform areas, with a total cover of field vegetation.Selecting pure pixels through the generation of MNF and PPI images proved more complex.Thus, when defining the clusters characterising each end-member through n-visualisation required the development of an iterative model to overcome a problem previously encountered by Small (2002).If endmembers are characterised on the basis of more or less restrictive criteria, then the spectral signature defined for each endmember can differ markedly.As shown here, being too restrictive (viz.visualising only those pixels in the MNF image with a high value in the PPI images) results in underestimated, vegetation cover values as one may possibly be characterising the spectral response of the forest vegetation on the basis of riverbank areas or cultivation.Conversely, if the threshold PPI is too low, then the assumed pure pixels may not correspond to 100% vegetation cover and lead to overestimated vegetation cover values.To choose an appropriate PPI threshold, it may suffice to check on the cover type to which the work image corresponds.This call for a comprehensive knowledge of the study area and, even more so, the process defining a cluster may be highly subjective.The iterative process proposed in this work can be used when the knowledge of the area is inadequate; in any case, it is much more objective, but requires the use of a series of field plots for which, in order? the actual cover is known to be able to ascertain which PPI thresholds will result in overestimated and which in underestimated values.One added difficulty potentially arising in characterising endmembers by n-visualisation of pure pixels (PPI image) in the MNF image is the absence of a cluster for every endmember to be included in the model.Such was the case with Beas de Granada, where the information in the PPI image had to be supplemented with the knowledge about the area in order to be able to characterise the soil endmembers.
These endmembers may model the Landsat image and produce a vegetation cover map of an acceptable accuracy, although both methods for the selection of pure pixels have some limitations and, above all, are to some extent subjective.Using both in parallel and selecting one on the basis of validation with field measurements is no doubt the best choice.
The methodology applied to Beas in order to consider its lithological non-uniformity, allowed us to obtain vegetation cover estimates as accurate as those for Aznalcóllar and using a single soil endmember.
In both study areas, the average absolute difference between the estimated vegetation cover and field vegetation cover values never exceeded 6%; also, the correlation between both variables exceeded 0.89.These results are similar to those obtained by other authors in the estimation of vegetation cover using Spectral Mixture Analysis of Landsat images.Thus, Maas (2000) estimated the cover of cotton fields with an average absolute difference of under 7%, and Mustard and Sunshine (1999) obtained an average difference between estimates and measurements of forest vegetation cover of under 3% and a correlation coefficient of 0.88.
The models have been validated with field plots that included a range of total vegetation cover between 12% a 95%and a range of tree cover (Pinus pinaster, P. halepensis, P. pinea, Quercus suber, Q. ilex) between 0% and 86%.The results obtained in this work confirm the potential of the Spectral Mixture Analysis not only for estimating the post-fire vegetation cover, but also for estimating vegetation cover of Mediterranean bushes and trees (Pinus spp.and Quercus spp.).The models developed worked worse in limit areas between vegetation and bare soils like roads, paths and fire lines and in areas with riverbank vegetation.
The vegetation cover map obtained provides reliable information that can be used as the starting point for other studies dealing with the evolution of vegetation after a wildfire, and for the modelling of erosive processes.However, regarding the latter type of study, it would be of great interest to distinguish differentiating wooded cover from thicket cover.

Figure 1 .
Figure 1.Location of the study area.

Figure 2 .
Figure 2. Methodology proposed: (model 1) Selecting pure pixels directly from the image on the basis of the knowledge of the area or (models 2i) selecting pure pixels from the images provided by the Minimum Noise Fraction and Pixel Purity Index.

Figure 5 .
Figure 5. Vegetation cover maps reclassified using four vegetation cover intervals for Aznalcollar (a) and Beas (b).

Table 1 .
Coefficients of correlation between the cover measurement estimates provided by the four proposed models and the field measurements

Table 2 .
Wilcoxon's test of means for the proposed models and the field measurements