Post-market environmental monitoring of Bt maize in Spain : Non-target effects of varieties derived from the event MON 810 on predatory fauna

The Spanish Government has established post-market environmental monitoring (PMEM) as mandatory for genetically modified (GM) crop varieties cultivated in Spain. In order to comply with this regulation, effects of Bt maize varieties derived from the event MON810 on the predatory fauna were monitored for two years in northeast and central Spain. The study was carried out with a randomized block design in maize fields of 3-4 ha on which the abundance of plant-dwelling predators and the activity-density of soil-dwelling predators in Bt vs. non-Bt near-isogenic varieties were compared. To this end, the plots were sampled by visual inspection of a certain number of plants and pitfall traps 6 or 7 times throughout two seasons. No significant differences in predator densities on plants were found between Bt and non-Bt varieties. In the pitfall traps, significant differences between the two types of maize were found only in Staphylinidae, in which trap catches in non-Bt maize were higher than in Bt maize in central Spain. Based on the statistical power of the assays, surrogate arthropods for PMEM purposes are proposed; Orius spp. and Araneae for visual sampling and Carabidae, Araneae, and Staphylinidae for pitfall trapping. The other predator groups recorded in the study, Nabis sp. and Coccinellidae in visual sampling and Dermaptera in pitfall trapping, gave very poor power results. To help to establish a standardized protocol for PMEM of genetically modified crops, the effect-detecting capacity with a power of 0.8 of each predator group is given. Additional key words: GMO; non-target arthropods; Orius spp.; Staphylinidae; Carabidae; PMEM; statistical power.

2005; Farinós et al., 2008;Castañera et al., 2010).However, few field studies have demonstrated the suitability of the design used to provide reasonable evidence, since most field trials do not include an analysis of the likelihood of correctly rejecting an incorrect null hypothesis, the so-called statistical power of the test (Cohen, 1988;Perry et al., 2009).The power of a test to detect an effect in plots of GM varieties in comparison with a comparator (for example a near-isogenic variety) is a function of several factors such as the magnitude of the effect to be detected, the variability (e.g. standard deviation) of raw data, the sample size, and the type I error, α.A review of the power of the historical data of field trials conducted to assess effects of GM crops on NTAs may help to select organisms with high power for ERA and PMEM programmes.Furthermore, power for certain non-target species or species assemblages may be an important criterion for selecting indicator or surrogate species, as suggested by Prasifka et al. (2008), in addition to other criteria that have been suggested for surrogate selection.
The principal aim of the study was to test the null hypothesis that there were no significant effects of Bt maize varieties on maize predators.On the same time, we tried to check if the test had sufficient statistical power (> 0.8) to detect differences of 50% or 25% in relation to the mean of a non-Bt conventional variety that we call the comparator.To do this, we have conducted a field trial where the abundance or activity of the main predatory species on a Bt maize variety and its isogenic variety were compared by means of visual counting and pitfall traps.The trial was conducted for two seasons in two maize growing regions, Lleida (northeast Spain) and the Tajo Valley (central Spain) where predatory fauna in maize is particularly well known.

Introduction
Genetically modified (GM) maize plants expressing an insecticidal Cry toxin from Bacillus thuringiensis Berliner, the so-called Bt maize, have been cultivated in Spain on a commercial scale since 1998, reaching an area of about 97,000 hectares (around 27% of the total maize grown in the country) in 2011 (MARM, 2011).Bt maize provides an effective control of the two key lepidopteran pests in Spain and other areas of southern Europe, the Mediterranean corn borer (MCB), Sesamia nonagrioides (Lefèbvre) (Lepidoptera: Noctuidae) and the European corn borer (ECB), Ostrinia nubilalis (Hübner) (Lepidoptera: Crambidae).However, concerns about environmental impacts of cultivation of Bt maize have been expressed in Europe.EU directives state that environmental risk assessment (ERA) must be assessed before a variety is authorized for commercial cultivation, and that post-market environmental monitoring (PMEM) programmes must be implemented once the variety has been planted (EFSA, 2006(EFSA, , 2011)).
Going ahead of what was included in the EU Directive 2001/18/EC, the Spanish government established PMEM as mandatory for any GM crop variety to be registered for cultivation in Spain (Order of the Ministry of Agriculture of 23 March 1998).Potential impacts of Bt maize on the environment to be considered included those for non-target arthropods (NTAs), an important component of agro-ecosystems formed by non-target herbivores and pests' natural enemies, among other functional groups (Asín & Pons, 1998;Albajes et al., 2003;de la Poza et al., 2005;Farinós et al., 2008;Lumbierres et al., 2011).
Reduced populations of predators and parasitoids are among the unintended direct or indirect effects that need to be assessed and monitored by field trials.Most of the trials conducted in the field to assess or monitor unintended effects of Bt maize on NTAs in Europe have shown no effects (Ortego et al., 2009), including a commercial-scale field study performed in Spain over three years with a variety derived from the transformation event CB176 (de la Poza et al., en los muestreos visuales y Carabidae, Araneae y Staphylinidae en los muestreos con trampas de gravedad.El resto de grupos registrados en el estudio, Nabis sp.y Coccinellidae en los muestreos visuales y Dermaptera en las trampas de gravedad, mostraron una potencia considerablemente menor.Se hacen recomendaciones para el diseño de ensayos para PMEM. Palabras clave: OMG; artrópodos no diana; Orius spp.; Staphylinidae; Carabidae; Araneae; potencia estadística.Effects of Bt maize MON810 on predatory fauna Spain, NS) and one in the Tajo Valley (central Spain, CS).The study was performed in two seasons.Field sizes were between 3 ha and 4 ha and standard local cultural practices in each area were used.Fields were sown between 8 April and 15 May in all cases.Two days after sowing, the plots in NS were sprayed with a herbicide mixture of 35% alachlor + 25% atrazine (Primdal, Agrodan, Brabrand, Denmark) at 6-8 L ha -1 .
A randomized block design (RBD) (plots of about 0.5 ha) was set up involving two treatments each with 3 and 4 blocks in NS in years 1 and 2, respectively, and 3 blocks in CS in both years.The treatments included (i) Bt transgenic maize (Event MON810, cv.PR33P67 in year 1 and DKC6041 YG in year 2), and (ii) non-Bt isogenic hybrid (cv.PR33P66 in year 1 and DKC6040 in year 2).Seeds in NS were dressed with imidacloprid (Gaucho ® Bayer Crop Science, Germany; 4.9 g a.i kg -1 of seeds).In CS seeds were treated with chlorpyriphos (Fostan 5G, Alcotán, Spain; 8 kg ha -1 ) in year 2.
Composition and abundance of plant-dwelling predators were determined by visual surveys of the plants, and the activity of soil-dwelling predators was recorded using pitfall traps.On each sampling date, between 12 and 15 plants from the central part of the plot were inspected visually, and the number of predators was recorded.The number of pitfall traps installed on each plot was 3 or 4 on all sampling dates.The number of visual sampling dates and trapping weeks was 6 and 7 per season in years 1 and 2, respectively, in NS and always 6 per season in CS.Sampling dates were distributed to cover the period of usual population peaks of predatory groups.Sampling dates were fixed according to prestablished maize growth stages at V 5-7 , V 4 , V 9-10 , V 15 , R 1 , R 3 , and R 5 .Pitfall traps consisted of a glass jar of 9 cm diameter and 17 cm depth (NS) or 12.5 cm diameter and 12 cm depth (CS) filled with water with ethilenglicol (25%) or ethanol and some detergent to decrease surface tension.The traps were left open for 7 days.
In the visual sampling five predatory taxa were recorded: the genera Orius (Heteroptera: Anthocoridae) and Nabis (Heteroptera: Nabidae), the families Coccinellidae and Carabidae and the order Araneae.In pitfall trapping two families, Carabidae and Staphylinidae, and two orders, Dermaptera and Araneae, were recorded.These predatory groups were selected according to the abundance and potential effects recorded in previous field trials (Lang et al., 1999;Albajes et al., 2003;Jasinski et al., 2003;de la Poza et al., 2005;Farinós et al., 2008).
In the combined analyses of variance a split-splitplot-like model (Gomez & Gomez, 1984) was initially used to analyse data in which year and region were considered the main plots.Subplots were the treatment and sub-subplots were the sampling dates.All factors, except blocks, were considered fixed and crossed with each other, except again for blocks, which were nested within regions and years.The main plot error term was block (year*region) and the subplot error was treatment*block (year*region).With this type of design, the two-way (treatment*sampling date) interactions were significant (p < 0.05) only in less than 12% of the cases analyzed and it was decided to use values of mean arthropod abundance (in visual sampling) or mean captures (pitfall traps) per season and to perform the statistical analysis with a RBD to increase statistical power in comparison with the split-split-plot-like design.As two-way interactions were very rarely (< 5%) significant (p < 0.05), they were pooled within residual error.In the RBD the factor 'block' was the error term for the factors 'year' and 'region' and the residual error was used for the factor 'treatment'.To normalize the original data, they were transformed by log10 (x + 1) prior to analysis.The level of significance was p < 0.05 in all cases.
The power of the tests was calculated to detect a 25% or 50% variation in the mean of abundance or activity of each organism in the comparator (non-Bt isogenic variety).Also, the power for these variations was calculated when data were analyzed year by year or region by region in order to compare power values of multi-year and multi-region tests vs. one-year and one-region tests.The capacity of the assay to detect differences in the Bt maize in relation to the mean of the near-isogenic variety with a statistical power of 0.8 was calculated (p = 1 -β, being β the type II error and considered here as = 0.2).Type I error considered was always α = 0.05, a value commonly used in ANOVA and power calculations in this kind of studies.The JMP statistical package was used for all analyses and power calculations (SAS, 2008).

Plant dwelling predators
The composition of predatory fauna recorded at the two locations did not differ greatly.The genus Orius was the most abundant group at both locations (62% and 47% in NS and CS, respectively).Araneae (33%) were the second most abundant group in NS and the third in CS (15%), whereas Coccinellidae were the second in CS (32%) and the third in NS (2%).The other predatory groups recorded accounted together for less than 10% of the total predators recorded on plants.
Table 1 shows mean numbers of plant-dwelling predators and results of statistical analyses.None of the arthropod groups recorded on plants was affected by the Bt trait.However, the year and particularly the region accounted for most of the variability, although in a different direction depending on the arthropod group.Whereas there were significantly more Orius spp. in year 2, the opposite was observed for Coccinel-lidae.Significantly more predators were recorded in CS except in the case of Araneae, which showed no significant differences between the two regions.

Soil-dwelling predators
Carabidae (46% and 71% in NS and CS, respectively) and Araneae (53% and 26%) were the most abundant groups at the two locations, accounting together for more than 96% of the predators recorded.Staphylinidae accounted for 3% of the total in CS but were practically absent in NS.Dermaptera were consistently found in pitfall catches but always in very low numbers at the two locations (< 2%).
The mean numbers of predators recorded in pitfall traps are shown in Table 2. Year and region were by far the most important sources of variability.Carabidae and Staphylinidae were significantly more abundant in CS, whereas there were no significant differences in Araneae and Dermaptera.Bt trait significantly affected

Statistical power in the trials
With the results of the ANOVA, the statistical power of the field trials was calculated for each of the predators recorded and the difference that could have been detected with the analyses performed with a power of 0.8 was calculated for each of the arthropod groups sampled (Table 3).
Total predators, Orius and Araneae were the groups with the highest capacity to detect differences between the treatment and the comparator in visual sampling with a power of 0.8 (Table 3).Accordingly, Orius spp.and Araneae were the predatory groups with the highest power to detect 25% and 50% differences whereas Coccinellidae, Carabidae, and Nabis spp.showed poor power results.In pitfall trap sampling, most of the groups showed high power results -in general higher than those obtained in visual sampling.Differ-ences lower than 20% would have been detected for the main three predator groups and the total number of predators in pitfall traps.In general pitfall trapping had a higher capacity to detect differences than visual records.
Power of the one-year or one-region tests is shown in Table 4. Power varied inconsistently from one year to the other and between regions.In most organisms (3 out 5) recorded in visual sampling, power to detect a 25% difference was lower in year 1 than in year 2, as was power to detect a 50% difference.In organisms caught in pitfall traps, power was lower in year 1 for one organism, whereas it was higher or equal for the other two and the total number of predators.When the two regions were compared, power to detect a 25% difference was higher or equal in most organisms recorded in visual sampling in CS, a similar pattern to that observed in pitfall traps.Most power values calculated to detect 50% differences were 1 for both regions.
Comparison of power values in combined analysis (Table 3) with respect to data analyzed by years or regions (Table 4) may also allow us to conclude about how useful replication during several years or at several locations may be for increasing power in this type of field trial (Table 5).To detect 25% differences in visual sampling, power values derived from the combined ANOVA (the two years and the two regions together) were higher than those derived from the singleyear ANOVA (years 1 or 2) in three organisms, whereas power was lower in the combined analysis for only one organism, and was the same for another organism.When power was compared for detecting a 50% difference, the pattern was similar to that observed for detecting a 25% difference.Power increased or decreased inconsistently in combined analysis in comparison with single-year analysis when pitfall trap catches were analyzed.
A similar pattern may be observed in the comparison of the combined vs. single-region analyses (Table 5).In 25% variation detection in visual sampling, most organisms showed increased or equal power in the combined ANOVA in comparison with NS but the results were less consistent in comparison with CS.Even fewer differences were found for detecting 50% differences.In pitfall trap records, power was rarely increased by performing the field trial at more than one location and power was equal or even lower in the combined analysis than in the single-location analysis in several organisms.

Discussion
Post-market environmental monitoring aims, among other objectives, to detect potential negative effects of cultivating GM crops on non-target organisms and on their ecological functions.To our knowledge, this is the first time that results of post-market environmental monitoring of non-target effects of Bt varieties based on the event MON810 in Spain are published.Predators are among natural enemies that play an important role in natural insect pest control in agro-ecosystems and must be included in any PMEM protocol.In this study the most abundant predators found in maize were monitored by visual sampling and pitfall trapping, and the relative abundances of the groups recorded agree with those of other studies conducted in Spain (Albajes et al., 2003(Albajes et al., , 2009;;de la Poza et al., 2005;Farinós et al., 2008).
In general, no significant differences in arthropod densities on plants or activities on the soil were found when varieties derived from event MON810 were compared with their corresponding isolines.Only rove beetles (Coleoptera: Staphylinidae) were affected by Bt maize, the number of captures in pitfall traps in CS being significantly lower in the Bt variety.Similar results were reported by de la Poza et al. (2005) in a three-year study conducted in Central Spain, where the number of staphylinid catches was significantly lower in a Bt maize variety derived from event CB176 in one of the years and lower, but not significantly so, in the other two years.A lower activity of rove beetles in Bt maize has been also recorded by Balog et al. (2010) but only for aphidophagous predatory rove beetles.When staphylinids are considered as an assemblage it is difficult to discuss about potential causes of effects of Bt maize because this family is quite heterogeneous in feeding habits including predatory species but also mycetophagous, parasitoids, and even herbivore species.Some of the authors have undertaken further studies on the potential mechanisms involved in such effects in the case of predatory rove beetles.When a commercially available rove beetle predator, Atheta coriaria Kraatz, was tested in the laboratory for its susceptibility to Cry1Ab toxin, the one tested in this study, no negative influence on the main biological parameters of the predator was found when the toxin was provided via the food web (García et al., 2010) Abundance of plant-dwelling generalist predators, such as Orius spp.and spiders, has been reported to increase occasionally in Bt maize (Jasinski et al., 2003;Musser & Shelton, 2003;de la Poza et al., 2005).Higher densities of Orius spp.and Araneae on Bt plots in visual sampling were confirmed, although differences from a non-Bt isogenic variety were not significant.The difference has been attributed to a better quality of the silks in host plants as a consequence of no borer attack in Bt plants (Jasinski et al., 2003;Musser & Shelton, 2003) or to a higher abundance of potential prey, par- 983 Effects of Bt maize MON810 on predatory fauna ticularly leafhoppers and aphids, on Bt plots (Pons et al., 2005;Albajes et al., 2011).As herbivore insects were not monitored in this study, this hypothesis may not be confirmed but the gap shows that the main herbivore arthropods in maize should also be monitored if biological control functions have to be measured.A study conducted in Europe to compare spider abundance and richness on Bt vs. non-Bt plants also found no differences between the two kinds of maize varieties (Meissle & Lang, 2005).
Statistical analysis of field trials for PMEM of a GM variety aims to test the null hypothesis: there is no difference between the GM vs. a conventional comparator variety in the abundance or activity of a nontarget organism.As many field trials conducted to test the null hypothesis in the case of GM crops are usually not able to reject it, we need to know the probability that the analysis will reject it when in the reality it is false.The statistical power of a test is the probability of rejecting the null hypothesis when a given opposite hypothesis is true, and allows a proper control of variation and adequate replication in regulatory trials which seek to study whether there are any deleterious environmental effects of new products (Perry et al., 2003).Power is therefore an important quality parameter of the field trials conducted for ERA and PMEM.
It is widely accepted that values of 0.7 (Prasifka et al., 2008) or 0.8 (Perry et al., 2003(Perry et al., , 2009;;Naranjo, 2005) are sufficient in field trials for ERA or PMEM purposes.In the present study the value of 0.8 was accepted to analyze the quality of the field trials carried out for PMEM.Use of retrospective power analyses for interpreting non-significant results has been criticized by several authors (see Nakagawa & Foster, 2004 and references within).However, values derived from this kind of studies may provide an indication of which arthropods might be suitable for ERA and PMEM purposes.Here, power analyses has allowed doing some recommendations about which arthropods may be used in field trials to provide the highest statistical power.
Only Araneae showed a power close to 0.8 to detect a 25% variation in visual sampling and pitfall traps; this group as a whole was also signalled as having a high power value by Prasifka et al. (2008), particularly in pitfall trap records but less in visual records, as was also found here.Also, Staphylinidae may be considered as indicators at locations with high pitfall trap captures because of their high power, although their ecological functions may vary according to the species.In regions where Staphylinidae are caught in lower amounts, as reported for NS field trails (Albajes et al., 2011) power is much lower, reaching inacceptable values.Although they did not always reach a power of 0.8 to detect a 25% variation in visual sampling in single-year or single-region analyses, Orius spp.may also be considered as a potential surrogate for PMEM.Also, Prasifka et al. (2008) found that lady beetles had moderate power in visual sampling, although lower than 0.8.for detecting low differences.To detect higher differences (50%), most of the organisms sampled in pitfall traps have sufficient power, but among on-plant predators only the most abundant groups, such as Orius spp.and Araneae, may be expected to have a statistical power of 0.8.Duan et al. (2006), who found a higher power in ants caught with pitfall traps than in ants counted in a soil-extraction technique, despite the higher means recorded in the latter, suggest that pitfall traps measure activity more than density and that when this is strongly aggregated, as in the case of ants, foraging behaviour of individuals causes a lower variability among trap locations.
Capacity of the assay to detect small differences or effects in the measured values is another way to express the statistical power of the assay but it has an advantage in relation to the power value.This may have a saturated response at values of 1; in these conditions power value cannot be used to compare different responses of assays or organisms, whereas capacity to detect differences never reaches saturated responses and allows the comparison of assays and organisms with high detecting capacities.There were two separate groups in the abundance or activity-density of the non-target organisms according to their effect-detection capacity.
The first group, with detecting capacities clearly lower than 50%, includes Orius spp., Araneae and total predators among those found on plants, and Carabidae, Araneae, Staphylinidae and total predators among those caught in pitfall traps.A second group includes the rest of the predators recorded in visual sampling and pitfall traps and includes groups with very poor effect-detecting capacities (above 70%).
As combined analysis of multi-year or multi-region trials has a higher sample size, a higher power could be expected in the combined analysis than in the singlefactor analysis if the variance of the data resulting from the increase in the number of years does not counterbalance the increase in sample size (Duan et al., 2006).However, this was not a general consequence of combining the two years and the two regions in one global analysis.Only in visual sampling did combined analysis increase the power of single-year or region analyses, although not consistently.The increase in power derived from the increase in sample size in the combined analysis may be counterbalanced in some predator groups by the increased variability between years and regions, as noted in Tables 1 and 2, where these two factors, and particularly region, significantly influenced predator densities or activity in several predator groups.In pitfall trapping, the combined analysis gave increased power only in a few cases, mainly because of the relatively high values found in the single analyses; only in less abundant predator groups may replication of trials at more than one location or in more than one year lead to increased power.
In summary, field trials carried out to monitor negative impacts on non-target arthropods caused by Bt maize varieties derived from the event MON810 did not detect negative effects in almost all taxa recorded.Only Staphylinidae in pitfall traps showed significantly different activities between Bt and non-Bt plots, being more abundant in the latter, though they only reached significant numbers in one of the regions where trials were conducted, central Spain.The power of the trials conducted was above the threshold of 0.8 for most of the taxa recorded in visual sampling and pitfall traps when the objective was to detect 50% differences.When the object was to detect a difference of 25%, only Orius spp. in visual sampling and Carabidae, Staphylinidae, and Araneae in pitfall traps had power values above 0.8.Moreover, all these predators have a capacity to detect differences below 26%.These groups may therefore be candidates to be used as surrogate species or taxa in monitoring Bt maize effects on nontarget arthropods although other criteria have also to be considered for PMEM like susceptibility to Bt toxin or the probability for the organism to be exposed to the Bt toxin in the field among others.Pitfall traps have in general more power than visual sampling, so they would need a lower sample size.Field trials over several years and in various geographic locations are required to take into account the influence of different seasons and localities, but our results highlight that adding years and or locations to single trials does not necessarily ensure increased power.A high variability between years or locations may lead to lower power when multi-year or multi-location trials are carried out.The conclusions achieved from this study should be confirmed with field trials conducted on an increased number of years in order to have a more precise relationship between power, sample size, and data variability.

Table 5 .
Variations in power when field trials are analyzed separately in each of the two years (years 1 and 2) or in each of the two regions (northeast Spain, NS and central Spain, CS) in comparison with the combined analysis of data of the two years and regions together.Power is increased (I) ordecreased (D)  or at the same maximal value (=) in the combined analysis

Table 1 .
Mean (± S.E.) number of individuals per plant recorded by plant visual inspection on plots of Bt and non-Bt maize.Each mean is the average of 26 values, except in Carabidae in which n = 14.The trial was conducted in two regions, northeast Spain (NS) and central Spain (CS), for two years.There were no significant (p > 0.05) differences between treatments for any predator group Carabidae were analyzed only in NS due to the very low numbers in CS. a d.f.= 1,3.b d.f.= 1,19. *

Table 2 .
Mean (± S.E.) number of individuals caught per pitfall trap in plots of Bt and non-Bt maize.Each mean is the average of 164 values, except in Staphylinidae in which n = 72.The trial was conducted in two regions, northeast Spain (NS) and central Spain (CS), for two years.Means within a row followed by no letter are not significantly (p > 0.05) different Effects of Bt maize MON810 on predatory fauna the number of individuals collected only in the case of Staphylinidae and CS, which were more abundant on non-Bt plots.
* Staphylinidae were analyzed only in CS due to the very low numbers in NS. a d.f.= 1,3.b d.f.= 1,19.

Table 3 .
Statistical power of the field trial conducted for monitoring effects of Bt maize on non-target arthropods.Power has been calculated to detect 25% or 50% differences in relation to the mean of the isogenic variety that was used as a comparator.Difference in % in relation to the mean of the comparator that could have been detected by the assay with a power of 0.8

Table 4 .
Statistical power of the field trial conducted to monitor effects of Bt maize on non-target arthropods when data are analyzed year by year and region by region.Power has been calculated to detect 25% or 50% differences in relation to the mean of the isogenic variety that was used as a comparator