Identification of a walnut ( Juglans regia L . ) germplasm collection and evaluation of their genetic variability by microsatellite markers

The characterization and evaluation of walnut (Juglans regia) germplasm constitute important aspects of taxonomic analysis and are valuable tools for breeding programs. In this work, a collection of 57 common walnut cultivars, mainly coming from Spain and the USA, has been studied with microsatellite markers. To carry out this work, 32 primer pairs flanking simple sequence repeats previously developed in Juglans nigra were screened to select the loci that presented high polymorphism and that were easier to score. The 19 selected microsatellite markers allowed the discrimination of the studied cultivars, with a total of 97 alleles detected and an average of 5 alleles per locus, confirming that these markers are more suitable tools for walnut identification than other molecular markers studied previously. The genetic similarity estimated from the molecular data clearly separated the Spanish walnuts from the Californian genotypes. Allelic data are presented for use as size standards to assist in correcting laboratory-tolaboratory variation of allele size calling. Some of them are compared with previous results published and the discrepancies found are discussed. Additional key words: cultivar discrimination; genetic relationships; molecular characterization; plant breeding; polymorphism.


Introduction
Juglans regia L. species (common walnut, Persian walnut, or English walnut) is widely cultivated throughout the temperate regions of the world (McGranahan and Leslie, 1990).The origin of this species seems to be a large area of the Central Asia Mountains (Nekrassowa, 1927;Berg, 1937;Browicz, 1976).During the last glaciations (Würm glaciations), walnut disappeared in Southern Europe and Southern Turkey, but survived in the warmer areas close to the Black Sea and Caspian Sea.According to available pollen data, walnut was reintroduced into former areas during the second millennium BC (Zohary and Hopf, 1988).Likewise, in Northern Spain, walnut pollen has been found in sediments from the Münsterian period, during the Low Palaeolithic time, when temperatures were expected to be low (Sánchez-Goñi, 1988).Walnut pollen has been detected also in the Carihuela Cave sediments, located in Southern Spain.This deposit of pollen has been dated to 28,000 years BC, when the Würm glaciations occurred (Carrión and Sánchez-Gómez, 1992).Walnut pollen from 11,000 to 7,000 years BC has been found in Central Italy, and has been dated to 5,000 years BC in the Southern Alps and the Balkans.In addition, Van den Brinks and Janssen (1985) reported the presence of pollen dated to 4,500 years BC in the Serra da Estrela (Portugal), which is older than the fossil pollen found in Turkey and in Greece (Bottema, 1980).Additional paleopalynologic study indicated that walnut was present also in Northern Africa during the Low and Middle Holocene (Ballouche and Damblon, 1988).Taking into consideration the previous reports and in particular the existence of walnut in both Central Asia and the Iberian Peninsula during the Würm glaciations, some questions have arisen concerning Vavilov's theory about the origin of cultivated species (Zohary, 1970).According to Frutos (2000), in the Iberian Peninsula, walnuts could have survived the rigours of the cold because the North-South oriented mountain range known as the Iberian System allowed the migration of species to the warmer areas.On the contrary, the species growing on the Northern side of the Pyrenees mountains, that are East-West oriented, did not escape from the cold because they found an insurmountable barrier when moving Southwards in the last Ice Age (Frutos, 2000).More recently, walnut has been cultivated since Greek and Roman times all around the Mediterranean basin, where this species is found as scattered individuals or as groups of trees, bordering agricultural land or rivers.In the 18 th century, walnut began to be cultivated in South America by the Colonial Spaniards, who exported J. regia genotypes from Spain.Finally, in the 19 th century, walnuts were imported into Northern California from France and into Southern California from China (Tulecke and McGranahan, 1994;Beede and Hasey, 1998).Cultivated distribution now includes North and South America (Chile, Argentine), Australia, New Zealand, South Africa and Japan.The Persian walnut is cultivated extensively for its high-quality nuts as well as base material for timber industry (McGranahan and Leslie, 2009).
In general, European walnut production still depends largely on trees grown from seedlings.The accurate identification of walnut genotypes is a basic requirement for the management and use of germplasm (clarification of synonymy, homonymy, and misnaming) for practical breeding purposes and for protection of proprietary rights.The traditional methods for its characterization have been based on the analysis and comparison of morphological observations.However, the influence of the environment in the expression of morphological characters, as well as the long juvenile period of the walnut trees, cause difficulties for the proper classification of the plant material exclusively by morphological traits.To overcome these limitations, molecular markers have been used to differentiate, characterize, and identify walnut accessions.Such DNA-based markers are not affected by the environment and they can be detected in all tissues at all stages of development.Previous studies of walnut genetic diversity were carried out with isozymes (Arulsekar et al., 1986;Solar et al., 1994;Ninot and Aletà, 2003;Vyas et al., 2003), restriction fragment-length polymorphism (RFLP) (Fjellstrom et al., 1994), randomly amplified polymorphic DNA (RAPD) (Nicese et al., 1998), inter-simple sequence repeat (ISSR) (Potter et al., 2002), and amplified fragment length polymorphism (AFLP) markers (Andreakis et al., 2002;Kafkas et al., 2005;Bayazit et al., 2007).Among the DNA-based markers, microsatellites or SSRs (simple sequence repeats) allow a high level of resolution in genetic studies due to their high polymorphism, co-dominant inheritance, reproducibility, and easy detection by PCR (Gupta et al., 1996).There is literature related to the use of microsatellites in the study of genetic relationships in walnut (Woeste et al., 2002;Dangl et al., 2005;Foroni et al., 2005Foroni et al., , 2006;;Robichaud et al., 2006;Victory et al., 2006;Wang et al., 2008;Pollegioni et al., 2009;Bai et al., 2010;Gunn et al., 2010).Recently Ciarmiello et al. (2010) have published the molecular characterization of Juglans cultivars via amplification refractory mutations system, a standard technique that allows the discrimination of alleles at a specific locus differing by as little as 1 bp (Stirling, 2003).
The main objective of this study was assessing the genetic variability of 57 Persian walnut accessions by SSR markers.The results confirmed the utility of 19 SSRs, previously developed in J. nigra (Woeste et al., 2002), for the characterization of J. regia varieties.This subset of SSR markers represent a powerful tool for future breeding programs involving the walnut collection included in this study and useful for programs aimed at the conservation of walnut genetic resources.Also, allelic data are presented for use as size standards, to assist in correcting laboratory-to-laboratory variation of allele size calling that may result from differences in methodologies.The differences found between the resulting data and those of previous studies are discussed.

Plant material
Plant material was obtained from the walnut collection at the Instituto Murciano de Investigación y Desarrollo Agrario y Alimentario (IMIDA), in Murcia, Spain.The 57 Juglans regia cultivars included in this study, together with their pedigree, place of origin, and observed genetic heterozygosity (Ho, calculated as the number of heterozygous loci for a given cultivar divided by the total number of loci assayed) are listed in Table 1.Forty-four of these accessions come from Spain, 11 from the USA, 1 from France, and 1 from Chile.

DNA extraction and PCR amplification
Genomic DNA was isolated from young fresh leaves using the DNeasy Plant Mini Kit (Qiagen).The DNA extracted was quantified using a spectrophotometer, and diluted to 10 ng µL -1 to carry out PCR amplifications.Thirty-two primer pairs flanking microsatellites, previously developed in J. nigra, were assayed in 12 genetically-diverse walnut genotypes.Nineteen of these showed a high polymorphism and repeatable and clear amplification patterns, according to the detection protocol described below, and were selected for the molecular analysis of each accession (Table 2).The PCR reactions were performed in a 20-µL volume and the reaction mixture contained 1x PCR buffer (Ecogen, Barcelona, Spain), 1.9 mM MgCl 2 , 0.2 mM dNTPs, 0.2 µM of each primer, 0.25 units of Taq DNA polymerase (Ecogen), and 10 ng of genomic DNA.A touchdown PCR amplification protocol was programmed (Don et al., 1991), consisting of an initial step of 5 min at 94°C, 35 cycles of 45 sec at 94°C, 45 sec at the annealing temperature for the primer, and 45 sec at 72°C, followed by a final step of 10 min at 72°C.The annealing temperature (Ta°C) was 58 or 60°C for the first cycle and was reduced by 0.2°C per cycle for the next 14 cycles.For the last 20 cycles, the annealing temperature was 55 or 57°C (Ta-3°C), respectively (Table 2).The PCR reactions were carried out in a 96-well block Thermal cycler (Eppendorf, Barcelona, Spain).The PCR products were detected using the ABI3730 Genetic Analyzer and the GeneMapper analysis software (Applied Biosystems).For capillary electrophoresis detection, forward SSR primers were labeled with the 5'-fluorescence dyes NED (yellow), 6-FAM (blue), VIC (green), and PET (red).The size standard used in the sequencer was GS500LIZ (Applied Biosystems).Each reaction was repeated and analyzed twice for confirmation.

Data analysis
The following variability parameters were estimated from the microsatellite marker data obtained (Table 3): number of alleles per locus (Na); effective number of alleles per locus (Ne = 1/Σp i 2 , where p i is the frequency of the ith allele); observed heterozygosity (Ho, calculated as the number of heterozygous genotypes divided by the total number of genotypes); expected heterozygosity (He = 1 -Σp i 2 , where pi is the frequency of the i th allele); fixation index (F = 1 -Ho/He) (Wright, 1950); and the power of discrimination (PD = 1 -Σg i 2 , where gi is the frequency of the ith genotype) (Kloosterman et al., 1993).These analyses were computed with the GeneAlEx V6 program (Peakall and Smouse, 2006).CERVUS software 3.0 (Kalinowski et al., 2007) was used for computing the proportion of null alleles and the significant deviations (p < 0.01 and p < 0.001) from Hardy-Weinberg equilibrium (HWE) at individual loci.The sequential Bonferroni test was used to compute the critical significance for HWE test (Rice, 1989).The proportion of shared alleles, as described by Bowcock et al. (1994), was used to calculate a genetic distance between all pairwise combinations of the 57 walnut cultivars studied, using the program Populations 1.2.28 (http://www.cnrs.gif.fr/pge)(Langella, 1999).This statistic is a measure of the dissimilarity between two samples; thus, the distance between two individuals that are identical at all loci tested is equal to zero.A dendrogram, based on shared allele genetic distance was constructed using the unweighted pair group method average (UPGMA) method implemented in Molecular Evolutionary Genetics Analysis (MEGA) Program v.

Polymorphism and heterozygosity of SSR markers
Nineteen polymorphic SSR primer pairs, developed for J. nigra (Table 2), were tested in 57 accessions of J. regia from the IMIDA collection (Table 1).Alleles were differentiated clearly using a capillary electrophoresis sequencer and no discrepancies were found in the banding patterns of the duplicate analyses of each DNA sample.These primer pairs had different levels of amplified bands, the sizes of which ranged from 105 bp at locus WGA054 to 295 bp at locus WGA202 (Table 2).All primer pairs produced a maximum of two bands per genotype, in accordance with the diploid level of this species.Genotypes showing a single amplified fragment were considered as homozygous for that particular locus, since segregation analysis is needed to detect the presence of putative null alleles (Callen et al., 1993).The number of alleles observed (Na) at each locus ranged from 3 (WGA079, WGA118, WGA225, WGA331, and WGA376) to 10 (WGA032) with an average of 5 alleles per locus (Table 3), much higher than the values of 1.3 and 3.9 detected in J. regia with RAPDs (Nicese et al., 1998), and ISSRs (Potter et al., 2002) respectively.Altogether, 97 alleles were identified in the set of accessions.In all samples, the effective number of alleles was lower than the observed and varied from 1.256 for WGA079 to 3.850 for WGA202 (Table 3).The observed heterozygosity ranged from 0.228 for WGA079 to 0.737 for WGA005, with a mean of 0.517, and was below than the expected heterozygosity in 14 loci out of 19 (Table 3).Consequently, the fixation index (F) values, used to estimate the degree of allelic fixation, were positive and close to zero for all the loci studied except for WGA005, WGA079, WGA321, WGA332, and WGA376, with an overall mean of 0.095.WGA069, WGA072, WGA089 and WGA349 loci had the highest estimated frequency of null alleles (Table 3).Significant departure over Hardy-Weinberg equilibrium appeared only in WGA069 (p < 0.001), WGA089 (p < 0.01) and WGA349 (p < 0.01) loci due to a deficiency of heterozygotes (Table 3).None of these deviations remained significant after applying Bonferroni correction (data not shown).
The UPGMA cluster classified 57 genotypes into two main clusters that generally agree with their geographic origins and pedigree (Fig. 1).The first cluster, from 'As1' to 'Salvador', includes most of the Spanish cultivars studied and the second one, from 'Serr' to 'Trinta', consists mainly of Californian cultivars.In the first cluster, five discrete sub-clusters can be defined, and for some of these subgroups a close relationship with their geographical origin can be established.Thus, the first subgroup, from 'As1' to 'Callao', consists mostly of cultivars from Northern Spain (Asturias and Cantabria regions), but also two cultivars from Southern and Southeastern Spain ('VZ4' from Granada and 'Callao' from Albacete).This subgroup includes also a foreign cultivar from Chile, 'Pirque'.The second subgroup, from 'Garganteña seedling' to 'Tobilla', comprises cultivars from Southeastern Spain (Albacete and Valencia provinces), but also from Cáceres (1), Asturias (2), and Granada (1).The third subgroup is the smallest and includes only 'As18' and 'Santa Cruz' cultivars from Northern Spain (Asturias and Cantabria regions).The fourth subgroup, from 'VZ3' to 'Hartley', is the most diverse and includes the foreign cultivars 'Franquette' (France) and 'Hartley' (California) together  with the Spanish cultivar 'VZ2' from Granada.Finally, the fifth subgroup, from 'VZ6' to 'Salvador', includes cultivars from three different provinces of Spain: Granada (2) and Málaga (1) in Southern Spain, and Murcia (1) in Southeastern Spain.The second cluster, from 'Serr' to 'Trinta', includes most of the Californian cultivars studied, but also includes two Spanish cultivars 'Ladredo' and 'Sendra'.

Discussion
In this work the genetic variability of 57 Persian walnut accessions was assessed by SSR markers.The results conf irm that J. nigra SSR markers can be used to identify the level of genetic variability in J. regia, in accordance with previous studies (Dangl et al., 2005;Foroni et al., 2005Foroni et al., , 2006;;Wang et al., 2008;Pollegioni et al., 2009;Gunn et al., 2010).All the cultivars analyzed had a unique SSR fingerprint, which confirms the high efficiency of these markers.Furthermore, the average number of alleles detected per locus (5) and the mean power of discrimination obtained (0.74) confirm that SSRs are a more suitable tool for walnut identification than other molecular marker systems studied previously (Nicese et al., 1998;Potter et al., 2002).In all samples, the effective number of alleles was lower than the observed.These differences may be due to the presence of private alleles that exist in a few genotypes, which could be used for their identification.Observed heterozygosity was lower than expected for 14 loci out of 19 and therefore the fixation index (F) values were positive and close to zero for most of the loci studied, suggesting that the global behavior of the walnut genotypes studied was similar to that of a random mating population.In addition, the proportion of null alleles was estimated, and they were found to occur at a moderate frequency, with the mean value equaled 0.053.WGA069 and WGA349 loci had the highest estimated frequency of null alleles, 0.149 and 0.196 respectively, consistent with the high values reported by Dangl et al. (2005) for these loci.Significant departure over Hardy-Weinberg equilibrium appeared only in WGA069 (p < 0.001), WGA089 (p < 0.01) and WGA349 (p < 0.01) loci due to a deficiency of heterozygotes.However, none of these deviations remained significant after applying Bonferroni correction.
The observed genetic heterozygosity of the walnut collection studied was moderate, with an average value of 0.52.The presence of private alleles in some cultivars could be due to a mutation in the microsatellite sequence that would give rise to longer o shorter new alleles.'Santa Cruz' was homozygous at 17 of the 19 loci, suggesting that this genotype could be the product of a self-pollination.The comparison of the allele sizes (in base pairs) performed by Dangl et al. (2005) for six varieties studied in common ('Chandler', 'Franquette', 'Howard', 'Payne', 'Serr', and 'Sunland'), with 14 of the 19 SSRs used in this study (Table 6), detected minor changes that may have resulted from differences in methodology.For five SSRs (WGA001, WGA004, WGA089, WGA202, and WGA349), the data obtained in the two studies were completely identical.Also, eight SSRs displayed alleles that were either smaller (e.g.1-bp differences for WGA225; 2-bp for WGA069, WGA118, and WGA276; 3-bp for WGA332) or larger (e.g.1-bp differences for WGA009, WGA331, and WGA376) compared with those that were detected in our laboratory.However, the differences between consecutive alleles at these SSRs were identical for the six varieties, except for 'Chandler' with WGA009 and 'Sunland' with WGA276 (Table 6).With WGA321, the data were completely identical for three cultivars ('Chandler', 'Franquette', and 'Payne') and only 1-bp discrepancies occurred for the shorter allele of 'Serr' and 'Sunland'.Most of these differences might be interpreted as stutter due to extra base additions that occur with some Taq polymerases (Brownstein et al., 1996).This result for the WGA321 locus also shows discrepancies due to misinterpretation of the homozygous (-/245 reported by Dangl et al. 2005) versus heterozygous (225/245 obtained in this work) state for cultivar 'Howard' (Table 6).Consistent with the pedigree of 'Howard' (Tulecke and McGranahan, 1994), the WGA321-225 allele observed in our experiment (Table 5) could have been inherited from the accession UC56-224 through cultivar 'Sharkey' (224/241) (Dangl et al., 2005), and the WGA321-245 allele could have been inherited from cultivar 'Pedro' (241/245) (Table 5).One possible explanation for the difference found between the two studies would be a mutation in the 225 bp allele to a null allele (no amplification).In this case, the mutation would affect the one or two annealing targets for the primers, and not the microsatellite sequence itself.Thus, the analysis of identical samples, and the comparison of walnut microsatellite data among laboratories working on walnut genetic resources, could help to develop a similar method of defining a common and unique reference allele set for several SSR loci, as has occurred in grape (This et al., 2004).This set of reference alleles could then be used in order to code the data from several laboratories and would enable a very easy comparison of data and/or genetic resources.The establishment and feeding of a uniform database with confirmed microsatellite profiles for true-to-type walnut cultivars would support better and more-rationalized management of walnut collections.The majority of accessions examined in this work are traditional cultivars of unknown parentage.However, a close relationship between the known pedigree and the genetic similarity was observed with SSRs.Thus, the cultivars 'Payne' and 'Eureka', ancestors of most of the Californian cultivars tested, showed, as expected, a similarity value of 0.50 or more with all the cultivars related to them, and clustered together.A def inite grouping of the Spanish cultivars can be established according to their geographical origin.Some cultivars, however, were placed outside of their regions.This could be due to movements of plant material from one region to another.Only one French cultivar, 'Franquette', was included in this analysis.The fact that the Californian cultivar 'Hartley' and the French cultivar 'Franquette' clustered together is consistent with previous results indicating that 'Hartley' is derived from crosses involving French cultivars (Potter et al., 2002;Dangl et al., 2005;Foroni et al., 2006).The level of similarity found among Californian varieties and the two Spanish varieties 'Ladredo' and 'Sendra' at the SSR level indicates that they may have a common or related ancestry.
The markers employed here will be useful for the characterization and comparison of walnut germplasm collections and for the detection of propagation errors.The evaluation of the molecular diversity of walnut genetic resources is important for the optimal development of programs aimed at conservation.These markers were able to identify uniquely all the walnut cultivars studied.In general, the cultivars sharing common parents tended to group together and with at least one of the parents.Since pedigree and passport data are often unknown or incomplete for many fruit species (Warburton and Bliss, 1996), SSRs can be a useful tool for the assessment of the degree of similarity of cultivars in this species, in order to select the best parental combinations for the production of new genetic combinations.

Table 1 .
The 57 Juglans regia cultivars included in this study and maintained at the IMIDA collection a Pedigree data obtained fromTulecke and McGranahan (1994).bHo:observed genetic heterozygosity.

Table 2 .
Characteristics of the 19 microsatellite markers assayed a Ta: annealing temperature for the first cycle of the PCR.See Material and methods.b Ta-3: annealing temperature for the last 20 cycles of the PCR.See Material and methods.c Forward primers were modified at the 5´ end with a fluorescent label: 6-FAM (blue), NED (yellow), VIC (green) or PET (red).

Table 3 .
Variability parameters calculated for 19 SSR markers in 57 walnut cultivars Na: allele number per locus.Ne: effective number of alleles per locus.Ho: observed genetic heterozygosity.He: expected genetic heterozygosity.F: fixation index.F(Null): frequency of alleles null.HWE: probability test for departure from Hardy-Weinberg equilibrium.ND: not done.NS: not significant.*Signicant at p < 0.01 , ** Significant at p < 0.001.PD: power of discrimination.

Table 4 .
Allele size (AS) in base pairs and allele frequencies (AF) at nuclear SSR loci

Table 6 .
Comparison of the allele sizes (in base pairs) obtained in the current work and byDangl et al.