Development of a SNP parentage assignment panel in some North-Eastern Spanish meat sheep breeds

Aim of study: To validate two existing single nucleotide polymorphism (SNP) panels for parentage assignment in sheep, and develop a cost effective genotyping system to use in some North-Eastern Spanish meat sheep populations for accurate pedigree assignment. study: Spain Material and methods: Nine sheep breeds were sampled: Rasa Aragonesa (n=38), Navarra (n=39), Ansotana (n=41), Xisqueta (n=41), Churra Tensina (n=38), Maellana (39), Roya Bilbilitana (n=24), Ojinegra (n=36) and Cartera (n=39), and these animals were genotyped with the Illumina OvineSNP50 BeadChip array. Genotypes were extracted from the sets of 249 SNPs and 163 SNPs for parentage assign ment designed in France and North America, respectively. Validation of a selected cost-effective genotyping panel of 158 SNPs from the French panel were performed by Kompetitive allele specific PCR (KASP). Additionally, some functional SNPs (n=15) were also genotyped. Main Results: The set of 249 SNPs for parentage assignment showed better diversity, probability of identity, and exclusion probabilities than the set of 163 SNPs. The average minor allele frequency for the set of 249, 163 and 158 SNPs were 0.41 + 0.01, 0.39 + 0.01 and 0.42 + 0.01, respectively. The parentage assignment rate was highly dependent to the percentage of putative sires genotyped. Research highlights: The described method is a cost-effective genotyping system combining the genotyping of SNPs for the parentage assignment with some functional SNPs, which was successfully used in some Spanish meat sheep breeds.


Introduction
Breeding programs have the purpose to get sustainable genetic gains in one or several traits while controlling the loss of genetic variation. Traditional pedigree based-BLUP (Best linear unbiased prediction) selection (Henderson, 1984) is used to calculate estimated breeding values (EBVs) obtained from performance records and pedigree information. However, the success of genetic evaluations systems is directly affected by the accuracy of pedigrees. Complete pedigree information is a prerequisite to get accurate EBVs, correctly rank parents and offspring and maximize the genetic gain (Israel & Weller, 2000;Raoul et al., 2016). In this sense, the proportion of known sires is very low in Spanish meat sheep populations because the management (extensive or semi-extensive farming) relies very little on artificial insemination (AI) or natural mating with a single ram per group of ewes. Moreover, a number of these populations are considered as endangered breeds with reduced effective population size, and reared in small-sized flocks. Therefore, the implementation of a mating scheme based in pedigree information can control the inbreeding that is greatly affected by population structure (Gutiérrez et al., 2008).
In this situation, the number of ewes belonging to a breeding program nucleus remains limited, because only some of them are inseminated by or mated to a single identified ram. Furthermore, another source of incorrect pedigree record information is usually due to ewes failing to keep their litter together, or lamb desertion, that may lead to limit the selection response (Barnett et al., 1999;Visscher et al., 2002).
Therefore, genomic information like DNA markers can contribute to reconstruct the phylogenetic relationships of populations. Microsatellite markers have been used extensively for parentage control in sheep (Arruga et al., 2001;Glowatzki-Mullis et al., 2007;Saberivand et al., 2011;Visser et al., 2011;Souza et al., 2012;da Silva et al., 2014) and are recommended by the International Society for Animal Genetics (ISAG) as they are highly abundant and informative, relatively inexpensive to use, and generate satisfactory results in tests for paternity exclusion. However, as DNA markers in genomic selection studies (Meuwissen et al., 2013), single nucleotide polymorphisms (SNPs) are now largely developed on SNPs chip arrays allowing high throughput genotyping (Heaton et al., 2002;Werner et al., 2004;Hayes, 2011). Recently, various SNP panels have been developed for sheep of different international breeds specifically for parentage assignment (Bell et al., 2013;Clarke et al., 2014;Heaton et al., 2014;Tortereau et al., 2017). The SNPs panel developed from French breeds was the first panel based on European sheep breeds (Tortereau et al., 2017). These authors pointed out that four Spanish breeds (Churra, Ojalada, Castellana and Rasa aragonesa) belonging to the Sheep HapMap breeds of the International Sheep Genomics Consortium (Kijas et al., 2012a) had similar minor allele frequency (MAF) values for the selected SNPs to that described in the French breeds, suggesting that this panel should perform well in these Spanish breeds.
In addition, in Spain, a national breeding program for resistance to classical scrapie was implemented. In the breeding programs, animals are genotyped, and those carrying favorable Prnp alleles for resistance are used as breeding animals (Hunter et al., 1997;Acín et al., 2004).
In the same way, a selection program for prolificacy in Rasa Aragonesa breed implements the genotyping of reproducers for alleles associated to prolificacy (Calvo et al., 2020), as well as that related to reproductive seasonality (Calvo et al., 2018). Apart from these SNPs, some other SNPs are of interest to genotype for validation of their effects in these Spanish breeds, such as atypical scrapie susceptibility (Moum et al., 2005), lentivirus susceptibility infection (Heaton et al., 2012;Sider et al., 2013), or other alleles found in other breeds and related to prolificacy (Bodin et al., 2007;Drouilhet et al., 2013).
The objective of this study was to validate two existing SNP panels for parentage assignment in sheep, including the French panel, and develop a cost-effective genotyping system of a reduced set of SNPs in an open platform to use in some North-Eastern Spanish meat sheep populations for accurate pedigree assignment. In a second stage, we tested and validated the performance of the cost-effective genotyping system together with some functional SNPs in replacements lambs from different farms and breeds.
Genomic DNA was extracted from blood samples of the 335 ewes using the FlavorPrep Genomic DNA mini kit (Flavorgen, Ibian, Zaragoza, Spain). DNA samples were genotyped with the Illumina (San Diego, California, USA) OvineSNP50 BeadChip array designed by the International Sheep Genome Consortium (Kijas et al.,3 SNP parentage testing in some North-Eastern Spanish meat sheep breeds 2012b). SNP genotyping services were provided by the "Xenetica Fontao" company (www.xeneticafontao.com).

Selection of the SNP panel for parentage assignment
Firstly, we applied the quality control (QC) criteria on the raw genotypes obtained from the OvineSNP50 BeadChip array using PLINK 1.9 (Chang et al., 2015) as follows: i) Individuals with low call rate (< 0.97) were excluded from additional analysis; ii) SNPs with unknown location of the marker in the ovine chromosomes were excluded; iii) SNPs were also excluded if they showed a low call rate (< 0.97), a MAF < 0.05, or significant deviations from Hardy-Weinberg equilibrium (HWE) (p-value < 0.001) within breed. The subsequent analysis focused on two sets of SNPs; a first set of 249 SNPs published from the French panel for parentage assignment (Tortereau et al., 2017) and the 163 SNPs panel described in Heaton et al. (2014) used in the North American and globally diverse breeds. Paternity assignment effectiveness does not only depend on the number of SNPs used but also on the level of informativeness CHR SNP position in Oar3.1 and allele variation dbSNP name Gene Phenotype  Demars et al., 2013;[11] Bodin et al., 2007 that these markers provide. To study the informativeness of the SNPs included in this work, three informative indexes were calculated for both sets of SNPs and for each population included in this study: the MAF, the exclusion probability (PE), and the probability of identity (PI) (Schütz & Brenig, 2015;Tortereau et al., 2017). PE is the probability to exclude one (PE1) or two (PE2) randomly sampled parent(s) from the parentage of an individual which is truly unrelated to them. PE1 assumes that genotypes are known for the offspring and a putative parent, but genotypes are not available for a known parent (one parent missing). PE2 assumes genotypes are known for the offspring, one confirmed parent, and one putative parent (both parents genotyped). PI is the probability that two randomly selected individuals in a population have identical genotypes for all the SNPs genotyped. A reduced panel of 158 SNPs from the French panel was chosen to use in an open platform for a cost-effective genotyping for parentage assignment. Only SNPs with a MAF >0.3 and a call rate >0.97 in the 9 breeds were selected.
A reduced panel of 158 SNPs from the French panel was chosen to use in an open platform for a cost-effective genotyping for parentage assignment. Only SNPs with a MAF >0.3 and a call rate >0.97 in the 9 breeds were selected.

Parentage assignment validation
We carried out the paternity assignment in each of the ten farms by using the CERVUS software (Kalinowski et al., 2007). CERVUS uses a simulation procedure to determine the distribution of the critical values of logarithm of the odds (LOD) or Delta score for 80% and 95% confidence levels for the candidate father-offspring pairs. LOD score was used for paternity assignment. The simulation parameters were as follows: 10,000 simulated offspring, the number of candidate parents and the sampled sires was provided by the breeders' association (varying between 50% and 100%), at least 90% loci having allele calls, with an estimated 5% genotyping error rate. We allowed one SNP genotype mismatch between offspring and its assigned sire because of technical genotyping failures.

Results and discussion
We selected two sets of SNPs for parentage assignment described in sheep. All SNPs from these panels fulfill the QC criteria. Tortereau et al. (2017) reported that the panel of 249 SNPs used for parentage assignment in the French breeds had similar medium MAF values in four Spanish breeds (Churra, Ojalada, Castellana and Rasa Aragonesa) belonging to the Sheep HapMap breeds (Kijas et al., 2012a), suggesting that this set of SNPs should perform well in these breeds. Furthermore, the North American panel of 163 SNPs for parentage testing (Heaton et al., 2014) also found that the Rasa Aragonesa sheep breed had the highest MAF value (0.40) of all breeds.
In the nine North-Eastern Spanish meat sheep populations included in this study, no mendelian inheritance errors were detected in verified family trios or duos for  Table 2. Major statistics for two parentage panels (French, North American and globally diverse breeds), and a subset of 158 SNPs from the French panel on the 9 Spanish populations: MAF, PI (Probability of identity), PE1 and PE2 (exclusion probabilities considering the exclusion of one or the two parents respectively).
both panels. SNPs were in Hardy-Weinberg equilibrium. There were not any uninformative SNPs in each breed group (MAF=0) in any breed group. The average MAF for the sets of 249 and 163 SNPs were 0.41 + 0.01, and 0.39 + 0.01, respectively. The set of 249 SNPs had better diversity, PI, PE1 and PE2 values than the set of 163 SNPs (Table 2). PI, PE1 and PE2 values are highly dependent on the number of SNPs (Jamieson & Taylor, 1997), but Tortereau et al. (2017) demonstrated that these values were better on these populations with 150 SNPs randomly selected from the French panel than those of the North American panel.
For the reasons described above, we decided to select a reduced panel of 158 SNPs from the French panel for a cost-effective genotyping. Only, SNPs with a MAF value greater than 0.3 and with a call rate > 0.97 in all the nine populations were retained. The SNPs were distributed over the 26 autosomes. The major statistics for this panel of 158 SNPs (MAF, PI, PE1 and PE2) are shown in Table  2. The names, MAFs, and other features of the SNPs of each panel are shown in Tables S1-S3 [suppl]. The average MAF for the set of 158 SNPs was 0.42 + 0.01, having better values than the other two sets of SNPs. Slightly better values were found for the reduced panel (158 SNPs) compared to the American one (163 SNPs) for the PI, PE1 and PE2 values, although Ansotana and Rasa Aragonesa showed lower PI values with the set of 163 SNPs. At the population level, the lowest and greatest average MAF values were obtained respectively in Churra tensina, and in Xisqueta and Navarra breeds whatever the panel. In general, all breeds showed good PI, PE1, and PE2 values. For the set of 158 SNPs, the probability (PI) that two randomly selected individuals have identical genotypes within breed was very low, reaching its lowest and highest values in the Cartera (2.38E-66) and the Roya Bilbilitana populations (4.94E-64), respectively. However, in the Rasa Aragonesa breed the lowest PI value was found with the set of 163 SNPs (9.57E-66) compared to the set of 158 SNPs (1.22E-64).
The set of 158 SNPs was also used to perform parentage assignment validation using KASP technology. Furthermore, the 15 functional SNPs were also genotyped in conjunction with those used for parentage assignment for a total of 173 SNPs. KASP technology was chosen because is a very cost-effective genotyping platform. In this sense, the total cost per sample for a set of 192 SNPs assay (DNA extraction and genotyping a maximum of 192 SNPs) was €9 when dealing with more than 1,500 individuals (all-inclusive service from the LGC, Genomics Hoddesdon, UK). The price goes down around €2 when genotyping more than 3,000 samples. Five SNPs from the reduced panel failed or had a call rate <0.95 in KASP genotyping. However, MAF, PI, PE1 and PE2 had similar values (Table S4 [suppl]). Functional SNPs were genotyped successfully. For example, we could genoty-pe efficiently for the numerous alleles of the PRNP gene (Table 1) at codons 136 (p.A136V,T), 141 (p.L141F), 154 (p.R154H) and 171 (p.Q171R,H,K), identifying 3, 2, 2, and 4 alleles for each codon, respectively. This validation was performed in 12 commercial farms from three different breeds. Farmers declared a proportion of putative sires sampled from the farm because not all the putative males were avalaible, mainly because some sires were dead. Table 3 shows the assignment rate in different farms from the three breeds. As expected when the list of putative sires was completely (or almost) genotyped in a farm, a very high assignment rate was obtained. In two farms, a 100% assignment rate was achieved. In general, the assignment rate is highly dependent to the percentage of putative sires genotyped. We only found one out of 2,018 replacements ewes (farm G) with two possible parents, a father-offspring pair. This problem has been previously pointed by Tortereau et al. (2017) recommending to genotype at least 180 SNPs given the number of false-positive results when the dam is not genotyped and the true sire is not among the candidate sires or are highly-related. Because this is an open genotyping platform we could complete the panel with more SNPs to increase the parentage assignment power; or add new validated functional SNPs.
The total cost per sample for this set of 173 SNPs for parentage assignment and genotyping some functional genes (the same price is for the genotyping a maximum of 192 SNPs) is similar to those used with microsatellites. In this way, this panel is routinely used in Rasa Aragonesa and Ojinegra sheep breeds for parentage assignment and genotyping of functional SNPs by KASP. Marker-or gene-assisted selection (MAS/GAS) is been applied in these breeds for pre-selection of replacement animals for increasing frequency of favorable alleles of a major gene, for example PrnP alleles for scrapie resistance or BMP15 alleles for litter size. However, a balance over time between selection for polygenes and the major gene for a given trait is needed to avoid inbreeding, and maintain the genetic variability within the breed.
In conclusion, the described method is successfully used in some meat Spanish sheep breeds, combining the genotyping of SNPs for the parentage assignment with some functional SNPs that can be used for pre-selection of replacement animals. The described method is a cost effective genotyping system, which is routinely used in Rasa Aragonesa and Ojinegra meat sheep breeds in their selection schemes by KASP genotyping technology. In addition, the SNPs for the parentage assignment could be genotyped using other genotyping platforms such as, for example, a custom low density array (these SNPs are included in the Illumina OvineSNP50 BeadChip array) or by Sequenom technology as described by Tortereau et al. (2017).  Table 3. Assignment rate of replacement ewes in different farms from the Rasa Aragonesa, Navarra and Cartera breeds. The number of replacement ewes, sires and declared proportion of sires sampled and genotyped are also indicated.