Genetic relationships between interspecific lines derived from Oryza glaberrima and Oryza sativa crosses using microsatellites and agro-morphological markers

New Rice(s) for Africa (NERICA) are high yielding rice varieties mostly cultivated in Sub-Saharan Africa and developed by the Africa Rice Center. This study is aimed at investigating the proportion of introgression of parental genomic contribution of 60 lowland NERICA varieties and establishment of molecular profiling. Agro-morphological data from 17 characteristics was recorded and significant ( p <0.05) to high significant ( p <0.0001) differences were obtained with leaf length and width, plant height at maturity, days to heading, maturity, primary and secondary branching of panicles, and grain width and grain thickness. A total of 114 microsatellite polymorphic markers covering 2183.13 cM of the rice genome showed the proportions of alleles introgressed from the donor parent ( Oryza glaberrima ) into 52 lowland NERICA lines (TOG5681 and IR64) as follows: 11% for BC 2 , 6.07% for BC 3, and 7.55% for BC 4 . The introgression proportions for the eight remaining lowland NERICA lines derived from other crosses ranged from 5.5 to 11.3%. The proportion recorded with the recurrent parent was 83.99%. The highest introgression proportions of the O. glaberrima allele for all 60 lowland NERICA lines were found on chromosomes 2, 6, and 12 (TOG5681/IR64) and on chromosome 3 with NERIC-L-29 (TOG5681/IR1529-680-3-2). Multivariate analyses performed using an association of agro-morphological and molecular data revealed two major groups according to the distribution of the lowland NERICAs including the lowland NERICAs released were found in cluster 1 of the dendrogram. Genetic and genomic studies, QTL identification and analysis using agro-morphologically significant traits revealed should be used to develop mega-varieties adapted in rice growth conditions in Sub-Saharan Africa.


Introduction
Rice (Oryza sativa L.) is the second most widely grown cereal in the world after wheat. It is one of the main food crops and a staple for the majority of the populations in developing countries. In Sub-Saharan Africa (SSA), the potential of lowland agro-ecosystems to produce rice is much higher than that of upland ecologies, because they are suited to cropping intensification, with the possibility of growing two or more rice crops per year. The development of more lowland ecologies therefore offers a great opportunity for the 2 co-dominant, multiallelic, highly polymorphic (even in closely related individuals), with high abundance and uniform distribution in plant genomes, and widely used for genetic studies including estimation of the proportions of donor genome and a recurrent parent background (Bernardo et al., 2000).
The genotyping of 48 lowland NERICA lines derived from crosses between IR64 (O. sativa) and TOG5681 (O. glaberrima) using 60 microsatellite markers showed variable proportions of introgression of the glaberrima parent (TOG5681) depending on the level of the backcross generation. The estimated averages of donor parent TOG5681 coverage were 7.2% (83.5 cM) at BC 2 F 10, 8.5% (99.3 cM) at BC 3 F 8 and 8.1% (93.8 cM) at BC 4 F 8 (Ndjiondjop et al., 2008). As a complement to this first study, the objectives of the present study were (i) to estimate the proportion of introgressions from the donor parent; (ii) estimate the highest proportion of O. glaberrima introgression; (iii) establish the molecular profiling of the lowland NERICA lines derived from crosses using TOG5681 and IR64; and (iv) assess the genetic relationships among breeding lines quoted above to identify desirable parental combinations, and associate both agro-morphological and molecular SSR markers which could be used for an efficient breeding program.

Agro-morphological characterization
The experiments were conducted during the 2008 and 2009 rainy seasons at the AfricaRice experimental station in Ouedeme, located in the southern part of Benin (6°42'46"N, 1°41'07"E, altitude 21 m). An aug-them the lowland New Rice for Africa (NERICA). The lowland NERICA lines give real hope for rice productivity improvement, profitability and the sustainability of rice production systems in SSA. They are derived from interspecific crosses between the African rice species (Oryza glaberrima) and the Asian rice species (Oryza sativa indica). While the African rice species is resistant to diseases and drought, has lower yield potential resulting from high grain shattering and susceptibility to lodging (Jones et al., 1997;Futakuchi & Sie, 2009), the Asian rice species has high yield potential. A total of 60 lowland NERICA lines have been developed and include different levels of backcrossing: 4 BC 1 , 22 BC 2 , 19 BC 3 and 15 BC 4. Several of the released lowland NERICA lines  are widely grown in countries such as Benin ; Burkina Faso , Cameroon (NL-19), Liberia (NL-19), Sierra-Leone ; Togo ; Mali (NL-20 and NL-42); and Niger . Understanding the genetic variability between the 60 lowland NERICA lines will enable farmers to choose the varieties that are best suited to their ecologies and cultural practices.
By definition, genetic diversity is an inherited variation among and between populations, created, activated and maintained by evolution (Demol et al., 2002). It is a fundamental characteristic without which breeders are very limited and powerless in plant breeding. The study of genetic diversity reposes on adapted and appropriate techniques such as characterization using agro-morphological, physiological, biochemical or molecular markers which have been successfully used in recent years to help in identifying elite promising lines. The 60 lowland NERICA lines have been successfully characterized using agromorphological markers and showed three clusters, irrespective of the level (4 BC 1 , 22 BC 2 , 19 BC 3 and 15 BC 4 ) of the backcross generation (Moukoumbi et al., 2011). Eighty percent of the lowland NERICA lines showed characteristics sought in lowland rainfed growth conditions, such as 20-25 tillers per plant (good); an intermediate plant height (110-130cm); early to medium duration (100<days<130); dense secondary panicle branching (approximately 50% of spikelets borne directly on primary branches); and highly fertile panicles (>90%). Molecular characterization has been greatly facilitated by the advent of DNA marker technology in the 1980s, which offered a large number of environmentally-insensitive genetic markers that could be generated to follow the inheritance of important agronomic traits (Peleman & Van der Voort, 2003). Microsatellites (SSR) are 3 Genetic relationships between interspecific lines derived from Oryza glaberrima and O. sativa crosses boric acid and 1 mM EDTA), stained with 0.5 µg/mL bromophenol blue (3X STR), visualized with ultraviolet Trans-illuminator and the image captured by Alpha Imager HP software. SSR-profiles were scored and analyzed.
For statistical descriptive and variance analyses mixed model was performed from agro-morphological data using XLSTAT (2011) software. Data scoring and statistical analyses were performed as described by Semagn et al. (2006). Only clear polymorphic SSR bands of various molecular weight sizes were scored manually in comparison with the respective parents. The letter «A» was attributed to alleles from the donor parent; the letter «B» to alleles from the recurrent parent (O. sativa); the letter «H» to heterozygotes; the letter «E» to non-parental alleles; and «-» to signify missing data. The number of polymorphic markers was estimated with Microsoft Excel-2010 for each cross including the 60 lowland NERICAs. The map distances (2183.13 cM) between 114 markers was used as the basis for estimating parental (donor and recurrent) contribution and introgression, the heterozygosity and non-parental genome per chromosome for each lowland NERICA using Graphical Geno-Types (Van Berloo, 2008). The molecular profile was generated with 52 lowland NERICAs derived from crosses between TOG5681 and IR64. Multiple correspondence analyses (MCA) following Ward's (1963) method (XLSTAT, 2011) and cluster analysis using Unweight Neighbors Joining method were carried out to investigate the overall variation and patterns of relationships among lowland NERICA(s) using the mented experimental design was laid out in three blocks using NPK 15-15-15 fertilizers as basal application at a rate of 200 kg/ha during land preparation and urea was applied at the rate of 50 kg/ha at 14 days after sowing (DAS) and at panicle initiation. Descriptor data (17 in total) were collected according to descriptors for wild and cultivated rice (Oryza spp.) from Bioversity International/IRRI/AfricaRice (2007).

Proportions of introgression
Genomic DNA was extracted from 250 mg of young leaves according to the protocol on mini-preparations (Risterucci et al., 2000). Quantification and assessment of DNA quality were performed using a spectrophotometer at 260 nm and 280 nm wavelengths. The genomic DNAs of the 60 NL and all parents were diluted and stored at -20°C.
Genomic DNA was extracted from the 60 lowland NERICA lines and analyzed by simple sequence repeat (SSR) using PCR techniques. The number of primers used ranged from 73 to 250 (Table 1), according to the cross (Orjuela et al., 2010) and 25 µL of total SSR-PCR volume mixture was amplified using the following program: initial denaturation (1 cycle of 94°C for 4 min) followed by 35 amplification cycles including denaturation (94°C for 30 s); hybridization of primers (55°C for 30 s), elongation (72°C for 45 s) and a final elongation (72°C for 5 min). SSR-PCR products were separated on 3% TBE agarose gel electrophoresis with 0.5 X TBE buffer (40 mM Trizma base-HCl, 40 mM

High proportion of the introgression of the donor parent (O. glaberrima)
The highest proportion of the introgression of the donor parent (O. glaberrima) in the 52 NL was on software package Graphical GenoTypes (Van Berloo, 1999). Table 1 shows that the differences among agromorphological traits such as plant height at maturity, panicle secondary branching and grain width were moderately significant (p<0.05), while leaf length and width, panicle primary branching, days to heading, maturity and grain thickness were highly significant (p<0.0001). Most of the lowland NERICAs were of intermediate plant height (110-130 cm). Indeed, the recorded mean plant height at maturity was 119.57 cm, with a minimum plant height at maturity of 87.30 cm and a maximum of 178.20 cm. Panicle primary and secondary branching was dense, heavy and compact, and rarely open. Leaf width varied between 1.01 and 2.53 mm with a mean of 1.45 mm, while leaf length showed a minimum of 32.38 mm and maximum of 77.22 minimum and maximum. Days to heading observed ranged from early to medium (100<days<130).

5
Genetic relationships between interspecific lines derived from Oryza glaberrima and O. sativa crosses  -39) and Mali (NL-42). The major difference observed between the cluster and MCA was the nine sub-groups observed in the cluster analysis, which were not evident in the MCA.

Discussion
Agronomic and morphological traits were examined as recommended by Jacquot & Arnaud (1979) and Glaszmann (1987). The most discriminating quantitative traits were leaf length and width, days to heading, maturity and panicle primary branching. Similar results were reported by Sie (1991) for leaf length and width, grain length and weight as discriminating traits through a study based on genetic evaluation of traditional rice varieties. In addition, these results provide information on the weed suppressive and high yielding characteristics of the 60 lowland NERICAs.
SSR polymorphic markers were well distributed along the 12 rice chromosomes. The main advantages of the SSR markers used are their co-dominance and high polymorphism, even among very closely linked subjects, which showed their efficiency in the assessment of the parental contributions reported by several studies. The estimated O. glaberrima genome among interspecific lines (O. glaberrima and O. sativa) was chromosomes 2, 6 and 12 and the lowest on chromosomes 1 and 4 (Fig. 1). For eight lowland NERICAs , the highest introgression occurred on chromosomes 2 (NL-22 and NL-42), 4 (NL-21) and 6 (NL-21 and NL-59). There was no donor parent genome introgression on chromosomes 2, 3, 4, 6, 10 and 12 (NL-23, NL-24 and NL-25).

Lowland NERICA structuring
Cluster and MCA lines are useful to evaluate the potential breeding value of the lowland NERICAs. The first two axes in the MCA explained 90.37% of the total variability (Fig. 2) and revealed the two distinct major groups between lowland NERICA regardless of the level of backcross and the proportion of the introgressed donor parent genome. TOG5681 (O. glaberrima) appeared distant from all lowland NERICA lines and other parental lines.
The cluster analysis was performed using the simple matching coefficients derived from 114 SSR markers. The dendrogram produced two distinct clusters with nine sub-clusters (Fig. 3). Forty six percent of the lowland NERICAs were found on cluster 1, including four sub-clusters from NL-1 to NL-34. Eight lowland NERICA varieties released in SSA , where a strong variability was found, were included in  Figure 2. Score plot on the first two principal components from multiple correspondence analyses (MCA) of the 52 lowland lines derived from TOG5681 and IR64 crosses genotyped with 114 SSRs. NERICA lines are described in Table 3. Genetic relationships between interspecific lines derived from Oryza glaberrima and O. sativa crosses NL-47 and NL-59 derived from crosses with TOG5674 and TOG5675, the proportion of introgression was highest and ranged from 5.5 to 9.5%. These three lowland NERICAs varieties might have introgressed the resistance genes rymv1-4 and rymv1-5 alleles (Albar et al., 2003; which were identified in TOG5674 and TOG5675 (O. glaberrima varieties). The 10 lowland NERICA varieties released in Sub-Saharan Africa are found in the nine sub-clusters and eight of them belong to cluster 1, where 80% of lowland NERICA varieties showed characteristics adapted to lowland rice growing conditions. Cluster 2 included lowland NERICA varieties that might be grown in both upland and lowland ecologies. Lowland NERICA-5 derived from TOG5681/IR64 might be screened for some abiotic and biotic stresses and might reveal other quantitative trait loci (QTLs) for tolerance or resistance genes hidden in the TOG5681 variety. This information can help other researchers to identify important agronomic traits and encourage research on QTLs for various stresses. Molecular analysis shows a wider genome of the 60 lowland NERICA lines than the agromorphological analysis reported by previous studies. 7.2, 8.5 and 8.1% at BC 2 , BC 3 and BC 4, respectively. In this study, the proportions of the TOG5681 genome were 11, 6.07 and 7.55% at BC 2 , BC 3 and BC 4 and statistically different (p<0.05). The O. glaberrima genome on the other hand was lower at BC 3 than BC 4 and double according to the expected Mendelian inheritance values: 12.5% (BC 2 ), 6.25% (BC 3 ) and 3.13% (BC 4 ). Though Hospital (2005) reported that during successive backcrosses, the genome of the donor parent must move towards zero in all chromosomes except the one carrying the introgressed portion of the allele of interest, the gap observed between expected and estimated donor parent contribution could be explained in different ways. The gap might have resulted from the action of environmental effects on plant growth and development. Phenotypic variability can be observed even between genotypes belonging to the same group with the same parents (Cisse et al., 2006) and between population sizes used for making selections. The number of markers and their distribution could explain the disparities.
Studies using the same parents (Ndjiondjop et al., 2008;Agnoun et al., 2012) reported the highest proportion of the introgression on chromosome 6. The study showed that the highest introgression occurred on chromosomes 2 and 12. Variability of the introgressed donor parent genome has been widely observed. The lack of introgression of the donor parent genome in some lowland NERICA lines could explain some phenotypical differences observed during vegetative and reproductive stages of these lowland NERICAs. Indeed, the genome of the O. glaberrima parent can be partially introgressed on progenies when the crosses are carried out with an O. sativa variety (Barry et al., 2007). Also intensive selection occurred during the selfing of the BC 2, BC 3 and BC 4 generations according to the number of the traits concerned. As mentioned by Heckenberger et al. (2005), selection and genetic drift during inbreeding might explain the differences observed between the current and expected proportions of the donor parent genome in the 60 lowland NERICA varieties. The proportion of introgression of the donor parent genome is not the critical factor but rather what it represents in terms of genetic information, including the number of the genes accumulated. The lack of donor parent genomes on chromosomes 2, 3, 4, 5, 6, 10, and 12 in NL-23, NL-24 and NL-25 derived from the crossing carried out with three parents might be justified because these varieties have never gone beyond experimental selection in spite of their demonstrated tolerance to salinity and cold. Seed admixtures, spontaneous mutation, anther sterility during pollination resulting from sporo-gametophytic (Zeng et al., 2009) during the development of the 60 lowland NERICA varieties could justify the presence of the non-parental alleles. Regarding the lowland NERICA varieties NL-43,