Canonical correlation analysis for the determination of relationships between plant characters and yield components in red pepper [Capsicum annuum L. var. conoides (Mill.) Irish] genotypes

In this study, a canonical correlation analysis (CCA) was used to estimate relationships between plant characters [X set-fruit length (FL), fruit width (FW), fruit wall thickness (FWT), placenta length, stem thickness, plant height (PH), leaf length, leaf width, flowering time (50%), and time to maturity], and yield components [Y set-total fruit weight per plant (FW/P), average fruit weight and number of fruits per plant (FN/P)] of 56 red peppers [Capsicum annuum L. var. conoides (Mill.) Irish] in populations collected from the Samsun province in the Black Sea Region of Turkey. All canonical correlation coefficients (0.708, 0.635, 0.413) between the pairs of canonical variables were found to be significant (P < 0.01). The findings obtained from the CCA indicate that FN/P had the largest contribution for the explanatory capacity of canonical variables estimated from yield components of 56 red pepper populations when compared with other yield components. FL and PH had largest contribution for the explanatory capacity of canonical variables estimated from plant characters when compared with other characters. The results of this study show that PH, FWT and FW should be used with the aim of increasing yield per plant in red pepper genotypes. Additional key words: canonical correlation coefficient, canonical variable, multivariate analysis, population, variation.


Introduction
Correlation coeff icients generally show relationships among independent characters and the degree of linear relation between these characteristics (Guler et al., 2001).In plant breeding research, usually more than one character or variable is taken for measurement from the same plant.To determine the degree and direction of linear relationships belonging to these obtained measurements, simple correlation analysis is usually preferred by researchers (Cankaya and Kayaalp, 2007).It is important to determine the relationship between two or more characters measured at an early time and a later time (Akbas and Takma, 2005;Cankaya et al., 2008) since early selection is one of the vegetable breeding methods applied for a higher yield the plants.If there is a relationship between plant characters and yield components, multivariate analyses such as canonical correlation analysis (CCA) -a technique for describing the relationship between two variable sets simultaneously and for producing both structural and spatial meanings (Thompson, 1984)can provide pepper breeders information based on indirect selection.
Peppers (Capsicum annuum L.) are very important commodity in countries such as China, Mexico, Turkey, India and Hungary.Peppers, a type of vegetable, have a special place in Turkish cuisine and are consumed either fresh or processed all year long (Balkaya and Karaagac, 2005).Pepper production in Turkey shows a steady increase.They are commonly grown in the Black Sea Region of Turkey, and Samsun province is a producer in this area.As in all cultivated plants, the main objective of pepper growing is to grow high yield and high quality crops.The strategy of the Capsicum breeder is to assemble a cultivar with superior genetic potential for yield and improved quality (Bosland, 1993).Pepper yield is affected by genotypic and environmental factors.For this reason, determination of the effects of genotypic factors in pepper breeding is a primary concern.Determining fruit characters and yield components will provide important benefits in pepper breeding programs.More information is needed to clarify the relationships between plant and fruit characteristics.
The application of CCA in plant breeding began to increase with the availability of related computer packages (Carvalho et al., 1988;Loıselle et al., 1991;Tavares et al., 1999).To our knowledge, however, the applications of CCA have not been founded for the estimation of relationships between plant characters and yield components in red peppers.Accordingly, the objectives of the present study were twofold: i) to estimate the interrelationships between plant characters and yield components measured from 56 red pepper [Capsicum annuum L. var.conoides (Mill.)Irish] populations collected from Samsun province, Black Sea Region, Turkey; and ii) to determine which variables can be used as early selection criteria for increasing the yield of peppers using CCA.

Data collection
A total of 56 red pepper populations were collected from Samsun province, Black Sea Region, Turkey.The evaluations were carried out at the Black Sea Agricultural Research Institute.The seeds of red pepper populations were sown into plug trays (5.5 cm width and 5.5 cm depth) on 30 March 2004 and 04 April 2005.Peat and perlite at 3:1 ratios, respectively, were used as the growing medium.The seedlings were transplanted to an open field at the 4-5 leaves period on 10 May 2004 and 18 May 2005.A spacing of 80 and 40 cm between rows and plants were used.The fruits were harvested when they reached full maturity.Harvesting started at the end of August and lasted to the end of October each year (the investigated genotypes have different harvest periods).Measurements and observations of examined characters were done on 12 plants, which had been randomly chosen in the mid-row of each plot.The following measurements and observations were made: flowering time (50%) (FT); time to maturity (TM); plant height (PH); stem thickness (ST); leaf length (LL); leaf width (LW); fruit length (FL); fruit width (FW); fruit wall thickness (FWT); placenta length (PCL); number of fruits per plant (FN/P); total fruit weight per plant (FW/P); and average fruit weight (AFW).Flowering time was determined as sowing to 50% flowering of plants.Time to maturity was recorded as the time from sowing to fruit maturity.Plant height (cm); stem thickness (mm); leaf length (cm); and leaf width (cm) were measured at f irst harvest for each genotype.The other characters were measured at or after harvest.The number of fruits per plant and total fruit weigh per plant values were determined for each population during the harvest period.Mean fruit weight was found by dividing the total fruit weight by the fruit number.The field experiment was established as a randomized block design with three replications.Because the effects of year were not signif icant on plant characters and yield, according to two-way ANOVA, two-year data were combined before the analysis used in this study.

Canonical correlation analysis (CCA)
The relationships of several morphological characters with yield and their components across red pepper genotypes were also investigated using canonical correlation analysis.The CCA, developed by Harold Hotelling in 1935, focuses on the correlation between a linear combination of the variables in the plant characters variable set (X-set) -called canonical variable U-and a linear combination of the variables in the yield components variable set (Y-set)-called canonical variable V-such that the correlation between the two canonical variables is maximized (Gunderson and Muirhead, 1997).Canonical variables (U and V), which in this study are needed to represent the association between the different plant characters from 56 red pepper genotypes, are so formed that the first pair has the largest correlation of any linear combination of the original variables.Subsequent pairs also have maximized correlations subject to the constraint that they are uncorrelated with each previous pair (Johnson and Wichern, 2002).Symbolically, given X nxp and Y nxq , then U i = Xa i and V i = Yb i where a i and b i are standardized canonical coefficients that can be used to determine which variables are redundant in interpreting the canonical variables (Cankaya and Kayaalp, 2007).These coefficients indicate the relative importance of the variable set of plant measurements at the harvest period in determining the value of the variable set of the plant characteristics at the yield components for red pepper genotypes, with i = 1,…,min(p,q).However, the coefficients can be unstable because of the presence of multicollinearity in the data.For this reason, the canonical loadings are considered to provide a substantive meaning of each variable for the canonical variables (Akbas and Takma, 2005).The result satisf ies Corr(U i ,V j ) = 0, Corr(U i ,U i ) = 0, Corr(V i ,V j ) = 0 for i ≠ j and Corr(U i ,V j ) = ρ i for i = j (Al-Kandari and Jolliffe, 1997).The canonical correlation coefficient (ρ i ) is the measure of the interrelationship between two variable sets.Let be ρ 2 1 ,…,ρ 2 p (0 ≤ ρ 2 p ≤…≤ ρ 2 1 ≤ 1) be min(p,q) ordered eigenvalues (λ i ) of the matrix Σ -1 11 Σ 12 Σ -1 22 Σ 21 , where . Their positive roots ρ 1 ,…,ρ p are the population of canonical correlation coeff icients between U and V.
Interpretations of canonical correlation analysis (CCA) The null and alternative hypotheses for assessing the statistical significance of the canonical correlation coefficients (CCCs) are, The F test statistic for the statistical significance of ρ 2 i is . Here, where n is the number of cases, p is number of variables in the X set, q is the number of variables in the Y set, and r 2 i represents the eigenvalues of Σ -1 11 Σ 12 Σ -1

22
Σ 21 or the squared canonical correlations.Canonical correlation coefficients do not identify the amount of variance accounted for in one variable set by other variable sets.Therefore, it is important to calculate the redundancy measure for each canonical correlation to determine how much of the variance in one set of variables is accounted for by the other set of variables (Sharma, 1996).The redundancy measure can be formulated as below Plant traits and yield components in red pepper where OV (Y |V i ) is the averaged variance in Y variables that is accounted for by the canonical variate V i , LY ij which is the loading of the j th Y variable on the i th canonical variate; and q is the number of traits in canonical variates mentioned.

Applications of canonical correlation analysis
While the first ten characters were included in the first variable set (X nxp : plant characters), the latter three characters were included in the second variable set (Y nxq : yield components).All of the computational work was performed to examine the relationships between the two sets of traits by means of the PROC CONCORR procedure of the SAS 6.0 statistical package (SAS, 1988).

Results and discussion
Descriptive statistics for the examined characters are presented in Table 1.Bivariate correlations displaying the relationships among the traits of red pepper genotypes are given in Table 2.The highest correlation was predicted between FW/P and FN/P (0.85, P < 0.01), while the lowest correlations (r = -0.01 or 0.01, P > 0.05) were between PCL and FWT, LW and FW, FN/P and FL, FN/P and LL, FN/P and LW for all traits.The findings supported those of Jankulovski et al. (1997).Also, the highest correlations were predicted between FW/P and FN/P (r = 0.85, P < 0.01) for yield components; LW and LL (r = 0.81, P < 0.01) for plant characters; and FN/P and PH (r = 0.46, P < 0.01) for the interrelationships between yield components and plant characters.The lowest correlations were predicted between FN/P and AFW (r = -0.05,P > 0.05) for yield components; LW and FW (r = 0.01, P > 0.05), PCL and FWT (r = -0.01,P > 0.05) for plant characters; and FN/P and FL (r = -0.01,P > 0.05) for the interrelationships between yield components and plant characters.In addition to these results, it was determined that the relationships between yield components and plant characters were similar to the results of previous studies (Rochetta et al., 1976;Cruz et al., 1988;Ghai and Thakur, 1989;Taveres et al., 1999;Krishna et al., 2007).Although plant characters are important indicators of yield components in red pepper populations, it is extremely difficult to explain simultaneously the relationship between the traits.Therefore, instead of interpreting the correlations given in Table 2, three canonical correlation coeff icients were estimated to explain the interrelationships between the variable sets since the number of canonical correlations that need to be interpreted is the minimum number of traits within plant and fruit character variables at the X variables and the Y variables sets (Table 3).
Table 3 shows that all canonical correlation coefficients were signif icant (0.708, 0.635 and 0.413, P < 0.001) with respect to the likelihood ratio test.Based on this result, we interpreted the relationship between the first pair of canonical variables (U 1 and V 1 ), which had a maximum coefficient.
Standardized canonical coeff icients (canonical weights) and canonical loadings were given for the first pair of canonical variables (U 1 and V 1 ), as shown in Tables 4 and 5, respectively.Magnitudes of the canonical coefficients signify their relative contributions to the correlated variate.That is, the coefficients indicate the effects of plant characters on the yield components for red pepper genotypes.Therefore, the canonical variates (U 1 and V 1 ) representing the optimal linear combinations of dependent and independent variables can be defined by using the standardized canonical coefficients (given in  Accordingly, if the values of the plant characters (except for PCL, LW and FT) increase, the fruit weight per plant and average fruit weight will decrease, and the fruit number per plant will increase.These results support the idea that the fruit length and stem thickness of plants of the red pepper genotypes are important factors, as they are the primary determinant for fruit numbers per plant (Depeste, 1987;Silvetti, 1991;Sreelathakumary and Rajamony, 2003).
Variables with larger canonical loadings contributed more to the multivariate relationships between yield components and plant characters (Table 5).The loadings for the yield components suggested that FN/P and AFW were more influential than FW/P in forming V 1 .The loading for plant height was more influential than other plant characters in forming U 1 .According to the cross loadings, FN/P and PH contributed the most to canonical variates V 1 and U 1 , respectively.Although there is no relationship between plant height and yield for the ideal hybrid for cultivars in glasshouses because of heterosis vigor, the findings indicated that plant height, fruit wall thickness and fruit width should be used with the aim of increasing yield per plant in red pepper populations (Table 6).
In the present study, it was found that 90.7, 1.4 and 7.8% of total variation in the yield components set was explained by all canonical variables V i , while the redundancy measure of 0.455 for the first canonical variable suggests that about 45.5% of the ratio was explained by canonical variable U 1 .Also, it was found that 19.3% of total variation in the plant characters set was explained by the first canonical variable U 1 , while the redundancy measure of 0.097 for first canonical variable suggests that about 9.7% of the ratio was explained by canonical variable V 1 (Table 7).
Determining the relationship between characters affecting optimum output is very important for increasing yield components in red pepper genotypes.Larger fruit dimensions are desirable for both farmers and consumers.To this end, this study has revealed the relationships between the yield components and plant characters of the pepper.Fruit width, fruit wall thickness, plant height and number of fruits per plant were the most influential factors in this relation.Solanki et al. (1986) and Basavaraj (1997) have reported that fruit length, fruit width, number of fruits per plant and total fruit weight have strong positive correlations with yield.Therefore, the results obtained from this work will advance plant breeding practices and research on yield components by guiding Capsicum breeders in selecting the best plant characters in red peppers.In conclusion, this will lead to an increase in desirable yield values by decreasing the number of studied characters, which will in turn increase selection efficiency in red pepper production.

Table 2 .
The correlation matrix between traits , **: correlation is significant at the 0.05 and 0.01 level (2-tailed), respectively.-: correlation is not statistically significant at the 0.05 level (2-tailed).The superscript indicates that no correlation was found between the traits (for example between FL and FN/P).Bold figures present the highest and lowest correlation between the traits. *

Table 3 .
Summary results for the canonical correlation analysis

Table 4 .
Standardized canonical coefficients for canonical variables

Table 5 .
Canonical loadings of the original variables with their canonical variables

Table 6 .
Cross loading of the original variables with opposite canonical variables

Table 7 .
The explained total variation ratio by canonical variables for the variable sets