characte-Identification of bean varieties according to color features using artificial neural network

A machine vision and a multilayer perceptron artificial neural network (MLP-ANN) were applied to identify bean varieties, based on color features. Ten varieties of beans, which were grown in Iran (Khomein1, KS21108, Khomein2, Sarab1, Khomein3, KS21409, Akhtar2, Sarab2, KS21205, and G11870) were collected. Six color features of the bean and six color features of the spots were extracted and used as input for MLP-ANN classifier. In this study, 1000 data sets were used, 70% for training, 15% for validating and 15% for testing. The results showed that the applied machine vision and neural network were able to classify bean varieties with 100% sensibility and specificity, except with Sarab1 with sensibilities of 100%, 73.3%, 60% for the training, validation and testing processes, respectively and KS21108 with specificities of 100%, 79% and 71%, respectively for the aforementioned processes. Considering total sensibilities of 100%, 97.33%, 96% and also specificities of 100%, 97.9% and 97.1% for training, validation and testing of beans, respectively, the ANN could be used as a effective tool for classification of bean varieties.


Introduction
Beans (Phaseolus vulgaris L.) are legumes which are mostly produced and consumed as dry and green seeds in the world.Legumes are major sources of proteins and nutrients such as oil, f iber, starch, vitamins and mineral elements (Laurent et al., 2010).The analysis and classification of seeds are essential processes for the f inal step of crop production (Granitto et al., 2002).In today's competitive market, since consumers tend to use healthy and homogenous products, producers have to present products which are sorted according to physical characteristics like appearance, size, color, internal health and variety.Furthermore, identification of grain varieties helps farmers to use suitable grains for planting and marketing, because in this way the grains they sell would have the essential standards for marketers (Chen et al., 2010).It is very difficult to sort and separate products manually.Manual inspection involves labor intensive work and the decision made thereof can be very subjec-tive depending on the mood and condition of the person involved.Furthermore, this manual procedure can be very time consuming and inefficient especially when dealing with high production volumes.Considering these reasons, new methods like machine vision systems for grading and sorting have been introduced.Color is a property of products which is measured with machine vision and has been a great help in identifying objects for many years.The process of color classification involves extraction of useful information concerning the spectral properties of the object surfaces and discovering the best match from a set of known descriptions or class models to implement the recognition task (Sahin, 1997).Also color properties have been widely used for apple (Malus × domestica) quality evaluation, and mostly for defect detection (Leemans et al., 1998;Leemans & Destain, 2004) and tomato (Solanum lycopersicum) quality (Sarkar & Wolfe, 1985).Pydipati et al. (2006) used a machine vision technology for evaluating the quality of grains and fruits.They distinguished citrus diseases using the structural characte-ristics of leaf color.Thorp & Dierig (2011) developed an image processing algorithm to extract information of flower features with hue, saturation and intensity (HSI) color space.They stated that their methods are simple, practical and quite generic, and could be easily used for many digital image processing applications which are based on color features.Their results showed that this system has the advantage of high accuracy and quick response in comparison with other measurement methods.
Classification of grain varieties by image analysis method has been investigated by several researchers.Shahin & Symons (2001) and Venora et al. (2007) used seed color and size with a flatbed imaging system to determine color grading, in Sicilian and Canadian lentil landraces and cultivars, finding that color features were good predictors.Also Venora et al. (2009) used color, shape and size of Italian bean landraces for identification of varieties.Similar experiments were performed by Chen et al. (2010) who identified five corn varieties.Color features have been extensively applied for grain quality evaluation.In a similar trend, Kılıc et al. (2007) developed a system based on size and color for classification of beans.Results showed that this method was able to correctly classify beans.Also Laurent et al. (2010) developed a computer vision system to study the hard-to-cook characteristic of bean with color and histogram features.The results showed that the color changes of beans are related to the hard-to-cook aspect.Guevara-Hernández & Gómez-Gil (2011) developed a machine vision system to classify wheat and barley grain kernels.Results of the system indicated that accuracies higher than 99% can be achieved when morphologic, color, and texture features are extracted from the grain kernels.
Color features and artificial neural network (ANN) classifier have been used by many researchers.Arribas et al. (2011) and Al Ohali (2011) used RGB color space and ANN classifier for sunflower (Helianthus annuus) leaves classification and grading of date fruit into three quality categories, respectively.Anami et al. (2011) classified different agriculture and horticulture products by neural network, finding that color and texture features were able to significantly detect normal and affected products.Abdullah et al. (2001) investigated the quality of oil palm (Elgaeis guineensis) with color features (HSI color space), finding that the vision system was able to correctly classify samples at a greater than 90% success rate.Huang (2007) also investigated the process for detecting Phalaenopsis seedling diseases based on texture and color features by neural network; results showed that diseases can be segmented and classified significantly using this method.
Considering the importance of identif ication of beans varieties for presenting to the market, this study aimed to classify ten bean cultivars using a method based on machine vision which extracted color features followed by ANN.

Varieties used and image acquisition
In this study, ten major commercial Iranian bean varieties were considered (Fig. 1).Bean samples were acquired from the National Station of Bean Research in Khomein (Markazi province, Iran).The machine vision system was composed of three parts imaging, processing and information display unit.The imaging unit was responsible for recording the images and included a camera and a lighting system.To take pictures, a 576 × 720 pixels resolution Samsung camera (SCC-101 PA) was used.Low-level vision processing tasks need good lighting in the work environment.Hence, good and uniform illumination from external light source is essential for machine vision applications.In this research a fluorescent lamp (36 W) was used for lighting and a fiberglass sheet in front of it was utilized for producing homogenous lighting.The processing unit included three actions: transferring the camera signals to the computer, developing and extracting the image characteristics.Transferring the information from the camera to the computer was done by capture card (ACEDVio, Canopus).Image development was carried out to remove noise and non-homogenous lighting effects from images.The main purpose of the image processing was to extract features, which is done by the image processing science (González & Woods, 2007).

Image processing algorithm
Image analysis consisted of smoothing, image segmentation, features extraction, and data analysis.All pictures were captured in the RGB color space (Fig. 2a).Image process algorithm used Gaussian f ilter to remove the noise of picture and also image segmentation was designed to separate the region of interest from background.The blue color was selected for the Identification of bean varieties according to color features using ANN background, because of its clear color contrast with the beans.For this reason, the image of B channel was subtracted from R channel.In the R-B image, the object pixels (pixels of beans) are higher than zero because the red values of object pixels are higher than blue values (Fig. 2b).Then R-B gray image was converted into binary [0, 1] image with threshold and 1 assigned to the object and 0 assigned to the background (Fig. 2c).Then erosion and dilation orders with disk structure were used for smoothing of edge (Gonzalez & Woods, 2007).To extract the color features, the bina-ry image was used for masking on RGB image (Fig. 2d).This image was used for two purposes; first, for extracting R mean , G mean and B mean ; second, for converting to HSI (Ruiz-Ruiza et al., 2009), and extracting H mean , S mean and I mean .Spot color features can increase the classification accuracy for the beans have high color similarity.The S channel in HSI was separated and gray image of S converted to binary with thresholding method and 1 assigned to the spots and 0 assigned to the plane of beans.Like beans, a similar method was used for masking on RGB and HSI to extract color features of spots (Fig. 3).

Artificial neural network
ANN is a non-linear modeling technique which can provide the classification abilities and lets the extension of computer vision technology into the area of color, at human brain levels of performance (Du & Sun, 2006).The learning procedure for developing a neural network can be either supervised or unsupervised.To obtain a unique solution for the network system, the sample size (number of input patterns) should be larger than or equal to the number of independent variables (sum of the number of weights and other possible variables) (Peres et al., 2011).In this study, a multilayer perceptron (MLP) was employed as the modeling network for classification and a supervised learning algorithm (LM) was used for training the ANN.ANN model contained an input layer, an output layer and one or more hidden layers.In this study, the MLP network applied had four layers: an input layer, two hidden layers and an output layer.The input layer had a maximum of 12 neurons, one for each mean color parameter in the RGB and HSI color space for bean and spot.The two hidden layers had a variable number of hidden units, the optimal number of which was determined using a trial and error strategy.The output layer was equal to the number of classes, in this case 10 neurons corresponding to the ten varieties.The MLP-ANN used TANSIG function in the hidden layers and linear function in the output layer.In this study, 1000 data sets were used, 700 (70%) for training, 150 (15%), for validating and 150 (15%) for testing.Due to the different ranges of each input, we normalized the inputs between [-1, 1].Therefore, the ANNs were trained on the first subset (training set), and its performance was monitored using the second subset (validation set).Finally, the latter subset (test set) was used to check the predictive performance of the network since the data included in this subset was not used in the network development.In ANN, if the output layer equals [1 0 0 0 0 0 0 0 0 0 0 0] matrix, it means the Khomein1 variety, and if it equals [0 1 0 0 0 0 0 0 0 0 0 0] matrix it means the KS21108 variety and so on for other classes.In this method, each bean belongs to a class whose neurons have the highest value in the output layer.In this study the general sensibility and specificity of MLP-ANN classifier were calculated based on Peres et al. (2011): Identification of bean varieties according to color features using ANN 673  where N c is the number of samples of a specific group correctly classified, N tc is the total number of samples belonging to that specific group, N b is the number of samples of a specific group classified as belonging to that group and N tb is the total number of samples of any group classified as belonging to that specific group.

Results and discussion
Twelve color features were extracted from the bean images.Table 1 lists descriptive statistics of the bean color features computed from images.They are seed color components (whole seed and spot seed), in the RGB model (which stands for Red, Green and Blue channels) and HSI model (that stands for Hue, Saturation and Intensity channels, measured as mean grey levels).Twelve of the color feature values were plotted against bean varieties (Fig. 4).As can be seen in this figure, it is clear that less overlapping features are more efficient in separating the bean varieties.There were complex relationships between bean varieties and color features.It was difficult to develop a simple model, like a linear model to predict bean varieties on these color features.Therefore all the 12 variables of seed color and spot information were inserted in the model of the MLP-ANN to identify the 10 varieties.The neural networks, with four layers, were trained and selected using two different data subsets (training and validation sets, respectively) and were then tested, with the experimental data not used in the two previous steps.Several neural networks were trained and the network which gave the minimum mean square error (MSE) of the validation subset was chosen (Table 2).The two hidden layers of the selected network had different numbers of neurons, being 20 and 10, respectively.
Finally the selected MLP-ANN (12:20:10:10) was used to evaluate the ability of this multivariate tech-

674
A. Narisahmadi and N. Behroozi-Khazaei / Span J Agric Res ( 2013) 11(3): 670-677 nique for beans classification.The results obtained for the mentioned specific network are presented in Table 3.The results of this table showed that the selected neural network was able to correctly classify beans according to the varieties with a satisfactory sensibility and specificity of 100% (for the data included in the training, Identification of bean varieties according to color features using ANN 675 validation and testing data sets).It should be mentioned that Sarab1 variety was an exception with sensibilities of 100%, 73.3%, 60% and the KS21108 variety with specificities of 100%, 79% and 71% for the training, validation and testing process, respectively.So, the MLP-ANN model proposed in the present study, together with the 12 color features used, is a reliable practical tool for classification of the beans variety.
Moreover, the analysis of the results presented in Table 3 shows that Sarab1 was misclassified as KS21108, probably due to the high overlapping in color features of these two varieties diagrams (Fig. 4).On the other hand, ANOVA results showed that there were no significant differences (p ≤ 0.05) among six color features of Sarab1 variety and those of KS21108.A similar work was done by Shahin & Symons (2001)    In this research we used only 12 color features to identify 10 bean varieties by ANN classifier with a general sensibility of 100%, 97.33%, 96% and specificity of 100%, 97.9% and 97.1% for training, validation and testing, respectively.Applied ANN helped us use fewer variables than linear discriminant analysis (LDA) method of Venora et al. (2009) for classification.Therefore the image processing algorithm and ANN classification process was quite convenient and rapid.
As final conclusions, the MLP-ANN method confirmed to be a practical and effective model for bean varieties classification.MLP-ANN model contained an input layer, an output layer and two hidden layers.This model shows high overall sensitivity and specificity (higher than 96% and 97%, respectively).To obtain these results, 12 color features (R mean , G mean , B mean, H mean , S mean and I mean ) of bean and their spots were used.This recommended method, compared with previous methods, is a good way for a rapid and cheap classification of bean varieties.

Figure 3 .
Figure 3. Algorithm of the coded program.

Table 1 .
Descriptive statistics (mean ± standard deviation) of color features measured in this research

Table 2 .
Artificial neural network (ANN) results Bold values belong to the selected network with minimum MSE of the validation subset.

Table 3 .
The artificial neural network (ANN) analysis: results for training, validation and test groups seeds based on image analysis features, observing that neural classifiers achieved an overall accuracy of more than 90% for grading of lentil.Kılıc et al. (2007)investigated the quality of beans based on length, width, average, variance, skewness and kurtosis values by ANN classifier.Generally 90.6% of the beans were correctly classified; 99.3% of white beans, 93.3% of yellow-green damaged beans, 69.1% of black damaged beans, 74.5% of low damaged beans and 93.8% of highly damaged beans were correctly classified too. of