Machine learning applied to the prediction of citrus production

Irene Díaz, Silvia M. Mazza, Elías F. Combarro, Laura I. Giménez, José E. Gaiad


An in-depth knowledge about variables affecting production is required in order to predict global production and take decisions in agriculture. Machine learning is a technique used in agricultural planning and precision agriculture. This work (i) studies the effectiveness of machine learning techniques for predicting orchards production; and (ii) variables affecting this production were also identified. Data from 964 orchards of lemon, mandarin, and orange in Corrientes, Argentina are analysed. Graphic and analytical descriptive statistics, correlation coefficients, principal component analysis and Biplot were performed. Production was predicted via M5-Prime, a model regression tree constructor which produces a classification based on piecewise linear functions. For all the species studied, the most informative variable was the trees’ age; in mandarin and orange orchards, age was followed by between and within row distances; irrigation also affected mandarin production. Also, the performance of M5-Prime in the prediction of production is adequate, as shown when measured with correlation coefficients (~0.8) and relative mean absolute error (~0.1). These results show that M5-Prime is an appropriate method to classify citrus orchards according to production and, in addition, it allows for identifying the most informative variables affecting production by tree.


lemon; mandarin; orange; M5-Prime; age; framework; irrigation

Full Text:



Agustí M, 2000. Crecimiento y maduración del fruto. In: Fundamentos de Fisiología Vegetal. McGraw Hill, Madrid. 669 pp.

Agustí M, 2003. Citricultura. Ed. Mundi-Prensa, Madrid. 456 pp.

Alavi AH, Hasni H, Lajnef N, Chatti K, Faridazar F, 2016a. An intelligent structural damage detection approach based on self-powered wireless sensor data. Aut Construc 62: 24-44.

Alavi AH, Hasni H, Lajnef N, Chatti K, Faridazar F, 2016b. Damage detection using self-powered wireless sensor data: An evolutionary approach. Measurement 82: 254-283.

Altman NS, 1992. An introduction to kernel and nearest-neighbor nonparametric regression. The Amer Statist 46 (3): 175-185.

Arango RB, Díaz I, Campos AM, Combarro EF, Canas EF, 2015. On the influence of temporal resolution on automatic delimitation using clustering algorithms. Appl Math Inf Sci 9 (2L): 339-347.

Basak D, Pal S, Patranabis DC, 2007. Support vector regression. Neural information processing. Letters and Reviews 11 (10): 203-224.

Behnood A, Behnood V, Gharehveran MM, Alyamac KE, 2017. Prediction of the compressive strength of normal and high-performance concretes using M5P model tree algorithm. Constr Build Mater 142: 199-207.

Breiman L, 2001. Statistical modeling: The two cultures (with discussion). Statist Sci 16 (3): 199-231.

Das SK, Samui P, Sabat AK, 2011. Application of Artificial Intelligence to maximum dry density and unconfined compressive strength of cement stabilized soil. Geotech Geol Eng 29 (3): 329-342.

Di Rienzo JA, Casanoves F, Balzarini MG, González L, Tablada M, Robledo CW, 2015. InfoStat versión 2015. Grupo InfoStat, FCA, Universidad Nacional de Córdoba, Argentina.

El Gibreen H, Aksoy MS, 2015. Classifying continuous classes with reinforcement learning rules. In: Intelligent Information and database systems; Nguyen NT, Trawinski B, Kosala R (eds.), pp: 116-127. Springer Int.

Fernández-Quintanilla C, Dorado J, San Martín C, Conesa-Muñoz J, Ribeiro A, 2011. A five-step approach for planning a robotic site-specific weed management program for winter wheat. Proc. Robotics and Associated High-Technologies and Equipment for Agriculture; Gonzalez de Santos P & Rabatel G (eds.), Montpellier (France), pp. 3-12.

Frank E, Wang Y, Inglis S, Homles G, Witten I, 1998. Using model trees for classification. Mach Learn 32 (1): 63-76.

García-Petello J, Castel JR, 2004. The response of Valencia orange trees to irrigation in Uruguay. Span J Agric Res 2 (3): 429-443.

Gasque M, Granero B, Turegano JV, González-Altozano P, 2010. Regulated deficit irrigation effects on yield, fruit quality and vegetative growth of ´Navelina´ citrus trees. Span J Agric Res 8 (S2): S40-S51.

González-Altozano P, Castel JR, 2003. Riego deficitario controlado en ´Clementina de Nules´. Efectos sobre la producción y la calidad de la fruta. Span J Agric Res 1 (2): 81-92.

González-Sánchez A, Frausto-Solís J, Ojeda-Bustamante W, 2014. Predictive ability of machine learning methods for massive crop yield prediction. Span J Agric Res 12 (2): 313-328.

Goyal, MK, 2014. Modeling of Sediment yield prediction using M5 model tree algorithm and wavelet regression. Water Resour Manage 28: 1991-2003.

Han J, Kamber M, 2006. Data mining: concepts and techniques, 2nd ed. Morgan Kaufmann Publ.

Kohonen T, 1982. Self-organized formation of topologically correct feature maps. Biol Cybern 43: 59-69.

Medina-Urrutia VM, Becerra-Rodríguez S, Ordaz-Ordaz E, 2004. Crecimiento y rendimiento del limón mexicano en altas densidades de plantación en el trópico. Revista Chapingo Serie Horticultura 10 (1): 43-49.

Mitchell T, 1997. Machine learning. McGraw Hill.

Onyari EK, Ilunga FM, 2013. Application of MLP neural network and M5P model tree in predicting stream flow: A case study of Luvuvhu Catchment, South Africa. Int J Innov Manage Technol 4 (1): 11-15.

Orduz-Rodríguez JO, Chacón-Díaz A, Linares-Briceño VM, 2007. Evaluación del potencial de rendimiento de tres especies y un híbrido de cítricos en la región del Arari del Departamento del Meta (Colombia) durante doce años, 1991-2003. Orinoquia 11 (2): 41-48.

Pérez-Ariza C, Nicholson A, Flores M, 2012. Prediction of coffee rust disease using Bayesian networks. Proc. 6th Eur Workshop on Probabilistic Graphical Models, Granada (Spain), pp: 259-266.

Pourzangbar A, Brocchini M, Saber A, Mahjoobi J, Mirzaaghasi M, Barzegar M, 2017. Prediction of scour depth at breakwaters due to non-breaking waves using machine learning approaches. Appl Ocean Res 63: 120-128.

Quinlan JR, 1992. Learning with continuous classes. Proc Aust Joint Conf on Artificial Intelligence, Hobart (Tasmania), Nov 16-18, pp: 343-348.

Quinlan JR, 1993. C4.5: Programs for machine learning. Morgan Kaufmann Publ.

Samadi M, Jabbari E, Azamathulla HM, 2014. Assessment of M5′ model tree and classification and regression trees for prediction of scour depth below free overfall spillways. Neural Comput Appl 24 (2): 357-366.

Tripathy AK, Adinarayna J, Sudharsan D, Merchant SN, Desai UB, Vijayalaksmi K, Raji-Reddy D, Screenivas G, Ninomiya S, Hirafuji M, Kiura T, Tanaka K, 2011. Data mining and wireless sensor network for agriculture pest/disease predictions. Proc World Cong on Information and Communication Technologies, Mumbai (India), pp: 1229-1234.

Tucker DPH, Wheaton TA, Muraro RP, 1994. Citrus tree spacing. University of Florida. Fla Coop Ext Serv.

Uysal I, Altay HG, 1999. An overview of regression techniques for knowledge discovery. Knowl Eng Rev 14: 319-340.

Wang Y, Witten IH, 1997. Induction of model trees for predicting continuous classes. 9th Eur Conf on Machine Learning, Prague (Czech Republic).

Wang H, Ma Z, 2011. Prediction of wheat stripe rust based on support vector machine. Proc 7th Int Conf on Natural Computation, Shanghai (China), pp: 259-266.

Ye X, Sakai K, Manago M, Asada S, Sasao A, 2007. Prediction of citrus yield from airborne hyperspectral imagery. Precis Agric 8 (3): 111-125.

Yu H, Liu D, Chen G, Wan B, Wang S, Yang B, 2010. A neural network ensemble method for precision fertilization modelling. Math Comput Model 51 (11): 1375-1382.

DOI: 10.5424/sjar/2017152-9090