Future scenarios and conservation strategies for a rear-edge marginal population of Pinus nigra Arnold in Italian central Apennines

Aim of the study: To forecast the effects of climate change on the spatial distribution of Black pine of Villetta Barrea in its natural range and to define a possible conservation strategy for the species Area of study: A rear-edge marginal population of Pinus nigra spp. nigra in Abruzzo region, central Italian Apennines Matherials and Methods: For its adaptive and genetic traits this population is considered endemic of the Italian peninsula and represents a rear-edge marginal population of nigra subspecies. The spatial distribution of the tree in the administrative Region (Abruzzo) was used to define the ecological traits while three modelling techniques (GLM, GAM, Random Forest) were used to build a Species distribution model according to two climatic scenarios. Main results: The marginal population’s range was predicted to shift at higher elevations as consequence of climatic adaptation. Many zones, represented by the higher part of the mountains surrounding the study area (currently bare and inhospitable for trees), were identified as suitable in future for the species. However, in the case of a rapid climate change, this marginal population may not be able to move as fast as necessary. An in-situ adaptive management integrated with an assisted migration protocol might be considered to favour natural regeneration and improve the richness and variability of the genetic pool. Research highlights: Most of the genetic richness is held in small populations at the borders of natural distribution of forest species. Monitoring this MAP could be useful to understand the adaptive processes of the species and could support the future management of many other within-core populations.


Introduction
A valuable part of the genetic richness and diversity of forests is held by small populations living at the edges of the natural species range.Those populations, defined as marginal and peripheral populations (MaP populations) will be the first facing the climate change effects and may contain original adaptive traits and genetic variability (Hampe & Petit, 2005).Those will play a key role to study and understand the effects of Global Change on forest ecosystems.Climate change may have different impacts on different populations 2 In the Mediterranean region, the Italian peninsula is a well-known hotspot of genetic diversity, consequence of the presence of many ancient glacial refugia (Petit et al., 2003).In this paper, through a SDM approach, the future impacts of climate change on a MAP of Pinus nigra in central Italy (Abruzzo region) were investigated.The aim was to guide the future management processes and to consider and possibly to implement assisted migration actions of this marginal popolation as well as to add knowledge about the more likely events of forest species at the borders of the natural distributions.The current distribution of the species across the region was considered to cover the ecological niche (ecological niche) while just the spatial distribution and shape of the MaP population was investigated.Three modelling techniques were compared and used to create an ensemble model with two future climatic scenarios for central Italy according to two future trajectories from the IPCC AR5, the rcp4.5 and rpc8.5 (IPCC, 2014).

Target species and study area
The target species of this study is the Black pine of Villetta Barrea (Pinus nigra Arnold ssp.nigra var.italica Hochst) in its native area, the Abruzzo region in central Italy (Fig. 1).This tree belongs to the nigra subspecies (Quézel & Médail, 2003) and is naturally distributed only in this region on approximately 400 hectares around the small town of Villetta Barrea (Lat. 41.7768 N,Long. 13.9374 E).Geneticists had classified the Black pine of Villetta Barrea as intermediate between the two Italian subspecies (nigra and laricio).It is smaller for size and growth rate than Austrian and Calabrian pine (Gellini & Grossoni, 2003) but highly drought-tolerant.The phenotype of trees and the needle's anathomy are diagnostic to asses differences among subspecies and varieties (Bruschi et al., 2005).
A small part of this population (about 100 ha) is registered as a seed stand (Regional Code ABR04) while the other stands are included into the in the National Park of Abruzzo Lazio and Molise.In the last decades, many seeds were harvested from this area for nursery activities and reforestation programmes on calcareous soils, similar to Austrian pine (Pinus nigra Arnold ssp.Nigra).As a consequence, the gene pool of these trees has been spread across the whole Italian country (Gellini & Grossoni, 2003).
The average annual temperature of the zone where the MAP is located is 15.1 °C according to the 1971-Manjarrés, 2015), carbon storage and water balance (Vitale et al., 2012) and could force species to migrate from their present geographical range (Parmesan, 2006).Under climate change effects, forest species may have to adapt to new environmental conditions to avoid local extinction or severe genetic erosion (Provan & Maggs, 2012;Hamann & Aitken, 2013).A good prediction of the most likely effects of movement of climatic belts is fundamental to balance future forest management and seed transfers among different ecological regions (Benito-Garzón & Fernández-Manjarrés 2015).Small and isolated populations with low gene flow and low genetic variability could disappear, with a loss of adaptive richness (Schueler et al., 2014) threatened by the speed in the changing environment (Mátýas et al., 2009).
To apply climate change prediction on ecological systems different disciplines are involved.On one side, climatology is fundamental to outline future scenarios.On the other side, ecology and biology are basic to weight and balance species' response, taking also into account the interaction of biotic versus abiotic factors, especially at the margins of the natural range (Guisan & Zimmermann, 2000) or for planted/introduced species (Isaac-Renton et al., 2014).In such a background, prediction of future impacts on forest ecosystems is an ensemble of climatic scenarios and adaptability of the species that must be considered in a holistic view and tackled under many different aspects (Trivedi et al., 2008) and mainly driven by uncertainties (Araújo et al., 2005;Wang et al., 2012).
To predict such events on forest species many models have been proposed and named as Ecological Niche Models (ENM) and/or Species Distribution Models (SDM) (Guisan & Zimmermann, 2000;Elith & Leathwick, 2009;Warren, 2012;Flower et al., 2013;Mcinerny & Etienne, 2013;Warren, 2013).In those modelling procedures the spatial distribution of forest species is connected to a set of environmental variables that can be used as predictors modelling procedure (Guisan & Thuiller, 2005;Elith & Leathwick, 2009).A vast number of different SDMs (and ENMs) algorithms have been proposed in the literature, often compared with each other in order to assess their power and suitability according to the different nature of data or species distributions (Zaniewski et al., 2002;Liu et al., 2011;Merow et al., 2014).When a SDM (or an ENM) incorporates future climate predictions, future distribution of species can be forecast becoming a very powerful way to study climate change effects on populations (Forester et al., 2013;Brunetti et al., 2014;Isaac-Renton et al., 2014) small parts of species range (Hamann & Aitken, 2013) or provenances (Isaac-Renton et al., 2014).

3
Conservation strategies for a rear-edge marginal population Following the previous knowledge, the modelling of the MAP of Villetta Barrea was performed considering more than the current spatial distribution of the MAP which represents just a small part of the potential ecological niche of the species (Fig. 2).The presence and absence data were extracted from the Regional forest categories map of Abruzzo (Marchetti et al., 2006) with a spatial resolution of 100 metres.All polygons of the "Natural stand of Villetta Barrea pine" and "Afforestation in mountainous areas" category were used to calculate the presence dataset while all the others polygons were considered as (pseudo) absence.More than 10,000 presence points (10,047) in WGS84 UTM 33N reference system were obtained while 444,143 locations were detected as absences.Aware that when the proportions of presences and absences in a model are not equal (or not equally weighted) the prediction can be asymmetric (Barbet-Massin et al., 2012)

Definition of the ecological niche of the Villetta Barrea Black pine (Presence/absence dataset)
Forest management, reforestation activities on poor and vulnerable soils or on abandoned lands, natural Parks regulations and many other events have strongly modified the distribution (and the gene pool) of forest species, especially in mountainous areas.Consequently, many aspects must be carefully considered before modelling species distributions and especially in the case of conifers used for reforestation programmes (Cantiani & Chiavetta, 2015).Indeed, due to human influence on forest ecosystems, a quantity of the current geographical distribution of Black pine has been modified (Bernetti, 1995;Gellini & Grossoni, 2003;Bruschi et al., 2005).As a result, in some cases such as Pinus nigra spp, the present distribution of the species had to be carefully checked to avoid artificial reduction or expansion of the ecological niche.).The interpolated dataset for the current period (1980-2010) was modified adding the predicted variation as anomalies to the raster maps using the "delta method" (Ramirez-Villegas & Jarvis, 2010).To consider soil variability, the soil map of Italy (European Soil Bureau, 1999) was converted in raster map with the same spatial resolution of the climatic predictors and included as predictor in the model.

Tested models
To model the spatial distribution of the target species, three algorithms were selected and compared.Those algorithms were: i) Generalized Linear Model (GLM); ii) Multivariate Adaptive Regression Splines (MARS) and iii) Random Forest (RF).All methods were implemented in biomod2 package (Thuiller et al., 2014) for R (R CoreTeam, 2015) which was adopted to perform the spatial analysis.
GLM is generally known as "Logit Model", it is used for binomial regression (1, 0) and is widely available

Climate data and future scenarios
Many climatic data are freely available in web repositories.WorldClim database (Hijmans et al., 2005) is one of the most famous and used for ecological modelling.It is free of charge and provides raster maps with a maximum spatial resolution of 30 arc-second including temperatures, precipitations and 19 bioclimatic variables (www.worldclim.org).Anyway, is some cases as MaP populations' analysis, WorldClim's maps can be not adequate to consider all climatic variability due to the spatial resolution (approximately 1 km at the equator) and compared to the physiographic characteristics of the study environment (Hijmans et al., 2005;Bedia et al., 2013).For these reasons and due to the topographic layout of our study region (Abruzzo) the 19 bioclimatic maps of worldclim portal were re-calcultaed.Climatic data from the regional meteorological network were interpolated at 100 metres of spatial resolution using a comparative approach between geostatistical methods as suggested by Attorre et al. (2007).
According to IPCC predictions for the Mediterranean area (IPCC, 2014) two future scenarios were Conservation strategies for a rear-edge marginal population to avoid lack of information and biases during the random extraction, predictions of models with TSS higher than 0.7 (Araújo et al., 2005) were used to calculate an ensemble model.An averaged mean of the four algorithms was calculated (Marmion et al., 2009;Zhang et al., 2015).Weights were given according to the accuracy of each model, assessed using a bootstrapping procedure with 30 runs for each dataset (Guisan & Zimmermann, 2000).Three different suitability maps were obtained, one for each scenario (ABR0, ABR4.5, ABR8.5) and the whole procedure is graphically-reported in Fig. 3.
Finally, to calculate the potential suitable area for Black pine in Abruzzo, a binary transformation of the prediction of the ensemble model was carried out using the threshold which maximize TSS, a method known to improve the accuracy of prediction (Jiménez-Valverde & Lobo, 2007).Maps were transformed rescaling pixel values (1 pixel = 1 ha) between 1 (potentially suitable) and 0 (not suitable for the species).The same procedure was performed for elevation values computing minimum, median, mean, mode and maximum values to check a possible rising at higher elevation of the suitable envelope.All the calculations and statistics were made considering the whole spatial distribution of the species in Abruzzo, while the conservation strategy was studied just for the MAP. in statistical packages (Bedia et al., 2011); the optimal regression formula is generally calculated through a stepwise procedure, using AIC criteria.
MARS model (Friedman, 1991) is a powerful nonparametric tool, mainly used for data mining.It is an adaptive procedure and similar to GLM it is based on regressive methods and well suited for high-dimensional problems (i.e, a large number of inputs).It can be considered as a generalization of stepwise linear regression or a modification of the CART method (Hastie et al., 2008).The main feature of MARS is that the algorithm works sub-setting the dataset in different subsections which are modelled separately and connected at the end of computation.
RF regression-model algorithm (Breiman, 2001) belongs to the machine-learning techniques and derives from Classification and Regression Trees.In this case, the regression is built using predictors to classify objects which are sampled randomly through a bootstrap procedure.The number of randomly-sampled predictors is, in general, the square root of the total number for classification and one-third for regression.Tree nodes are created using the randomly-sampled predictors (generally climatic variables or bioclimatic indices as our case) that had the smallest classification error.For each step, RF created a different regression tree, splitting data into groups, the "bagged sample" and the "out-of-bag sample".The first is used to create the tree and the second is used to calculate the classification error (the Out Of Bag error).After a specific number of trees (500, 1000, 2000... N) is created, the computation ends.

Ensemble model calculation and environmental analysis
To assess differences among datasets and models the Kruskal-Wallis Rank Sum test (Kruskal & Wallis, 1952) was used to perform an non parametric ANOVA.The Area Under Receiver Operating Characteristic Curve (AUC) and True Skill Statistics (TSS) (Allouche et al., 2006) were used as indicators.AUC and TSS were calculated with a split-sample approach (Van Houwelingen & Le Cressie, 1990), dividing each dataset in "training sites" and "test sites" with a 70% -30% proportion.AUC and TSS are both indicators of goodness of prediction and while the first varies between 1 and 0 the second ranges between -1 and 1.However, only TSS, which corresponds to the sum of sensitivity and specificity-minus-one, has the additional advantage of being fully independent from the species prevalence and the size of the validation dataset (Allouche et al., 2006).After model comparison,   5 and 6 the values related to potential suitable area and elevation limits for the three modelled scenarios are reported.A cartographic representation of the projections is shown in Fig. 4. The model correctly predicted the current natural distribution in ABR0, which was drawn as black polygons.With ABR8.5 the situation was completely changed and current distribution was not predicted to be suitable any more.Three different zones were selected by the

Results
Predictors were initially tested for collinearity (Montgomery et al., 2012) and the 5 biovariables with no collinearity problems (Table 1) were added to the soil map and used for building the SDM.The importance of each predictive variable is reported in Table 2.The modelling procedure detected the bio5 (Max Temperature of Warmest Month) as the most important predictor for all models.Climatic data were detected as much more relevant than soil data, especially in MARS model where they were not useful at all.
Mean AUC and TSS values of 30 bootstrap runs and of full models for each dataset are reported in Table 3, whereas global means and standard deviations are reported in Table 4. ANOVA performed on TSS values demonstrated the absence of statistical differences between the algorithms.Consequently, the ensemble model was created with all the algorithms.In the current scenario (ABR0) the ensemble model calculated 229,991 hectares of potentially-suitable area, much higher than the present distribution which is 19,185 hectares.In ABR4.5 and ABR8.5 this estimated area decreased very strongly and respectively of -72.1% and -96.5%.According to this prediction, also elevation of suitable envelope was predicted to change.Minimum random extraction of trees and variables, the possible overfitting of some models such as GLM and MARS can be reduced.
The current distribution of the Black pine population was predicted to be modified by the considered scenarios.Strong changes were predicted to happen especially with the warmest projection (ABR8.5).The predicted loss of suitable area for the whole Abruzzo region was very high (-95%) and just few mountainous zones were detected as suitable (Fig. 5).However, as the model suggested, Abruzzo's topographic morphology may play a key role in the conservation of this marginal gene pool.In such case, trees at higher elevations on rocks, growing at around 2,000 metres a.s.l., might increase their reproductive role in a warming climate allowing the species to migrate.In this context, this "new" source of seeds could be a relevant advantage versus competitors such as Beech and/or Oaks which are frequently mixed with Black pine.These hardwood species are not able to migrate at higher elevation without the help of animals.In addition future-suitable lands are actually bare and inhospitable for trees species and, Black pine would probably be more able to colonize these new environments.A similar effect of the elevation ranges was detected in other Italian regions (Attorre et al., 2011;Vacchiano & Motta, 2014).
Our model reported a very high variation of suitable area in Abruzzo predicting higher values for ABR0 than the real situation (+2,000%).Reasons rely on the fact that SDMs works with the occupied ecological niche, compared to the environmental variability.In this view, the spatial distribution of a species cannot match all the locations that have similar condi-model on top of the mountains surrounding the Sangro river.The first one was on the higher parts of the Camosciara area (Monte Capraro, Monte Petroso and Monte Tartaro) whereas other two were on the opposite side of the valley (Monte Greco, Monte Marsicano and Monte della Corte).

Discussion and Conclusion
Despite many recent works used RF as unique (or unique-based) algorithm to predict present and future distribution of forest species (Wang et al., 2012;Melini, 2013;Isaac-Renton et al., 2014), also regression-based models (GLM and MARS) performed well in this study.As expected, RF showed higher TSS and AUC and smaller standard deviations.The use of a group of models for a consensus map, with more runs and multiple datasets with biomod2 ensemble weighting method, can consistently improve the prediction of a single model (Marmion et al., 2009;Forester et al., 2013).An ensemble modelling approach can correct biases in calculations, making the prediction more stable than the classical packages of the various algorithms (Liaw & Wiener, 2002).For instance, while RF prediction is often affected by  tions.Forest ecosystems are dynamic and complex systems with a mixture of species which interact and compete for natural resources (Ciancio & Nocentini, 2011) and those dynamics can hardly be included in a model.In addition, we must also consider that Black pine is a very plastic species which can grow in a very wide spectrum of areas (Vidakovic, 1974;Isajev et al., 2004;Corona & Nocentini, 2009).Different biotic and a-biotic factors are involved (Pearson & Dawson, 2003).In this view, a specific conservation strategy, based on silvicultural management and monitoring system for the species should be considered in order to observe future development and manage the forest genetic resource properly.If on one side the adaptation strategies are partially considered in this modelling approach, future climate developments are likely to be faster than the migration and adaptation ability of the species (Mátýas et al., 2009).An assisted migration protocol could be taken into account and combined with in-situ adaptive management.A silvicultural approach aimed at increasing the genetic exchange among trees and based on natural regeneration could favour the development of adaptive traits (Brang et al., 2014).This could be achieved with silvicultural interventions which differentiate stand structure and open up the canopy, e.g.small group selection felling (Ciancio et al., 2006).At the same time the establishment of seed orchards and dynamic ex-situ conservation could allow the conservation of the available gene pool.
In the end, predicting the impact of Climate Change on forest species is full of uncertainties.Biological, genetic and ecological skills are fundamental to contribute to these studies, to enforce and validate statistical models and to combine different approaches.Species-specific analysis like genetic diversity, dendrochronology and water-stress resistance could be added to study the phenotypic plasticity of the species (Grivet et al., 2013).In addition, the genetic information about species and their local adaptation must be carefully considered and all the efforts made on local studies, such as those on MaP populations, could have a global impact on the development of a common scientific knowledge base about adaptive processes of forest species.
more runs were computed.An equal number of presence and pseudo-absence points have been extracted from the database, weighting them equally during the computation.Five different datasets were created (PArepI, PArepII, ParepIII, ParepIV, ParepV) with 20,094 locations each (10,047 presences + 10,047 pseudo-absences) merging the final predictions to obtain a single ensemble model.In addition, before any modelling activity the presence and absence points were carefully checked with geostatistical methods to remove the spatial autocorrelation.2000 normal climate.Average annual normal of precipitation amount is 1,491 mm with 112 wet days but low summer precipitation (251 mm).The vegetation period is around 150 days and soil types belong to the Calcaric cambisols (European-Soil-Bureau, 1999).

Figure 1 .
Figure 1.Natural range of Pinus nigra spp.from EUFORGEN official portal and geographic position of the MaP population of Villetta Barrea (red circle).
map, showing the natural distribution area of Pinus nigra was compiled by members of the EUFORGEN Networks Citation: Distribution map of Black pine (Pinus nigra) EUFORGEN 2009, www.euforgen.org.First published online on 26 March 2005 -Updated on 5 following the projections rcp4.5 (ABR4.5)and rcp8.5 (ABR8.5

Figure 2 .
Figure 2. A partial representation of the ecological niche covered by the Villetta Barrea Black pine in Abruzzo.Black circles represents all the pixels covered by the species in the region while yellow, blue and red dots refer to the study area.The ecological optimum(Gellini & Grossoni, 2003) is coloured in green.

Figure 3 .
Figure 3. Structure of the modelling procedure for the construction of the Species distribution model. 8

Figure 4 .
Figure 4. Cartographic representation of the Ensemble models for the MAP and the surrounding area of Villetta Barrea.Green colours (dark green and clear green) correspond to "potentiallysuitable area" whereas red colours (yellow, orange and red) were used for the not suitable lands.

Table 1 .
Collinearity test results of the 19 input parameters

Table 2 .
Predictors' importance for each algorithm and mean values (range from 0 to 1)

Table 3 .
Mean AUC and TSS of 30 bootstrap reperirions

Table 4 .
Global mean and standard deviation of AUC and TSS values

Table 5 .
Potential distribution area (ha) in the whole Abruzzo region in the three considered scenarios (number of cells > 0.7)

Table 6 .
Elevation variance in metres (and percentage referring to ABR0) for each scenario (cell value > 0.7)