Evaluating predictive models of species’ distributions: criteria for selecting optimal models

doi:10.1016/S0304-3800(02)00349-6

Ecological Modelling

Volume 162, Issue 3, 15 April 2003, Pages 211-232

https://doi.org/10.1016/S0304-3800(02)00349-6 Get rights and content

Abstract

The Genetic Algorithm for Rule-Set Prediction (GARP) is one of several current approaches to modeling species’ distributions using occurrence records and environmental data. Because of stochastic elements in the algorithm and underdetermination of the system (multiple solutions with the same value for the optimization criterion), no unique solution is produced. Furthermore, current implementations of GARP utilize only presence data—rather than both presence and absence, the more general case. Hence, variability among GARP models, which is typical of genetic algorithms, and complications in interpreting results based on asymmetrical (presence-only) input data make model selection critical. Generally, some locality records are randomly selected to build a distributional model, with others set aside to evaluate it. Here, we use intrinsic and extrinsic measures of model performance to determine whether optimal models can be identified based on objective intrinsic criteria, without resorting to an independent test data set. We modeled potential distributions of two rodents (Heteromys anomalus and Microryzomys minutus) and one passerine bird (Carpodacus mexicanus), creating 20 models for each species. For each model, we calculated intrinsic and extrinsic measures of omission and commission error, as well as composite indices of overall error. Although intrinsic and extrinsic composite measures of overall model performance were sometimes loosely related to each other, none was consistently associated with expert-judged model quality. In contrast, intrinsic and extrinsic measures were highly correlated for both omission and commission in the two widespread species (H. anomalus and C. mexicanus). Furthermore, a clear inverse relationship existed between omission and commission there, and the best models were consistently found at low levels of omission and moderate-to-high commission values. In contrast, all models for M. minutus showed low values of both omission and commission. Because models are based only on presence data (and not all areas are adequately sampled), the commission index reflects not only true commission error but also a component that results from undersampled areas that the species actually inhabits. We here propose an operational procedure for determining an optimal region of the omission/commission relationship and thus selecting high-quality GARP models. Our implementation of this technique for H. anomalus gave a much more reasonable estimation of the species’ potential distribution than did the original suite of models. These findings are relevant to evaluation of other distributional-modeling techniques based on presence-only data and should also be considered with other machine-learning applications modified for use with asymmetrical input data.

Introduction

Predictive modeling of species’ distributions now represents an important tool in biogeography, evolution, ecology, conservation, and invasive-species management (Busby, 1986, Nicholls, 1989, Walker, 1990, Walker and Cocks, 1991, Sindel and Michael, 1992, Wilson et al., 1992, Box et al., 1993, Carpenter et al., 1993, Austin and Meyers, 1996, Kadmon and Heller, 1998, Yom-Tov and Kadmon, 1998, Corsi et al., 1999, Peterson et al., 1999, Peterson et al., 2000, Fleishman et al., 2001, Peterson and Vieglais, 2001, Boone and Krohn, 2002, Fertig and Reiners, 2002, Scott et al., 2002). These approaches combine occurrence data with ecological/environmental variables (both biotic and abiotic factors: e.g. temperature, precipitation, elevation, geology, and vegetation) to create a model of the species’ requirements for the examined variables. Primary occurrence data exist in the form of georeferenced coordinates of latitude and longitude for confirmed localities that typically derive from vouchered museum or herbarium specimens (Baker et al., 1998, Funk et al., 1999, Soberon, 1999, Ponder et al., 2001, Stockwell and Peterson, 2002a). Absence data are rarely available, especially in poorly sampled tropical regions where modeling may hold greatest value (Stockwell and Peters, 1999, Anderson et al., 2002a). The environmental variables typically examined in such modeling efforts encompass only relatively few of the possible ecological-niche dimensions (Hutchinson, 1957). Nevertheless, currently available digital environmental coverages (digitized computer maps) provide many variables that commonly influence species’ macrodistributions (Grinnell, 1917a, Grinnell, 1917b; Root, 1988, Brown and Lomolino, 1998).

The resulting model is then projected onto a map of the study region, showing the species’ potential geographic distribution (e.g. Chen and Peterson, 2000, Peterson and Vieglais, 2001). Models are generally based on the species’ fundamental niche (Hutchinson, 1957; including factors controlling distributions put forward in Grinnell, 1917b; see also MacArthur, 1968, Wiens, 1989, Morrison and Hall, 2002). Thus, some areas indicated by the model as regions of potential presence may be occupied by closely related species, or may represent suitable areas to which the species has failed to disperse or in which it has gone extinct. Rather than a drawback, however, this “overprediction” resulting from the niche-based nature of the models actually allows for synthetic evolutionary and ecological applications comparing potential and realized distributions (Peterson et al., 1999, Peterson and Vieglais, 2001; Anderson et al., 2002a, Anderson et al., 2002b).

The Genetic Algorithm for Rule-Set Prediction (GARP: http://biodi.sdsc.edu/; see http://beta.lifemapper.org/desktopgarp/ for software download) is an expert-system, machine-learning approach to predictive modeling (Stockwell and Peters, 1999). Genetic algorithms constitute one class of artificial-intelligence applications and were inspired by models of genetics and evolution (Holland, 1975). They have been applied to various problems not amenable to traditional computational methods because the search space of all possible solutions is too large to search exhaustively in a reasonable amount of time (Stockwell and Noble, 1992). Genetic algorithms present a heuristic solution to this dilemma by scanning broadly across the search space and refining solutions that show high values for the optimization (fitness) criterion. GARP has proven especially successful in predicting species’ potential distributions under a wide variety of situations (Peterson and Cohoon, 1999; Peterson et al., 1999, Peterson et al., 2001, Peterson et al., 2002a, Peterson et al., 2002b, Peterson et al., 2002c; Godown and Peterson, 2000, Sanchez-Cordero and Martinez-Meyer, 2000, Peterson, 2001, Elith and Burgman, 2002; Feria-A. and Peterson, 2002; Stockwell and Peterson, 2002a, Stockwell and Peterson, 2002b; but see Lim et al., 2002). Chen and Peterson (2000), Peterson and Vieglais (2001), and Anderson et al. (2002a) provide general explanations of the GARP modeling process and interpretation of potential distributions; see Stockwell and Noble (1992) and Stockwell and Peters (1999) for technical details.

GARP reduces error in predicted distributions by maximizing both significance and predictive accuracy, a novel goal for such analytical systems (Stockwell and Peters, 1999). The algorithm is largely successful in doing so without overfitting or overly specializing rules, which is especially important when models are based on occurrence data compiled without a fixed study design (Peterson and Cohoon, 1999). Owing to stochastic elements in the algorithm (such as mutation and crossing over; Holland, 1975, Stockwell and Noble, 1992), however, no unique solution is produced; indeed, the underdetermination of the system yields multiple solutions holding the same value for the optimization criterion. Hence, the variability among resulting models (typical of most machine-learning problems) requires careful examination of possible sources of error in order to select the most predictive models.

A common strategy for evaluating model quality has been to divide known localities randomly into two groups: training data used to create the model and an independent test data set used to evaluate model quality (Fielding and Bell, 1997, Fielding, 2002). One-tailed χ²-statistics (or binomial probabilities, if sample sizes are small) are often employed to determine whether test points fall into regions of predicted presence more often than expected by chance, given the proportion of map pixels predicted present by the model (e.g. Peterson et al., 1999, Anderson et al., 2002a). These tests using independent test data thus provide extrinsic measures of model significance (departure from random predictions). However, by excluding part of the data set from the model-building stage, the algorithm cannot take advantage of all known locality records. Clearly, an optimal model would incorporate data from all available records of the species.

One tactic for managing the variability among models has been to make multiple models and determine how many models predict particular pixels as present (Anderson et al., 2002a, Lim et al., 2002; Peterson et al., unpublished data). Anderson et al. (2002a) tempered among-model variation by making three GARP models per species and creating a composite prediction based on all three models. In further analyses, map pixels predicted present by at least two of the models were then considered “predicted presence”. Similarly, Lim et al. (2002) created five models per species and deemed pixels predicted by three or more of them as predicted presence in subsequent analyses. More recently, Peterson et al. (unpublished data) have made larger numbers of models and summed them (for each model, value of 1 for a pixel of presence; value of 0 for predicted absence). In such an approach, the value of a pixel in the composite (summed) map thus equals the number of models predicting presence in that cell. Summing models may reveal a consistent signal that holds up across many different independent random walks of model generation. The above methods weigh all model replicates equally; in contrast, we herein compare such equal-weight tactics with a best-subsets approach.

Two types of error are possible in predictive models of species’ distributions: false negatives (omission error or underprediction) and false positives (commission error or overprediction). The relative proportions of these errors are typically expressed in a confusion matrix, or error matrix (Fielding and Bell, 1997). Four elements are present in a confusion matrix (Table 1). Element a represents known distributional areas correctly predicted as present, and d reflects regions where the species has not been found and that are classified by the model as absent. Thus, a and d are considered correct classifications; in contrast, c and b are usually interpreted as errors. Element c denotes omission: pixels of known distribution predicted absent by the model. Conversely, b is a measure of areas of absence (or “pseudo-absence”—see below) incorrectly predicted present (commission). Unfortunately, when known presence points are few in number and true absence points are not available, problems arise with some measures derived from the confusion matrix (Fielding and Bell, 1997).

GARP creates a confusion matrix by intrinsically re-sampling map pixels with replacement. First, 1250 map pixels are chosen randomly with replacement from those pixels holding localities of known occurrence (training points). The quantity a is the number of those pixels that coincide with areas of predicted presence; the number falling outside the prediction equals c. Thus, a+c=1250 for GARP models in which all pixels are predicted as either present or absent (in some models, the rule-set may not make a decision for every pixel; such pixels are then coded as “no data” in the prediction—see below). Likewise, 1250 pixels are re-sampled with replacement from the remaining pixels of the study area (any pixels without confirmed presence data in the training set). These pixels are referred to as background points or pseudo-absence points (Stockwell and Peters, 1999), highlighting the difference between models based on typical biodiversity information (positive occurrence records from zoological museums or herbaria, as here) and those that also include true absence data (e.g. Corsi et al., 1999, Fertig and Reiners, 2002). Background pixels that fall into regions of predicted presence yield b, whereas background pixels of predicted absence produce d; b+d=1250 for models with a presence/absence prediction for all pixels (but less if not all cells are predicted either present or absent).

As mentioned above, distributional-modeling algorithms like GARP are often used with only presence data. For most species, data regarding absence are not available (Stockwell and Peters, 1999, Peterson, 2001). In addition, when a potential distribution based on the species’ fundamental niche is desired, use of absence data could adversely affect the model-building process by inhibiting inclusion of areas that hold suitable environmental conditions where the species is not present due to historical restrictions or biological interactions (Peterson et al., 1999, Anderson et al., 2002b). However, despite the practical necessity and theoretical justification for using only presence data in modeling ecological niches, this asymmetry in input data (errors in pseudo-absences but not in presences) requires that interpretation of the confusion matrix be amended. In such cases, whereas element c represents pure omission error, element b includes the contributions of both true and apparent commission error.

Apparent commission error derives from potentially habitable regions correctly predicted as presence, but that cannot be demonstrated as such because no verification of the species exists there. The lack of verification of the species may have various causes (Karl et al., 2002). In certain cases, some areas lacking documentation of the species stem from historical causes or biotic interactions (Peterson, 2001). For example, disjunct areas of potential habitat with no records of the species often correspond to historical restrictions or the historical effects of speciation (e.g. failure of the species to disperse to a region of suitable habitat; Peterson et al., 1999, Peterson and Vieglais, 2001, Anderson et al., 2002a). Similarly, competition between related species showing parapatric distributions likely restricts many species’ realized distributions (Peterson, 2001, Anderson et al., 2002b). Other biological interactions—such as predation in some parts of the potential range but not in others—may also limit some species’ distributions. In addition to historical and biotic causes, apparent commission error can also derive from inadequate sampling: map pixels of real presence (at least at some time of the year in some subhabitat) lacking documentation of the species because they have not been adequately sampled by biologists (Karl et al., 2002). This latter form of apparent commission error has recently been recognized in presence/absence data sets where inventories were extensive yet incomplete (Boone and Krohn, 1999, Karl et al., 2000, Schaefer and Krohn, 2002, Stauffer et al., 2002). By definition, it reaches maximum manifestation in presence-only modeling applications like current implementations of GARP. As the goal of presence-only potential-distribution modeling is to determine which of the background (pseudo-absence) pixels actually represent suitable areas for a species—whether or not it actually inhabits them—interpreting measures of commission is critical.

One measure of overall model performance is the correct classification rate of Fielding and Bell (1997) (see Table 2). GARP provides an intrinsic correct classification rate derived from the confusion matrix: (a+d)/(a+b+c+d)—equal to the “accuracy” of Stockwell and Peterson (2002b), not that of Anderson et al. (2002a). This quantity ranges from 0 to 1 and is designed to measure overall model adequacy, including contributions of both omission and commission in the denominator. Note that, correct classification rate = (1 minus sum of error terms)/(sum of all terms). However, because element b is overestimated by the preponderance of background (pseudo-absence) pixels, this statistic is necessarily biased with data sets that lack true absence data (common with biodiversity information; Peterson, 2001, Ponder et al., 2001, Stockwell and Peterson, 2002a). Likewise, the overall Kappa (κ)-statistic of Fielding and Bell (1997) includes elements of both omission and commission and thus suffers from the same problem (see also Fielding, 2002).

The χ²-statistic based on independent test data can be used as an extrinsic measure of overall performance, because it incorporates both omission (of test points) and commission (via expected frequencies; Table 2). However, this statistic is highly sensitive to the proportional extent of predicted presence, making highly significant results possible with unacceptably high omission rates (e.g. models that only include the core ecological distribution of the species). In addition, χ²-significance values are related to sample size (Peterson, 2001). Hence, it is likely that neither correct classification rates, κ-statistics (both potentially intrinsic), nor χ²-significance values (typically extrinsic) represent reliable measures of overall model performance.

To assess model performance more adequately, other indices that provide intrinsic estimates of each error component can be derived from the confusion matrix (Table 2; reviewed in Fielding and Bell, 1997). The quantity c/(a+c) represents the intrinsic omission error rate, and b/(b+d) represents what we here term the intrinsic commission index (false negative and false positive rates, respectively, of Fielding and Bell (1997)). The intrinsic omission error reflects the proportion of known localities (training points) that fall outside the predicted region (by re-sampling with replacement to produce the confusion matrix). The intrinsic commission index mirrors the proportion of pixels predicted present by the model (proportion of re-sampled background points falling into regions of predicted presence). Owing to the general scarcity of confirmed presence data, however, this latter index includes contributions of (1) true commission error (overprediction) as well as of (2) apparent commission error (correctly predicted areas not verifiable as such, primarily because of the lack of adequate sampling). The aim of predictive modeling is precisely to determine this latter quantity, as well as the geographic distribution of those pixels. To emphasize the dual nature of b/(b+d), we term it the intrinsic commission index rather than intrinsic commission error. One of our aims is to discriminate between its two components.

Extrinsic measures of omission and commission exist parallel to the respective intrinsic ones (Table 2). Where out_test=the number of test points falling outside predicted areas and n_test=the number of test points, out_test/n_test represents extrinsic omission error. Likewise, the proportion of pixels predicted present can serve as an extrinsic commission index. In fact, because the number of training points is usually extremely small in comparison with the number of background pixels in the overall study region, the intrinsic commission index will converge on this extrinsic measure with adequate re-sampling.

In the present study, we evaluate model performance based on both intrinsic and extrinsic criteria, with the goal of identifying optimal models based on intrinsic measures only. If that were possible, optimal models could then be identified even when generated using all known locality data. We approach this problem by examining measures of omission and commission, as well as composite indices designed to reflect both quantities. Because measures of commission are dependent on the proportional extent of areas potentially inhabitable by the species within the study region, we examine in detail three cases whose modeled ecological niches show geographic manifestations occupying varying proportions of the respective study areas. Current implementations of GARP represent the modification of a general algorithm for the specific case of presence-only (generally museum) data. The present research is also germane to evaluation of other distributional-modeling techniques that use presence-only data. In addition, it may be broadly relevant to machine-learning applications with asymmetrical input data (asymmetrical errors).

Section snippets

Study species

The spiny pocket mouse Heteromys anomalus (Heteromyidae) is a common, medium-sized rodent (50–100 g) that is widespread along the Caribbean coast of South America in northern Colombia and Venezuela, as well as on the nearby islands of Trinidad, Tobago, and Margarita. It has been documented in deciduous forest, evergreen rainforest, cloud forest, and some agricultural areas, typically from sea level to approximately 1600 m (Anderson, 1999, unpublished data; Anderson and Soriano, 1999). We examine

Composite measures of performance

Extrinsic performance measures (χ²) were almost always significant. Seventeen of the 20 models for H. anomalus showed significant deviations from random predictions, in the desired direction (χ² for significant models=4.07–16.95; P<0.05; one-tailed critical value χ_1,0.05²=2.706; the other three models showed non-significant departures in the desired direction). All models were highly significant for both M. minutus (χ²=177.02–684.74; P⪡0.05) and C. mexicanus (χ²=42.29–164.50; P⪡0.05). The

Measures of overall performance

Considerable variation was present among GARP models, as predicted by the theoretical background of genetic algorithms (Holland, 1975) and indicated by previous work (e.g. Anderson et al., 2002a). Thus, the algorithm generally performed as expected under this domain. Below, however, we consider issues regarding error quantification in this special case of presence-only data. Furthermore, we explore relationships between various indices and expert-judged model quality.

Neither extrinsic nor

Conclusions and recommendations

In the terminology of genetic algorithms, modification of GARP for use with presence-only occurrence data can result in a highly atypical fitness surface. When visualized in omission/commission space, the repercussions of pseudo-absences sometimes create a fitness ridge, rather than the typical global fitness peak. For GARP distributional models, this ridge is likely present for most species having medium-to-large potential distributions in the study region. Solutions along the ridge show

Acknowledgements

This work has been supported by a Grant in Aid of Research (American Society of Mammalogists) and a Roosevelt Postdoctoral Research Fellowship (American Museum of Natural History) to RPA; Subvención CONICIT (S2-2000002353) to DL; and National Science Foundation grants to ATP. Funding sources supporting Anderson’s systematic research on Heteromys appear in the relevant taxonomic works. Vı́ctor Sánchez-Cordero, Mark E. Stahl, Robert S. Voss, Marcelo Weksler, and two anonymous reviewers read

References (73)

M.P. Austin et al.
Current approaches to modelling the environmental niche of eucalyptus: implication for management of forest biodiversity
Forest Ecol. Manage.
(1996)
A.M. Jarvis et al.
Predicting population sizes and priority conservation areas for 10 endemic Namibian bird species
Biol. Conserv.
(1999)
A.O. Nicholls
How to make biological surveys go further with generalized linear models
Biol. Conserv.
(1989)
A.T. Peterson et al.
Geographic analysis of conservation priority: endemic birds and mammals in Veracruz, Mexico
Biol. Conserv.
(2000)
A.T. Peterson et al.
Effects of global climate change on geographic distributions of Mexican Cracidae
Ecol. Model.
(2001)
J. Soberón
Linking biodiversity information sources
Trends Ecol. Evol.
(1999)
D.R.B. Stockwell et al.
Induction of sets of rules from animal distribution data: a robust and informative method of data analysis
Math. Comput. Simul.
(1992)
D.R.B. Stockwell et al.
Effects of sample size on accuracy of species distribution models
Ecol. Model.
(2002)
J.A. Allen
Report on mammals from the district of Santa Marta, Colombia, collected by Mr. Herbert H. Smith, with field notes by Mr. Smith
Bull. Am. Mus. Natl. Hist.
(1904)
R.P. Anderson
Preliminary review of the systematics and biogeography of the spiny pocket mice (Heteromys) of Colombia
Rev. Acad. Colomb. Cienc. Exactas, Fı́sicas y Naturales
(1999)

R.P. Anderson et al.

The occurrence and biogeographic significance of the southern spiny pocket mouse Heteromys australis in Venezuela

Z. Sauget.

(1999)

R.P. Anderson et al.

Geographical distributions of spiny pocket mice in South America: insights from predictive models

Glob. Ecol. Biogeogr.

(2002)

R.P. Anderson et al.

Using niche-based GIS modeling to test geographic predictions of competitive exclusion and competitive release in South American pocket mice

Oikos

(2002)

AOU, 1998. Check-List of North American Birds, 7th ed. American Ornithologists’ Union, Washington, DC, 829...

August, P.V., 1984. Population ecology of small mammals in the llanos of Venezuela. In: Martin, R.E., Chapman, B.R....

R.J. Baker et al.

Bioinformatics, museums, and society: integrating biological data for knowledge-based decisions

Occas. Pap. Mus. Tex. Tech Univ.

(1998)

O. Bangs

List of the mammals collected in the Santa Marta region of Colombia by W.W. Brown, Jr

J. Proc. N. Engl. Zool. Club

(1900)

R.B. Boone et al.

Modeling the occurrence of bird species: are the errors predictable?

Ecol. Appl.

(1999)

Boone, R.B., Krohn, W.B., 2002. Modeling tools and accuracy assessment. In: Scott, J.M., Heglund, P.J., Morrison, M.L.,...

E.O. Box et al.

A climatic model for location of plant species in Florida, USA

J. Biogeogr.

(1993)

Brown, J.H., Lomolino, M.V., 1998. Biogeography, 2nd ed. Sinauer Associates, Sunderland, MA, 691...

J.R. Busby

A biogeoclimatic analysis of Nothofagus cunninghamii (Hook.) Oerst. in southeastern Australia

Aust. J. Ecol.

(1986)

M.D. Carleton et al.

Systematic studies of oryzomyine rodents (Muridae, Sigmodontinae): a synopsis of Microryzomys

Bull. Am. Mus. Natl. Hist.

(1989)

G. Carpenter et al.

DOMAIN: a flexible modelling procedure for mapping potential distributions of plants and animals

Biodivers. Conserv.

(1993)

G.-J. Chen et al.

A new technique for predicting distribution of terrestrial vertebrates using inferential modeling

Zool. Res.

(2000)

F. Corsi et al.

A large-scale model of wolf distribution in Italy for conservation planning

Conserv. Biol.

(1999)

A. Dı́az de Pascual

Aspectos ecológicos de una microcomunidad de roedores de selva nublada, en Venezuela

Bol. Soc. Venez. Cienc. Nat.

(1988)

A. Dı́az de Pascual

The rodent community of the Venezuelan cloud forest, Mérida

Polish Ecol. Stud.

(1994)

Elith, J., Burgman, M., 2002. Predictions and their validation: rare plants in the central highlands, Victoria,...

ESRI, 1998. ArcView GIS, version 3.1. Environmental Systems Research Institute Inc., Redlands,...

Feria-A., T.P., Peterson, A.T., 2002. Prediction of bird community composition based on point-occurrence data and...

Fertig, W., Reiners, W.A., 2002. Predicting presence/absence of plant species for range mapping: a case study from...

Fielding, A.H., 2002. What are the appropriate characteristics of an accuracy measure? In: Scott, J.M., Heglund, P.J.,...

A.H. Fielding et al.

A review of methods for the assessment of prediction errors in conservation presence/absence models

Environ. Conserv.

(1997)

E. Fleishman et al.

Modeling and predicting species occurrences using broad-scale environmental variables: an example with butterflies of the Great Basin

Conserv. Biol.

(2001)

V.A. Funk et al.

Testing the use of specimen collection data and GIS in biodiversity exploration and conservation decision making in Guyana

Biodivers. Conserv.

(1999)

Cited by (967)

Habitat suitability assessment for Saunders's Gull (Saundersilarus saundersi) in the Yellow River Delta, China
2024, Ecological Informatics
The Yellow River Delta (YRD) is a key breeding place for Saunders's Gull (Saundersilarus saundersi), one of the vulnerable birds in the world. Thus, a thorough understanding of the key environmental factors influencing its suitable habitat holds great value for its conservation. Previous researches focused on Saunders's Gull's population changes and habitat features of breeding places in the nature reserve. However, its habitat suitability in the whole YRD has not been studied systematically, hindering the formulation of macro-protection policies to a certain extent. On the basis of occurrence records and environmental variables, we constructed an optimized MaxEnt model using the ‘kuenm’ R package to investigate the potential distribution of suitable habitat for the Saunders's Gull in the YRD. Results showed that the ideal hyperparameters for the MaxEnt model were a feature combination of linear and quadratic terms, and the regularization multiplier of 0.6 after optimization, indicated a high level of prediction accuracy. The environmental factors that had a significant effect on the Saunders's Gull's distribution were land use and land cover, percentage of beach saline–alkali land (PLAND_55) and distance to coastline (D_coastline). The highly suitable habitat had an area of 258.56 km², of which 66.63% was located within the YRD National Nature Reserve. The management of constructed wetlands should be combined with the protection of natural wetlands to avoid the disadvantage of ‘isolated island-type’ protection. The study provides insights into the quantitative relationship between waterfowl and their habitat, providing support for the sustainable development of the YRD, with certain practical significance.
Environmental and geographic low suitability overlapping of geoduck clams in the Pacific Northeast predicted by Ecological Niche Modeling
2024, Regional Studies in Marine Science
The known distribution limits of geoduck clams (genus Panopea) remain geographically separated globally. However, in the northeast Pacific, P. generosa is distributed from Alaska to northern Mexico, but approaches, and likely overlaps, the distribution of P. globosa, which is mainly distributed in the Gulf of California. There is incipient evidence that both populations coexist in the central part of the Baja California Peninsula on the Pacific coast. However, knowledge of the biogeographic distribution and environmental conditions for both species is still developing. To address the hypothesis of coexistence in the same geographic space, we developed distribution area models using ecological niche modeling through the maximum entropy algorithm. The algorithm used presence-only records and environmental layers (predictive variables) such as mean surface temperature, mean salinity, mean depth, temperature range, and primary productivity. The results indicate that both species overlap in a narrow portion of the environmental space (Grinnellian ecological niche), which is reflected in a restricted geographic distribution with conditions of low abiotic suitability. Our results contribute to directing new research in the areas of ecology, biogeography, paleontology, and geoduck clam fisheries management.
Predicting the potential impact of environmental factors on the distribution of Triplochiton scleroxylon (Malvaceae): An economically important tree species in Nigeria
2023, Acta Ecologica Sinica
Triplochiton scleroxylon is a lowland rainforest tree species distributed in Central and West Africa, with economic and medicinal importance. Given that its population is threatened by overexploitation for timber, poor seed production, and habitat destruction, it becomes important to project its distribution along environmental gradients to bolster our understanding of its conservation. This study thus examines the combined effect of overexploitation and climate change on its distribution in Nigeria. We used MaxEnt algorithm and spatial analysis module of ArcGIS to model the current and future habitat suitability of T. scleroxylon in Nigeria. We utilized 95 unique occurrence points and eight environmental variables (Bio7, Bio8, Bio9, Bio13, Bio14 & Bio18, elevation, and slope) to understand the response of the species to changing climates during the time periods 1970–2000, and to project species responses to temperature and precipitation in the years 2050 and 2070, using two Global Climate Models. Results indicated that the prediction of the suitable habitat for T. scleroxylon under the model was very reliable (AUC ≥ 0.9; TSS ≥ 0.6). Annual temperature range (Bio7) was the most influential bioclimatic variable for predicting habitat suitability and distribution for the species in Nigeria. Highly suitable habitat has an optimum temperature of 28.5 °C, precipitation ≤600 mm, an elevation of ≤400 m, and a slope of 90°; it is currently estimated at 23.3% of the total land cover, and predicted to reduce to <15% by the year 2050. The methods of SDM have a great potential in understanding species richness, and also provide data for the support of conservation strategies at local scales. Hence, this study advocates for the conservation of the remaining habitats of T. scleroxylon through the implementation of strategic plans, especially in South-western Nigeria where there is a high potential for its regeneration owing to favourable environmental conditions.
Potential upslope and latitudinal range shifts for Andean potato weevils Premnotrypes species, in the tropical Andes of South America
2023, Crop Protection
Premnotrypes is a weevil species complex that occurs in the high Andes of South America. Most of the species are considered to be crop pests, associated with potato production, thus constituting a major potential pest in Andean regions. Changes in temperature patterns affect the ontogeny and physiology of crop pests directly, which will in turn affect the pest populations. Here, we used ecological niche modeling (ENM) to assess the potential geographic distribution of seven economically important Premnotrypes species under current conditions and two under future scenarios of conditions scenarios (SSP 4.5 and SPP 8.5) for 2050 and 2090. Current-time niche models predicted suitable areas across the tropical Andes. For P. vorax, ENMs showed suitable areas across the Andes of Peru, Colombia, and Venezuela. For P. lathithorax, ENMs showed suitable areas only in the Andes of Peru and Bolivia. Finally, for P. solaniperda, P. solanivorax, P. pusillus, P. suturicallus, and P. percei, ENMs indicated suitable areas across central and southeastern Peru. Under future conditions, for all seven species, ENMs anticipated range shifts to higher latitudes and upslope migration, thus reducing suitable areas in the eastern Andes. Our novel results offer a guide for designing monitoring programs for potato pests and developing optimal pest control and mitigation strategies.
Factors affecting invasion process of a megadiverse country by two exotic bird species
2023, Anthropocene
Understanding the factors underlying bird invasions is crucial for their management. Here, the invasion processes of Mexico by the European Starling (Sturnus vulgaris) and the Eurasian Collared-Dove (Streptopelia decaocto) are analyzed. A 30 × 30 km grid-cell map with the presence/absence of both species was generated using citizen-science data to describe their invasion patterns in time and space from their first records until 2016. Binomial Generalized Linear Models were used to determine the invasion probabilities of both species. Geographic Information was used to determine the climatic variables that better explain their presence (abiotic factors) and the number of phylogenetically closely-related species (biotic factors). A bioclimatic model was used to test if the role that climatic variables play to determine the invasion success of birds at the global scale holds at regional scales. This model related the invasion probabilities of each species with biotic and abiotic factors. The main findings are: (1) Both species have expanded from established populations in the US, and new introductions by bird-trade. (2) European Starlings invaded the country slower than Eurasian Collared-Doves. (3) European Starlings invaded areas with dry and temperate climates, while Eurasian Collared-Doves invaded most of the country, being positively affected by temperature and precipitation. (4) Invasion probabilities of both species were not constrained by phylogenetically closely-related species richness. This study indicates that for exotic invasive birds that exploit agricultural areas, biotic factors do not provide invasion resistance of megadiverse countries such as Mexico.
Predicting the dispersal and invasion dynamics of ambrosia beetles through demographic reconstruction and process-explicit modeling
2024, Scientific Reports

View all citing articles on Scopus

View full text

Evaluating predictive models of species’ distributions: criteria for selecting optimal models

Abstract

Introduction

Section snippets

Study species

Composite measures of performance

Measures of overall performance

Conclusions and recommendations

Acknowledgements

Forest Ecol. Manage.

Biol. Conserv.

Biol. Conserv.

Biol. Conserv.

Ecol. Model.

Trends Ecol. Evol.

Math. Comput. Simul.

Ecol. Model.

Report on mammals from the district of Santa Marta, Colombia, collected by Mr. Herbert H. Smith, with field notes by Mr. Smith

Bull. Am. Mus. Natl. Hist.

Preliminary review of the systematics and biogeography of the spiny pocket mice (Heteromys) of Colombia

Rev. Acad. Colomb. Cienc. Exactas, Fı́sicas y Naturales

The occurrence and biogeographic significance of the southern spiny pocket mouse Heteromys australis in Venezuela

Z. Sauget.

Geographical distributions of spiny pocket mice in South America: insights from predictive models

Glob. Ecol. Biogeogr.

Using niche-based GIS modeling to test geographic predictions of competitive exclusion and competitive release in South American pocket mice

Oikos

Bioinformatics, museums, and society: integrating biological data for knowledge-based decisions

Occas. Pap. Mus. Tex. Tech Univ.

List of the mammals collected in the Santa Marta region of Colombia by W.W. Brown, Jr

J. Proc. N. Engl. Zool. Club

Modeling the occurrence of bird species: are the errors predictable?

Ecol. Appl.

A climatic model for location of plant species in Florida, USA

J. Biogeogr.

A biogeoclimatic analysis of Nothofagus cunninghamii (Hook.) Oerst. in southeastern Australia

Aust. J. Ecol.

Systematic studies of oryzomyine rodents (Muridae, Sigmodontinae): a synopsis of Microryzomys

Bull. Am. Mus. Natl. Hist.

DOMAIN: a flexible modelling procedure for mapping potential distributions of plants and animals

Biodivers. Conserv.

A new technique for predicting distribution of terrestrial vertebrates using inferential modeling

Zool. Res.

A large-scale model of wolf distribution in Italy for conservation planning

Conserv. Biol.

Aspectos ecológicos de una microcomunidad de roedores de selva nublada, en Venezuela

Bol. Soc. Venez. Cienc. Nat.

The rodent community of the Venezuelan cloud forest, Mérida

Polish Ecol. Stud.

A review of methods for the assessment of prediction errors in conservation presence/absence models

Environ. Conserv.

Modeling and predicting species occurrences using broad-scale environmental variables: an example with butterflies of the Great Basin

Conserv. Biol.

Testing the use of specimen collection data and GIS in biodiversity exploration and conservation decision making in Guyana

Biodivers. Conserv.