Species distributions are limited by a complex array of abiotic and biotic factors. In general, abiotic (climatic) factors are thought to explain species’ broad geographic distributions, while biotic factors regulate species’ abundance patterns at local scales. We used species distribution models to test the hypothesis that a biotic interaction with a tree, the Colombian oak (Quercus humboldtii), limits the broad-scale distribution of the Acorn Woodpecker (Melanerpes formicivorus) in the Northern Andes of South America. North American populations of Acorn Woodpeckers consume acorns from Quercus oaks and are limited by the presence of Quercus oaks. However, Acorn Woodpeckers in the Northern Andes seldom consume Colombian oak acorns (though may regularly drink sap from oak trees) and have been observed at sites without Colombian oaks, the sole species of Quercus found in South America. We found that climate-only models overpredicted Acorn Woodpecker distribution, suggesting that suitable abiotic conditions (e.g. in northern Ecuador) exist beyond the woodpecker’s southern range margin. In contrast, models that incorporate Colombian oak presence outperformed climate-only models and more accurately predicted the location of the Acorn Woodpecker’s southern range margin in southern Colombia. These findings support the hypothesis that a biotic interaction with Colombian oaks sets Acorn Woodpecker’s broad-scale geographic limit in South America, probably because Acorn Woodpeckers rely on Colombian oaks as a food resource (possibly for the oak’s sap rather than for acorns). Although empirical examples of particular plants limiting tropical birds’ distributions are scarce, we predict that similar biotic interactions may play an important role in structuring the geographic distributions of many species of tropical montane birds with specialized foraging behavior.
Citation: Freeman BG, Mason NA (2015) The Geographic Distribution of a Tropical Montane Bird Is Limited by a Tree: Acorn Woodpeckers (Melanerpes formicivorus) and Colombian Oaks (Quercus humboldtii) in the Northern Andes. PLoS ONE 10(6): e0128675. https://doi.org/10.1371/journal.pone.0128675
Academic Editor: Filippos A. Aravanopoulos, Aristotle University of Thessaloniki, GREECE
Received: January 23, 2015; Accepted: April 29, 2015; Published: June 17, 2015
Copyright: © 2015 Freeman, Mason. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: The authors have no support or funding to report.
Competing interests: The authors have declared that no competing interests exist.
Understanding the factors that explain species’ distributional limits is a fundamental goal of ecology and biogeography . Abiotic factors such as temperature and precipitation are strong predictors of species richness , and several disparate lines of evidence support the hypothesis that abiotic factors often set species’ broad geographic distributions (the geographic Grinnelian niche) —meta-analyses reveal that species’ range limits are often determined by abiotic conditions [4,5], many species are shifting their distributions to cooler upslope or higher-latitude environments in concordance with modern global warming [6,7], and introduced species’ distributional limits can often be predicted from the climatic conditions they experience in their native range [8,9].
However, climate is not the only factor that influences species’ distributional limits. Biotic factors such as habitat variables, resource availability and species interactions (i.e., competition, mutualism and predation) can all limit distributions [1,10]. Many examples demonstrate that species interactions can influence distributional limits at fine spatial scales (e.g., for competition) [11–15], and the influence of biotic factors such as interspecific competition can also explain non-random abundance patterns at regional scales [16,17]. Thus, it is clear that biotic factors can impact patterns of species’ abundance and distribution on small spatial scales (the local Eltonian niche) . However, the relative paucity of examples where biotic factors explain species’ geographic limits supports the Eltonian noise hypothesis, which posits that biotic interactions seldom influence species’ geographic extents . Investigating the influence of biotic interactions on species’ distributions is an active arena of research [10,19–21], and an increasing number of case studies demonstrate that biotic interactions can be important factors influencing species’ distributions [22–25].
We provide one of the first tests comparing the relative influence of climate and a putatively strong biotic interaction in limiting the geographic distribution of the tropical population of a widespread bird species found in both temperate and tropical biomes in the Americas. We used a species distribution modeling approach to investigate two non mutually exclusive factors that could limit the distribution of the Acorn Woodpecker (Melanerpes formicivorus) at its southern range margin: 1) abiotic factors such as temperature, precipitation and seasonality, or 2) a biotic interaction with a putatively important food resource. As its name implies, acorns produced by Quercus oaks form an important component of the Acorn Woodpeckers’ diet, at least within North America, where woodpeckers store acorns in granary trees . These stored acorns are then eaten by woodpeckers during periods of low resource availability (e.g., winter; ). This woodpecker-oak interaction is sufficiently strong that Acorn Woodpecker distribution along the Pacific coast of North America is effectively limited to locations where multiple species of oaks co-occur . However, Acorn Woodpeckers inhabit a wide latitudinal distribution from western North America south to the Northern Andes in South America, and the influence of oak distributions on Acorn Woodpecker distribution in other regions remains unknown (Fig 1).
Fig 1. Geographic distribution of Acorn Woodpecker (Melanerpes formicivorus).
An expert range map of Acorn Woodpecker from BirdLife International is shown in light blue. Inset shows a male Acorn Woodpecker (photograph by Walt Koenig). Reprinted under a CC BY license, with permission from Walt Koenig, original copyright 2011.
Biogeographic and ecological viewpoints provide conflicting perspectives to the hypothesis that oak presence limits the Acorn Woodpecker’s southern range margin. On one hand, the Acorn Woodpeckers’ southern range margin roughly correlates with the distribution of the sole Quercus species in South America, the Colombian oak (Quercus humboldtii) , but does not correspond with an obvious biogeographical or climatic barrier, suggesting that the woodpecker’s distribution may be limited more by the occurrence of oaks than by climatic conditions alone. Indeed, the Acorn Woodpecker’s distributional limits are unusual among Andean bird species: montane bird species that co-occur with Acorn Woodpeckers in southern Colombia typically range farther south into neighboring Ecuador . On the other hand, Acorn Woodpeckers in Colombia can be found at sites without Colombian oaks, have a varied diet including insects and fruit, and do not appear to rely on acorns as a food resource (but do regularly drink sap from Colombian oak trees) . Thus, documented ecological interactions between Colombian oaks and Acorn Woodpeckers within Colombia are unclear and suggest the possibility that Colombian oaks may not influence the woodpeckers’ distribution.
We used a species distribution modeling approach to test the hypothesis that Acorn Woodpecker distribution at its southern range margin is limited by a biotic interaction with Colombian oaks. To test this hypothesis, we compared the performance of three species distribution models. The first model (“abiotic” model) was a standard climate envelope model built using Acorn Woodpecker occurrence data and the climatic variables associated with woodpecker presence localities. The second model (“Quercus” model) included a single, binary layer that reflects the presence or absence of Quercus humboldtii based on occurrence records. The third model (“abiotic and Quercus” model) included both Colombian oak presence-absence as a biotic layer and climatic variables. The hypothesis that an interaction with Colombian oaks limits Acorn Woodpecker distribution in the Northern Andes makes the general prediction that models including the biotic layer (both the “Quercus” and the “abiotic and Quercus” models) will outperform the model without the biotic layer (the “abiotic” model). We assessed this prediction by comparing the performance of the three species distribution models using several metrics. The hypothesis further predicts that models that include biotic layers will more accurately model the Acorn Woodpeckers’ southern range limit than the abiotic model. We compared model overprediction (false positives, quantified as the commission rate) beyond the Acorn Woodpecker’s southern range limit for the three models to test this specific prediction.
Materials and Methods
We used occurrence data downloaded from the Global Biodiversity Information Facility (GBIF; http://www.gbif.org/) to develop a database of georeferenced Acorn Woodpecker localities in the Northern Andes of Colombia (hereafter ‘Northern Andes’). A large proportion of occurrence data came from citizen science observational data entered into eBird . To minimize georeferencing errors from observational data, we accepted eBird records only if they came from stationary counts (checklists of bird species detected by a stationary observer from a single geographic point), exhaustive area counts with a total area < 1 km2 (checklists of bird species detected by an observer within a region < 1 km2), or traveling counts that covered < 5 km (checklists of bird species detected by an observer traveling < 5 km). We included 319 Acorn Woodpecker localities from South America in our finalized data set (S1 File). To avoid issues associated with geographic sampling bias, we used the spThin package  to subsample our dataset such that all occurrence records were separated by a minimum distance of 1 km. We retained 113 occurrence records after thinning. This number of occurrence records is sufficient to generate robust species distribution models; reliable habitat suitability models have been built with far fewer occurrence points (i.e., < 25) in other empirical studies [33,34].
Colombian oaks are the southernmost species of Quercus oak in the Americas, and the only species found in South America. Endemic to the Western, Central and Eastern Andes of Colombia, Colombian oaks live in montane forests (1,500–3,300 m) where they sometimes grow in nearly monospecific stands termed ‘robledales’ . We used a database of 117 georeferenced Colombian oak records from Gonzalez et al.  in our analysis (S2 File). This database includes records from herbaria and field surveys, and was quality checked by experts .
Species distribution modeling
We randomly partitioned one quarter (28 records) of the Acorn Woodpecker occurrence records to test the performance of species distribution models built with the remaining data (85 records), a methodology known as k-fold partitioning. We downloaded data for 19 abiotic variables at a resolution of 30 arc seconds from the WorldClim database . Because our study focuses on Acorn Woodpeckers in the Northern Andes, we limited our species distribution modeling to an extent between 69°W and 82°W longitude and 7°S and 11°N latitude, which does not include the nearest neighboring population of Acorn Woodpeckers present in the highlands of Costa Rica and Panama. We used 250 randomly generated “pseudo-absence” points within this geographic extent for model training, and constructed species distribution models using Maxent v3.3.3k [37,38] with default settings using the ‘dismo’ package .
Model complexity, or the number of variables included in a model, can affect performance and error rates when predicting species distributions. Maxent includes a built-in level of protection against overfitting models, known as L1 regularization. This procedure uses a parameter (β) that weights the penalty applied to the addition of extra parameters and is automatically adjusted by recent versions of Maxent . Recently, Warren and Seifer  demonstrated that overparameterized Maxent models consistently performed better than underparameterized Maxent models. Thus, we opted to let Maxent control model parameterization and included all 19 bioclim variables as input into Maxent, as done in other empirical studies [41–43].
We also generated a grid of Colombian oak presence or pseudo-absence as a binary character by extending known Colombian oak localities to include a radius of 20 km. We used the same set of training and testing data to compare the performance between species distribution models constructed from solely abiotic variables, a single biotic layer (presence-absence of Quercus), and one that included Colombian oak presence or absence in addition to the same abiotic variables.
We assessed the contribution of each layer, such as bioclim variables or the presence of Quercus oaks, towards generating the final model in Maxent. Maxent estimates the importance of each variable in two ways, first by estimating a heuristic approach to calculate the scaled ‘percent contribution’ of each variable, which reflect increases in model performance. However, this number can be influenced by collinear relationships with other layers and must be interpreted with caution. To address collinearity, Maxent also estimates ‘permutation importance’, which uses a jackknife approach to exclude one variable at a time when running the model to assess how much unique information each layer provides. The relative loss of model performance is then scaled and provides another measure of layer importance for a given model.
Model performance can be assessed with a wide variety of statistics that emphasize different aspects of the predictive abilities of a given model . Many model performance statistics, including all of the parameters considered here, are derived from the ‘confusion matrix’, which describes the number of true positives, false positives, true negatives, and false negatives generated by predictions using a certain model . Certain indices of model performance, such as the area under the Receiver Operating Characteristic curve (AUC), do not require a specified threshold to assess the ability of a model to assign a data point (i.e., occurrence data) to a binary state (i.e., presence or absence). Other parameters, such as Kappa (κ; ), require a threshold to quantify the performance of binary categorization.
In this study, we evaluated species distribution models using six different indices of model performance that describe different aspects of the ‘confusion matrix’, including both threshold-independent and threshold-dependent variables. These parameters include (1) the threshold-independent area under the Receiver Operating Characteristic curve (AUC) ; (2) the area under the curve of the Kappa (κ) at different binary thresholds ; (3) the ‘overall performance’ index sensu Anderson et al. , also known as the correct classification rate  at different binary thresholds; (4) the intrinsic commission (false positive) rate  at different binary thresholds; (5) the intrinsic omission (false negative) rate  at different binary thresholds; (6) and the Pearson’s correlation coefficient between observations in the presence and pseudoabsence data set and the model predictions . Although certain modeling evaluation parameters, such as AUC, have been criticized , the variety of parameters considered here allow us to comprehensively evaluate different aspects of model performance, such as model commission (overprediction) and omission (underprediction). Since our research question is focused on determining why Acorn Woodpecker distribution is limited in South America, we placed special emphasis on examining patterns of commission (i.e., overprediction) among our models.
We visualized species distribution models with continuous suitability output from Maxent and also converted continuous suitability into a binary presence-absence model by using the probability threshold that corresponded to maximum Kappa values for each model. Freeman and Moisen  found that this method of binary classification performed favorably compared to 10 other methods, such as using the traditional arbitrary cutoff of 0.5 or the threshold at which sensitivity is equal to specificity, based on measurements of predicted prevalence, prediction accuracy, and the resulting distribution output.
We constructed continuous and binary predictions of habitat suitability for Acorn Woodpeckers in the northern Andes (Fig 2). Response curves indicate how variation among different abiotic and biotic conditions influences the probability of occurrence for Acorn Woodpeckers (S1 Fig).
Fig 2. Species distribution models generated for Acorn Woodpecker in northern Andes.
Habitat suitability is shown as a continuous variable, in which green colors indicate highly suitable habitat, with (A) abiotic variables alone, (B) Quercus presence alone, and (C) abiotic variables + Quercus presence. We converted continuous measures of habitat suitability into binary presence and absence models by setting the threshold to the value that maximizes the parameter Kappa, which is an index of model performance. Binary conversions were done for (C) abiotic variables alone and (D) abiotic variables in addition to presence of Quercus. Excluding Quercus occurrence points generated species distribution models that over-predict the occurrence of Acorn Woodpeckers in South America.
Using abiotic variables alone, our species distribution model predicted suitable habitat for Acorn Woodpeckers in many regions of Colombia, restricted areas of Venezuela, and high elevation areas in Ecuador and northern Peru (Fig 2A and 2C). The variable that contributed most to the presence-absence predictions of the “abiotic only” model was the mean temperature of warmest quarter, although this variable did not rank highly in permutation importance (Table 1).
In contrast, including Colombian oak presence in addition to abiotic variables largely limited the predicted distribution of Acorn Woodpeckers to the northern Andes in Colombia (Fig 2B and 2D). In this model, the presence of Colombian oak contributed the most to the presence-absence predictions of the model, which was consistent when considering permutation importance (Table 2).
We found that including Colombian oak presence as an additional variable generally improved species distribution model performance over species distribution models built from abiotic variables alone (Table 3 and Fig 3). More specifically, while the abiotic-only model had a strong overall performance score, species distribution models that included Colombian oak occurrence points had slightly higher AUC scores, higher Kappa scores, and Pearson’s correlation coefficients (Table 3). Thus, including Colombian oak presences into species distribution modeling improved the ability of models to accurately predict Acorn Woodpecker distribution limits, including its well-documented absence from Ecuador and Peru.
Fig 3. Performance indices of species distribution models constructed from only abotic climatic variables (red), only Quercus occurrence data (yellow), and abiotic climatic variables plus Quercus occurrence data (blue).
Species distribution models that include Quercus occurrence data consistently perform better than models constructed from abiotic variables alone based on (A) AUC scores; (B) area under the curve of Kappa values; (C) overall performance sensu Anderson et al. ; (D) commission (false positive) indices; and (E) omission (false negative) indices. Note that the Pearson’s correlation coefficient is not included in this figure because it does not lend itself to visualization.
We used species distribution models to test the hypothesis that Acorn Woodpecker distribution at its southern range margin is limited by a biotic interaction with Colombian oaks. We found that species distribution models including presence of Colombian oaks as a binary predictor in addition to climatic data had higher AUC, higher kappa, and lower commission rate in northern Ecuador compared to species distribution models built using solely climatic data (Figs 2 and 3), and that the presence of Colombian oaks contributed the most to presence-absence predictions of the niche model built using both climatic data and Colombian oak occurrences (Table 2). These results suggest the Eltonian noise hypothesis may not apply to Acorn Woodpeckers at their southern range margin. Instead, these results are consistent with the hypothesis that the presence of Colombian oaks limits the Acorn Woodpecker at its southern range margin, presumably because the seeds of oaks are an important food resource for Acorn Woodpeckers, as they are in North America . The niche model including Colombian oak presence correctly predicted the presence of Acorn Woodpeckers from regions in the Colombian Andes where Colombian oaks occur and the absence of Acorn Woodpeckers from nearby regions with similar climates that lack oaks (e.g., northern Ecuador).
Additional factors may also influence the location of the Acorn Woodpecker’s southern range margin. For example, dispersal constraints and interspecific competition are two factors that commonly limit distributions of tropical montane birds [51–53]. However, both processes are unlikely to apply to the Acorn Woodpecker. Acorn Woodpeckers are strong dispersers within their North American range , though dispersal has not been measured for tropical populations of Acorn Woodpeckers. Moreover, there are no obvious biogeographic barriers to range expansion at its current range margins (i.e., montane forest habitat is continuous along Andean slopes stretching from Colombia south into Ecuador), and nearly all co-occurring montane birds occur in both Colombia and Ecuador . Thus, it is unlikely that the woodpecker’s current range margins in South America reflect dispersal constraints.
A second possibility is that interspecific competition with a closely related species may limit tropical montane birds to smaller distributions despite the availability of suitable environmental space. We do not have rigorous data to address this possibility for Acorn Woodpeckers, but suggest that interspecific competition is unlikely to be a dominant factor influencing the woodpecker’s distribution. Interspecific competition is hypothesized to influence species’ distributions when species are “replaced” geographically by ecologically similar taxa [11,55]. This scenario may apply to the only montane bird species that shares a similar distributional pattern to the Acorn Woodpecker, the Golden-fronted Redstart (Myioborus ornatus), which is replaced south of its southern range limit in southern Colombia by its allospecies, the morphologically and ecologically similar Spectacled Redstart (Myioborus melanocephalus; ). However, Acorn Woodpeckers are the only species of Melanerpes woodpecker found in the Andes, and there are no ecologically similar woodpeckers south of the Acorn Woodpeckers’ southern range margin . Thus, it appears unlikely that interspecific competition influences Acorn Woodpecker southern range limit in the northern Andes.
We used commission rates to test the hypothesis that species distribution models using only climatic data overpedict Acorn Woodpecker distribution in the Northern Andes. This analysis hinges on the assumption that our Acorn Woodpecker locality dataset accurately maps Acorn Woodpecker distribution. In general, bird distributions in the Northern Andes are well known thanks both to the efforts of field ornithologists and legions of birdwatchers . This is particularly true in northern Ecuador, which is a popular location for birdwatching . Within birds, Acorn Woodpeckers are a particularly conspicuous species due to their social behavior and loud vocalizations, and are thus easily detectable when present. It is therefore likely that sites outside the known range of Acorn Woodpeckers but predicted as suitable for Acorn Woodpeckers in our models (especially in northern Ecuador) represent model overpredictions rather than sites where Acorn Woodpeckers occur but have yet to be detected. This assumption could be further tested by directed field surveys for Acorn Woodpeckers in the Andes of extreme southern Colombia and northern Ecuador.
Our results support the hypothesis that a biotic interaction with an important food resource limits a tropical bird’s large-scale geographic distribution. Tropical species have been hypothesized to inhabit distributions largely shaped by biotic interactions . However, few studies have marshaled quantitative evidence that biotic interactions limit distributions of tropical species [58–60], and the influence of biotic interactions with important food resources on the distributional limits of tropical birds has seldom been previously considered in a species distribution modeling framework (but see ). It is especially intriguing that Acorn Woodpeckers in the Northern Andes are apparently limited by the presence of Colombian oaks despite their generalist diet—tropical populations of Acorn Woodpeckers flycatch for insects, drink sap, consume acorns, fruit, grains and eat an array of insects [30,61]. Thus, although Acorn Woodpeckers in Colombia can occur at sites where oaks are absent , and tropical populations of Acorn Woodpeckers do not store acorns in large granary trees as do North American populations [30,62], oaks appear to be a sufficiently important food resource that they influence Acorn Woodpecker distribution at a broad geographic scale, perhaps because Acorn Woodpeckers regularly drink sap from Colombian oak trees . Further studies should investigate Acorn Woodpecker diet in the Northern Andes. Regardless, our results suggest that a biotic interaction with oaks limits Acorn Woodpecker broad geographic distribution but may not influence local patterns of abundance and distribution within the northern Andes, a reversal of the commonly noted pattern that biotic interactions influence local patterns of abundance and distribution but not broad geographic patterns .
We conclude with a call for further case studies to test whether biotic interactions influence large-scale distributional limits (e.g., [22–25]). Such studies are especially appropriate in birds, where reciprocal transplant experiments to experimentally assess the impact of biotic interactions on distributional limits are nearly impossible, but voluminous distributional data facilitates construction of species distribution models to explore how biotic interactions impact species’ distributional limits. Acorn Woodpeckers may be extreme in their apparent reliance on a particular species of tree as a food resource. However, we suggest that many tropical birds may inhabit distributions that are limited more by the presence of single plant species that are important food resources than by abiotic climatic factors. For example, a number of tropical bird species are specialized on bamboo seeds (Neotropics; : Asian tropics;: Melanesia; ; these birds likely inhabit distributions limited more by bamboo distribution than by climatic factors . Similar scenarios may apply to other specialist foragers, such as Phaethornis hermits (hummingbirds) that forage primarily on Heliconia flowers , or the Giant Conebill (Oreomanes fraseri) and Point-tailed Palmcreeper (Berlepschia rikeri) that are tightly associated with, respectively, high-elevation Polylepis forests in the Andes and Mauritia palm groves in the Amazon . Future research will determine the extent to which biotic interactions with plants that provide important food sources structure distributions of tropical montane birds.
S1 Fig. Response curves for predictor variables (BioClim variables and Colombian oak presence) in species distribution models.
Red lines correspond to the abiotic-only models, yellow lines correspond to the Quercus-only model, and blue lines correspond to the abiotic + Quercus model.
S1 File. Occurrence data of Acorn Woodpeckers (Melanerpes formicivorus) used in this study.
This database includes records from both museum specimens and eBird records. Specimen records have unique Global Biodiversity Information Facility (GBIF) identification numbers (“GBIF_ID”); eBird records also have unique identification numbers (“eBird_ID”). There are three types of eBird records—Stationary Counts, Traveling Counts and Exhaustive Area Counts.
S2 File. Occurrence data of Colombian Oak (Quercus humboldtii) used in this study.
This database was developed and published by Gonzalez et al. (35). In addition to latitude and longitudes for each record, Gonzalez et al (35) provided information on the collector, herbarium and identification number of each record (when available).
We thank J. D. Palacio-Mejia and C. Gonzalez for sharing their database of Colombian Oak localities. We thank P. Title for advice in constructing and evaluating species distribution models. W. Koenig, E. Miller, and two anonymous reviewers provided feedback that substantially improved this manuscript.
Conceived and designed the experiments: BGF NAM. Performed the experiments: BGF NAM. Analyzed the data: NAM. Wrote the paper: BGF NAM.
- 1. Sexton JP, McIntyre PJ, Angert AL, Rice KJ. Evolution and ecology of species range limits. Annu Rev Ecol Evol Syst. 2009;40: 415–436.
- 2. Hawkins BA, Field R, Cornell H V, Currie DJ, Guégan J-F, Kaufman DM, et al. Energy, water, and broad-scale geographic patterns of species richness. Ecology. 2003;84: 3105–3117. pmid:14573816
- 3. Soberón J. Grinnellian and Eltonian niches and geographic distributions of species. Ecol Lett. 2007;10: 1115–1123. pmid:17850335
- 4. Cahill AE, Aiello-Lammens ME, Caitlin Fisher-Reid M, Hua X, Karanewsky CJ, Ryu HY, et al. Causes of warm-edge range limits: systematic review, proximate factors and implications for climate change. J Biogeogr. 2014;41: 429–442.
- 5. Hargreaves AL, Samis KE, Eckert CG. Are species’ range limits simply niche limits writ large? A review of transplant experiments beyond the range. Am Nat. 2014;183: 157–73. pmid:24464192
- 6. Chen I-C, Hill JK, Ohlemüller R, Roy DB, Thomas CD. Rapid range shifts of species associated with high levels of climate warming. Science. 2011;333: 1024–1026. pmid:21852500
- 7. Freeman BG, Class Freeman AM. Rapid upslope shifts in New Guinean birds illustrate strong distributional responses of tropical montane species to global warming. Proc Natl Acad Sci. 2014;111: 4490–4494. pmid:24550460
- 8. Petitpierre B, Kueffer C, Broennimann O, Randin C, Daehler C, Guisan A. Climatic niche shifts are rare among terrestrial plant invaders. Science. 2012;335: 1344–1348. pmid:22422981
- 9. Monahan WB, Tingley MW. Niche tracking and rapid establishment of distributional equilibrium in the house sparrow show potential responsiveness of species to climate change. PLoS One. 2012;7:e42097. pmid:22860062
- 10. Wisz MS, Pottier J, Kissling WD, Pellissier L, Lenoir J, Damgaard CF, et al. The role of biotic interactions in shaping distributions and realised assemblages of species: implications for species distribution modelling. Biol Rev. 2013;88: 15–30. pmid:22686347
- 11. Terborgh J, Weske JS. Role of competition in distribution of Andean birds. Ecology. 1975;56: 562–576.
- 12. Brown JH. Mechanisms of competitive exclusion between 2 species of chipmunks. Ecology. 1971;52: 305–311.
- 13. Hairston NG. The experimental test of an analysis of field distributions: Competition in terrestrial salamanders. Ecology. 1980;61: 817–826.
- 14. Jankowski JE, Robinson SK, Levey DJ. Squeezed at the top: Interspecific aggression may constrain elevational ranges in tropical birds. Ecology. 2010;91: 1877–1884. pmid:20715605
- 15. Martin PR, Martin TE. Ecological and fitness consequences of species coexistence: A removal experiment with wood warblers. Ecology. 2001;82: 189–206.
- 16. Gotelli NJ, Graves GR, Rahbek C. Macroecological signals of species interactions in the Danish avifauna. Proc Natl Acad Sci. 2010;107: 5030–5035. 0914089107 pmid:20194760
- 17. Mayr E, Diamond JM. The birds of northern Melanesia: speciation, ecology & biogeography. Oxford: Oxford University Press; 2001. 492 p.
- 18. Soberón J, Nakamura M. Niches and distributional areas: concepts, methods, and assumptions. Proc Natl Acad Sci. 2009;106: 19644–19650. pmid:19805041
- 19. Godsoe W, Harmon LJ. How do species interactions affect species distribution models? Ecography. 2012;35: 811–820.
- 20. Kissling WD, Dormann CF, Groeneveld J, Hickler T, Kühn I, McInerny GJ, et al. Towards novel approaches to modelling biotic interactions in multispecies assemblages at large spatial extents. J Biogeogr. 2012;39: 2163–2178.
- 21. Araújo MB, Luoto M. The importance of biotic interactions for modelling species distributions under climate change. Glob Ecol Biogeogr. 2007;16: 743–753.
- 22. Leathwick JR, Austin MP. Competitive interactions between tree species in New Zealand’s old-growth indigenous forests. Ecology. 2001;82: 2560–2573.
- 23. Bullock JM, Edwards RJ, Carey PD, Rose RJ. Geographical separation of two Ulex species at three spatial scales: does competition limit species’ ranges? Ecography. 2000;23: 257–271.
- 24. De Araújo CB, Marcondes-Machado LO, Costa GC. The importance of biotic interactions in species distribution models: a test of the Eltonian noise hypothesis using parrots. J Biogeogr. 2014;41: 513–523.
- 25. Heikkinen RK, Luoto M, Virkkala R, Pearson RG, Körber J-H. Biotic interactions improve prediction of boreal bird distributions at macro-scales. Glob Ecol Biogeogr. 2007;16: 754–763.
- 26. Koenig WD, Mumme R. Population Ecology of the Cooperatively Breeding Acorn Woodpecker. Princeton University Press, Princeton, NJ; 1987. p. 435.
- 27. Koenig WD, Haydock J. Oaks, acorns, and the geographical ecology of acorn woodpeckers. J Biogeogr. 1999;26: 159–165.
Ecological studies at different spatial scales tend to ask different questions and use different methods. Macroecologists are traditionally concerned with species distributions, diversity and abundance at large spatial extents and coarse spatial resolutions. They typically analyse global or continental databases using geographic information systems and spatial statistics. By contrast, community ecologists ask questions related to interactions among species and ecosystem function at the smaller extents and finer resolutions of individual research sites. Community ecology has a long history of using local and often long-term experiments, the likes of which are not feasible at larger, macroecological scales. Findings about environmental change from these two sub-disciplines can also seem quite different. For example, studies of the ecological impacts of climate change at macro scales have revealed common and predictable trends, such as pole-ward shifts in species distributions and advancement of spring events (Parmesan & Yohe 2003). However, experimental studies at individual field sites often show more idiosyncratic responses to environmental change, characterised by nonlinearities and feedbacks that can be difficult to predict (Suttle et al. 2007; Tylianakis et al. 2008). There is a need to reconcile such findings at different scales, which also offers the opportunity to deepen our understanding by using local scales to inform large scales and vice versa (Paine 2010). It is therefore crucial to develop new methodological frameworks that are able to link data across scales and build bridges between sub-disciplines.
One area in which studies at different scales have historically been very disparate is in the use of ecological niche theory. Macroecologists tend to be schooled in the niche concepts of Grinnell (1917), who defined a species’ niche on the basis of the abiotic environmental conditions (e.g. temperature, precipitation) required to maintain a population. Community ecologists, on the other hand, tend to use the niche concept of Elton (1929), who emphasised the functional role of a species in a biotic community. In reality, both abiotic environment and biotic community impact species, and recent work has made progress integrating Grinnellian and Eltonian niche concepts to improve knowledge of how ecological systems operate across spatial scales (Soberón 2007; Peterson et al. 2011; Trainor & Schmitz 2014). Of particular interest is the role that biotic interactions play in determining species geographical ranges. For instance, the Eltonian Noise Hypothesis posits that species ranges at large extents and coarse resolutions are determined principally by abiotic factors, with little impact of local biotic interactions (Soberón 2007; Soberón & Nakamura 2009). Developing ways to test this hypothesis and include biotic interactions in models of species distributions will have great practical value across a range of applications, from predicting range shifts under climate change to anticipating the spread of invasive species and zoonotic diseases (Peterson et al. 2011).
Species distribution models (SDMs, also called ecological niche models) are widely used to estimate species ranges, typically at macroecological scales. These models use statistical associations between known occurrences and abiotic variables to project probabilities of occurrence under changed conditions (Guisan & Thuiller 2005). Several different methods have been used for SDMs (Elith et al. 2006), and mechanistic approaches that incorporate physiology and life history traits show potential for improving predictions (Kearney & Porter 2009; Pearson et al. 2014). However, current methods overwhelmingly ignore biotic interactions (Urban 2015), despite wide acknowledgement that this is a potentially severe limitation of the models (Pearson & Dawson 2003; Schmitz et al. 2003; Guisan & Thuiller 2005; Araújo & Luoto 2007; Kissling et al. 2012; Ockendon et al. 2014; Thuiller et al. 2015).
Every species, regardless of where it occurs, is linked to other species through competitive, symbiotic, consumptive or pathogenic interactions. So in all but the simplest systems, species’ fates are woven together as networks of biotic interactions. Results of early global change experiments signalled prominent roles for interactions in moderating species’ responses (Chapin et al. 1995; Harte & Shaw 1995). This was tested explicitly and affirmed in both laboratory (Davis et al. 1998; Petchey et al. 1999) and field experiments (Dormann et al. 2004; Suttle et al. 2007). Empirical evidence for the importance of biotic interactions in the context of environmental changes now spans a wide diversity of habitat types (Post & Pedersen 2008; Martin & Maron 2012; Barton & Ives 2014; Paul & Johnson 2014), and pronounced interaction effects are evident in all manner of species’ responses to recent environmental changes in the natural world (Edwards & Richardson 2004; Winder & Schindler 2004; Munson et al. 2008; Ling et al. 2009).
It has been questioned whether the effects of biotic interactions extend to the macroecological scales of most SDMs, and given the disconnect in scales between empirical research and macroecological modelling, it is important to ask whether biotic interactions, which occur at scales of individuals, add predictive value at much coarser resolutions and larger extents. Several tests from both modern and historical records suggest that they do (Araújo & Luoto 2007; Heikkinen et al. 2007; Gotelli et al. 2010; Hellman et al. 2012; Pateman et al. 2012; Blois et al. 2013; Mason et al. 2014) and there has been notable progress in developing frameworks to account for interactions in macroecological models (Pellissier et al. 2010; Fordham et al. 2013; Trainor et al. 2014).
Most attempts at integrating biotic interactions with SDMs have focused on explaining patterns of species co-occurrence rather than modelling interactions explicitly (Araújo et al. 2011; Morueta-Holme et al. 2016). One study compared predictions of species occurrence from SDMs that included only abiotic variables to those that also included recorded presences of other species in the community as additional explanatory variables (Giannini et al. 2013). Another study combined a graph theoretic, trophic interaction model with an SDM to account for differences in resource accessibility across locations when predicting the geographical distribution of a consumer species (Trainor & Schmitz 2014). Other studies have used known or expected biotic interactions to constrain the output of SDMs (Fernandes et al. 2013; Pellissier et al. 2013). A complementary line of work has used the output of SDMs to infer the structure of food webs – trophic interactions – at different locations and under different climate change scenarios (Albouy et al. 2014; Morales-Castilla et al. 2015). In these approaches, a single species is examined at a time and only it and its interaction partners are considered in SDMs, ignoring all other biotic interactions in a community. But the outcomes of interspecific interactions produce ‘knock-on’ effects along additional linkages, making it difficult to fully articulate or bound biotic interaction networks and therefore the task of modelling biotic interaction networks even more challenging.
Here we show how Bayesian networks (BNs, also called belief networks, Bayesian belief networks or Bayes nets) can be used to model biotic interactions in a manner that allows their direct and indirect effects to propagate among species in a community – following dynamics in nature – and, when combined with SDMs, can result in better estimates of species ranges. BNs have been applied to a wide range of problems across multiple disciplines (Jensen 2001; Neapolitan 2003; Pearl 2009) and are increasingly being used in environmental modelling and management (Uusitalo 2007; Mantyka-Pringle et al. 2016) and to model secondary extinctions in food webs (Eklöf et al. 2013). In our case, BNs offer a way to model large numbers of interacting species simultaneously, with the flexibility to define interactions from macro-scale presence-absence patterning among species, from community-level understanding of positive and negative interactions, or from a combination of the two approaches.
First, we describe our approach to modelling biotic interactions as BNs and show how to integrate this step in an existing SDM workflow. We then demonstrate our approach by applying it to a species pool of grassland plants in the western USA for which community-level interactions are well defined from 18 years of research in a study system where all species co-occur (Suttle et al. 2007; Sullivan et al. 2016). This record of community-level understanding provides an opportunity to test interactions derived from statistical associations of species occurrences at macro scales for community-level ecological realism. We find that many of the derived interactions are ecologically realistic, and that including biotic interactions in SDMs with this approach improves predictions of species occurrence for this example system. Finally, we use future climate scenarios to show how species’ predicted distributions in 2050 are refined when biotic interactions are included in models.
Incorporating Biotic Interactions in Species Distribution Models
BNs are probabilistic graphical models that represent a set of random variables as nodes and conditional dependencies between random variables as directed edges between nodes (Pearl 1988; Koller & Friedman 2009). Random variables in our BNs are probabilities of species occurrence and biotic interactions are modelled as positive and negative conditional dependencies among random variables. In this way, the probability of species occurrence from an SDM at a particular location can be modified up or down given the expected presence of any interaction partners and their combined effect on the focal species. When computing modified occurrence probabilities using BNs, the effects of conditional dependencies are propagated through the entire interaction network, meaning that new predictions reflect all biotic interactions in a community.
Conditional dependencies can represent a variety of biological processes and relationships among species. For example, competition can be described by negative conditional dependencies that decrease occurrence probabilities in line with how many competitors are present at a given location. Nitrogen fixation among plants can be described by positive conditional dependencies that increase occurrence probabilities if nitrogen-fixing species are also present. Conditional dependencies can also reflect the effects of shared and mutually exclusive habitat suitability among species. For example, positive conditional dependencies can capture environmental conditions that are suitable for multiple species – a feature that is not possible with conventional SDMs designed solely for individual species. Although shared habitat suitability can hardly be considered a biotic interaction, there remains value in using conditional dependencies in this way to leverage otherwise ignored information about one species to improve predictions for other species in the community.
When incorporating biotic interactions in SDMs using BNs, we start with the output of a conventional SDM, which we refer to as priors. Priors are then combined with a BN to give posteriors. In the description that follows, we write the prior of species i at location x as and the posterior as . The linking of SDMs and BNs is clear and elegant when the output of an SDM is the probability of occurrence of a species. However, SDM outputs are rarely true probabilities of occurrence; rather, outputs are typically habitat suitability values that are normalised between zero and one (Gotelli & Stanton-Geddes 2015). We can still use the mechanics of BNs in such cases, but must be careful to think in terms of conditional dependencies as modifying habitat suitability values up or down depending on the combined effect of multiple biotic interactions.
Modelling biotic interactions using Bayesian networks
In a BN, the nodes (random variables) and edges (conditional dependencies) form a graphical structure that is a directed acyclic graph (DAG), which is defined by two conditions: (1) edges are directed; and (2) the graph is acyclic (i.e. it is not possible, starting from a given node, to travel along a set of directed edges and return to the starting node; Thulasiraman & Swamy 1992). Although it is common to see directed edges interpreted as causal relationships, this is not always justified (Thulasiraman & Swamy 1992; Spirtes et al. 2000; Dawid 2008).
In our approach, nodes are species and directed edges are biotic interactions: nodes represent the probability of species occurrence at a particular location and directed edges represent the effect on occurrence of one species on another (Fig. 1). Note that we are dealing with Boolean random variables because the outcome at a particular location is either present (1) or absent (0). By way of illustration, consider three species: A, B and C. The presence of C is positively affected by A but negatively affected by B, which can be represented by A → C B; where C is referred to as the child node of parent nodes A and B, which, as A and B have no conditional dependencies themselves, would also be called root nodes. This DAG indicates that the probability of occurrence of C at a particular location is conditionally dependent on whether species A and B are present at the location. There are possibilities: A and B are both absent; A is present but B is absent; A is absent but B is present; and A and B are both present. The conditional probabilities associated with these four possibilities together comprise the state table for species C. The combination of a DAG and associated state tables (also known as conditional probability tables) for each species is a BN. A worked example of solving this BN is in Box 1.
Box 1. A worked example of solving a Bayesian network with three species
Consider the following Bayesian network: A → C B; where the presence of species C is positively affected by A but negatively affected by B.
At location x = 1, assume that the prior probability of occurrence of C is and its conditional probabilities are 0.4 when A and B are both absent; 0.8 when A is present but B is absent; 0 when A is absent but B is present; and 0.4 when A and B are both present, i.e. the positive influence of A balances the negative influence of B. As species A has no conditional dependencies, we can assume it has a single probability of occurrence at the location, say, ; similarly, we can assume B also has a single probability of occurrence, say, . The probability of occurrence of C including the effect of biotic interactions is then . In this example, the prior probabilities of occurrence , and combine with the BN to give new, posterior probabilities of occurrence , and .
If the effects of A and B are switched, i.e. A C ← B, then the posterior for C becomes , which is lower than before because the negative effect on C is now associated with A, which is more likely to be present than B.
As an example of propagating conditional dependencies, now assume that the presence of A positively affects the presence of B, which, in turn, positively affects the presence of C: A → B → C. Assume that the presence of A doubles the probability of occurrence of B, and similarly for the effect of B on C. As above, let the priors at the given location be , and . With this new BN, the posteriors are , and . Although the principle remains the same for communities with a greater number of species and more complex pattern of interactions, solving BNs by hand quickly becomes unwieldy. Fortunately, BNs can be solved numerically very efficiently (Thulasiraman & Swamy 1992). Example R code for solving BNs is in Appendix S1.
In practice, specifying a BN involves five main considerations. (1) Inclusion: Should a biotic interaction be modelled in the BN? (2) Direction: What is the net effect of the biotic interaction on the two interacting species, i.e. should A → B or A ← B be used? (3) Sign: Does the biotic interaction have a positive or negative effect on the presence of the affected species? (4) Strength: How much does the biotic interaction modify the presence of the affected species? (5) Combination: How do multiple biotic interactions combine to modify the presence of the affected species? These five considerations sort into two main tasks: specifying the graphical structure of the BN (considerations 1, 2 and 3) and specifying rules for filling the state table entries defined by the graphical structure (considerations 4 and 5).
Specifying the graphical structure of a Bayesian network
The graphical structure of a BN is the particular arrangement of nodes and edges used to represent biotic interactions. The arrangement must be a DAG for the BN to be solvable; and although feedback cycles are common in ecology, it is often straightforward to organise conditional dependencies in such a way as to avoid cycles in BNs without sacrificing too much biological realism. There are two general approaches for determining which biotic interactions to include in a BN: a computer-aided optimisation process that considers all possible species pairings based on macroecological occurrence data, or direct selection of interactions using data from community ecology.
The optimisation approach involves trialling multiple combinations of biotic interactions in order to find the combination that maximises an objective function. The area under the receiver operator curve (AUC) statistic is a familiar option for an objective function as it is widely used to measure the ability of an SDM to discriminate between known presences and absences (with presence-only data, absences are random locations drawn from the geographical extent under investigation; Phillips et al. 2006). Once an objective function has been chosen, a variety of algorithms can be used for optimisation, ranging from simple Hill-Climbing searches (Skiena 2008) to more involved genetic algorithms (Holland 1975). Note that exhaustive searches are impractical as the number of distinct DAG topologies can be large even when there are few nodes (Robinson 1973).
Alternatively, interactions can be specified based on community-level understanding, although suitable data sets are rare given the effort, time and cost required to collect comprehensive community data. If such data are available, however, then the first step is to identify a set of candidate biotic interactions for inclusion in the BN. A subset can then be selected based on each candidate interaction's perceived importance for the study region of interest. Among this subset, a direction and sign must be associated with each interaction (strictly, sign is not necessary for specifying the graphical structure, but as the effect of a biotic interaction is closely linked with its type then it makes sense for the decision about sign to be made at this stage). Note that decisions about sign and direction are not always straightforward. With a trophic interaction in which B eats A, for example, on the one hand the presence of B has a negative effect on A but on the other hand the presence of A has a positive effect on B; in such cases, to avoid directed cycles (i.e. A ⇌ B) it is necessary to pick the direction and sign combination that is expected to have the greatest impact on species occurrence. After a DAG has been specified, the size of state tables is fixed and the next task is to fill table entries with values.
Specifying rules for filling state table entries
A comprehensive but time-consuming approach to filling state table entries involves determining biologically motivated values for each state table entry individually. We refer to this approach as the FULL model. This model is unlikely to be practically feasible for all but the smallest systems because a species with N conditional dependencies has entries in its state table. One must therefore devise rules for simplifying the process of filling state table entries. Here, we propose three models, which we refer to as AND, OR and SUM (Box 2).
Box 2. Simple models for filling Bayesian network state table entries
Consider a Bayesian network with four species that has a graphical structure with two positive and one negative biotic interactions: A → D, B → D and C D. The state table for D has entries that must be filled individually when using the FULL model. We can simplify the process of filling state table entries by using an AND, OR or SUM model. With the AND model (eqn 1), species A and B must both be present to increase the presence of D. With the OR model (eqn 2), either A or B must be present to increase the presence of D. With the SUM model (eqn 3), the effect of positive conditional dependencies increases with the number of species that are present, which means there are more distinct state table entries compared to AND and OR models. Schematically, we can illustrate the differences among the three models using the state table (also known as a conditional probability table) for species D, with no change in the expected presence of D represented by ‘∼’, small increases and decreases by ‘↑’ and ‘↓’, respectively, and large increases and decreases by ‘↑↑’ and ‘↓↓’:
The main question when deciding whether to simplify or not is ‘Does species identity matter?’ Or put in more accommodating terms: ‘Can we reasonably assume that the effects of biotic interactions are in some ways substitutable with one another?’ If species identity truly matters then the effect of each state must be determined individually, and only the FULL model is justified. But if there is some flexibility then a simpler model can be used to fill state table entries.
The two simplest models are AND and OR. Analogous to Boolean logic, all species must be present for conditional dependencies to have an effect with the AND model, but only one species must be present with the OR model. Formally, we can fill state table entries for the AND and OR models using the following rules. For species i, let the total number of positive conditional dependencies be and the total number of negative conditional dependencies be . Given a particular state table entry, let the number of species present with positive conditional dependencies be and with negative conditional dependencies be . With the AND model, fill the entry as