"Exploring the inference potential of species co-occurrence data - From sampling scales to drivers of community assembly"Darcy, SeanMost natural microbial ecosystems consist of spatially organised environments that can be heterogeneous even at the micro-scale. At this scale specific assemblages of species establish. Resulting community composition can be influenced by interactions – be it by enhancing species abundance through facilitation or by suppressing abundance through competition. Additionally, species spatial abundance distributions across environments follow the distribution of resources they show specific adaptations to. Reconstructing underlying mechanisms behind species occurrence and abundance has been a long standing goal in ecology. In this context, species-species associations (co-occurrences; derived from pairwise abundance correlations) are thought to hold valuable information on deterministic drivers. To what extent they can be reconstructed from such data is contentious, as a variety of such processes interplay and can be further affected by meta-community dynamics (such as dispersal), likely confounding possible inferences. For the field of microbial ecology, a less discussed issue is the fact that community data is often sequenced from material taken at relatively large volumes that contain many micro-habitats. Thereby potentially distinct communities are pooled into a single measurement likely further reducing inference potential. We developed a theoretical model to investigate how species associations reflect either underlying interactions or similarity in resource preference. We assert how inference potential changes with dispersal, different spatial distributions of resources and how all of this relates to volume for pooled samples. To achieve this, we adapted the well known generalized Lotka-Volterra equations to include an environmental filtering dynamic. Communities were simulated in microhabitats – nodes in a 3D network with differing levels of connectivity allowing dispersal – embedded in varying spatial distributions of resources (implemented through random Gaussian fields). To replicate large sample scales we pool habitats within different volumes summing species and resource abundances. Species pairwise associations are determined via abundance correlations across samples. Our results show that associations generally reflected both the strongest species interactions and similarities in resource preference. Positive associations only indicated species with similar environmental preference, when resources did not show strictly inverse distributions. Habitat connectivity and sample volume homogenised community abundances and both increased the number of positive associations and decreased inference potential. Negative associations too increased with volume but were lost with increased dispersal. The strength of these relationships were highly dependent on the distribution of resources. Our model allows the simulation of spatially explicit community data. It includes central drivers of community assembly and can be applied to varying habitat configurations and resource distributions. We show how both interactions and similarity in environmental preference are imprinted on co-occurrence data and therefore advise the inclusion of context data to interpret any specific associations. For example, two species correlated with the same resources often showed overall preference similarity. We generally advise minimising the volume of samples when investigating community data from heterogeneous microbial ecosystems. |
« back