This document illustrates the use of the package TPD for calculating several aspects of the functional diversity of biological communities. The functions included in TPD are based on the the trait probability densities of populations and species (TPDs), and its extension to communities (TPDc). These methods allow estimating several components of functional diversity and to partition it across scales. This collection of methods constitutes a unified framework that incorporates the underlying probabilistic nature of trait distributions in a uni- or multivariate trait space.
##The probabilistic nature of functional diversity
The collection of species that compose a community display a variety of trait values that conform a multivariate distribution of traits. As early as 1957, Hutchinson proposed his concept of niche, which is the multidimensional hypervolume in which a species can maintain a population. Hutchinsonian hypervolumes are generally conceived as uniform features, and all the points contained within them are assigned an equal probability of persistence of the species. Conseuquently, these hypervolumes can be characterised by their boundaries.
Nevertheless, this is not a totally satisfying conception, because some parts of the niche of a species present better conditions for subsistence than others. In correspondence, niches of species are usually described, conceived and represented as probability density functions. The same representation is generally applied to the trait distributions of species, reflecting the fact that individuals of a given species have some degree of variability in their trait values, and that some trait values (or combinations of trait values in the case of multiple traits) are more likely than others, just as it is more likely to find a 1.70 m tall person than a 2.10 m tall one.
Despite this, most methods analysing the functional diversity of communities are based on the consideration of a single value for each species and trait (the average values of each species for each trait), most species display non-negligible differences in the traits of their individuals. Probability density functions can be used to reflect the unequal probabilities of different trait values. Throughout this vignette, and in the TPD package, we will refer to these functions as Trait Probability Density (TPD).
Consider for instance the dataset trees
, contained in
the package datasets
of R. It contains 31 observations of
the girth, height and volume of black cherry trees. Let us calculate the
TPD of the height for this population of cherry trees:
The plot shows clearly that heights around 75 ft are much more likely than heights higher than 90 ft.
Because TPD’s are probability density functions, they integrate to 1, which means that the area under the plotted curve equals 1. To calculate this integral, we first expand a bit the range of traits of the database –we consider values between 50 and 100 ft; we want to be sure that our interval encompasses all the distribution–, and divide the trait axis into equal-length intervals. Although it is best to use a high number of intervals, in this example, for the sake of visibility, we have divided it into 50 intervals:
grid <- seq(from = 50, to = 100, length.out = 50)
edge_length <- grid[2] - grid[1]
#Each interval has this size (in trait units):
edge_length
#> [1] 1.020408
Now, let us evaluate the TPD function in each point of the grid.
kde_height <- density(trees$Height, from = 50, to = 100, n = 50)
plot(kde_height, main = "TPD height")
points(kde_height$x, kde_height$y, pch = 16, col = 2 , cex=0.6)
abline(v = grid, col = "grey50", lwd = 0.3)
The sum of the areas of all the rectangles with base
edge_length
= 1.02, and height equal to the value of the
TPD function evaluated in each point of the grid is:
Which is roughly 1! The TPD package uses kernel
density estimations procedures to calculate the TPD of populations (when
trait information of species is available at different locations, eg.
different habitats) or species (when trait information of species is
available at different points, and we assume a single TPD for the
species regardless of location). We refer to these functions as TPDs.
Because of the multidimensional nature of traits, the package allows
estimating TPDs for single or multiple traits, with the function
TPDs
, which makes use of the kernel density estimation
functions of the package ks. TPDs
estimates a TPDs for each species or population included in the
data.
##The TPDs
function
To illustrate the use of TPDs
, we are going to use the
database iris
, which contains measurements (in cm) of the
sepal length and width and petal length and width, for 50 flowers from
each of 3 species of Iris (I. setosa, I.
versicolor, and I. virginica)
head(iris)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3.0 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
Suposse that we want to calculate the TPDs of each species
considering Sepal.Length
and Sepal.Width
.
Estimating these TPDs’s is very simple:
traits_iris <- iris[, c("Sepal.Length", "Sepal.Width")]
sp_iris <- iris$Species
TPDs_iris <- TPDs(species = sp_iris, traits_iris)
#> -------Calculating densities for One population_Multiple species-----------
After a few seconds of calculations, the object
TPDs_iris
is created. This object, of class TPDsp contains,
along with other information, the grid of cells in which the function
was evaluated, contained in TPDs_iris$data$evaluation_grid
,
and the probability associated to each of those cells, contained in a
list TPDs_iris$TPDs
with one element for each species:
head(TPDs_iris$data$evaluation_grid)
#> Sepal.Length Sepal.Width
#> 1 3.760000 1.64
#> 2 3.783518 1.64
#> 3 3.807035 1.64
#> 4 3.830553 1.64
#> 5 3.854070 1.64
#> 6 3.877588 1.64
nrow(TPDs_iris$data$evaluation_grid)
#> [1] 40000
names(TPDs_iris$TPDs)
#> [1] "setosa" "versicolor" "virginica"
head(sort(TPDs_iris$TPDs$setosa, decreasing = TRUE))
#> 22455 22255 22655 22054 22254 22055
#> 0.0006073417 0.0006067314 0.0006064457 0.0006051767 0.0006050836 0.0006045352
sum(TPDs_iris$TPDs$setosa)
#> [1] 1
There are a lot of cells in the grid, resulting from dividing each
axis into 200 equal intervals. This number could be modified with the
argument n_divisions
of TPDs
, which defaults
to 200 for 2 dimensions, but increasing it results in higher computation
times (because the number of cells increases exponentially with
dimensions), with rather insignificant improvements in accuracy. Most
importantly, TPDs_iris
contains the TPDs of the three
species (in TPDs$TPDs
), which are vectors containing the
probability associated to each point of the evaluation grid:
summary(TPDs_iris$TPDs)
#> Length Class Mode
#> setosa 40000 -none- numeric
#> versicolor 40000 -none- numeric
#> virginica 40000 -none- numeric
# The sum of all the elements in each TPDs is equal to 1:
sapply(TPDs_iris$TPDs, sum)
#> setosa versicolor virginica
#> 1 1 1
The function plotTPD
can be used to visualize the
resulting TPDs’s. Note that it can only handle 1 or 2-dimensional data,
and that in the case of 2 dimensional plots, the packages
gridExtra and ggplot2 must be
installed.
plotTPD(TPD = TPDs_iris, nRowCol = c(1,3))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
Now that we are able to visualize TPD’s, we are going to show an
important aspect of their construction: alpha. The parameter
alpha
determines the proportion of the probability density
function of each species or population that will be included in the
resulting TPD (Blonder et al. 20114; Swanson et al. 2015). This means
that setting alpha = 1
will include the whole function,
whereas alpha = 0.5
will include only a 50%. It is
important to remarks that in the cases in which alpha
is
set to a value smaller than 1, the TPD
function rescales
the probabilities so that they add up to 1. By default,
alpha
is set to 0.95; thus TPDs
is not too
sensitive to outliers. Let us visualize the effect of alpha in the
calculation of a 1-dimensional TPD. First, lets set
alpha = 1
:
traits_iris_d1 <- iris[, c("Sepal.Length")]
sp_iris <- iris$Species
TPDs_iris_d1_a1 <- TPDs(species = sp_iris, traits = traits_iris_d1, alpha=1)
#> -------Calculating densities for One population_Multiple species-----------
plotTPD(TPD = TPDs_iris_d1_a1, nRowCol = c(1,3))
Now, alpha = 0.95
, the default value:
TPDs_iris_d1_a095 <- TPDs(species = sp_iris, traits = traits_iris_d1, alpha=0.95)
#> -------Calculating densities for One population_Multiple species-----------
plotTPD(TPD = TPDs_iris_d1_a095, nRowCol = c(1,3))
And this is how alpha = 0.5
looks like:
TPDs_iris_d1_a05 <- TPDs(species = sp_iris, traits = traits_iris_d1, alpha=0.5)
#> -------Calculating densities for One population_Multiple species-----------
plotTPD(TPD = TPDs_iris_d1_a05, nRowCol = c(1,3))
Indeed, the same logic applies for higher dimensions:
TPDs_iris_a95 <- TPDs(species = sp_iris, traits = traits_iris, alpha=0.95)
#> -------Calculating densities for One population_Multiple species-----------
plotTPD(TPD = TPDs_iris_a95, nRowCol = c(1,3))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
TPDs_iris_a75 <- TPDs(species = sp_iris, traits = traits_iris, alpha=0.75)
#> -------Calculating densities for One population_Multiple species-----------
plotTPD(TPD = TPDs_iris_a75, nRowCol = c(1,3))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
(NOTE: Be careful when making 2 dimensional plots. If you try to plot
too many species or populations at the same time, it can take quite a
long time. We recommend to use the argument whichPlot
in
PlotTPD
to control for this.)
As we mentioned above, sometimes there is trait information at the
popolation level. Imagine that we have measured traits on individuals of
each species in two different environments. The argument
samples
of TPDs
allows us to calculate a
separate TPDs for each population of each species. For example, suppose
that the first 25 observation of each species belong to “Site.1” and the
other 25 belong to “Site.2”.
pops_iris <- rep(c(rep("Site.1", 25), rep("Site.2", 25)), 3)
TPDs_pops_iris <- TPDs (species = sp_iris, traits = traits_iris, samples = pops_iris)
#> -------Calculating densities for Multiple populations_Multiple species-----------
plotTPD(TPD = TPDs_pops_iris, nRowCol = c(3,2))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
Because the assignment of individuals to sites in this example is somewhat random, there are not important differences in the TPDs of populations of the same species. However, in real conditions, differences in the functional traits between conspecifics living under different conditions can be substantial, and therefore it is important to take this aspect into account (Carmona et al. 2015).
The TPDs
function requires a high quantity of
information to work: one measurement per individual and trait cosidered,
and all the traits must be measured in all individuals in order to place
them in the multidimensional space. In addition, several individuals are
required to estimate reliable TPDs’s. In many occasions, we do not have
access to such a high-quality information. The TPDsMean
function is designed for those cases. TPDsMean
performs the
same task as TPDs
, but it estimates the probabilities by
calculating a multivariate normal distribution for each species or
population, rather than using kernel density estimations. Hence, the
only information required is the mean and standard deviation of each
trait considered for each species or population. Let us imagine that we
only have the means and standard deviations of the iris species:
names_iris <- unique(iris$Species)
mt1 <- tapply(iris[, "Sepal.Length"], iris$Species, mean)
mt2 <- tapply(iris[, "Sepal.Width"], iris$Species, mean)
means_iris <- matrix(c(mt1, mt2), ncol=2)
st1 <- tapply(iris[, "Sepal.Length"], iris$Species, sd)
st2 <- tapply(iris[, "Sepal.Width"], iris$Species, sd)
sds_iris <- matrix(c(st1, st2), ncol=2)
TPDsMean_iris<- TPDsMean(species = names_iris, means = means_iris, sds = sds_iris)
#> -------Calculating densities for One population_Multiple species-----------
plotTPD(TPD = TPDsMean_iris, nRowCol = c(1,3))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
TPDc
: from species to communitiesOne interesting feature of TPD’s is that the concept presented above
for species or pupulations can be easily scaled up to express the
functional structure of communities. This is the goal of the function
TPDc
. TPDc
combines the information contained
in the TPDs of the species in the community with information about their
abundances, resulting in the Trait Probabiity Density of the community
(which we called TPDc). The rationales is very simple: for each
cell of the grid in which the functional space is divided, TPDc is the
result of summing the probability of each of the species present in the
community,weighted by their relative abundances.
We can illustrate this concept with the iris
example.
Let us imagine three different communities containing different
abundances of each species:
abundances_comm_iris <- matrix(c(c(0.5, 0.4, 0), #I. virginica absent
c(0.0, 0.9, 0.1 ), #I. versic. dominates; setosa absent
c(0.0, 0.1, 0.9 )), #I. virg. dominates; setosa absent
ncol = 3, byrow = TRUE, dimnames = list(paste0("Comm.",1:3),
unique(iris$Species)))
To calculate the TPDc’s of a group of communities, we first have to
calculate the TPDs of all the species present in each
community. (This step is important: if you do not have the TPDs of all
the species or populations, you cannot estimate the TPDc of the
community, and TPDc
will end with an error).
TPDs_iris <- TPDs(species = sp_iris, traits_iris)
#> -------Calculating densities for One population_Multiple species-----------
TPDc_iris <- TPDc(TPDs = TPDs_iris, sampUnit = abundances_comm_iris )
The resulting object (TPDc_iris
) is an object of class
TPDcomm, containing several pieces of information inherited from
TPDs_iris
, such as the coordinates of the evaluation grid
(in TPDc_iris$data$evaluation_grid
).
What is new in this object is the information contained in its
component $TPDc
. Specifically, we are interested in the
element TPDc_iris$TPDc$TPDc
, which is a list with an
element for each community (length(TPDc_iris$TPDc$TPDc) =
3), each one containing the probability associated to each cell of the
grid in each community. In fact, the TPDc of a community reflects the
probability of observing a given trait value when extracting a random
individual from the community. Because TPDc’s are probability density
functions, they integrate to 1:
(NOTE: the probabilities contained in a TPDc will always sum 1,
regardless of the values of abundance provided to the TPDc
function, which are automatically transformed to relative
abundances)
Similar to what we did with TPDs’s, plotTPD
can be used
to provide a graphical representation of TPDc’s of 1 or 2 dimensions.
Let us examine the three communities:
plotTPD(TPD = TPDc_iris, nRowCol = c(1,3))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
The plots clearly reflect our expectations when we created the communities: the part of the functional space occupied by I. setosa is empty in Comm.2 and Comm.3, where that species is not present. In addition, Comm.2 and Comm.3, which share the same two species (I. versicolor and I. virginica), occupy the same part of the functional space. However, because of the different abundances of their species, the TPDc’s of these communities differ, reflecting the higher abundance of I. versicolor in Comm. 2 and of I. virginica in Comm.1.
TPDs’s and TPDc’s are powerful tools for studying the functional diversity of biological communities. The TPD package includes several functions that estimate different indices reflecting complimentary aspects of functional diversity. With the aim to help users to select for the most adequate method depending on specific questions, we divide these methods into four different categories attending to two independent criteria.
The first criteria is whether the identity or number of species is or not of primary concern. In some cases (1), we are only interested on the aggregated probability distribution, that is, the TPD of the studied entity (which can be species, populations or communities). For example, if we are interested on how does the ammount of functional volume occupied by communities vary across environmental gradients, we only need to know the TPDc’s of communities. In such case, it would have been correct to sample and measure traits from random individuals in each community, constructing the TPDC’s without having identified any species. By contrast (2), sometimes the interest is not only on the TPD of the community, but on how the different species that compose it are organised in the functional space. This includes the study of functional diversity in terms of the functional dissimilarity of species, or the study of functional redundancy.
These two groups can be further divided according to whether (a) we are interested in studying properties only at the within-unit level (units can be species, populations or communities), or (b) we are interested in analysing the differences between these units, for which we have to consider more than one unit simultaneously.
The next table shows the methods that compose the framework, classified according to the above-stated criteria, along with the associated specific functions included in the TPD package:
(1) Interest is only on the aggregated probability distribution | (2) Interest is not only on the aggregated probability distribution | |
---|---|---|
(a) Within units | Functional Richness, Evenness and Divergence
(REND ) |
Functional redundancy (redundancy ) |
Traits sampling (tSamp ) |
Species dissimilarity (dissim ) |
|
(b) Between units | Overlap-based dissimilarity (dissim ) |
Partition of functional diversity with Rao (dissim +
Rao ) |
Dissimilarity decomposition (dissim ) |
## 1. Species are not of primary concern
### 1.a. Indices for single units
The methods included in this category are intended for cases in which we are purely interested in some feature of the functional structure of individual populations, species or communities. This means that the results yielded for each unit of analysis is totally independent from the rest of units. The methods included here, and the associated functions are:
####Functional Richness, Evenness and Divergence
(REND
)
These three indices of functional diversity were originally described by Mason et al. (2005). Because a deeper study of their biological significance is out of the scope of this document, we refer readers to that reference for further details.
Functional richness (FRic) is the amount of functional volume occupied by the analysed unit (species/population/community). Therefore, a species with great variability in traits between their individuals will display a higher FRic value than a species with less variability.
Functional evenness (FEve) reflects the evenness of the abundance of the different trait values in the analysed unit. Imagine two communities with the same functional richness (the same degree of variability in traits between their individuals). Suppose that some traits are much more abundant than others in the first species, whereas all the traits have a similar abundance in the second species: FEve for the second species should be higher than FEve for the first one.
Functional divergence (FDiv) reflects the distribution of abundances within the functional trait volume occupied by the analysed unit. A species in which the most abundant trait values are away from the center of gravity of the functional space –eg. individuals are either very small, or very big– will display high FDiv, whereas a species in which the most abundant trait values are similar –eg. most individuals having intermediate size– will have low FDiv .
The function REND
calculates these three indices using
the TPDs’s and TPDc’s of populations/species and communities,
respectively. The function works with one or more dimensions, thus
expanding the framework presented in Mason et al. (2005) to multiple
traits without modifying its original rationale, based on probabilities.
Let us start with species with one and two dimensions:
TPDs_iris_d1 <- TPDs(species = sp_iris, traits_iris_d1)
#> -------Calculating densities for One population_Multiple species-----------
RED_d1 <- REND(TPDs = TPDs_iris_d1)
#> Calculating FRichness of species
#> Calculating FEvenness of species
#> Calculating FDivergence of species
TPDs_iris_d2 <- TPDs(species = sp_iris, traits_iris)
#> -------Calculating densities for One population_Multiple species-----------
RED_d2 <- REND(TPDs = TPDs_iris_d2)
#> Calculating FRichness of species
#> Calculating FEvenness of species
#> Calculating FDivergence of species
RED_d1
and RED_d2
contain the FRic, FEve
and FDiv values of the three species considering one and 2 traits
simultaneously. We can plot the respective TPDs’s, along with their
FRich values:
plotTPD(TPD = TPDs_iris_d1, nRowCol = c(1, 3), leg.text = paste0(names(RED_d1$species$FRichnes),
"; FRic=", round(RED_d1$species$FRichnes, 3)))
plotTPD(TPD = TPDs_iris_d2, nRowCol = c(1, 3), leg.text = paste0(names(RED_d2$species$FRichnes),
"; FRic=", round(RED_d2$species$FRichnes, 3)))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
The first thing to underscore is that FRic is given in the original
trait units, which means that in the one-dimensional case
(Sepal.Length
), FRic is expressed in cm and in the two
dimensional case (Sepal.Length
and
Sepal.Width
) it is expressed in cm2. I.
virginica is the species that occupies the biggest proportion of
the functional space, both for one and two traits, and I.
setosa the one occupying the smallest proportion.
The plots also reveal that I. setosa occupies a portion of the functional space that is distinct from the ones occupied by the other two species, which are more similar –with I. versicolor occupying a slightly more central position than I. virginica. We are going to take advantage of this observation to show the calculation of FDiv, and its application to communities. Let us create three communities, with different abundances of each species:
abundances_comm_iris_divergence <- matrix(c(c(0.45, 0.1, 0.45), # Different species are more abundant
c(0.05, 0.9, 0.05 ), # Different species are less abundant
c(0.9, 0.05, 0.05 )), # A species with extreme trait values is very abundant
ncol = 3, byrow = TRUE, dimnames = list(paste0("Comm.",1:3),
unique(iris$Species)))
TPDc_iris_divergence_d1 <- TPDc(TPDs = TPDs_iris_d1, sampUnit = abundances_comm_iris_divergence)
RED_comm_d1 <- REND(TPDc = TPDc_iris_divergence_d1)
#> Calculating FRichness of communities
#> Calculating FEvenness of communities
#> Calculating FDivergence of communities
plotTPD(TPD = TPDc_iris_divergence_d1, nRowCol = c(1, 3),
leg.text = paste0(names(RED_comm_d1$communities$FDivergence),
"; FDiv=", round(RED_comm_d1$communities$FDivergence, 3)))
TPDc_iris_divergence_d2 <- TPDc(TPDs = TPDs_iris_d2, sampUnit = abundances_comm_iris_divergence)
RED_comm_d2 <- REND(TPDc = TPDc_iris_divergence_d2)
#> Calculating FRichness of communities
#> Calculating FEvenness of communities
#> Calculating FDivergence of communities
plotTPD(TPD = TPDc_iris_divergence_d2, nRowCol = c(1, 3),
leg.text = paste0(names(RED_comm_d2$communities$FDivergence),
"; FDiv=", round(RED_comm_d2$communities$FDivergence, 3)))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
Comm.3, with a great proportion of their TPDc being away from the center, displays the highest FDiv value, followed by Comm.1. By contrast, Comm.2, where I. versicolor dominates, has the lowest FDiv value. Interestingly, given that the three communities are composed of the same species, they have the same value of FRic, which is independent of species abundances:
RED_comm_d1$communities$FRichness
#> Comm.1 Comm.2 Comm.3
#> 3.752432 3.752432 3.752432
RED_comm_d2$communities$FRichness
#> Comm.1 Comm.2 Comm.3
#> 6.331625 6.331625 6.331625
To show the use of functional evenness, we are going to create an artificial set composed of two groups of four species each. Species in each group have the same mean trait value, but different variances. For the sake of visibility, we are going to present a one-dimensional example, but the same applies for higher dimensions.
set.seed(1)
sp_means <- rep(seq(0, 8, length.out = 4), 2)
sp_sds <- rep(c(0.5, 1), each=4)
sp_trait <- numeric()
n_indiv <- 50
for (i in 1:length(sp_means)) {
sp_trait <- c(sp_trait, rnorm(n_indiv, sp_means[i], sp_sds[i]))
}
sp_names <- rep(paste0("SP.", 1:length(sp_means)), each=n_indiv)
TPDs_sp <- TPDs(species = sp_names, traits = sp_trait, alpha = 0.99)
#> -------Calculating densities for One population_Multiple species-----------
plotTPD(TPD = TPDs_sp, nRowCol = c(2,4))
Now, lets create four communities, two with the species in the first group, and two with the species in the second group, and examine their FEve values:
comm_abundances <- matrix(c( c(0.25, 0.25, 0.25, 0.25, 0, 0, 0, 0), #Comm1, goup 1, equal abundances
c(0, 0, 0, 0, 0.25, 0.25, 0.25, 0.25), #Comm2, goup 2, equal abundances
c(0.85, 0.05, 0.05, 0.05, 0, 0, 0, 0), #Comm3, goup 1, unequal abundances
c(0, 0, 0, 0, 0.85, 0.05, 0.05, 0.05)), #Comm4, goup 2, unequal abundances
ncol = 8, byrow = TRUE, dimnames = list(paste0("Comm.",1:4),
unique(sp_names)))
TPDc_sp <- TPDc(TPDs = TPDs_sp, sampUnit = comm_abundances)
RED_sp <- REND(TPDc_sp)
#> Calculating FRichness of communities
#> Calculating FEvenness of communities
#> Calculating FDivergence of communities
plotTPD(TPDc_sp, nRowCol = c(2,2), leg.text = paste0(names(RED_sp$communities$FEvenness),
"; FEve=", round(RED_sp$communities$FEvenness, 3)))
The difference between Comm.1 and Comm.2 reflecs the effect of the evenness in the distribution of probabilities within the species that compose the communities. The probabilities associated to the trait values are generally more even in the species of Comm2 than in those of Comm1, hence the higher value of FEve (0.818 vs 0.706). Comm.3 and Comm.4 are composed of the same species than Comm1. and Comm.2, but with uneven relative abundances, thus reflecting the effect of the evenness in the abundance of species.
####Traits sampling (tSamp
)
Some applications based on the traits of species require simulating
trait values that could be found in a given population, species or
community. The function tSamp
performs this task. Lets say
that we want to simulate 1,000 trait values from each Iris
species:
Iris_samp
contains three groups of 1,000 simulated trait
values, one for each species. As usual, visualizing the results can be
useful:
plotTPD(TPDs_iris, nRowCol = c(1,3))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
par(mfrow=c(1,3))
for (i in 1:3){
plot(Iris_samp$species_samples[[i]], pch = 16, cex = 0.6,
xlim = range(TPDs_iris$data$evaluation_grid[[1]]),
ylim = range(TPDs_iris$data$evaluation_grid[[2]]))
}
Indeed, tSamp
also works with TPDc’s, simulating trait
values that could have been drawn from a community. We can recover the
object TPDc_iris
that we created above, draw 1,000 trait
values from each community, and plot them:
Iris_comm_samp <- tSamp(TPDc = TPDc_iris, size = 1000)
plotTPD(TPDc_iris, nRowCol = c(1,3))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
par(mfrow=c(1,3))
for (i in 1:3){
plot(Iris_comm_samp$communities_samples[[i]], pch = 16, cex = 0.6,
xlim = range(TPDc_iris$data$evaluation_grid[[1]]),
ylim = range(TPDc_iris$data$evaluation_grid[[2]]))
}
### 1.b. Indices for multiple units
Sometimes one can be interested in the difference in the functional diversity between different units (once again, these can be populations, species or communities), or β diversity, rather than on diversity at the single unit scale or α diversity. Differences between pairs of units can be used to answer a lot of interesting ecological questions. The methods included in this section are useful for that purpose.
#### Overlap-based functional dissimilarity, and its decomposition
(dissim
)
When two species, populations or communities share some part of the
functional volume that they occupy, their respectives TPD’s must overlap
up to some degree. We recommend reading Mouillot et al. (2005), Leps et
al. (2006), Geange et al. (2012) or de Bello et al. (2013) as useful
references to understand this concept and the related technical,
mathematical and ecological details behind the concept of overlap. We
are going to rely once again in the iris
database to
illustrate this method. Remember that the object
TPDs_iris_d1
contains the TPDs of the three species for the
trait Sepal.Length
. We are going to plot the three TPDs
together:
par(mfrow=c(1,1))
plot(TPDs_iris_d1$data$evaluation_grid, TPDs_iris_d1$TPDs[[1]], type="n",
ylim = c(0,max(sapply(TPDs_iris_d1$TPDs, max))))
for (i in 1:length(TPDs_iris_d1$TPDs)){
lines(TPDs_iris_d1$data$evaluation_grid, TPDs_iris_d1$TPDs[[i]], lwd=2, col=i)
}
legend("topright", bty = "n", lwd = 2, col = 1:length(TPDs_iris_d1$TPDs),
legend = names(TPDs_iris_d1$TPDs))
As we already knew, I. versicolor and I. virginica
are functionally rather similar, as shown by the relatively high overlap
between their TPDs’s. On the other hand, I. virginica and
I. setosa are very different, with only a very small proportion
of their TPDs’s overlapping. We can use dissim
to
accurately estimate these differences:
dissim_iris_d1 <- dissim(TPDs_iris_d1)
#> Calculating dissimilarities between 3 populations. It might take some time
dissim_iris_d1$populations$dissimilarity
#> setosa versicolor virginica
#> setosa 0.0000000 0.7235442 0.9471442
#> versicolor 0.7235442 0.0000000 0.4268629
#> virginica 0.9471442 0.4268629 0.0000000
The dissimilarity matrix quantifies our previous intuitions: I.
virginica and I. setosa have a functional dissimilarity of
0.947 (with 1 being the maximum possible value for dissimilarity),
whereas the dissimilarity between I. versicolor and I.
virginica is 0.427. The function works in a similar way for
communities and higher dimensions. We can recreate the object
TPDc_iris
:
abundances_comm_iris <- matrix(c(c(0.9, 0.1, 0), #I. setosa dominates; virg. absent
c(0.0, 0.9, 0.1 ), #I. Versic. dominates; setosa absent
c(0.0, 0.0, 1 )), #Only I. virg. present
ncol = 3, byrow = TRUE, dimnames = list(paste0("Comm.",1:3),
unique(iris$Species)))
TPDc_iris <- TPDc( TPDs = TPDs_iris, sampUnit = abundances_comm_iris)
plotTPD(TPDc_iris, nRowCol = c(1,3))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
dissim_iris_comm <- dissim(TPDc_iris)
#> Calculating dissimilarities between 3 communities. It might take some time
dissim_iris_comm$communities$dissimilarity
#> Comm.1 Comm.2 Comm.3
#> Comm.1 0.0000000 0.8985651 0.9095471
#> Comm.2 0.8985651 0.0000000 0.3750567
#> Comm.3 0.9095471 0.3750567 0.0000000
Comm.1 is very dissimilar to the other two communities, which in turn
are rather similar between them. Note, however, that Comm.2 and Comm.3
have a dissimilarity of 0.375, in spite of occupying exactly the same
region of the functional space. Therefore, the dissimilarity between
these communities is entirely due to differences in the evennes of the
distribution of traits between the communities, rather than to
differences in the parts of the functional space occupied by each
community. The other two elements of the dissim_iris_comm
object reflect this fact: dissimilarity between two TPD’s can be
decomposed into two complementary components. The first component is due
to dissimilarities in the relative abundances of traits shared by the
communities (as in the example between Comm.2 and Comm.3), whereas the
other is due to the sections of the functional space that are
exclusively occupied by one of the communities, but not by the
other.
dissim_iris_comm$communities$P_shared
#> Comm.1 Comm.2 Comm.3
#> Comm.1 NA 0.9508338 0.6009773
#> Comm.2 0.9508338 NA 1.0000000
#> Comm.3 0.6009773 1.0000000 NA
dissim_iris_comm$communities$P_non_shared
#> Comm.1 Comm.2 Comm.3
#> Comm.1 NA 0.04916617 0.3990227
#> Comm.2 0.04916617 NA 0.0000000
#> Comm.3 0.39902267 0.00000000 NA
As we expected, $P_shared
accounts for all the
dissimilarity between Comm.2 and Comm.3. This is because all the species
present in Comm.3 are present in Comm.2. Therefore the functional volume
occupied by Comm.3 is completely contained (nested) within the
functional volume occupied by Comm.2. Because the functional volume
occupied by Comm.3 is a subset of the functional volume of Comm.2,
$P_shared
is equal to 1. Similarly, the dissimilarity
between Comm.1 and Comm.2 is mostly accounted by $P_shared
,
because the most abundant species in Comm.2 is also present in Comm.1,
leading to a high level of nestedness. By contrast, the only species
present in Comm.3 is not in Comm.1, resulting in a lower (but still
substantial) value of $P_shared
.
## 2. Species are of primary concern
### 2.a. Indices for single units
In this group we include two methods. The first one is functional redundancy. The second one is the calculation of the dissimilarity between the species that compose a community. These dissimilarities can be used afterwards in indices to calculate alpha functional diversity. The calculation of dissimilarities between species has been explained above, so there is no need for further details. Now, we will focus on redundancy.
redundancy
)We consider that two species are redundant in a community when they occupy the same functional volume, which means that they have the same traits. Accordingly, if there are two species that are totally redundant in a community, the loss of one of them will not reduce the functional volume occupied by the community (which is indicated by functional richness. In reality, communities are usually composed of more than two species, and therefore the concept of redundancy involves the simultaneous consideration of the funcional volumes occupied by each species.
The method that we implement in the function
(redundancy
), evaluates the number of species that occupy
each cell of the grid defining the functional space occupied by the
community. Afterwards, it performs a weighted mean for the whole
community, using the probability associated to each cell (which are
contained in TPDc) as the weighting factor. This gives the average
number of species that are present in each cell; if we substract 1 to
this number, we get the average number of redundant species in the
community as a whole, that is, the average number of species that could
be lost without in the community without reducing its functional
volume.
Now that we are familiarized with the iris
dataset
(remember: I. setosa is functionally different from I.
versicolor and I. virginica, which are similar), we are
going to use it to explore the use of redundancy
. Let us
build a set of communities with different abundances of each
species:
TPDs_iris <- TPDs(species = sp_iris, traits_iris)
#> -------Calculating densities for One population_Multiple species-----------
abundances_comm_iris <- matrix(c(c(0.9, 0.05, 0.05), #I. setosa dominates
c(0.0, 0.5, 0.5 ), #I. setosa absent
c(0.33, 0.33, 0.33), #Equal abundances
c(0.1, 0.45, 0.45), #Versicolor and virginica dominate
c(0.5, 0, 0.5)), #versicolor absent
ncol = 3, byrow = TRUE, dimnames = list(paste0("Comm.",1:5),
unique(iris$Species)))
TPDc_iris <- TPDc( TPDs = TPDs_iris, sampUnit = abundances_comm_iris)
Communities dominated by I. setosa should have a lower redundancy (because that species occupies a portion of the functional space that is not occupied by the other two), whereas communities in which I. setosa is absent should present a higher degree of redundancy. Finally, communities in which I. setosa is present, but in low abundances, should display intermediate values of redundancy.
FRed_iris <- redundancy(TPDc = TPDc_iris)
plotTPD(TPD = TPDc_iris, leg.text = paste(names(FRed_iris$redundancy),
round(FRed_iris$redundancy, 3), sep="; FRed="))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
redundancy
works as we expected; Comm.5, composed of two
totally distinct species has no functional redundancy at all. Comm.1,
dominated by a functionally unique species, and with low abundance of
two similar species, has a very low value of redundancy (0.086. On the
opposite extreme, Comm.2, composed by two similar species, has a rather
high value of redundancy. Note that redundancy, as defined here, is
bounded between 0 and the number of species that compose the community
(indicated in FRed_iris$richness
) minus 1.
### 2.b. Indices for multiple units
####Partition of functional diversity with Rao (dissim
+
Rao
)
When the interest of the study lies on the spatial or temporal
structure of functional diversity, it is often useful to partition the
total diversity of all the considered communities (γ diversity) into its
within-communities (α) and
between-communities (β)
components. The Rao index of functional diversity is a very good
alternative to perform this partition (see de Bello et al. 2010 for
technical details regarding the definition of Rao as well as
mathematical guidelines for correctly performing this partition). Here,
it suffices to say that the Rao index requires a characterization of the
functional dissimilarities between all the species/populations in the
considered group of communities. As we have seen before, these
dissimilarities can be easily estimated using the dissim
function. This function returns an object of class
OverlapDiss
, which should be fed to the function
Rao
, along with the TPDc’s of the communities (calculated
with the function TPDc
). We are going to rely once again in
the iris
dataset to illustrate the use of Rao
.
Imagine a group of communities situated along an environmental gradient.
Suppose that I. setosa dominates in the first part of the
gradient, where individuals of I. virginica cannot survive, and
the opposite happens in the other extreme of the gradient. I.
versicolor, which have intermediate traits can live everywhere, but
its abundance maximizes in the intermediate part of the gradient. Let us
create 5 communities along that gradient:
TPDs_iris <- TPDs(species = sp_iris, traits_iris, alpha=0.95)
#> -------Calculating densities for One population_Multiple species-----------
abundances_comm_iris <- matrix(c(c(0.9, 0.1, 0), # setosa dominates
c(0.4, 0.5, 0.1 ),
c(0.15, 0.7, 0.15), #versicolor dominates
c(0.1, 0.5, 0.4),
c(0, 0.1, 0.9)), #virginica dominates
ncol = 3, byrow = TRUE, dimnames = list(paste0("Comm.",1:5),
unique(iris$Species)))
TPDc_iris <- TPDc( TPDs = TPDs_iris, sampUnit = abundances_comm_iris)
plotTPD(TPD = TPDc_iris, nRowCol = c(1,5))
#> Be patient, in the 2-dimensional case, plots can take some time.
#> If you think it takes too long, consider reducing the number of plots using 'whichPlot'
#> Warning: Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
#> Removed 796 rows containing missing values or values outside the scale range
#> (`geom_raster()`).
dissim_iris_sp <- dissim(TPDs_iris)
#> Calculating dissimilarities between 3 populations. It might take some time
Once that TPDc_iris
and dissim_iris_sp
are
defined, we can proceed to use Rao
:
Rao_iris
contains a few elements with interesting
information about our communities. For instance,
Rao_iris$alpha_eqv
contains the alpha diversity of each
community, expressed in equivalent numbers. Note that for a correct
partition of gamma diversity into its beta and alpha components, these
alpha values do not correspond exactly with the ones that would have
been calculated if each community would have been considered as an
isolated unit (see de Bello et al. 2010 for details on this). If we want
to have alpha expressed that way, we need to set the argument ’regional
= FALSE`:
Here (Rao_iris_noReg$alpha_eqv
), we see that Comm.2 is
the most diverse, with more than 2 equivalent species, whereas Comm.5,
with only two very-similar species has the lowest alpha diversity.
The studied communities have a gamma diversity of
Rao_iris$gamma_eqv=
2.108 equivalent species, and the beta
diversity is Rao_iris$beta_prop=
32.9%.
Rao_iris$pairwise$beta_prop
is where we would have to look
if we were interested in the differences between pairs of
communities:
Rao_iris$pairwise$beta_prop
#> Comm.1 Comm.2 Comm.3 Comm.4 Comm.5
#> Comm.1 NA 0.3585872 0.70921893 0.77714415 0.9279824
#> Comm.2 0.3585872 NA 0.10245613 0.15889514 0.4181624
#> Comm.3 0.7092189 0.1024561 NA 0.03561324 0.2655448
#> Comm.4 0.7771441 0.1588951 0.03561324 NA 0.1183461
#> Comm.5 0.9279824 0.4181624 0.26554478 0.11834614 NA
This confirms that communities that are further away in the gradient from each other (e.g. Comm.1 and Comm.5) are more functionally dissimilar than communities that are in more similar environmental conditions (eg. Comm 4 and Comm.5).
This vignette has been built using R version 4.4.2 (2024-10-31) with
the packages TPD
1.1.0, ggplot2
3.5.1,
gridExtra
2.3, and ks
1.14.3 Version: [1]
“2024-11-16 04:05:26 UTC”