Title: | Taxonomic Uncertainty on Multivariate Analyses of Ecological Data |
---|---|
Description: | Permutational method to incorporate taxonomic uncertainty and some functions to assess its effects on parameters of some widely used multivariate methods in ecology, as explained in Cayuela et al. (2011) <doi:10.1111/j.1600-0587.2009.05899.x>. |
Authors: | Luis Cayuela and Marcelino de la Cruz |
Maintainer: | Luis Cayuela <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1-2 |
Built: | 2025-01-25 03:18:10 UTC |
Source: | https://github.com/cran/betaper |
This function asses the effects of taxonomic uncertainty on the R2 coefficients and the p-values of a permutational multivariate analysis of variance using distance matrices.
adonis_pertables(formula = X ~ ., data, permutations = 5, method = "bray") ## S3 method for class 'adonis_pertables' plot(x, ...)
adonis_pertables(formula = X ~ ., data, permutations = 5, method = "bray") ## S3 method for class 'adonis_pertables' plot(x, ...)
formula |
A typical model formula such as 'Y ~ A + B*C', but where 'Y' is a pertables object (i.e. a list of simulated community data matrices obtained with |
data |
The data frame from which 'A', 'B', and 'C' would be drawn. |
permutations |
Number of replicate permutations used for the hypothesis tests (F tests) for each simulated community data matrices obtained with |
method |
The name of any method used in 'vegdist' to calculate pairwise distances. |
x |
|
... |
Additional graphical parameters passed to plot. |
adonis_pertables
returns an object of classadonis_pertables
, basically a list with the following components:
raw |
An object of class |
simulation |
A list with the results of the simulation: |
The objects of class adonis_pertables
have print
and plot
S3 methods for a simple access to results. See the examples.
Luis Cayuela and Marcelino de la Cruz
Cayuela, L., De la Cruz, M. and Ruokolainen, K. (2011). A method to incorporate the effect of taxonomic uncertainty on multivariate analyses of ecological data. Ecography, 34: 94-102. http://dx.doi.org/10.1111/j.1600-0587.2009.05899.x.
pertables
, adonis
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") # Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) ## Not run: Amazonia100 <- pertables(Amazonia, index=index.Amazon, nsim=100) # Assess the effects of taxonomic uncertainty on a PERMANOVA (i.e., adonis) test: Amazonia.adonis <- adonis_pertables(Amazonia100 ~ Ca + K + Mg + Na, data=soils) Amazonia.adonis plot(Amazonia.adonis) ## End(Not run) # Fast example for Rcheck Amazonia4.p2 <- pertables.p2(Amazonia[1:50,], index=index.Amazon, nsim=4, ncl=2, iseed=4) set.seed(2) Amazonia.adonis <- adonis_pertables(Amazonia4.p2 ~ Ca + K + Mg + Na, data=soils) Amazonia.adonis plot(Amazonia.adonis)
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") # Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) ## Not run: Amazonia100 <- pertables(Amazonia, index=index.Amazon, nsim=100) # Assess the effects of taxonomic uncertainty on a PERMANOVA (i.e., adonis) test: Amazonia.adonis <- adonis_pertables(Amazonia100 ~ Ca + K + Mg + Na, data=soils) Amazonia.adonis plot(Amazonia.adonis) ## End(Not run) # Fast example for Rcheck Amazonia4.p2 <- pertables.p2(Amazonia[1:50,], index=index.Amazon, nsim=4, ncl=2, iseed=4) set.seed(2) Amazonia.adonis <- adonis_pertables(Amazonia4.p2 ~ Ca + K + Mg + Na, data=soils) Amazonia.adonis plot(Amazonia.adonis)
The Amazonia
data frame has tree counts in nine 0.16-hectare inventory plots in Western Amazonia. soils
contains data on soil
cations at each location.
data(Amazonia) data(soils)
data(Amazonia) data(soils)
Amazonia
is a data frame with 1188 observations (species) and 12 columns (taxonomic descripcion and sites). The three first columns
refer to family, genus and specific species Latin names. Columns 4 to 12 have tree abundance data for nine inventory plots.
soils
is a data frame with 9 observations (inventory plots) and 4 columns (variables). Soil variables (Ca, K, Mg, Na) are given in cmol/kg.
Data from Western Amazonia includes tree inventories at nine lowland sites (approximately 100-150 m above sea level) near Iquitos, Peru. The sites were selected to represent regional variations in geology and were distributed along a soil nutrient gradient ranging from poor loamy soils to richer clayey soils. Each inventory consisted of 20 x 20 m plots (0.16 ha total area) distributed along 1.3-km transects. At each site, K. Ruokolainen and colleagues identified to species or morphospecies all woody, free-standing stems of > 2.5 cm dbh. The full inventories sampled 3980 individuals from 1188 species or morphospecies.
Higgins, M.A. & Ruokolainen, K. 2004. Rapid tropical forest inventory: a comparison of techniques based on inventory data from western Amazonia. Conservation Biology 18(3): 799-811.
Ruokolainen, K., Tuomisto, H., Macia, M.J., Higgins, M.A. & Yli-Halla, M. 2007. Are floristic and edaphic patterns in Amazonian rain forests congruent for trees, pteridophytes and Melastomataceae? Journal of Tropical Ecology 23: 13-25.
data(Amazonia) data(soils)
data(Amazonia) data(soils)
This function asses the effects of taxonomic uncertainty on two widely used parameters of a [Partial] Constrained Correspondence Analysis, i.e. the 'percentage explained variance' (sometimes referred to as R-squared) and the pseudo-F.
cca_pertables(fml, data, scale = FALSE,...) ## S3 method for class 'cca_pertables' plot(x, pch = 18, ...)
cca_pertables(fml, data, scale = FALSE,...) ## S3 method for class 'cca_pertables' plot(x, pch = 18, ...)
fml |
Model formula, where the left hand side gives a pertables object (i.e. a list of simulated community data matrices obtained with |
data |
Data frame containing the variables on the right hand side of the model formula. |
scale |
Scale species to unit variance (like correlations). |
x |
|
pch |
Plotting 'character', i.e., symbol to use in the CCA plot. See |
... |
Additional graphical parameters passed to plot. |
This function is a wrapper to submit a pertables
object to cca
function of the vegan package. See the documentation of cca
for details about formula
and Condition
use.
cca_pertables
returns an object of class cca_pertables
, basically a list with the following components:
raw |
An object of class class |
simulation |
A list with the results of the simulation: |
The objects of class cca_pertables
have print
and plot
S3 methods for a simple access to results. See the examples.
Luis Cayuela and Marcelino de la Cruz
Cayuela, L., De la Cruz, M. and Ruokolainen, K. (2011). A method to incorporate the effect of taxonomic uncertainty on multivariate analyses of ecological data. Ecography, 34: 94-102. http://dx.doi.org/10.1111/j.1600-0587.2009.05899.x.
pertables
, cca
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") # Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) ## Not run: Amazonia100 <- pertables(Amazonia, index=index.Amazon, nsim=100) # Assess the effects of taxonomic uncertainty on a CCA analysis of biological data explained # by all the environmental variables of the soil data: Amazonia.cca <- cca_pertables(Amazonia100 ~., data=soils) Amazonia.cca plot(Amazonia.cca) ## End(Not run) # Fast example for Rcheck Amazonia4.p2 <- pertables.p2(Amazonia[1:50,], index=index.Amazon, nsim=4, ncl=2, iseed=4) set.seed(2) Amazonia.cca <- cca_pertables(Amazonia4.p2 ~., data=soils) Amazonia.cca plot(Amazonia.cca)
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") # Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) ## Not run: Amazonia100 <- pertables(Amazonia, index=index.Amazon, nsim=100) # Assess the effects of taxonomic uncertainty on a CCA analysis of biological data explained # by all the environmental variables of the soil data: Amazonia.cca <- cca_pertables(Amazonia100 ~., data=soils) Amazonia.cca plot(Amazonia.cca) ## End(Not run) # Fast example for Rcheck Amazonia4.p2 <- pertables.p2(Amazonia[1:50,], index=index.Amazon, nsim=4, ncl=2, iseed=4) set.seed(2) Amazonia.cca <- cca_pertables(Amazonia4.p2 ~., data=soils) Amazonia.cca plot(Amazonia.cca)
HCP
has tree abundance data from 16 forest fragments located in the Highlands of Chiapas, southern Mexico. HCP.coords
contains the geographical UTM coordinates for the 16 forest fragments' centroids.
data(HCP) data(HCP.coords)
data(HCP) data(HCP.coords)
HCP
is a data frame with 231 observations and 19 variables. The three first columns contain family, genus and specific species Latin
names. Columns 4 to 19 have tree abundance data for the 16 forest fragments. HCP.coords
is a data frame with two columns and 16 rows.
Cayuela, L., Golicher, D.J., Rey Benayas, J.M., Gonzalez-Espinosa, M. & Ramirez-Marcial, N. 2006. Fragmentation, disturbance and tree diversity conservation in tropical montane forests. Journal of Applied Ecology 43: 1172-1181.
data(HCP) data(HCP.coords)
data(HCP) data(HCP.coords)
This function asses the effects of taxonomic uncertainty on the coefficient of correlation and the p-values of a Mantel test.
mantel_pertables(pertab, env, dist.method = "bray", binary = FALSE, cor.method = "pearson", permutations = 100) ## S3 method for class 'mantel_pertables' plot(x, xlab = "Environmental distance", ylab = "Sorensen's similarity index", pch = 19, ...)
mantel_pertables(pertab, env, dist.method = "bray", binary = FALSE, cor.method = "pearson", permutations = 100) ## S3 method for class 'mantel_pertables' plot(x, xlab = "Environmental distance", ylab = "Sorensen's similarity index", pch = 19, ...)
pertab |
A pertables object (i.e. a list of simulated community data matrices obtained with |
env |
Data frame with the environmental variables. |
dist.method |
Method to compute the dissimilarity matrices from the biological and environmental data tables. One of the methods described in function |
binary |
Value for the argument |
cor.method |
Correlation method, as accepted by |
permutations |
Number of permutations in assessing significance. |
x |
|
xlab |
Label to name x-axis |
ylab |
Label to name y-axis |
pch |
Plotting 'character', i.e., symbol to use in the distance decay plot. See |
... |
Additional graphical parameters passed to plot. |
mantel_pertables
returns an object of classmantel_pertables
, basically a list with the following components:
mantel |
A list with two components: |
simulation |
A list with the results of the simulation: |
The objects of class mantel_pertables
have print
and plot
S3 methods for a simple access to results. See the examples.
Luis Cayuela and Marcelino de la Cruz
Cayuela, L., De la Cruz, M. and Ruokolainen, K. (2011). A method to incorporate the effect of taxonomic uncertainty on multivariate analyses of ecological data. Ecography, 34: 94-102. http://dx.doi.org/10.1111/j.1600-0587.2009.05899.x.
pertables
, mantel
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") ## Not run: # Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) Amazonia100 <- pertables(Amazonia, index=index.Amazon, nsim=100) # Assess the effects of taxonomic uncertainty on a Mantel test of biological dissimilarity # correlated to soil dissimilarity among sites: Amazonia.mantel <- mantel_pertables(pertab=Amazonia100, env=soils, dist.method = "bray") Amazonia.mantel plot(Amazonia.mantel) ## End(Not run) # Fast example for Rcheck Amazonia4.p2 <- pertables.p2(Amazonia[1:50,], index=index.Amazon, nsim=4, ncl=2, iseed=4) set.seed(2) Amazonia.mantel <- mantel_pertables(pertab=Amazonia4.p2, env=soils, dist.method = "bray") Amazonia.mantel plot(Amazonia.mantel)
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") ## Not run: # Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) Amazonia100 <- pertables(Amazonia, index=index.Amazon, nsim=100) # Assess the effects of taxonomic uncertainty on a Mantel test of biological dissimilarity # correlated to soil dissimilarity among sites: Amazonia.mantel <- mantel_pertables(pertab=Amazonia100, env=soils, dist.method = "bray") Amazonia.mantel plot(Amazonia.mantel) ## End(Not run) # Fast example for Rcheck Amazonia4.p2 <- pertables.p2(Amazonia[1:50,], index=index.Amazon, nsim=4, ncl=2, iseed=4) set.seed(2) Amazonia.mantel <- mantel_pertables(pertab=Amazonia4.p2, env=soils, dist.method = "bray") Amazonia.mantel plot(Amazonia.mantel)
This function implements a permutational method to incorporate taxonomic uncertainty on multivariate analyses typically used in the analysis of ecological data. The procedure is based on iterative randomizations that randomly re-assign non identified species in each site to any of the other species found in the remaining sites.
pertables(data, index = NULL, nsim = 100) pertables.p2(data, index = NULL, nsim = 100, ncl=2, iseed = NULL)
pertables(data, index = NULL, nsim = 100) pertables.p2(data, index = NULL, nsim = 100, ncl=2, iseed = NULL)
data |
Community data matrix. The three first columns are factors referring to the family, genus and species specific names. The remaining columns are numeric vectors indicating species abundances at each site. |
index |
List of additional parameters to determine the level at which species have been identified. Default values include 'Indet', 'indet', 'sp', 'sp1' to 'sp100', 'sp 1' to 'sp 100', ”, and ' '. |
nsim |
Number of simulations of species' identities, i.e., number of data tables to simulate. |
ncl |
Number of clusters for parallel simulation. |
iseed |
An integer to be supplied to clusterSetRNGStream, or NULL not to set reproducible seeds. |
The procedure is implemented in two sequential steps:
Step 1. Morphospecies identified only to genus are randomly re-assigned with the same probability within the group of species and morphospecies that share the same genus, provided they are not found in the same sites. In the re-assignment of the species identity, the species considered can also receive its own identity. For instance, let's assume we have three floristic inventories. In site A we have Eugenia sp1 and E. nesiotica. In site B we have Eugenia nesiotica, E. principium and E. salamensis. In site C we have Eugenia sp2 and E. salamensis. Eugenia sp1 can be thus re-identified with equal probability as Eugenia sp2, E. principium, E. salamensis or just maintain its own identity (Eugenia sp1). In the latter case, this means that we assume that E. sp1 is a completely different species, although we do not know its true identity. On the contrary, we cannot re-identify E. sp1 as E. nesiotica because they were found in the same site, so we are quite certain that E. sp1 is different from E. nesiotica. The same is applied to species identified only to family and fully unidentified species. Note that when collating inventories from different researchers, we must rename all unidentified species. This is because two researchers can use the same label, e.g. Eugenia sp1, even though this name does not necessarily refer to the same species. For a verification of the biological identity of Eugenia sp1 one would need to cross-check the vouchers bearing the same name.
Step 2. Step 1 is iterated nsim times. As a result, nsim matrices are obtained, all of which contain the same number of sites but variable number of species depending on the resulting re-assignment of morphospecies, The process can be time-consuming if community data matrices are large.
Function pertables.p2
implements a parallelized version which considerably reduces computation time.
The function return a list of class pertables
with the following components
taxunc |
Summary of the number of species fully identified (0), identified to genus (1), identified to family (2), or fully undetermined (3). |
pertables |
A list with all the simulated data matrices. |
raw |
The raw data matrix, without the unidentified especies. |
Luis Cayuela and Marcelino de la Cruz
Cayuela, L., De la Cruz, M. and Ruokolainen, K. (2011). A method to incorporate the effect of taxonomic uncertainty on multivariate analyses of ecological data. Ecography, 34: 94-102. http://dx.doi.org/10.1111/j.1600-0587.2009.05899.x.
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") #Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) ## Not run: # compare prformance of pertables and pertables.p2 nsim <-100 ncl <-2 gc() t0<- Sys.time() Amazonia100<- pertables(Amazonia, index=index.Amazon, nsim=nsim) Sys.time()-t0 gc() t0<- Sys.time() Amazonia100.p2<- pertables.p2(Amazonia, index=index.Amazon, nsim=nsim, ncl=ncl) Sys.time()-t0 ## End(Not run) # Example for Rcheck Amazonia4.p2<- pertables.p2(Amazonia, index=index.Amazon, nsim=4, ncl=2)
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") #Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) ## Not run: # compare prformance of pertables and pertables.p2 nsim <-100 ncl <-2 gc() t0<- Sys.time() Amazonia100<- pertables(Amazonia, index=index.Amazon, nsim=nsim) Sys.time()-t0 gc() t0<- Sys.time() Amazonia100.p2<- pertables.p2(Amazonia, index=index.Amazon, nsim=nsim, ncl=ncl) Sys.time()-t0 ## End(Not run) # Example for Rcheck Amazonia4.p2<- pertables.p2(Amazonia, index=index.Amazon, nsim=4, ncl=2)
This function asses the effects of taxonomic uncertainty on two widely used parameters of a [Partial] Redundance Analysis, i.e. the 'percentage explained variance' (sometimes referred to as R-squared) and the 'pseudo-F' .
rda_pertables(fml, data, scale=FALSE,...) ## S3 method for class 'rda_pertables' plot(x, pch = 18, ...)
rda_pertables(fml, data, scale=FALSE,...) ## S3 method for class 'rda_pertables' plot(x, pch = 18, ...)
fml |
Model formula, where the left hand side gives a pertables object (i.e. a list of simulated community data matrices obtained with |
data |
Data frame containing the variables on the right hand side of the model formula. |
scale |
Scale species to unit variance (like correlations). |
x |
rda_pertables object to plot. |
pch |
Plotting 'character', i.e., symbol to use in the RDA plot. See |
... |
Additional graphical parameters passed to plot. |
This function is a wrapper to submit a pertables
object to rda
function of the vegan package. See the documentation of cca
for details about formula
and Condition
use.
rda_pertables
returns an object of class'rda_pertables'
, basically a list with the following components:
raw |
An object of class class |
simulation |
A list with the results of the simulation: |
The objects of class 'rda_pertables'
have print
and plot
S3 methods for a simple access to results. See the examples.
Luis Cayuela and Marcelino de la Cruz
Cayuela, L., De la Cruz, M. and Ruokolainen, K. (2011). A method to incorporate the effect of taxonomic uncertainty on multivariate analyses of ecological data. Ecography, 34: 94-102. http://dx.doi.org/10.1111/j.1600-0587.2009.05899.x.
pertables
, cca
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") #Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) ## Not run: Amazonia100 <- pertables(Amazonia, index=index.Amazon, nsim=100) # Assess the effects of taxonomic uncertainty on a RDA analysis of biological data explained # by all the environmental variables of the soil data: Amazonia.rda <- rda_pertables(Amazonia100 ~., data=soils) Amazonia.rda plot(Amazonia.rda) ## End(Not run) # Fast example for Rcheck Amazonia4.p2 <- pertables.p2(Amazonia[1:50,], index=index.Amazon, nsim=4, ncl=2, iseed=4) set.seed(2) Amazonia.rda <- rda_pertables(Amazonia4.p2 ~., data=soils) Amazonia.rda plot(Amazonia.rda)
data(Amazonia) data(soils) # Define a new index that includes the terms used in the \code{Amazonia} dataset to define # undetermined taxa at different taxonomic levels index.Amazon <- c(paste("sp.", rep(1:20), sep=""), "Indet.", "indet.") #Generate a pertables object (i.e. a list of biological data tables simulated from taxonomic # uncertainty) ## Not run: Amazonia100 <- pertables(Amazonia, index=index.Amazon, nsim=100) # Assess the effects of taxonomic uncertainty on a RDA analysis of biological data explained # by all the environmental variables of the soil data: Amazonia.rda <- rda_pertables(Amazonia100 ~., data=soils) Amazonia.rda plot(Amazonia.rda) ## End(Not run) # Fast example for Rcheck Amazonia4.p2 <- pertables.p2(Amazonia[1:50,], index=index.Amazon, nsim=4, ncl=2, iseed=4) set.seed(2) Amazonia.rda <- rda_pertables(Amazonia4.p2 ~., data=soils) Amazonia.rda plot(Amazonia.rda)