Core Function Syntax
The decostand function implements commonly used normalization techniques for community ecology datasets.
decostand(community_data, approach, dim_margin, range.global, base_log = 2, rm_na=FALSE, ...)
wisconsin(community_data)
decobackstand(community_data, apply_zap = TRUE)Arguments:
community_data: A matrix-like object containing ecological community data.approach: The normalization algorithm to apply.dim_margin: The margin dimension (1 for rows, 2 for columns) if the default is unsuitable.range.global: A matrix supplying the range forapproach = "range", enabling consistent ranges across data subsets. Dimensions must align withcommunity_data.base_log: Logarithm base forapproach = "log".rm_na: Determines whether missing values are ignored during margin standardization.apply_zap: Converts near-zero values to absolute zeros to prevent negative values and exaggerated abundance estimates.
Normalization Approaches (Method Parameter)
total: Divides entries by their margin sum (default margin 1). Standardized rows or columns sum to 1.max: Divides entries by their margin maximum (default margin 2), scaling the peak value to 1.frequency: Divides by the margin sum and multiplies by the count of non-zero entries, averaging non-zero values to 1 (default margin 2).normalize: Scales entries so the margin sum of squares equals 1 (default margin 1). Also known as chord transformation.range: Scales values to fall between 0 and 1 (default margin 2). Constant values map to 0.rank/rrank:ranksubstitutes abundances with ascending ranks, keeping zeros intact.rrankapplies relative ranks, peaking at 1 (default margin 1). Ties use average ranks.standardize: Adjusts data to possess zero mean and unit variance (Z-score; default margin 2).pa: Converts the matrix to presence/absence (binary 0/1).chi.square: Divides by the square root of row and column sums, adjusting by the square root of the matrix total. When combined with Euclidean distance, it mimics chi-square distance used in correspondence analysis (default margin 1).hellinger: Applies the square root tototalnormalization, reducing compositionality bias.log: Log transformation wherelog_b(x) + 1forx > 0and baseb, keeping zeros as zero. Not equivalent tolog(x+1). Higher bases diminish abundance weighting in favor of presence.alr: Additive log-ratio transformation mitigates skewness and compositional bias. Requires positive values (usepseudocountif needed). A reference row/column (specified byreference) is removed from the output.clr: Centered log-ratio transformation centers each feature's log value against the mean of all feature logs. Widely used in microbial ecology to handle compositional data.rclr: Robust centered log-ratio variation ofclrthat tolerates zeros without pseudocounts. Divides by the geometric mean of observed features; zeros stay zero and are excluded from the mean calculation.
Standard Score (Z-Score) Transformation
A standard score transformation scales data by subtracting the feature mean and dividing by its standard deviation, resulting in a distribution with a mean of 0 and a standard deviation of 1.
Manual Calculation
# Define a sample vector
measurements <- c(5, 10, 15, 20, 25)
# Compute mean and standard deviation
avg_val <- mean(measurements)
std_dev <- sd(measurements)
# Compute standard scores
scaled_vals <- (measurements - avg_val) / std_dev
# Display scaled values
print(scaled_vals)Output:
[1] -1.4142136 -0.7071068 0.0000000 0.7071068 1.4142136Using the decostand Function
library(vegan)
# Construct a sample dataset
site_matrix <- data.frame(
sp1 = c(2, 4, 6),
sp2 = c(3, 5, 7),
sp3 = c(5, 1, 9)
)
# Apply 'total' normalization across rows
normalized_matrix <- decostand(site_matrix, method = "total", MARGIN = 1)
# Display results
print(normalized_matrix)Output:
sp1 sp2 sp3
1 0.2000000 0.3000000 0.5000000
2 0.4000000 0.5000000 0.1000000
3 0.2727273 0.3181818 0.4090909Reversing Standardization
The decobackstand function reverses a standardization, converting the transformed data back to its original state. This reverse operation is not supported for all methods and may yield imprecise results due to rounding errors. It is advisable not to overwrite the original dataset. Utilizing apply_zap = TRUE ensures that original zeros remain exact.