Skip to contents

Build a phylogenetic tree from a distance matrix using fast implementations of neighbor-joining algorithms from the decenttree library.

Usage

fast_nj(x, method = c("rapidnj", "nj", "bionj", "rapid_bionj"))

Arguments

x

A distance matrix (dist object or square matrix) with taxa names.

method

Algorithm to use. One of "rapidnj" (default), "nj", "bionj", "rapid_bionj". RapidNJ is recommended for large datasets (>1000 taxa).

Value

A phylogenetic tree of class "phylo" (from the ape package).

Details

This function wraps the decenttree C++ library which provides highly optimized implementations of neighbor-joining algorithms:

  • rapidnj: RapidNJ algorithm with branch-and-bound optimization. O(n^2) average case, much faster than standard NJ for large datasets.

  • nj: Standard Neighbor-Joining (Saitou & Nei 1987). Uses SIMD vectorization on ARM for ~3x speedup.

  • bionj: BIONJ variant (Gascuel 2009) which uses variance estimates. Uses SIMD vectorization on ARM for ~3x speedup.

  • rapid_bionj: RapidNJ optimization applied to BIONJ.

For datasets with more than ~5000 taxa, RapidNJ provides speedups of 30-100x compared to ape::nj().

References

Simonsen M, Mailund T, Pedersen CNS (2011). "Inference of Large Phylogenies using Neighbour-Joining." Communications in Computer and Information Science, 127, 334-344.

Saitou N, Nei M (1987). "The neighbor-joining method: a new method for reconstructing phylogenetic trees." Molecular Biology and Evolution, 4(4), 406-425.

Gascuel O (1997). "BIONJ: An Improved Version of the NJ Algorithm Based on a Simple Model of Sequence Data."

See also

nj for the standard ape implementation

Examples

if (FALSE) { # \dontrun{
# Create a random distance matrix
n <- 100
d <- matrix(runif(n*n), n, n)
d <- (d + t(d)) / 2
diag(d) <- 0
rownames(d) <- colnames(d) <- paste0("taxon_", 1:n)

# Build tree with RapidNJ
tree <- fast_nj(d)
plot(tree)

# Compare with ape::nj()
tree2 <- ape::nj(d)
} # }