Clearcut

The reference implementation for Relaxed Neighbor Joining (RNJ)

Evans, J., Sheneman, L., Foster, J.A., (2006) Relaxed Neighbor-Joining: A Fast Distance-Based Phylogenetic Tree
Construction Method, Journal of Molecular Evolution, 62:785-792.

Extremely efficient phylogenetic tree reconstruction





Neighbor joining (NJ) is a popular distance-based phylogenetic tree reconstruction method.  It has nice theoretical properties, but suffers from an O(N3) time complexity.  Popular implementations of traditional NJ cannot process datasets with more than a few thousand taxa.

Relaxed neighbor joining (RNJ) modifies the traditional neighbor joining algorithm and achieves typical asymptotic runtimes of  O(N2logN) without a significant reduction in the quality of the inferred trees.  As with traditional neighbor joining, RNJ will reconstruct the true tree given a set of additive distances. 

Relaxed neighbor joining was developed by Jason Evans, Luke Sheneman, and James Foster from the Initiative for Bioinformatics and Evolutionary STudies (IBEST) at the University of Idaho. 

Clearcut is a stand-alone reference implementation of relaxed neighbor joining (RNJ).  It was written in the C programming language under Linux and has been successfully ported to Sun Solaris and Apple/Mac OS X.  Clearcut is distributed as open source under the BSD license, and is available for download (below).  Some simple documentation is included in the source distribution.

Clearcut is capable of taking either a distance matrix or a multiple sequence alignment (MSA) as input.  If necessary, Clearcut will compute corrected distances based on a configurable distance correction model (Jukes-Cantor or Kimura).  Clearcut outputs a phylogenetic tree in Newick format and an optional corrected distance matrix.      

Clearcut is maintained by Luke Sheneman


Current Clearcut Source Distribution

        clearcut-1.0.9.tar.gz  -- 349838 bytes -- Released 02/09/09



The Treezilla Dataset

Clearcut was developed and tested, in part, by using the following published 500-taxon dataset known as Treezilla:
Chase MW, Soltis DE, Olmstead RG, Morgan D, Les DH, Mishler BD, Duvall MR, Price RA,
Hills HG, Qiu YL, Kron KA, Rettig JH, Conti E, Palmer JD, Manhart JR, Sytsma KJ,
Michaels HJ, Kress WJ, Karol KG, Clark WD, H´edren MH, Gaut BS, Jansen RK, Kim
KJ, Wimpee CF, Smith JF, Furnier GR, Strauss SH, Xiang QY, Plunkett GM, Soltis PS,
Swensen SM, Williams SE, Gadek PA, Quinn CJ, Equiarte LE, Dolenberg E, Learn Jr GH,
Graham SW, Barrett SCH, Dayandan S, Albert VA (1993) Phylogenetics of seed plants: An
analysis of nucleotide sequences from the plastid gene rbcL. Ann Mo Bot 80:528–580

Crux

NOTE: Relaxed neighbor joining (RNJ) has also been implemented in Crux, a multi-purpose phylogenetic inferrencing application written by Jason Evans.




Please report any questions or suggestions for future versions of Clearcut here.