Announcing PUZZLE version 4.0: Maximum Likelihood Analysis for Nucleotide, Amino Acid, and Two-State Data December 1997 Copyright 1995-97 by Korbinian Strimmer and Arndt von Haeseler Zoologisches Institut, Universitaet Muenchen, Muenchen, Germany What's new in PUZZLE 4.0 compared to version 3.1: - computation of clock-like branch lengths (also for amino acids and for non-binary trees) - automatic likelihood ratio clock test - two-state ML model for presence-absence data - display of most probable assignment of rates to sites - identification of groups of identical sequences - possibility to read multiple input trees - Kishino-Hasegawa test to check whether trees are significantly different - BLOSUM 62 model of amino acid substitution (Henikoff-Henikoff 1992) - improvements to user interface - SH model can be applied to 1st and 2nd codon positions - automatic check for compatible compiler settings - fix of gcc compatibility problem. PUZZLE is a computer program to reconstruct phylogenetic trees from molecular sequence data by maximum likelihood. It implements a fast tree search algorithm, quartet puzzling, that allows analysis of large data sets and automatically assigns estimations of support to each internal branch. PUZZLE also computes pairwise maximum likelihood distances as well as branch lengths for user specified trees. Branch lengths can be calculated under the clock-assumption. In addition, PUZZLE offers a novel method, likelihood mapping, to investigate the support of a hypothesized internal branch without computing an overall tree and to visualize the phylogenetic content of a sequence alignment. PUZZLE also conducts a number of statistical tests on the data set (chi-square test for homogeneity of base composition, likelihood ratio clock test, Kishino-Hasegawa test). The models of substitution provided by PUZZLE are TN, HKY, F84, SH for nucleotides, Dayhoff, JTT, mtREV24, BLOSUM 62 for amino acids, and F81 for two-state data. Rate heterogeneity is modelled by a discrete Gamma distribution and by allowing invariable sites. The corresponding parameters can be inferred from the data set. PUZZLE is available free of charge from http://www.zi.biologie.uni-muenchen.de/~strimmer/puzzle.html (PUZZLE home page) ftp://ftp.ebi.ac.uk/pub/software (European Bioinformatics Institute) ftp://ftp.pasteur.fr/pub/GenSoft (Institut Pasteur) http://iubio.bio.indiana.edu/soft/molbio/evolve (IUBio archive www) ftp://iubio.bio.indiana.edu/molbio/evolve (IUBio archive ftp) PUZZLE is written in ANSI C. It will run on most personal computers and workstations if compiled by an appropriate C compiler. Executables are available for MacOS, Windows 95/NT, and OS/2. UNIX and VMS makefiles are also provided.