******************** Building phylogenies ******************** Building A Phylogenetic Tree From Pairwise Distances ==================================================== Directly via ``alignment.quick_tree()`` ======================================= Both the ``ArrayAlignment`` and ``Alignment`` classes support this. .. doctest:: >>> from cogent3 import load_aligned_seqs >>> aln = load_aligned_seqs('data/primate_brca1.fasta', moltype="dna") >>> tree = aln.quick_tree(calc="TN93") >>> tree = tree.balanced() # purely for display >>> print(tree.ascii_art()) #doctest: +SKIP /-Rhesus /edge.1--| | | /-HowlerMon | \edge.0--| | \-Galago -root----| |--Orangutan | | /-Chimpanzee \edge.2--| | /-Human \edge.3--| \-Gorilla The ``quick_tree()`` method also supports non-parametric bootstrapping. The number of resampled alignments is specified using the ``bootstrap`` argument. In the following, trees are estimated from 100 resampled alignments and merged into a single consensus topology using a weighted consensus tree algorithm. .. doctest:: >>> tree = aln.quick_tree(calc="TN93", bootstrap=100) Using the ``DistanceMatrix`` object ----------------------------------- .. doctest:: >>> from cogent3 import load_aligned_seqs >>> aln = load_aligned_seqs('data/primate_brca1.fasta', moltype="dna") >>> dists = aln.distance_matrix(calc="TN93") >>> tree = dists.quick_tree() >>> tree = tree.balanced() # purely for display >>> print(tree.ascii_art()) #doctest: +SKIP /-Rhesus /edge.1--| | | /-HowlerMon | \edge.0--| | \-Galago -root----| |--Orangutan | | /-Chimpanzee \edge.2--| | /-Human \edge.3--| \-Gorilla Explicitly via ``DistanceMatrix`` and ``cogent3.phylo.nj.nj()``` ---------------------------------------------------------------- .. doctest:: >>> from cogent3.phylo import nj >>> from cogent3 import load_aligned_seqs >>> aln = load_aligned_seqs('data/primate_brca1.fasta', moltype="dna") >>> dists = aln.distance_matrix(calc="TN93") >>> tree = nj.nj(dists) >>> tree = tree.balanced() # purely for display >>> print(tree.ascii_art()) #doctest: +SKIP /-Rhesus /edge.1--| | | /-HowlerMon | \edge.0--| | \-Galago -root----| |--Orangutan | | /-Chimpanzee \edge.2--| | /-Human \edge.3--| \-Gorilla Directly from a pairwise distance ``dict`` ------------------------------------------ .. doctest:: >>> from cogent3.phylo import nj >>> dists = {('a', 'b'): 2.7, ('c', 'b'): 2.33, ('c', 'a'): 0.73} >>> tree = nj.nj(dists) >>> print(tree.ascii_art()) #doctest: +SKIP /-a | -root----|--b | \-c By Least-squares ================ We illustrate the phylogeny reconstruction by least-squares using the F81 substitution model. We use the advanced-stepwise addition algorithm to search tree space. Here ``a`` is the number of taxa to exhaustively evaluate all possible phylogenies for. Successive taxa are added to the top ``k`` trees (measured by the least-squares metric) and ``k`` trees are kept at each iteration. .. doctest:: >>> import pickle >>> from cogent3.phylo.least_squares import WLS >>> dists = pickle.load(open('data/dists_for_phylo.pickle', 'rb')) >>> ls = WLS(dists) >>> stat, tree = ls.trex(a=5, k=5, show_progress=False) Other optional arguments that can be passed to the ``trex`` method are: ``return_all``, whether the ``k`` best trees at the final step are returned as a ``ScoredTreeCollection`` object; ``order``, a series of tip names whose order defines the sequence in which tips will be added during tree building (this allows the user to randomise the input order). By ML ===== We illustrate the phylogeny reconstruction using maximum-likelihood using the F81 substitution model. We use the advanced-stepwise addition algorithm to search tree space. .. doctest:: >>> from cogent3 import load_aligned_seqs >>> from cogent3.phylo.maximum_likelihood import ML >>> from cogent3.evolve.models import F81 >>> aln = load_aligned_seqs('data/primate_brca1.fasta') >>> ml = ML(F81(), aln) The ``ML`` object also has the ``trex`` method and this can be used in the same way as for above, i.e. ``ml.trex()``. We don't do that here because this is a very slow method for phylogenetic reconstruction.