{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Applying a time-reversible codon model\n", "\n", "We display the full set of codon models available." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Model Type | \n", "Abbreviation | \n", "Description | \n", "\n", "\n", "
---|---|---|
codon | \n", "CNFGTR | \n", "Conditional nucleotide frequency codon substitution model, GTR variant (with params analagous to the nucleotide GTR model). Yap, Lindsay, Easteal and Huttley, 2010, Mol Biol Evol 27: 726-734 | \n", "
codon | \n", "CNFHKY | \n", "Conditional nucleotide frequency codon substitution model, HKY variant (with kappa, the ratio of transitions to transversions) Yap, Lindsay, Easteal and Huttley, 2010, Mol Biol Evol 27: 726-734 | \n", "
codon | \n", "MG94HKY | \n", "Muse and Gaut 1994 codon substitution model, HKY variant (with kappa, the ratio of transitions to transversions) Muse and Gaut, 1994, Mol Biol Evol, 11, 715-24 | \n", "
codon | \n", "MG94GTR | \n", "Muse and Gaut 1994 codon substitution model, GTR variant (with params analagous to the nucleotide GTR model) Muse and Gaut, 1994, Mol Biol Evol, 11, 715-24 | \n", "
codon | \n", "GY94 | \n", "Goldman and Yang 1994 codon substitution model. N Goldman and Z Yang, 1994, Mol Biol Evol, 11(5):725-36. | \n", "
codon | \n", "Y98 | \n", "Yang's 1998 substitution model, a derivative of the GY94. Z Yang, 1998, Mol Biol Evol, 15(5):568-73 | \n", "
codon | \n", "H04G | \n", "Huttley 2004 CpG substitution model. Includes a term for substitutions to or from CpG's. GA Huttley, 2004, Mol Biol Evol, 21(9):1760-8 | \n", "
codon | \n", "H04GK | \n", "Huttley 2004 CpG substitution model. Includes a term for transition substitutions to or from CpG's. GA Huttley, 2004, Mol Biol Evol, 21(9):1760-8 | \n", "
codon | \n", "H04GGK | \n", "Huttley 2004 CpG substitution model. Includes a general term for substitutions to or from CpG's and an adjustment for CpG transitions. GA Huttley, 2004, Mol Biol Evol, 21(9):1760-8 | \n", "
codon | \n", "GNC | \n", "General Nucleotide Codon, a non-reversible codon model. Kaehler, Yap, Huttley, 2017, Gen Biol Evol 9(1): 134–49 | \n", "
\n", "10 rows x 3 columns
" ], "text/plain": [ "Specify a model using 'Abbreviation' (case sensitive).\n", "===============================================================================================================================================================================================================================\n", "Model Type Abbreviation Description\n", "-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n", " codon CNFGTR Conditional nucleotide frequency codon substitution model, GTR variant (with params analagous to the nucleotide GTR model). Yap, Lindsay, Easteal and Huttley, 2010, Mol Biol Evol 27: 726-734\n", " codon CNFHKY Conditional nucleotide frequency codon substitution model, HKY variant (with kappa, the ratio of transitions to transversions) Yap, Lindsay, Easteal and Huttley, 2010, Mol Biol Evol 27: 726-734\n", " codon MG94HKY Muse and Gaut 1994 codon substitution model, HKY variant (with kappa, the ratio of transitions to transversions) Muse and Gaut, 1994, Mol Biol Evol, 11, 715-24\n", " codon MG94GTR Muse and Gaut 1994 codon substitution model, GTR variant (with params analagous to the nucleotide GTR model) Muse and Gaut, 1994, Mol Biol Evol, 11, 715-24\n", " codon GY94 Goldman and Yang 1994 codon substitution model. N Goldman and Z Yang, 1994, Mol Biol Evol, 11(5):725-36.\n", " codon Y98 Yang's 1998 substitution model, a derivative of the GY94. Z Yang, 1998, Mol Biol Evol, 15(5):568-73\n", " codon H04G Huttley 2004 CpG substitution model. Includes a term for substitutions to or from CpG's. GA Huttley, 2004, Mol Biol Evol, 21(9):1760-8\n", " codon H04GK Huttley 2004 CpG substitution model. Includes a term for transition substitutions to or from CpG's. GA Huttley, 2004, Mol Biol Evol, 21(9):1760-8\n", " codon H04GGK Huttley 2004 CpG substitution model. Includes a general term for substitutions to or from CpG's and an adjustment for CpG transitions. GA Huttley, 2004, Mol Biol Evol, 21(9):1760-8\n", " codon GNC General Nucleotide Codon, a non-reversible codon model. Kaehler, Yap, Huttley, 2017, Gen Biol Evol 9(1): 134–49\n", "-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n", "\n", "10 rows x 3 columns" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from cogent3 import available_models\n", "\n", "available_models(\"codon\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using the conditional nucleotide form codon model\n", "\n", "The CNFGTR model ([Yap et al](https://www.ncbi.nlm.nih.gov/pubmed/19815689)) is the most robust of the time-reversible codon models available ([Kaehler et al](https://www.ncbi.nlm.nih.gov/pubmed/28175284)). By default, this model does not optimise the codon frequencies but uses the average estimated from the alignment. We configure the model to optimise the root motif probabilities." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "key | \n", "lnL | \n", "nfp | \n", "DLC | \n", "unique_Q | \n", "\n", "\n", "
---|---|---|---|---|
\n", " | -6739.3067 | \n", "77 | \n", "True | \n", "\n", " |
log-likelihood = -6739.3067
\n", "number of free parameters = 77
\n", "A/C | \n", "A/G | \n", "A/T | \n", "C/G | \n", "C/T | \n", "omega | \n", "\n", "\n", "
---|---|---|---|---|---|
1.0656 | \n", "3.9391 | \n", "0.7851 | \n", "1.9475 | \n", "4.2265 | \n", "0.7569 | \n", "
edge | \n", "parent | \n", "length | \n", "\n", "\n", "
---|---|---|
Galago | \n", "root | \n", "0.5330 | \n", "
HowlerMon | \n", "root | \n", "0.1365 | \n", "
Rhesus | \n", "edge.3 | \n", "0.0659 | \n", "
Orangutan | \n", "edge.2 | \n", "0.0233 | \n", "
Gorilla | \n", "edge.1 | \n", "0.0075 | \n", "
Human | \n", "edge.0 | \n", "0.0182 | \n", "
Chimpanzee | \n", "edge.0 | \n", "0.0085 | \n", "
edge.0 | \n", "edge.1 | \n", "0.0000 | \n", "
edge.1 | \n", "edge.2 | \n", "0.0101 | \n", "
edge.2 | \n", "edge.3 | \n", "0.0352 | \n", "
edge.3 | \n", "root | \n", "0.0228 | \n", "
AAA | \n", "AAC | \n", "AAG | \n", "AAT | \n", "ACA | \n", "ACC | \n", "ACG | \n", "ACT | \n", "AGA | \n", "AGC | \n", "\n", "\n", "
---|---|---|---|---|---|---|---|---|---|
0.0540 | \n", "0.0242 | \n", "0.0307 | \n", "0.0543 | \n", "0.0237 | \n", "0.0063 | \n", "0.0021 | \n", "0.0297 | \n", "0.0238 | \n", "0.0280 | \n", "
AGG | \n", "AGT | \n", "ATA | \n", "ATC | \n", "ATG | \n", "ATT | \n", "CAA | \n", "CAC | \n", "CAG | \n", "CAT | \n", "\n", "\n", "
---|---|---|---|---|---|---|---|---|---|
0.0122 | \n", "0.0405 | \n", "0.0226 | \n", "0.0071 | \n", "0.0141 | \n", "0.0203 | \n", "0.0228 | \n", "0.0063 | \n", "0.0220 | \n", "0.0237 | \n", "
CCA | \n", "CCC | \n", "CCG | \n", "CCT | \n", "CGA | \n", "CGC | \n", "CGG | \n", "CGT | \n", "CTA | \n", "CTC | \n", "\n", "\n", "
---|---|---|---|---|---|---|---|---|---|
0.0165 | \n", "0.0043 | \n", "0.0021 | \n", "0.0239 | \n", "0.0022 | \n", "0.0012 | \n", "0.0035 | \n", "0.0058 | \n", "0.0123 | \n", "0.0065 | \n", "
CTG | \n", "CTT | \n", "GAA | \n", "GAC | \n", "GAG | \n", "GAT | \n", "GCA | \n", "GCC | \n", "GCG | \n", "GCT | \n", "\n", "\n", "
---|---|---|---|---|---|---|---|---|---|
0.0098 | \n", "0.0105 | \n", "0.0703 | \n", "0.0112 | \n", "0.0263 | \n", "0.0310 | \n", "0.0154 | \n", "0.0083 | \n", "0.0036 | \n", "0.0145 | \n", "
GGA | \n", "GGC | \n", "GGG | \n", "GGT | \n", "GTA | \n", "GTC | \n", "GTG | \n", "GTT | \n", "TAC | \n", "TAT | \n", "\n", "\n", "
---|---|---|---|---|---|---|---|---|---|
0.0151 | \n", "0.0072 | \n", "0.0051 | \n", "0.0139 | \n", "0.0170 | \n", "0.0077 | \n", "0.0094 | \n", "0.0210 | \n", "0.0036 | \n", "0.0171 | \n", "
TCA | \n", "TCC | \n", "TCG | \n", "TCT | \n", "TGC | \n", "TGG | \n", "TGT | \n", "TTA | \n", "TTC | \n", "TTG | \n", "\n", "\n", "
---|---|---|---|---|---|---|---|---|---|
0.0220 | \n", "0.0083 | \n", "0.0039 | \n", "0.0214 | \n", "0.0038 | \n", "0.0033 | \n", "0.0201 | \n", "0.0222 | \n", "0.0051 | \n", "0.0107 | \n", "
TTT | \n", "\n", "\n", "
---|
0.0146 | \n", "