Applying GNC, a non-stationary codon model

See Kaehler et al for the formal description of this model. Note that we demonstrate hypothesis testing using this model elsewhere.

We apply this to a sample alignment.

[1]:
from cogent3.app import io, evo

loader = io.load_aligned(format="fasta", moltype="dna")
aln = loader("../data/primate_brca1.fasta")

The model is specified using it’s abbreviation.

[2]:
model = evo.model("GNC", tree="../data/primate_brca1.tree")
result = model(aln)
result
[2]:
GNC
key lnL nfp DLC unique_Q
-6707.1856 83 True
[3]:
result.lf
[3]:

GNC

log-likelihood = -6707.1856

number of free parameters = 83

Global params
A>C A>G A>T C>A C>G C>T G>A G>C G>T T>A
0.8618 3.5392 0.9785 1.6710 2.2023 6.2632 7.8953 1.2215 0.7983 1.2838
T>C omega
3.0618 0.8201
Edge params
edge parent length
Galago root 0.5233
HowlerMon root 0.1331
Rhesus edge.3 0.0639
Orangutan edge.2 0.0234
Gorilla edge.1 0.0075
Human edge.0 0.0182
Chimpanzee edge.0 0.0085
edge.0 edge.1 0.0000
edge.1 edge.2 0.0100
edge.2 edge.3 0.0368
edge.3 root 0.0246
Motif params
AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC
0.0557 0.0228 0.0352 0.0548 0.0234 0.0032 0.0000 0.0320 0.0224 0.0285
AGG AGT ATA ATC ATG ATT CAA CAC CAG CAT
0.0146 0.0379 0.0184 0.0074 0.0120 0.0181 0.0194 0.0053 0.0254 0.0236
CCA CCC CCG CCT CGA CGC CGG CGT CTA CTC
0.0213 0.0065 0.0000 0.0280 0.0000 0.0011 0.0011 0.0021 0.0154 0.0073
CTG CTT GAA GAC GAG GAT GCA GCC GCG GCT
0.0135 0.0107 0.0772 0.0088 0.0298 0.0318 0.0169 0.0107 0.0010 0.0130
GGA GGC GGG GGT GTA GTC GTG GTT TAC TAT
0.0147 0.0099 0.0079 0.0112 0.0148 0.0064 0.0073 0.0207 0.0021 0.0086
TCA TCC TCG TCT TGC TGG TGT TTA TTC TTG
0.0224 0.0074 0.0000 0.0275 0.0011 0.0043 0.0212 0.0198 0.0085 0.0102
TTT
0.0181

We can obtain the tree with branch lengths as ENS

If this tree is written to newick (using the write() method), the lengths will now be ENS.

[4]:
tree = result.tree
fig = tree.get_figure()
fig.scale_bar = "top right"
fig.show(width=500, height=500)