Apply a non-stationary nucleotide model to an alignment with a tree¶

We analyse an alignment with sequences from 6 primates.

[1]:

from cogent3.app import io

reader = io.load_aligned(format="fasta", moltype="dna")
aln = reader("../data/primate_brca1.fasta")
aln.names

[1]:

['Chimpanzee',
 'Galago',
 'Gorilla',
 'HowlerMon',
 'Human',
 'Orangutan',
 'Rhesus']

Specify the tree via a tree instance¶

[2]:

from cogent3 import load_tree
from cogent3.app import evo

tree = load_tree("../data/primate_brca1.tree")
gn = evo.model("GN", tree=tree)
gn

[2]:

model(type='model', sm='GN', tree='root', name=None, sm_args=None, lf_args=None, time_het=None, param_rules=None, opt_args=None, split_codons=False, show_progress=False, verbose=False)

Specify the tree via a path.¶

[3]:

gn = evo.model("GN", tree="../data/primate_brca1.tree")
gn

[3]:

model(type='model', sm='GN', tree='../data/primate_brca1.tree', name=None, sm_args=None, lf_args=None, time_het=None, param_rules=None, opt_args=None, split_codons=False, show_progress=False, verbose=False)

Apply the model to an alignment¶

[4]:

fitted = gn(aln)
fitted

[4]:

GN
key	lnL	nfp	DLC	unique_Q
	-6987.8834	25	True

In the above, no value is shown for unique_Q. This can happen because of numerical precision issues.

NOTE: in the display of the lf below, the “length” parameter is not the ENS. It is, instead, just a scalar.

[5]:

fitted.lf

[5]:

GN

log-likelihood = -6987.8834

number of free parameters = 25

Global params
A>C	A>G	A>T	C>A	C>G	C>T	G>A	G>C	G>T	T>A
0.8700	3.6669	0.9111	1.5925	2.1264	6.0323	8.2178	1.2288	0.6294	1.2498

T>C
3.4136

Edge params
edge	parent	length
Galago	root	0.1735
HowlerMon	root	0.0450
Rhesus	edge.3	0.0215
Orangutan	edge.2	0.0078
Gorilla	edge.1	0.0025
Human	edge.0	0.0061
Chimpanzee	edge.0	0.0028
edge.0	edge.1	0.0000
edge.1	edge.2	0.0033
edge.2	edge.3	0.0121
edge.3	root	0.0077

Motif params
A	C	G	T
0.3756	0.1768	0.2078	0.2398