Testing a hypothesis – non-stationary or time-reversible

We evaluate whether the GTR model is sufficient for a data set, compared with the GN (non-stationary general nucleotide model).

[1]:
from cogent3.app import io, evo, sample

loader = io.load_aligned(format="fasta", moltype="dna")
aln = loader("../data/primate_brca1.fasta")
[2]:
tree = "../data/primate_brca1.tree"
sm_args = dict(optimise_motif_probs=True)

null = evo.model("GTR", tree=tree, sm_args=sm_args)
alt = evo.model("GN", tree=tree, sm_args=sm_args)
hyp = evo.hypothesis(null, alt)
result = hyp(aln)
type(result)
[2]:
cogent3.app.result.hypothesis_result

result is a hypothesis_result object. The repr() displays the likelihood ratio test statistic, degrees of freedom and associated p-value>

[3]:
result
[3]:
Statistics
LR df pvalue
9.3813 6 0.1532
hypothesis key lnL nfp DLC unique_Q
null 'GTR' -6992.5741 19 True
alt 'GN' -6987.8834 25 True

In this case, we accept the null given the p-value is > 0.05. We still use this object to demonstrate the properties of a hypothesis_result.

hypothesis_result has attributes and keys

Accessing the test statistics

[4]:
result.LR, result.df, result.pvalue
[4]:
(9.381277657315877, 6, 0.15324334527517805)

The null hypothesis

This model is accessed via the null attribute.

[5]:
result.null
[5]:
GTR
key lnL nfp DLC unique_Q
-6992.5741 19 True
[6]:
result.null.lf
[6]:

GTR

log-likelihood = -6992.5741

number of free parameters = 19

Global params
A/C A/G A/T C/G C/T
1.2296 5.2478 0.9472 2.3389 5.9666
Edge params
edge parent length
Galago root 0.1727
HowlerMon root 0.0448
Rhesus edge.3 0.0215
Orangutan edge.2 0.0077
Gorilla edge.1 0.0025
Human edge.0 0.0060
Chimpanzee edge.0 0.0028
edge.0 edge.1 0.0000
edge.1 edge.2 0.0034
edge.2 edge.3 0.0119
edge.3 root 0.0076
Motif params
A C G T
0.3792 0.1719 0.2066 0.2423

The alternate hypothesis

[7]:
result.alt.lf
[7]:

GN

log-likelihood = -6987.8834

number of free parameters = 25

Global params
A>C A>G A>T C>A C>G C>T G>A G>C G>T T>A
0.8700 3.6670 0.9111 1.5925 2.1264 6.0324 8.2178 1.2288 0.6294 1.2499
T>C
3.4136
Edge params
edge parent length
Galago root 0.1735
HowlerMon root 0.0450
Rhesus edge.3 0.0215
Orangutan edge.2 0.0078
Gorilla edge.1 0.0025
Human edge.0 0.0061
Chimpanzee edge.0 0.0028
edge.0 edge.1 0.0000
edge.1 edge.2 0.0033
edge.2 edge.3 0.0121
edge.3 root 0.0077
Motif params
A C G T
0.3756 0.1768 0.2078 0.2398

Saving hypothesis results

You are advised to save these results as json using the standard json writer, or the db writer.

This following would write the result into a tinydb.

from cogent3.app.io import write_db

writer = write_db("path/to/myresults.tinydb", create=True, if_exists="overwrite")
writer(result)