Likelihood analysis of multiple loci¶
Section author: Gavin Huttley
We want to know whether an exchangeability parameter is different between alignments. We will specify a null model, under which each alignment get’s it’s own motif probabilities and all alignments share branch lengths and the exchangeability parameter kappa (the transition / transversion ratio). We’ll split the example alignment into two-pieces.
>>> from cogent3 import load_aligned_seqs, make_tree, make_table
>>> from cogent3.evolve.models import HKY85
>>> from cogent3.recalculation.scope import EACH, ALL
>>> from cogent3.maths.stats import chisqprob
>>> aln = load_aligned_seqs("data/long_testseqs.fasta")
>>> half = len(aln)//2
>>> aln1 = aln[:half]
>>> aln2 = aln[half:]
We provide names for those alignments, then construct the tree, model instances.
>>> loci_names = ["1st-half", "2nd-half"]
>>> loci = [aln1, aln2]
>>> tree = make_tree(tip_names=aln.names)
>>> mod = HKY85()
To make a likelihood function with multiple alignments we provide the list of loci names. We can then specify a parameter (other than length) to be the same across the loci (using the imported ALL
) or different for each locus (using EACH
). We conduct a LR test as before.
>>> lf = mod.make_likelihood_function(tree,loci=loci_names,digits=2,space=3)
>>> lf.set_param_rule("length", is_independent=False)
>>> lf.set_param_rule("kappa", loci=ALL)
>>> lf.set_alignment(loci)
>>> lf.optimise(local=True)
>>> print(lf)
Likelihood function statistics
log-likelihood = -9168.3331
number of free parameters = 2
==============
kappa length
--------------
3.98 0.13
--------------
====================================
locus A C G T
------------------------------------
1st-half 0.38 0.18 0.21 0.22
2nd-half 0.35 0.19 0.22 0.24
------------------------------------
>>> all_lnL = lf.lnL
>>> all_nfp = lf.nfp
>>> lf.set_param_rule('kappa', loci=EACH)
>>> lf.optimise(local=True, show_progress=False)
>>> print(lf)
Likelihood function statistics
log-likelihood = -9167.5373
number of free parameters = 3
======
length
------
0.13
------
================
locus kappa
----------------
1st-half 4.33
2nd-half 3.74
----------------
====================================
locus A C G T
------------------------------------
2nd-half 0.35 0.19 0.22 0.24
1st-half 0.38 0.18 0.21 0.22
------------------------------------
>>> each_lnL = lf.lnL
>>> each_nfp = lf.nfp
>>> LR = 2 * (each_lnL - all_lnL)
>>> df = each_nfp - all_nfp
Just to pretty up the result display, I’ll print(a table consisting of the test statistics created on the fly.)
>>> print(make_table(header=['LR', 'df', 'p'],
... rows=[[LR, df, chisqprob(LR, df)]], digits=2, space=3))
================
LR df p
----------------
1.59 1 0.21
----------------