Likelihood analysis of multiple loci
====================================

.. sectionauthor:: Gavin Huttley

We want to know whether an exchangeability parameter is different between alignments. We will specify a null model, under which each alignment get's it's own motif probabilities and all alignments share branch lengths and the exchangeability parameter kappa (the transition / transversion ratio). We'll split the example alignment into two-pieces.

.. doctest::

    >>> from cogent3 import load_aligned_seqs, make_tree, make_table
    >>> from cogent3.evolve.models import HKY85
    >>> from cogent3.recalculation.scope import EACH, ALL
    >>> from cogent3.maths.stats import chisqprob
    >>> aln = load_aligned_seqs("data/long_testseqs.fasta")
    >>> half = len(aln)//2
    >>> aln1 = aln[:half]
    >>> aln2 = aln[half:]

We provide names for those alignments, then construct the tree, model instances.

.. doctest::

    >>> loci_names = ["1st-half", "2nd-half"]
    >>> loci = [aln1, aln2]
    >>> tree = make_tree(tip_names=aln.names)
    >>> mod = HKY85()

To make a likelihood function with multiple alignments we provide the list of loci names. We can then specify a parameter (other than length) to be the same across the loci (using the imported ``ALL``) or different for each locus (using ``EACH``). We conduct a LR test as before.

.. doctest::

    >>> lf = mod.make_likelihood_function(tree,loci=loci_names,digits=2,space=3)
    >>> lf.set_param_rule("length", is_independent=False)
    >>> lf.set_param_rule("kappa", loci=ALL)
    >>> lf.set_alignment(loci)
    >>> lf.optimise(local=True)
    >>> print(lf)  # doctest: +SKIP
    Likelihood function statistics
    log-likelihood = -9168.3331
    number of free parameters = 2
    ==============
    kappa   length
    --------------
     3.98     0.13
    --------------
    ====================================
       locus      A      C      G      T
    ------------------------------------
    1st-half   0.38   0.18   0.21   0.22
    2nd-half   0.35   0.19   0.22   0.24
    ------------------------------------
    >>> all_lnL = lf.lnL
    >>> all_nfp = lf.nfp
    >>> lf.set_param_rule('kappa', loci=EACH)
    >>> lf.optimise(local=True, show_progress=False)
    >>> print(lf)  # doctest: +SKIP
    Likelihood function statistics
    log-likelihood = -9167.5373
    number of free parameters = 3
    ======
    length
    ------
      0.13
    ------
    ================
       locus   kappa
    ----------------
    1st-half    4.33
    2nd-half    3.74
    ----------------
    ====================================
       locus      A      C      G      T
    ------------------------------------
    2nd-half   0.35   0.19   0.22   0.24
    1st-half   0.38   0.18   0.21   0.22
    ------------------------------------
    >>> each_lnL = lf.lnL
    >>> each_nfp = lf.nfp
    >>> LR = 2 * (each_lnL - all_lnL)
    >>> df = each_nfp - all_nfp

Just to pretty up the result display, I'll print(a table consisting of the test statistics created on the fly.)

    >>> print(make_table(header=['LR', 'df', 'p'],
    ...             rows=[[LR, df, chisqprob(LR, df)]], digits=2, space=3))
    ================
      LR   df      p
    ----------------
    1.59    1   0.21
    ----------------