{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Available genetic codes" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Code ID | \n", "Name | \n", "\n", "\n", "
---|---|
1 | \n", "Standard Nuclear | \n", "
2 | \n", "Vertebrate Mitochondrial | \n", "
3 | \n", "Yeast Mitochondrial | \n", "
4 | \n", "Mold, Protozoan, and Coelenterate Mitochondrial, and Mycoplasma/Spiroplasma Nuclear | \n", "
5 | \n", "Invertebrate Mitochondrial | \n", "
6 | \n", "Ciliate, Dasycladacean and Hexamita Nuclear | \n", "
9 | \n", "Echinoderm and Flatworm Mitochondrial | \n", "
10 | \n", "Euplotid Nuclear | \n", "
11 | \n", "Bacterial Nuclear and Plant Plastid | \n", "
12 | \n", "Alternative Yeast Nuclear | \n", "
13 | \n", "Ascidian Mitochondrial | \n", "
14 | \n", "Alternative Flatworm Mitochondrial | \n", "
15 | \n", "Blepharisma Nuclear | \n", "
16 | \n", "Chlorophycean Mitochondrial | \n", "
20 | \n", "Trematode Mitochondrial | \n", "
22 | \n", "Scenedesmus obliquus Mitochondrial | \n", "
23 | \n", "Thraustochytrium Mitochondrial | \n", "
\n", "17 rows x 2 columns
" ], "text/plain": [ "Specify a genetic code using either 'Name' or Code ID (as an integer or string)\n", "==============================================================================================\n", "Code ID Name\n", "----------------------------------------------------------------------------------------------\n", " 1 Standard Nuclear\n", " 2 Vertebrate Mitochondrial\n", " 3 Yeast Mitochondrial\n", " 4 Mold, Protozoan, and Coelenterate Mitochondrial, and Mycoplasma/Spiroplasma Nuclear\n", " 5 Invertebrate Mitochondrial\n", " 6 Ciliate, Dasycladacean and Hexamita Nuclear\n", " 9 Echinoderm and Flatworm Mitochondrial\n", " 10 Euplotid Nuclear\n", " 11 Bacterial Nuclear and Plant Plastid\n", " 12 Alternative Yeast Nuclear\n", " 13 Ascidian Mitochondrial\n", " 14 Alternative Flatworm Mitochondrial\n", " 15 Blepharisma Nuclear\n", " 16 Chlorophycean Mitochondrial\n", " 20 Trematode Mitochondrial\n", " 22 Scenedesmus obliquus Mitochondrial\n", " 23 Thraustochytrium Mitochondrial\n", "----------------------------------------------------------------------------------------------\n", "\n", "17 rows x 2 columns" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from cogent3 import available_codes\n", "\n", "available_codes()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In cases where a ``cogent3`` object method has a `gc` argument, you can just use the number under \"Code ID\" column.\n", "\n", "For example:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "0 | |
TombBat | TGTGGCACAAGTACTCATGCC |
FlyingFox | ..........A.G........ |
DogFaced | ..........A.......... |
FreeTaile | .........GA.......... |
LittleBro | .........GA.......... |
5 x 21 dna alignment
\n", "" ], "text/plain": [ "5 x 21 dna alignment: FlyingFox[TGTGGCACAAA...], DogFaced[TGTGGCACAAA...], FreeTaile[TGTGGCACAGA...], ..." ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from cogent3 import load_aligned_seqs\n", "\n", "nt_seqs = load_aligned_seqs(\"../data/brca1-bats.fasta\", moltype=\"dna\")\n", "nt_seqs[:21]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We specify the genetic code, and that codons that are incomplete as they contain a gap, are converted to `?`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "0 | |
TombBat | CGTSTHASSVQHENSSLLLT |
FlyingFox | ...NA....L....-...Y. |
DogFaced | ...N...N.L........Y. |
FreeTaile | ...D.....L.......... |
LittleBro | ...D.....L.......... |
5 x 20 protein alignment
\n", "" ], "text/plain": [ "5 x 20 protein alignment: FlyingFox[CGTNAHASSLQ...], DogFaced[CGTNTHANSLQ...], FreeTaile[CGTDTHASSLQ...], ..." ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "aa_seqs = nt_seqs.get_translation(gc=1, incomplete_ok=True)\n", "aa_seqs[:20]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Getting a genetic code with `get_code()`\n", "\n", "This function can be used directly to get a genetic code. We will get the code with ID 4." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "aa | \n", "IUPAC code | \n", "codons | \n", "\n", "\n", "
---|---|---|
Alanine | \n", "A | \n", "GCT,GCC,GCA,GCG | \n", "
Cysteine | \n", "C | \n", "TGT,TGC | \n", "
Aspartic Acid | \n", "D | \n", "GAT,GAC | \n", "
Glutamic Acid | \n", "E | \n", "GAA,GAG | \n", "
Phenylalanine | \n", "F | \n", "TTT,TTC | \n", "
Glycine | \n", "G | \n", "GGT,GGC,GGA,GGG | \n", "
Histidine | \n", "H | \n", "CAT,CAC | \n", "
Isoleucine | \n", "I | \n", "ATT,ATC,ATA | \n", "
Lysine | \n", "K | \n", "AAA,AAG | \n", "
Leucine | \n", "L | \n", "TTA,TTG,CTT,CTC,CTA,CTG | \n", "
Methionine | \n", "M | \n", "ATG | \n", "
Asparagine | \n", "N | \n", "AAT,AAC | \n", "
Proline | \n", "P | \n", "CCT,CCC,CCA,CCG | \n", "
Glutamine | \n", "Q | \n", "CAA,CAG | \n", "
Arginine | \n", "R | \n", "CGT,CGC,CGA,CGG,AGA,AGG | \n", "
Serine | \n", "S | \n", "TCT,TCC,TCA,TCG,AGT,AGC | \n", "
Threonine | \n", "T | \n", "ACT,ACC,ACA,ACG | \n", "
Valine | \n", "V | \n", "GTT,GTC,GTA,GTG | \n", "
Tryptophan | \n", "W | \n", "TGA,TGG | \n", "
Tyrosine | \n", "Y | \n", "TAT,TAC | \n", "
STOP | \n", "* | \n", "TAA,TAG | \n", "