Team:Yale/Modeling
From 2012.igem.org
(→Survey for off-target binding sites) |
|||
Line 16: | Line 16: | ||
The distribution of specific mutations in MAGE is a stochastic process that we model as as a function of each oligo's predicted efficiency of allelic replacement (which can be estimated in ''E. coli'' as discussed in "Programming cells by multiplex genome engineering and accelerated evolution," Wang & Isaacs et al 2009), assuming that each mutation event is binary and exclusive. Then a population after ''c'' cycles is a weighted sum of ''n'' Bernoulli trials, each zero if the oligo does not mutate its target 'i' and otherwise equal to the number ''r'' of mutations it induces. Given efficiencies of allelic replacement ''p'', this probability mass function becomes: | The distribution of specific mutations in MAGE is a stochastic process that we model as as a function of each oligo's predicted efficiency of allelic replacement (which can be estimated in ''E. coli'' as discussed in "Programming cells by multiplex genome engineering and accelerated evolution," Wang & Isaacs et al 2009), assuming that each mutation event is binary and exclusive. Then a population after ''c'' cycles is a weighted sum of ''n'' Bernoulli trials, each zero if the oligo does not mutate its target 'i' and otherwise equal to the number ''r'' of mutations it induces. Given efficiencies of allelic replacement ''p'', this probability mass function becomes: | ||
- | [[Image:Eqns1.png]] | + | [[Image:Eqns1.png|center]] |
- | [[Image:Eqns2.png]] | + | [[Image:Eqns2.png|center]] |
In doing this, we have derived a more general form of the binomial distribution. Computing this PMF involves solving the subset sum problem, but we optimized our algorithm to avoid slowdowns by using a recursive formula (Wadyicki, Shah et al. 1973) for the occasional, simpler case when all oligos carry the same number of mutations, and in other cases a branched, dynammic programming algorithm (Horowitz and Sahni 1974). | In doing this, we have derived a more general form of the binomial distribution. Computing this PMF involves solving the subset sum problem, but we optimized our algorithm to avoid slowdowns by using a recursive formula (Wadyicki, Shah et al. 1973) for the occasional, simpler case when all oligos carry the same number of mutations, and in other cases a branched, dynammic programming algorithm (Horowitz and Sahni 1974). |
Revision as of 23:52, 26 October 2012
Home | Team | Official Team Profile | Project | Parts Submitted to the Registry | Modeling | Notebook | Safety | Attributions |
---|
Modeling the evolution of a population during MAGE
The distribution of specific mutations in MAGE is a stochastic process that we model as as a function of each oligo's predicted efficiency of allelic replacement (which can be estimated in E. coli as discussed in "Programming cells by multiplex genome engineering and accelerated evolution," Wang & Isaacs et al 2009), assuming that each mutation event is binary and exclusive. Then a population after c cycles is a weighted sum of n Bernoulli trials, each zero if the oligo does not mutate its target 'i' and otherwise equal to the number r of mutations it induces. Given efficiencies of allelic replacement p, this probability mass function becomes:
In doing this, we have derived a more general form of the binomial distribution. Computing this PMF involves solving the subset sum problem, but we optimized our algorithm to avoid slowdowns by using a recursive formula (Wadyicki, Shah et al. 1973) for the occasional, simpler case when all oligos carry the same number of mutations, and in other cases a branched, dynammic programming algorithm (Horowitz and Sahni 1974).
Survey for off-target binding sites
Not all MAGE-induced mutations will be at the intended sites; to identify likely unintended mutations, we scripted a search of the genome using BLAST+ to find subsequences with four base pairs or more matching oligos in the MAGE oligo pool, and estimates the change in Gibbs energy likely upon hybridization at each such off-target pairing, using the UNAFold software package.
Both of these scripts will be bundled into an cloud-based tool for genomic engineering (unpublished work).