Team:Paris Bettencourt/Semantic containment


iGEM Paris Bettencourt 2012

Semantic containment


  • Creating a semantic containment system to prevent gene expression in natural organisms
  • Characterize the system
  • Use this system in all genes of the system, the critical genes first (e.g. colicin)


  • An amber codon (stop codon) embedded in protein genes to prevent their expression and an amber suppressor system in our genetically engineered bacteria

Achievements :

  • Construction and characterization of 2 biobricks :
    • K914000 : PLac-supD-T : tRNA amber suppressor
    • K914009 : P1003* Ser133->Amber Codon : kanamycin gene resistance with 1 amber mutation

Both part were well characterized and works well. For the second parts, we show that as expected, one mutation is quite leaky, although it works qualitatively, but one mutation is not enough if we want to release such parts in nature. Other reasons emphasize this observation, notably the weakness of being at one mutation to recover the protein functionality.

  • Creation of a new category in the part registry : Semantic containment. The aim of this category is to let people improving each part by adding for instance other amber mutations to existing part to increase the containment.

Achievements :

  • Construction and characterization of 1 biobrick :
    • K914018 : P1003** Ser133 & Ser203 ->Amber Codon : kanamycin gene resistance with 1 amber mutation
  • Construction of 1 plasmid backbone :
    • K914012 : pSB1A2 with one Amber Codon : Ampicillin gene resistance with 1 amber mutation



We want to prevent our genetic construct from conferring an advantage to other organisms or alter them phenotypically. Minimizing horizontal gene transfer (HGT), either by conjugation, transduction, or transformation is thus our main concern. As these processes involve two parties, the genetically modified bacteria and some wild type population partners, and as we will not modify wild type populations, we cannot assume that HGT is fully avoidable. Semantic containment[1] means that our bacteria won't be able to "speak" with other organisms, since they don't speak the same "language", the language being DNA. Our system will read the stop codon TAG as the amino-acid serine. It means that in our bacteria the stop codon will be translated into a serine, whereas in wild type bacteria this protein will be truncated and will not confer an advantage to these cells. The 'TAG' codon has been chosen because of its low frequency in E. coli genome (314 occurrences), and also because for further applications, Church Lab tries to remove all amber codon of an E. coli strain[2]. Although it has been demonstrate that the over-expression of a tRNA amber suppressor sole does not affect its growth rate nor the morphology of E. coli[3].

Figure 1 : Semantic containment principle. The information carried by the DNA cannot be read by other organism


We want to create a new way to contain semantically genes, by replacing an amino-acid codon by a stop codon (amber codon, 'TAG'), and in our synthetic cell this stop codon will be read as the amino-acid. Here we want to show that this system works as expected. First we had to choose between two tRNA amber suppressor, either serine, or tyrosine that are available in the part registry. For that we calculate the abilities for the amber codon to reverse to a serine or tyrosine or related amino-acid that could conserve the function. Secondly we create a biobrick with a tRNA Amber suppressor (BBa_K914000), in order to have a reliable biobrick, with characterization of it. Thirdly, to test the latter biobrick, we built a biobrick which is a kanamycin resistance gene (P1003) with one amber mutation added instead of the serine 133 (BBa_K914009).

K914000 is the construction PLac-supD-T, and is named supD in the rest of the page. K914009 is the P1003 gene (kanamycin resistance gene) with the serine 133 which is replaced by a amber codon 'TAG'. Its name is Kan*.

With more time we will try to increase the robustness of this system, which is null when the tRNA amber suppressor is transferred too. We will try to create a new library of plasmid backbones in the part registry, where all backbones have at least two amber mutations. The idea is that all the community will be able to improve this library, either by adding new contained backbones, or by adding amber mutations on the same backbone, or add semantic containment to any other gene.


How to choose between serine and tyrosine ?

At one mutation of the 'TAG' codon we can have one serine or two tyrosine (among others). Even if the serine seems more interesting by this simple observation, we still want to know which of these two amino-acids is the less robust to mutation from a 'TAG' codon (amber codon) including similarity property of amino-acids. We calculated a score of weakness. The weakness of an amino-acid is defined here by its abilities to not revert to the same amino-acid or any other similar, from the amber codon. The score is calculated using the following formula :

Where i is one of the nine amino-acids accessible after 1 mutation. Subst(AA,AAi) calculate the similarity score, using a BLOSUM100 matrix, between serine or tyrosine (AA) and one of the nine amino-acids around (AAi). The lowest score is the weakest. The BLOSUM100 (BLOck SUbstitution Matrix) is constructed using local alignment of sequenced less than 100% identical[4,5], and is consequently adapted to appreciate the effect of a single mutation.

Weakness calculation

We wrote a Python script that can calculate the score of weakness, with all amino-acids (not only serine or tyrosine), and will sort a list of score. Data shown in appendix for other amino-acids.

The scores are, with a mutation rate of 10-9 for the serine and tyrosine:

  • BLOSUM100 :
    • Serine : ScoreSW = -3.33e-09
    • Tyrosine : ScoreYW = -1.00e-09

The amber codon is less likely to revert into a serine or similar.

Also, we favored serine replacement over tyrosine, because the frequency of serine in the E. coli genome is superior to the tyrosine one. It turns out that S has 57,88 codons over 1000 codons when Y has 28,59 codons over 1000 codons (Codon usage). Therefor, it might be more convenient to replace a serine than a tyrosine, because proteins would more likely to contain many serines than tyrosines.

Can the tRNA rescues the KanR phenotype ? (Qualitative experiment)

Figure 3 : Our hypothesis : supD can rescue the KanR phenotype. A production of the mRNA from the Kan* gene. B The mRNA is translated only if there is a tRNA amber suppressor (supD)

To do so, we will transform a plasmid with a the Kan* gene into a MG1655 strain that contain either pSB1C3::supD, or pSB1C3::RFP. We plate them on Chloramphenicol and Kanamycin. The Kan* is supposed to be non functional without an amber suppressor.

Is it working well? Is the amber mutation leaky? (Quantitative experiment)

Figure 2 : Schema of the four strains used for the quantitative experiments

We used the following strains, all in the E. coli strain MG1655. To quantify the leakiness in terms of expression of the Kan* gene, we performed real time experiment, where we measured the growth rate (through OD600 measurement) of each of these strains in different concentrations of kanamycin :

(1) pSB1A1::KanR + pSB1C3::supD:
positive control : Expresses constitutively kanamycine resistance gene.

(2) pSB1A1::Kan* + pSB1C3::supD:
construction : Is supD as efficient as the positive control?

(3) pSB1A1::Kan* + pSB1C3::RFP:
construction : Is Kan* as unefficient as the negative control?

(4) pSB1A2::PLac + pSB1C3::supD:
negative control : No kanamycin resistance gene.

In case of leakiness, the Kan* + RFP strain (3) will be able to grow in higher concentration of kanamycin than the negative control (4).

Experiments and results

Qualitative characterization of K914000 (supD) and K914009 (Kan*)

Figure 3 : Number of CFU/µg of plasmid after the different concentration

After preparing electro-competent M1655 cells with either pSB1C3::supD or pSB1C3::RFP. We transform the plasmid pSB1A1::Kan* in both competent cells. After transformation, cells are plated on Cm+Kan.

We can observe that without any plasmids transformed no cells grow, or when we transform another plasmid with no Kan resistance gene, but with an Amp gene resistance (and plated on Cm+Amp), colonies appear in the strain with RFP, unfortunately the other control (with supD) did not work this time, hence there is no picture of it. But for the quantitative experiment we need that control too (4), and we manage to do it that times. But the pSB1A1::Kan* can express the kanamycin resistance phenotype only in the strain containing the supD gene.

We can conclude here that the supD gene can rescue the phenotype KanS by allowing the correct expression of the kan gene P1003.

Quantitative characterization of K914000 (supD) and K914009 (Kan*)

Here we characterize both biobrick quantitatively. First, we are going to confirm the qualitative result for the part K914000 and then we will determine how a single amino-acid substitution is leaky.

Experimental setup

Once the construction is made (Fig. 2), we double transformed the plasmids into MG1655 strains, that does not contain any amber suppressor. We will work with three replicates of each strains, and for each strain we will be in 8 different conditions of antibiotic resistance. The antibiotic used is kanamycin, an aminoglycoside interfering with the translation. The range of concentration goes from 4 times as much as the usual concentration (100 µg/mL) to 8 times less. The 96 wells plate is then incubated in a plate reader that take measurements of OD600 every 6 minutes. This measure is correlated to the number of cell in the well. Each well contains 200µL of LB (Lysogeny broth, aka Luria Bertani), chloramphenicol and ampicilin at their usual concentration, the dilution of cells and different amount of kanamycin. An overlay of 50µL of mineral oil is added on the top. The measurement lasted approximately 16 hours and 30 min.

Figure 4 : The over night (O/N) culture is diluted twice, first to normalize all samples, and then to start with low concentration of cells.


First we observed that the qualitative result is reproduce here, the supD gene rescues the kanamycin resistance at any concentration of the antibiotic. It shows also that there is no disadvantage to use the supD amber suppressor compared to the wild type kanamycin gene resistance, since there is no difference of growth rate (Figure 4B and 4C). However the leakiness of the Kan* gene is higher than expected, and thus one mutation is not sufficient for the containment. Indeed, at usual concentration, Kan* gene manages to express an antibiotic resistance, even though lower than KanR or supD (Figure 4A). Here we define the growth rate as the OD600 observed at a given time (here t=8h20'), in order to overcome the fact that Kan*+RFP does not have a clear exponential phase. The fact that the strain (3) grows faster and can have higher OD600 may be explain by the fact that supD is deleterious for the strain, by removing some stop codon of other genes for instance.

Figure 4A and 4B: The bar graph represent the OD600 at time = 8h20' (black line Figure 4C) for the different strains at a given kanamycin concentration (black line on Figure 4B). On B, we observe the variations of the OD600 at different kanamycin concentrations, at the same time.
Figure 4C: The variations of the OD600 is observed in function of the time, for different concentrations of kanamycin (400µg/mL, 100µg/mL, 25µg/mL).

Quantitative characterization of K914018 (Kan**)

Here we perform the exact same experiment than the previous one, except that kanamycin resistant gene is contained with two amber mutations instead of one. It is observed that the positive and negative controls behave correctly, and similarly to the previous experiment, it is also the case for the complementation of Kan** with supD. That means that two mutation are not a problem for the cell. However, we notice that there is no more leakiness in the system at usual concentration, even at low concentration. Actually the double kanamycin resistant gene with two mutation behave exactly like the negative control (Plac + supD) which means that semantic containment is working well.

Figure 5A and 5B: The bar graph represent the OD600 at time = 8h37' (black line Figure 5C) for the different strains at a given kanamycin concentration (black line on Figure 5B). On B, we observe the variations of the OD600 at different kanamycin concentrations, at the same moment.
Figure 5C: The variations of the OD600 is observed in function of the time, for different concentrations of kanamycin (400µg/mL, 100µg/mL, 25µg/mL and 3.125µg/mL).

Comparison between one and two mutation

Here is a graph that gather both results, to highlight the effect of the second mutation. The results are normalized intra group (experiment with the one-mutated gene, and with the two-mutated gene) in order to have the same scale to compare them. We took the average of the positive (KanR+supD) and the negative (Plac+supD) controls.

Figure 6: Variation of OD600 in different concentrations of [Kan] (µg/mL) at t= 8,37h

Conclusion & Perspectives

This work demonstrates that semantic containment can be achieved by changing an amino-acid into an amber codon. We saw that one codon replacement is not enough because of some leakiness, and demonstrated that upon the replacement of two codons, the system isn't leaky at all. It means that for a robust semantic containment system we need 3 codon replacements, because with two replacements we are only one mutation away of a leaky expression. The underlying idea is in fine to create a library of semantic-contained genes and backbones that would be improved by anybody, by either adding new semantic systems, or increasing the degree of containment by adding mutations on already existing parts. This library is already started with the previously described part and with a plasmid backbone that has only one mutation so far. But we also showed that tRNA amber suppressor might be deleterious for the cell. This trade-off between the leakiness and the toxicity has to be studied in order to optimize the system.

Further experiments should improve the robustness of this system by adding a new security component that would ensure that as much as three genes would have to be transferred by HGT, to express synthetic gene in natural bacteria, thus cousing our containment system to fail. In our current design, the system fails if the tRNA amber suppressor gene is transferred with another semantic gene into a natural bacteria. The idea is to add a semantic containment for the tRNA. Since it is a tRNA, we cannot replace amino-acids because it is not translated. In order to fix that problem we will construct the following system. The tRNA supD gene will be under a T7 promoter which is orthogonal, meaning that it needs a special RNA polymerase (T7 RNA polymerase) to transcribe the gene (here it would be supD), but we would mutate the T7 RNA polymerase with several amber mutations. It means that we would need to transfer these two genes with the semantically contained gene in order to have something functional in the other organism, which is very unlikely to happen. Further experiments will elucidate the probability of such an event. A scheme of this system is depicted in Figure 5. This system needs to be activated in the lab, for example by transforming a plasmid with the wild type T7 RNA polymerase gene with another antibiotic gene resistance, say X. Then we would remove antibiotic X and loose the plasmid with wild type T7 polymerase when the positive feedback loop starts. We could also activate the system by transforming a wild-type T7 RNA polymerase carrying plasmid with conditional origin of replication, such as temperature-sensitive plasmids.

Figure 5 : Once the mRNA of the T7 RNA polymerase is transcribed (A), it needs the tRNA amber suppressor (B) to let the ribosome translate the mRNA into a functional protein. Then the T7 RNA polymerase will be able to transcribe the supD gene into this tRNA amber suppressor (C). Then it is a positive feedback loop, we need to start this system by adding either tRNA in the medium or T7 RNA pol wild type gene.

References & Appendix


1 - Marliere, P. The farther, the safer : a manifesto for securely navigating synthetic species away from the old living world. System and Synthetic Biology 3, 77-84 (2009). Paper

2 - Isaacs, F.J. et al. Precise manipulation of chromosomes in vivo enables genome-wide codon replacement. Science (New York, N.Y.) 333, 348-53 (2011). Paper

3 - Anderson, J.C., Voigt, C. a & Arkin, A.P. Environmental signal integration by a modular AND gate. Molecular systems biology 3, 133 (2007). Paper

4 - S Henikoff and J G Henikoff. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A. 1992 November 15; 89(22): 10915–10919. Paper

5 - Jorja G. Henikoff, Steven Henikoff, Blocks database and its applications, Methods in Enzymology, Academic Press, 266, 88-105 (1996). Paper


Genotype of strain used

  • MG1655 : F- λ- ilvG- rfb-50 rph-1

Other amino-acids weakness score

  • Weakness score with a BLOSUM 100, mutation rate = 10-9, initial Codon : TAG

main result part

('A', [-4.000000002555556e-09])

('C', [-6.777777782148149e-09])

('D', [-5.2222222257407416e-09])

('E', [-3.222222225407407e-09])

('F', [-3.0000000040740746e-09])

('G', [-6.555555560444445e-09])

('H', [-2.888888891555555e-09])

('I', [-5.111111115777778e-09])

('K', [-2.6666666694074067e-09])

('L', [-4.333333337555555e-09])

('M', [-4.000000003185186e-09])

('N', [-4.3333333354074076e-09])

('P', [-5.888888893518519e-09])

('Q', [-1.6666666689629633e-09])

('R', [-3.888888891888888e-09])

('S', [-3.333333334925926e-09])

('T', [-4.22222222462963e-09])

('V', [-4.888888893111112e-09])

('W', [-2.4444444508518513e-09])

('Y', [-1.0000000044444444e-09])

Copyright (c) 2012 All rights reserved. Design by FCT.