# Team:WHU-China/Modeling

### From 2012.igem.org

### Future Perspective

To date, we have built all our three devices and tested the function of many biobrick parts. However, our aim is far more than just making a toy in the laboratory. We are trying to create a product that has clinical applications or can be utilized other areas.

First, we will amalgamate the three devices, namely, the *Fatty Acid Degradation*, the *Cellulose Synthesis*, and the *Colonization*, into one whole system. Since there are tens of genes and regulation elements in all and their total length is more than 30kb, a larger vector like λ phage rather than a plasmid is likely to be adopted. It may also be a good choice to integrate the three devices into the chromosome of bacteria.

Second, the *Escherichia coli* may be a good model for molecule cloning operations, but not suit for pharmacy since there may be risk in infection and diseases. We propose *Lactobacillus* as a better model because not only its safety has been well demonstrated in food industry, but also the yogurt made from the genetically modified *Lactobacillus* will possess the property of making you slim!

The function of the whole system in *Escherichia coli* or *Lactobacillus* will be tested both *in vitro* and *in vivo* to confirm the effect of our project. The gut microbiota and E.coslim will be inoculated in a glass tube, through which plasma made from different kinds of food will flow. We will test the changes of the microbe community inoculated. Furthermore, a gut microbiota transplantation experiment on mice may also be conducted for further confirmation.

Our product will be finally packaged into two capsules. Capsule A is E.coslim, which is able to influent the human body's absorption to high energy contained nutrients, regulate the gut microbiota and make you slimmer day by day. And Capsule B is xylose, which can induce the *Death Device* in E.coslim and avoid people from malnutrition after taking in E.coslim for a long enough time.

Last but not least, we not only create a new microbe that can make people slim, but also provide tools to sense the fatty acid, the glucose and the xylose in other circumstances. So the new biobricks we submit may be applied in a variety of areas, such as the degradation of waste oil, urine sugar test for diabetes patients, safety control of genetic engineering, and so on. We are looking forward to the day when E.coslim is truly applied in people's life, in clinical and more widely in other areas beyond our imagination.

### Model I: Fatty Acid Degradation

The Fatty Acid Degradation Device may be the most complicated part in our project, along with its great importance. The antagonistic relationship between gene fadR and other genes related to β oxidation -- the fadL, fadD, etc, -- makes it regulatable to the concentration of fatty acid in the environment. Thus, it is necessary to explore the quantitative response corresponding to the concentration change of fatty acid. We build an ordinary differential equations-based mathematical model to describe the device and find a proper set of parameters under which the proportion of the steady expression level of fadL to fadR changes broadly from 0.2 to 3.5. The model mathematically demonstrates the effect of the Fatty Acid Degradation Device and also provides meaningful clues for the optimization of the device in experiments.

### The Ordinary Differential Equations of the Model

We conduct an evaluation by mathematical modeling and build the ordinary differential equations (ODE) as follows:

①

For simplicity, all genes with a promoter PfadR and equally regulated by FadR are deemed as a whole and represented as FadX, i.e., FadX refers to FadL, FadD, FadE, FadA, FadB, FadI, FadJ. And the Complex, or variable *x _{7}*, refers to the Fatty Acyl-CoA-FadR Complex.

Parameters in the ODEs:

① *E* denotes the constitutive expression rate of FadR, and D the degradation rates of FadR, FadX and Complex, which is assumed equal.

② *a* denotes the affinity of FadR to the promoter PfadR, and *V* denotes the background expression rate of related genes.

③ *k _{1}* and

*k*denote the forward and reverse reaction rate coefficients, respectively.

_{2}*k*to

_{3}*k*are parameters related to enzyme-catalyzed reactions based on the Michaelis-Menten Equation. Specially,

_{6}

②

while *f* denotes the concentration of fatty acid outside the bacteria, *K _{L}* the Michaelis constant of FadL, and

*k*the maximal activity of FadL.

_{L}Details for the ODE can be illustrated in Fig 1.

**Fig 1** Illustration of the meaning of the ODE

### Analysis on the Steady State of the ODE

By setting the right side of the equations to zeros, we get algebra equations about the five variables at the steady state. And after elimination we obtain the cubit equation

(ak_{1}D^{3})x^{3} + (D^{3}k_{1} - ak_{1}D^{2}E)x^{2} + (k_{6}D^{2}V - k_{1}D^{2}E + k_{1}k_{3}DV + k_{2}k_{6}DV)x - k_{6}DEV - k_{2}k_{6}EV = 0 ③

And the value of each variable in its steady state (the balanced point) is

The ODE (1) is highly complicated and we adopt numerical methods to analyze its properties. First, we generate 100000 sets of parameters stochastically (all in interval [0,10], and this setting keeps unchanged without special statement) to see the root distribution of equation (3). The results show that there is only one real positive root in 99890 cases and 3 in the rest cases. No cases when the real positive root doesn't exist are found.

Then note that the balanced point in (4) may not be authentic when *k _{4}*＜

*k*and

_{3}*x*becomes negative, which is impossible to occur. So we stochastically generate 100 parameters in which

_{3}^{*}*k*＜

_{4}*k*, and after solve the ODE numerically we find that all variables tend to an asymptotic steady state except for

_{3}*x*, which approaches +∞ as t →+∞ (Fig 2). Interestingly, we find that the authentic balanced point of

_{3}*x*and

_{1}*x*calculated by directly solving the ODE (1) numerically is very

_{2}*close*to that calculated by formula (4). For example, when

*E*=4.5249,

*a*=8.0649,

*V*=2.5906,

*D*=1.6831,

*k*=5.2315,

_{1}*k*=8.6560,

_{2}*k*=8.7696,

_{3}*k*=1.0092,

_{4}*k*=6.9635,

_{5}*k*=9.3253,

_{6}*x*and

_{1}*x*finally approach to 2.6599 and 0.0686, respectively, while

_{2}*x*=2.3952 and

_{1}^{*}*x*=0.0758 (Fig 2). The term

_{2}^{*}*close*may not be mathematically strict, but it plays an important role in the later discussion.

**Fig 2** Numerical simulation when *k _{4}*＜

*k*.

_{3}Besides, *x _{4}^{*}* and

*x*may also be negative. We also generate 100000 sets of parameters stochastically (without the limitation of

_{5}^{*}*k*＜

_{4}*k*) to see how frequently

_{3}*x*or

_{4}^{*}*x*will be negative and what will it be like. However, it turns out that in no case will

_{5}^{*}*x*or

_{4}^{*}*x*be negative. So we may draw a

_{5}^{*}*fuzzy*conclusion according to all the results above that under most conditions (99.89%), we can obtain a balanced point of ODE (1) which may not be authentic by formula (4). Fuzzy as the conclusion is, it is still useful to serve as an indicator for searching for a proper set of parameters, under which the Fatty Acid Degradation Device is highly regulatable.

### Parameter Screening

It is expected that when the concentration of fatty acid in the environment is high, the expression level of gene fadR is relatively low while that of gene fadX is relatively high, and vice versa. And among the 10 parameters in ODE (1), *k _{3}* is positively related to the concentration of fatty acid outside the bacteria according to formula (2). So we assume that the device is thought to be

*regulatable*when

*k*is equal to 0.5 and 1.5, the ratio of the expression level of fadX to fadR at steady state rises but is still lower than 0.5, and when

_{3}*k*is equal to 8.5 and 9.5, the ratio also rises and is both greater than 2.0.

_{3}We take advantage of the simplicity in calculating complexity of formula (4) to calculate the steady expression levels of fadR and fadX despite its possible errors. 10000 random parameters (*k _{3}* excluded) are generated and for each

*k*in [0.5, 1.5, 8.5, 9.5], the balanced point of ODE (1) is calculated according to formula (4), respectively. Then compare the ratio of the expression levels of fadX to fadR at the balanced point and save it if it meets the condition above. We verify all the parameters saved by directly solving the ODE (1) numerically to see if it really meets the condition. 182 out of 10000 sets of parameters are saved and 57 of them remain after the verification. A typical example below (Fig 3) illustrates the change of the expression levels of fadR and fadX and their ratio corresponding to

_{3}*k*. As

_{3}*k*increases, the ratio rises smoothly from 0.2 to 3.5, while the expression level of fadX rises from 0.6 to 2.0, and that of fadR decreases from 2.6 to 0.6.

_{3}**Fig 3** The change of the expression levels of fadR and fadX and their ratio corresponding to *k _{3}*.

### Conclusion

To evaluate the response of gene expression levels to the concentration of fatty acid in the environment quantitatively, we build a mathematical model based on ODE and demonstrate that the antagonistic relationship between fadR and fadX serves as a linear regulator to the gene expression. This is important for the function of Fatty Acid Degradation Device because the model suggests that the Device can adjust itself to an appropriate state when induced by fatty acid and function properly rather than changes drastically. So the Device is implied mathematically to possess a great potential of applications in human being.

### Gut Microbiota Regulation

Perhaps the most challenging part of our idea is that the E.coslim will make a difference not only by influence the metabolism but also interact with our inner ecosystem --- the Human Gut Microbiota. Collectively, the microbial associates that reside in and on the human body constitute our microbiota, and the genes they encode is known as our microbiome. Containing at least 100 trillion of cells, the human gut microbiota is of high complexity and diversity, with taxa across the tree of life, bacteria, eukaryotes, viruses and archaeons. As sometimes referred to our *forgotten organ*, it plays a major role in health and diseases in human, including obesity and diabetes. And maybe even more importantly, it interacts with the immune system, providing signals to promote the maturation of immune cells and the normal development of immune functions.

Though highly complicated, the gut microbiota is typically dominated by bacteria and specifically by members of the divisions Bacteriodetes and Firmicutes. And interestingly, it has been found that in animal model of obesity, the interplay of the two phyla is shifted with a significant reduction of Bacteriodetes and a corresponding increase of Firmicutes. This results in an increased capacity for harvesting energy from food and produces low-level inflammation. And more importantly, the imbalanced microbiota may be related to the recurrence of obesity after the treatment (whatever drugs, diets or exercises) stops.

Let us refocus on our versatile E.coslim. When establishing itself in the gut, E.coslim will interact with the microbiota and thus regulate them. Considering the importance of microbiota in human health, it is necessary to know what effects the E.coslim has on the gut microbiota. On one hand, it may be a great disaster for people if the establishment of E.coslim leads to the extinction of some major bacteria taxa. And on the other hand, it may be proven noneffective if E.coslim died gradually after intake.

We hope that our E. coslim can modulate gut microbiota therefore have long- term effect to help people lose weight

However, the metagenomic sequencing may be required to conduct the experiment to see the change of microbiota caused by E.coslim. Due to the limitation of time and funds, we as undergraduate students are not able to perform such experiments. Instead, we use a mathematical model to predict the results of the interaction between E.coslim and the gut microbiota. While only a concise description is presented in this page, **a detailed support information of this mathematical model can be downloaded here**.

Consistent with previous discoveries and without particularity, we assume that the gut microbiota is constituted by two populations of bacteria, the Firmicutes (x_{1}) and the Bacteriodetes (x_{2}). Bacterium within each population has the same properties, which is reflected in the equality of the corresponding mathematical parameters. And it is also assumed that there are two kinds of resources, glucose (S) and fatty acid (R), which are perfectly substitutable for both populations. We simplify this situation as an exploitative competition in a chemostat, and the ODEs are:

1. *S(t)* and *R(t)*: concentrations of glucose and fatty acid, respectively.

2. *x _{i}*: biomass of the competing populations at time t

3. *S ^{0}* and

*R*: concentrations of resource

^{0}*S*and

*R*in the feed bottle

4. D: dilution rate

[The specific death rates of the microorganisms are assumed to be insignificant compared to this dilution rate D]

5.*S _{i}* and

*R*: the rate of conversion of nutrient

_{i}*S*to biomass of population

*x*

_{i}
[if the conversion of nutrient to biomass is proportional to the amount of nutrient consumed, the consumption rate of resource *S* per unit of competitor *x _{i}* is denoted

*S*

_{i}[S(t),R(t)]/ξ

_{i}where ξ

_{i}is the respective growth yield constant. ]

6. *G _{i}*: the rate of conversion of nutrient to biomass of population

*x*

_{i}
[Since perfectly substitutable resources are alternative sources of the same essential nutrient, the rate of conversion of nutrient to biomass of population *x _{i}* is made up of a contribution from the consumption of resource

*S*as well as

*R*:

Here we choose

And let

They denote the maximal growth rate of population *x _{i}* on resource

*S(R)*when none of the other resource is available.

### Situation Before the Establishment of *E. coslim* Flora

A model is built to describe the quantitative relationship between Firmicutes and Bacteriodetes in obese people's intestines.

Parameters *m _{Si}(m_{Si})* can be assigned values to Firmicutes and Bacteriodetes so as to simulate the ability to utilize glucose and fatty acid. If their ability to use nutrient are given as follow:

Firmicutes | Bacteriodetes | |

Glucose | +++ (m) _{S1} |
++(m) _{S2} |

Fatty acid | + (m) _{R1} |
++(m) _{R2} |

Then we can set *m _{S1}*=2.25,

*m*=0.5,

_{R1}*m*=2.1,

_{S2}*m*=2.1. In order to make easier ODEs, we set

_{R2}*S*=

^{0}*R*=D=1 and ξ

^{0}_{i}/η

_{i}=100. Simulation result is shown in Fig 1.

**Fig 1 **

In this situation, the ratio N(Firmicutes)/N(Bacteriodetes) is rather high, usually achieving a value 8.0, and each of their absolute number, or concentration, is stable.

### Situation After the Establishment of *E. coslim* Flora

We try to add the E.coslim into system. E.coslilm consumes glucose as well as fatty acid, thus makes itself a competitor to Firmicutes and Bacteriodetes. While it is reproducing in intestines, the competition among these three types of bacteria makes the number change gradually. And we want to know the results of the establishment of the GEB in the gut.

The key point is to find out how competitive our E.coslim should be. In other words, we have to point out its ability to consume glucose and fatty acid---to study new parameters (*m _{S3 }*,

*m*). For example, if

_{R3}*m*＞

_{S3}*m*, then we conclude that E.coslim has stronger ability to consume glucose than Firmicutes. We try to find out an appropriate pair of (

_{S1}*m*,

_{S3 }*m*). And next we will discuss different situations of (

_{R3}*m*,

_{S3 }*m*), respectively.

_{R3}**
Situation 1. ( m_{S3 },m_{R3})=(2.5, 2.1) **

The number of "+" in the table below qulitatively represents the ability of the corresponding bacteria to consume the corresponding resource. "+" represents "low", "++" moderate, and "+++" strong, etc.

Firmicutes | Bacteriodetes | E.coslim | |

Glucose | +++ (m) _{S1} | ++(m) _{S2} | ++++(m)_{S3} |

Fatty acid | + (m) _{R1} | ++(m) _{R2} | ++(m)_{R3} |

The simulation result is shown in Fig 2.

**Fig 2**

**Situation 1'. ( m_{S3 },m_{R3})=(2.1,2.5) **

Firmicutes | Bacteriodetes | E.coslim | |

Glucose | +++ (m) _{S1} | ++(m) _{S2} | ++(m)_{S3} |

Fatty acid | + (m) _{R1} | ++(m) _{R2} | +++(m)_{R3} |

**Fig 3**

Situation 1 and 1' show that E.coslim are so competitive that others die out. This may be detrimental to human health since the importance of the gut microbiota.

**
Situation 2. ( m_{S3 },m_{R3})=(2.1,0.4) **

Firmicutes | Bacteriodetes | E.coslim | |

Glucose | +++ (m) _{S1} | ++(m) _{S2} | ++(m)_{S3} |

Fatty acid | + (m) _{R1} | ++(m) _{R2} | ＜+(m)_{R3} |

**Fig 4 **

In this situation E.coslim is too weak to survive in the gut, which refers to as noneffective.

**
Situation 3. ( m_{S3 },m_{R3})=(2.1,2.11) **

Firmicutes | Bacteriodetes | E.coslim | |

Glucose | +++ (m) _{S1} | ++(m) _{S2} | ++(m)_{S3} |

Fatty acid | + (m) _{R1} | ++(m) _{R2} | ++(m)_{R3} |

**Fig 5 **

**Situation 4. ( m_{S3 },m_{R3})=(2.25,0.5) **

Firmicutes | Bacteriodetes | E.coslim | |

Glucose | +++ (m) _{S1} | ++(m) _{S2} | +++(m)_{S3} |

Fatty acid | + (m) _{R1} | ++(m) _{R2} | +(m)_{R3} |

**Fig 6**

The simulation result shows that only in this situation will the E.coslim establish itself in the gut successfully without leading to the extinction of Firmicutes or Bacteriodetes. And surprisingly, the proportion of Firmicutes to Bacteriodetes also declines, as consistent to the condition in normal people. We may predict that E.coslim which suits Situation 4 can not only influence the human metabolism but also regulate the gut microbiota constitution, thus preventing the recurrence of obesity.

We also map the change of the consuming rate on each resource between the situation before and after the establishment of E.coslim (Situation 4 is used), as shown in Fig 7. And it is illustrated that the consuming rates of Bacteriodetes on both of the resources decline slightly, while that of Firmicutes is down-regulated acompanied with the increase of that of E.coslim.

**Fig 7 **

### Conclusion

By carefully assigning values to parameters in the mathematical model of the microbiota regulation, we found that the metabolic capacity of E.coslim should be controlled to a proper level, rather than as great as possible. Although it seems that a greater ability to degrade fatty acid and synthesize cellulose may further reduce the calorie people absorbed, it may also lead to potential healthy risk since the microbiota consititution may change drastically. The metabolic capacity indicated by related parameters also helps us in designing a perfect-functioning E.coslim.

### Gene Expression Measurement by *FANCY*

### Abstract

The quantitative measurement of gene expression levels is of great importance in the research of molecule biology, but existing measuring methods are either semi-quantitative (Western Blotting) or require costly instruments and agents (Flow Cytometry). We invent a novel quantitative and inexpensive method, *FANCY*, which is short for *Fluorescent ANalysis of CYto-imaging* (also in the honor of a wise and pretty girl Fan Cheng in our team) and based on cell fluorescence imaging and Support Vector Machine, to measure the gene expression in our project. *FANCY* contains three sub-programme: *FANCYSelector* to select single cells manually and calculate their properties, *FANCYTrainer* to train an SVM, and *FANCYScanner* to identify single cells by SVM trained by *FANCYTrainer*. The application of *FANCY* to 5 typical images proves the effectiveness and the error rate is acceptable. All our programs are freely available online and we suggest *FANCY* more widely applied in iGEM and other further researches.

### Protocol

**Preparation of fluorescent images **

1. Connect a red fluorescent protein gene (gene *rfp*, or biobrick 1-22O) with the objective gene under the same promoter in a plasmid by molecule cloning, i.e., make a polycistron which can encode both the objective gene product and RFP, thus making the expression levels of the two genes the same. Other fluorescent genes such as *gfp* or *yfp* may be also applicable.

2. Transform the plasmid into E.coli and incubate it until a proper density.

3. Prepare a slide sample of the transformed E.coli and dry it in the air. Cover the bacteria with anti-fluorescence quencher to maintain the fluorescence. The bacteria may be diluted to avoid cell mass formation on the slide during the preparation.

4. Take photos of the slide by fluorescent microscopy. Select visual fields with as many single cells and little cell masses as possible, and avoid photographing for too long a time since the fluorescent density may decrease under the exciting light.

**Support vector machine training (FANCYSelector and FANCYTrainer) **

1. 10 typical images are selected as the training images.

2. Transfer the original RGB images to gray images *G* and binary images *B*. No image enhancement was processed in our project, but it is suggested to be done in other conditions if needed.

3. For each object detected in the binary image *B*, ask the user to classify whether it is a single cell or a cell mass, and after that the object will be marked TRUE or FALSE, respectively. Then calculate the area *A*, perimeter *P*, Euler number *E*, the maximal length of any two points in the object *L* and the fluorescence *F*. The fluorescence *F* is calculated by the numerical double integral of the gray level of the image *G* on the region identified by the current object in the image *B*.

4. Record all the data obtained in Step 3 and save them in an ASCII or .mat file (optional).

5. Use the data *A*, *P*, *E* and *L* as the training data and the user's classification result as the group data to train the support vector machine (SVM). The Gaussian radial basis function with a scaling factor of 1 is selected as the kennel function of the SVM.

**Single cell identification and fluorescence calculation (FANCYScanner) **

1. For all the images taken from the slide, transfer them to gray and binary images respectively, calculate the *A*, *P*, *E*, *L* and *F* of each object in the binary images (the same as the former part) and classify them as single cells or cell masses using the SVM trained. All objects classified as single cells will be marked with * in the binary image *B*. And *B* will be saved for possible performance evaluation.

2. All data from objects classified as single cells will be recorded in a ASCII file. And the average and the standard variance of the fluorescent density (fluorescence *F* divided by area *A*) is calculated as the quantitative description of the objective gene expression level.

### Performance Evaluation

We randomly selected and applied both *FANCY* and manual identification to 5 other images to evaluate the performance of *FANCY*. The false positive and false negative rate of *FANCY* are calculated in comparison to manual identification, which defaults to be absolutely correct. 179 positive and 122 negative objects in all are identified by *FANCY* and the evaluation result is shown in Table 1.

**Table 1**

**Fig 1**. Typical false positive and false negative results of *FANCY* identification. **A**. The identification result. **B**. The original fluorescent image (some cells in the original image are not able to be detected after image transformation).

**Fig 2.** Comparison of a cell fluorescent image and its identification result by *FANCY*. **A.** The cell fluorescent image. **B.** The identification result, in which objects identified as single cells are dotted with a *.

The false positive and negative rate may decrease with more training data and proper image enhancement.

### Source Code

All source code programmed in Matlab can be freely downloaded here. For detailed usage please refer to the annotation in the code.