Gene Expression Measurement by FANCY

Abstract

The quantitative measurement of gene expression levels is of great importance in the research of molecule biology, but existing measuring methods are either semi-quantitative (Western Blotting) or require costly instruments and agents (Flow Cytometry). We invent a novel quantitative and inexpensive method, FANCY, which is short for Fluorescent ANalysis of CYtoimaging and based on cell fluorescence imaging and Support Vector Machine, to measure the gene expression in our project. FANCY contains three sub-programme: FANCYSelector to select single cells manually and calculate their properties, FANCYTrainer to train an SVM, and FANCYScanner to identify single cells by SVM trained by FANCYTrainer. The application of FANCY to 5 typical images proves the effectiveness and the error rate is acceptable. All our programs are freely available online and we suggest FANCY more widely applied in iGEM and other further researches.

Protocol

　　Preparation of fluorescent images

　　1. Connect a red fluorescent protein gene (gene rfp, or biobrick 1-22O) with the objective gene under the same promoter in a plasmid by molecule cloning, i.e., make a polycistron which can encode both the objective gene product and RFP, thus making the expression levels of the two genes the same. Other fluorescent genes such as gfp or yfp may be also applicable.

　　2. Transform the plasmid into E.coli and incubate it until a proper density.

　　3. Prepare a slide sample of the transformed E.coli and dry it in the air. Cover the bacteria with anti-fluorescence quencher to maintain the fluorescence. The bacteria may be diluted to avoid cell mass formation on the slide during the preparation.

　　4. Take photos of the slide by fluorescent microscopy. Select visual fields with as many single cells and little cell masses as possible, and avoid photographing for too long a time since the fluorescent density may decrease under the exciting light.

　　Support vector machine training (FANCYSelector and FANCYTrainer)

　　1. 10 typical images are selected as the training images.

　　2. Transfer the original RGB images to gray images G and binary images B. No image enhancement was processed in our project, but it is suggested to be done in other conditions if needed.

　　3. For each object detected in the binary image B, ask the user to classify whether it is a single cell or a cell mass, and after that the object will be marked TRUE or FALSE, respectively. Then calculate the area A, perimeter P, Euler number E, the maximal length of any two points in the object L and the fluorescence F. The fluorescence F is calculated by the numerical double integral of the gray level of the image G on the region identified by the current object in the image B.

　　4. Record all the data obtained in Step 3 and save them in an ASCII or .mat file (optional).

　　5. Use the data A, P, E and L as the training data and the user's classification result as the group data to train the support vector machine (SVM). The Gaussian radial basis function with a scaling factor of 1 is selected as the kennel function of the SVM.

　　Single cell identification and fluorescence calculation (FANCYScanner)

　　1. For all the images taken from the slide, transfer them to gray and binary images respectively, calculate the A, P, E, L and F of each object in the binary images (the same as the former part) and classify them as single cells or cell masses using the SVM trained. All objects classified as single cells will be marked with * in the binary image B. And B will be saved for possible performance evaluation.

2. All data from objects classified as single cells will be recorded in a ASCII file. And the average and the standard variance of the fluorescent density (fluorescence F divided by area A) is calculated as the quantitative description of the objective gene expression level.

Performance Evaluation

We randomly selected and apply both FANCY and manual identification to 5 other images to evaluate the performance of FANCY. The false positive and false negative rate of FANCY are calculated in comparison to manual identification, which defaults to be absolutely correct. 179 positive and 122 negative objects in all are identified by FANCY and the evaluation result is as follows:

Fig 1. Typical false positive and false negative results of FANCY identification. A. The identification result. B. The original fluorescent image (some cells in the original image are not able to be detected after image transformation).

The false positive and negative rate may decrease with more training data and proper image enhancement.

Source Code

All source code programmed in Matlab can be freely downloaded here. For detailed usage please refer to the annotation in the code.

Team:WHU-China/Project/FANCY

From 2012.igem.org

Gene Expression Measurement by FANCY

Abstract

Protocol

Performance Evaluation

Source Code