Team:Berkeley/Project/Automation

From 2012.igem.org

Revision as of 21:08, 3 October 2012 by Tdj (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

We can accommodate libraries of over a million members through our MiCodes design scheme. With a data set this large, we needed to develop methods to make MiCodes a high-throughput screening technique. Because this library utilizes visual phenotypes, two aspects of microscopy needed to be automatable: image acquisition and image processing.

If a MiCode library size is very large, it is important to find a time-efficient method of taking many pictures of yeast cells. This summer we dealt with libraries on the order of 10^3 members, and our images were taken with a regular fluorescence microscope. Larger libraries would most likely make use of automated stages to speed up image acquisition.

The segmentation of cells from an image allows for cell-by-cell analysis in downstream steps. In order to perform such analyses, we wrote cell segmentation software using MATLAB. to recognize an individual cell from its background. We first used edge detection with the Sobel operator and several filtering options to approximate possible cell outlines in the image. Then we performed a series of dilation and erosion steps to clear background pixels. We also added an additional filtering routine to refine the overall cell segmentation algorithm. This refining algorithm uses several geometric criteria, which we collected through testing of approximately 400 sample images. Our cell identification (cellID) file can be downloaded here.

On the left we see the original image that requires cell segmentation. On the right we have the
processed image where each cell has been identified as a separate object.

The cellID program is currently optimized for identifying yeast cells and works best for images in which cell clumping is at a minimum. We avoided cell clumping by experimentally altering the cell density of the samples we imaged. Overall, we have seen a successful identification rate of 96 percent for images we have tested. Images of each individual cell were generated, as shown below, and saved at the end of this process for use with downstream analysis. These saved images have several uses. For example, the cell images can be converted into a binary mask to sort out the organelles of a cell for several fluorescent channels. The number of images generated can also be used to count the number of cells identified from a certain population.

An essential portion of automation is determining morphological features. In the context of organelle detection, we developed pipelines to collect geometric measurements characteristic of each organelle (actin, cell periphery, vacuolar membrane, and nucleus), and defined specific measurements and cutoff values as our feature set for each organelle.

To achieve this, we imaged a small sample size of about 100 cells with varying phenotypes, and determined feature sets associated with each of the four organelles. We used CellProfiler 2.0, an open-source software designed for quantitative analysis of cellular phenotypes. We used this software to build pipelines that demonstrate the feasibility of automated identification of cellular phenotypes. The nucleus and actin MiCodes identification pipelines can be viewed here, in .cp format (for CellProfiler 2.0).

Pipeline procedure:
1. Separate cell image into its constituent fluorescent channels. Convert cells to grayscale.

2. Circle and identify blobs of high intensity pixels. These become unspecific "objects", that are converted to a psuedocolor image.

To identify "interesting" pixels, we performed a Robust Background Adaptive threshold operation, which is a threshold technique available in CellProfiler. We found that this method is most effective in high background images, which is often the case when analyzing subcellular phenotypes over cytosolic background.

3. Determine the features associated with each "object." We took measurements of geometric properties of a sample size of about 100 cells, and selected our geometric properties through inspection of these measurements. We came up with the following feature types:

Geometric Property	Description
Area	The area occupied by a blob of pixels.
Perimeter	The total number of pixels around the boundary of a blob.
Form Factor	How circular an object is. Calculated as (4piarea)/(perimeter)^2.

To better refine the organelle detection, we also intend to analyze the following additional properties to improve the classification:

Geometric Property	Description
Solidity	Area/Convex Area, or how "solid" a blob is.
Eccentricity	Ratio of the distance between foci of an ellipse and its major axis length.

4. Use the measurements taken to classify each "object" as a particular organelle.

Results: Our software proved capable of successfully identifying MiCoded nuclei 95% of the time when processing our sample image size. In the future, with a large, already-constructed dataset (a complete MiCode library), we would likely employ machine learning techniques to improve the automated identification.