Team:Johns Hopkins-Software/Cloud

From 2012.igem.org

(Difference between revisions)
Line 19: Line 19:
<br><br>
<br><br>
-
What is cloud computing? Many of you may already be very familiar with the concept of the cloud. It is the use of software and hardware services across a network, often the internet. It can be utilized in many forms as we are all familiar with Google apps, web hosting services, Dropbox, etc. The advantages of using the cloud is that a company would not have to maintain their own hardware, so they can save on the cost of the technology while ensuring the quality of performances. They can increase access as essentially anyone with the authorized credentials could access the data or software through the internet, and are not limited to any physical location. And of course, the cloud can handle many computationally demanding tasks. Using multiple machines to process work in parallel, performance could be sped up to a small fraction of the time.
+
Autogene harnesses the power of the cloud to perform computationally intense tasks at record speeds. Cloud computing is known as the use of software and hardware services across a network, often the internet. The advantages of using the cloud is that an organization would not have to maintain their own hardware, so they can save on the cost of the technology while ensuring the quality of performances. They can increase access as essentially anyone with the authorized credentials could access the data or software through the internet, and are not limited to any physical location. And of course, the cloud can handle many demanding tasks. Using multiple machines to process work in parallel, performance could be sped up to a small fraction of the time.
<br><br>
<br><br>
-
In the case of the AutoGene alignment algorithm, we wrote a client script that communicates with the cloud backend, which runs two tiers of algorithms that splits up the job into many subjobs running in parallel. We have tested this on an alignment of the PUC18 gene, which consists of a sequence of 2,680 letters, against a library of 17,500 yeast features, each about 400 letters long. Running conventionally without the cloud0, we found that it would take about 39 minutes to complete this alignment. Running the algorithm tailored for the cloud on a local work manager took around 10 minutes. The cloud from a cold start, meaning when we are just turning on the machines and they are not yet running at full power, it took 18 minutes. With five processing units on full power, it took less than seven minutes. Then finally with 10 processors we cut the time to three minutes, performing more than thirteen times faster than without the cloud. What a difference, right? This isn’t just speeding up the amount of time it takes to run a program. PUC18 is a relatively unintimidating-sized sequence. Many sequences of interest can get to many thousands of letters in length and libraries can have countless features, which could cause alignments to take weeks to complete. This would require more memory than a local machine would be able to handle, so this is the kind of job that could only be done through a cloud server. So far we have only been testing with 10 worker units. Theoretically, if we were to use more, we would continue to see a drastic change in speed. With this kind of improvement, we are making the impossible in biology possible.
+
In the case of the Autogene alignment algorithm, we wrote a client script that communicates with the cloud backend, running two tiers of algorithms that splits up the job into many subjobs running in parallel. We have tested this on an alignment of the PUC18 gene, which consists of a sequence of 2,680 letters, against a library of 17,498 yeast features, each about 400 base-pairs long. Running conventionally without the cloud, we found that it would take about 39 minutes to complete this alignment. Running it on the cloud with 10 processors we cut the time to three minutes, and running it with 30 processors we cut it to nearly one minute. PUC18 is a relatively unintimidating-sized sequence. Considering how many sequences of interest can be up to thousands of letters in length, and how libraries can have countless features, which could cause alignments to take weeks to complete, certain alignment tasks would require more memory than a local machine would be able to handle, so this is the kind of job that could only be done through a cloud server. With this kind of improvement, we are making the impossible in biology possible.
</font>
</font>
</div>
</div>

Revision as of 01:44, 1 October 2012




Autogene harnesses the power of the cloud to perform computationally intense tasks at record speeds. Cloud computing is known as the use of software and hardware services across a network, often the internet. The advantages of using the cloud is that an organization would not have to maintain their own hardware, so they can save on the cost of the technology while ensuring the quality of performances. They can increase access as essentially anyone with the authorized credentials could access the data or software through the internet, and are not limited to any physical location. And of course, the cloud can handle many demanding tasks. Using multiple machines to process work in parallel, performance could be sped up to a small fraction of the time.

In the case of the Autogene alignment algorithm, we wrote a client script that communicates with the cloud backend, running two tiers of algorithms that splits up the job into many subjobs running in parallel. We have tested this on an alignment of the PUC18 gene, which consists of a sequence of 2,680 letters, against a library of 17,498 yeast features, each about 400 base-pairs long. Running conventionally without the cloud, we found that it would take about 39 minutes to complete this alignment. Running it on the cloud with 10 processors we cut the time to three minutes, and running it with 30 processors we cut it to nearly one minute. PUC18 is a relatively unintimidating-sized sequence. Considering how many sequences of interest can be up to thousands of letters in length, and how libraries can have countless features, which could cause alignments to take weeks to complete, certain alignment tasks would require more memory than a local machine would be able to handle, so this is the kind of job that could only be done through a cloud server. With this kind of improvement, we are making the impossible in biology possible.
Autogene

Retrieved from "http://2012.igem.org/Team:Johns_Hopkins-Software/Cloud"