Genomics Startups Are Flourishing in the Cloud

Among the major success stories of last decade are the advances made in the fields of bioinformatics and genomics. The knowledge developed by genomics and bioinformatics industries will have a huge impact on personalized medical treatment, our understanding of disease, and how we plan our lives. Much of that success is the result of improved sequencing technology that allows companies to sequence genomes in timeframes that would have been unthinkable just a few years ago.

A thriving industry of bioinformatics and genomics startups are seeking to leverage the data made available by improved sequencing technologies. Sequencing is only the first stage though. The sequencing process produces huge amounts of data, sometimes hundreds of gigabytes across multiple samples. To provide useful information, that data has to be processed and analyzed.

The Challenge

Genomics is computationally intensive and requires large amounts of storage. While sequencing technology improved drastically, startups and researchers faced a downstream resource bottleneck. Most don’t have the capital to invest in creating their own computing clusters, and so they turn to the cloud.

In this article we’d like to talk about how ComputeNext’s cloud marketplace can help your genomic startup leverage the cloud for data analysis, storage, and collaboration.

Low-Cost And Flexible Storage

As we already said, protein sequencing produces huge amounts of data. The analysis stage can also create large datasets.That data has to be stored somewhere. The cloud provides almost unlimited quantities of inexpensive data storage.

Improved Collaboration

Providing access to genomics datasets used to be a clumsy business that involved transferring it to multiple hard drives and mailing it to collaborators. Because data can be uploaded to cloud storage platforms, access can be given to interested parties across the globe. When a project depends on insights drawn from data by experts that may be located anywhere in the world, or perhaps just next door, the improved collaboration potential created by cloud storage can make a significant difference to its success.

The ComputeNext cloud marketplace helps genomics startups discover and provision flexible storage from a large range of vendors.

On-Demand Compute Capacity

Genomics algorithms are computationally intensive. The raw data from sequencing isn’t particularly useful unless it’s analyzed, and given the size of the datasets and the complexity of pattern recognition and other algorithms, that takes a lot of computing heft. Traditionally, genomics analysis is carried out using large compute clusters, but procuring, housing, and maintaining that much physical infrastructure is expensive.

Most genomics research requires compute capacity in intense bursts rather than consistently. Investing significant capital in in-house server clusters is uneconomical when they’ll be idle for a lot of the time.

Cloud platforms allow genomics researchers to deploy variable-capacity virtual clusters on-demand and only pay for the resources that they use. By using the cloud, researchers can slash their compute costs and process multi-gigabyte workloads before downloading results that often amount to only a few kilobytes.

Replicable Toolsets

Genomics and bioinformatics researchers depend on a well-established toolset for analysis, including BLAST, glimmer, hmmer, phylip, and GeneSpring among many others. Each company and project has different requirements. By creating server images with the necessary tools ready-installed, researchers can deploy or replicate servers running the software they need very quickly.

The cloud is providing the informatics infrastructure necessary to leverage the data produced by sequencing hardware. The ComputeNext cloud marketplace helps genomics startups find the best platform for their storage and compute workloads.


Image: Flikr/igemhq