It’s been a few months since the Obama administration launched the Big Data Research and Development Initiative, aimed at pooling large collections of data to gain knowledge and insight into everything from national security to the environment to biomedical research.
Various divisions of the US Department of Health and Human Services are involved in the project, including the National Science Foundation and the National Institutes of Health (NIH), both of which are seeking ways to manage and analyze data collections. As a kickoff to the initiative, NIH placed the human genome project data set into the Amazon cloud for free.
The CDC has been in the game of sharing data for about two years with its BioSense 2.0 system and its Special Bacteriology Reference Lab. The Obama initiative brings FDA on board as well with a proposed FDA Virtual Laboratory Environment that will “combine existing resources and capabilities to enable a virtual laboratory data network, advanced analytical and statistical tools and capabilities, crowd sourcing of analytics to predict and promote public health, document management support, tele-presence capability to enable worldwide collaboration, and make any location a virtual laboratory with advanced capabilities in a matter of hours,” according to a White House fact sheet about the Big Data program.
Last week, Andrew Kasarskis of Mount Sinai’s Institute for Genomics and Multiscale Biology spoke on NPR about Big Data’s relation to healthcare and medical R&D, such as its potential to help develop better treatment.
“We try and leverage very, very large-scale data of very, very deep complexity to probe questions about how biology works and that can be used to help patients,” he told NPR’s Tracey Samuelson.
For example, Kasarskis and his Mount Sinai team discussed with NPR how data could be used to help understand from the condition of an 18-month old child with a liver condition using data from the baby’s DNA, the parents’ DNA, and from data on liver samples and genetic tie-ins.
Over the past couple of years, some companies, including GSK, have opened up their compound libraries to global researchers so that existing data can be leveraged to help find new treatments and targets.
With Big Data surrounding our every move, the possibilities for drug discovery and development seem endless. Has your company used open libraries or data pools? How do you envision Big Data affecting drug discovery and development?