How Google’s using big data and machine-learning to aid drug discovery
From answering heath-related questions in its search results to a fitness data platform for developers, Google is becoming increasingly ingrained in the fabric of our daily health-and-wellbeing habits. But behind the scenes, the Internet giant is also working to expedite the discovery of drugs that could prove vital to finding cures for many human ills.
検索結果内で健康関連の質問に答えを返すことから、フィットネスデータの開発者向けプラットフォームに至るまで、Googleはますます、人々の健康や幸福についての日常習慣に関わりを深めている。その一方、この大手インターネット企業は水面下で、医薬品の発見を促進させる取り組みも進めている。それは多くの病気の治療法を見つけるために、必要不可欠なものになるかもしれない。
Working with Stanford University’s Pande Lab, Google Research has introduced a paper called “Massively Multitask Networks for Drug Discovery” [PDF], which looks at how using data from a myriad of different sources can better determine which chemical compounds will serve as “effective drug treatments for a variety of diseases.”
While the paper itself doesn’t reveal any major medical breakthroughs, it does point to how deep learning can be used to crunch huge data-sets and accelerate drug discovery. Deep learning is a system that involves training systems called artificial neural networks on lots of information derived from key data inputs, and then introducing new information to the mix. You might want to check out our guide to five emerging deep learning startups to watch in 2015.
“One encouraging conclusion from this work is that our models are able to utilize data from many different experiments to increase prediction accuracy across many diseases,” explained the multi-authored Google Research blog post. “To our knowledge, this is the first time the effect of adding additional data has been quantified in this domain, and our results suggest that even more data could improve performance even further.”
Google said it worked at a scale “18x larger than previous work,” and tapped a total of 37.8 million data points across 200+ individual biological processes.
Googleは、この研究は「以前の研究より18倍大きい」規模で行われたと述べ、200を超える個人の生物学的プロセスを通じて、合計3,780万点のデータを取り出したとしている。
Googleは「今までの18倍以上」の効果があると述べ、200以上の生物学的過程で合計3780万データ点分岐を行った。
“Because of our large scale, we were able to carefully probe the sensitivity of these models to a variety of changes in model structure and input data,” Google said. “In the paper, we examine not just the performance of the model but why it performs well and what we can expect for similar models in the future.”
This feeds into a bigger trend we’ve seen of late, with many of the big tech companies investing resources in deep learning. Last year, Twitter, Google, and Yahoo all acquired deep learning startups, while Facebook and Baidu made significant hires in this realm. Netflix and Spotify carried out work in this area too.
今回の件は、ここ最近の大きな流れを後押しする。大手テック系企業の多くがディープラーニングに資源を投資しているという流れだ。Twitter、Google、Yahooは昨年、それぞれディープラーニング関連のスタートアップを買収し、FacebookとBaiduはこの分野に多くの人を雇った。NetflixとSpotifyもまた、この分野に取り組んだ。
この研究は最近のより大きなトレンドにつながっており、巨大テクノロジー企業がディープラーニングに資源を投資するようになっている。昨年では、TwitterやGoogle、Yahooがディープラーニングを得意とするスタートアップを買収した。FacebookとBaiduもこの分野の専門家を数多く雇い入れている。NetflixとSportifyもこの分野の事業に乗り出した。
At VentureBeat’s HealthBeat conference last October, we looked at how the future of health care could lean heavily on robotics, analytics, and artificial intelligence (AI). Feeding into this diagnostic element is treatment discovery, which is increasingly turning to AI, big data, and deep learning too, as we’re seeing with this latest research from Google and Stanford.
By automating and improving predictive techniques, this should not only speed up the drug discovery process but cut the costs. From the Google report:
予測技法が自動化され改良されてい事で、新薬の発見にかかる時間が短縮されるだけでなく、経費削減にもつながる。Google社の報告より:
このことは自動化と予測技術の改善により、医薬品の発見プロセスをスピードアップするだけでなくコストも削減するだろう。Googleのレポートは以下のように述べている。
Discovering new treatments for human diseases is an immensely complicated challenge. Prospective drugs must attack the source of an illness, but must do so while satisfying restrictive metabolic and toxicity constraints. Traditionally, drug discovery is an extended process that takes years to move from start to finish, with high rates of failure along the way.
In short, testing millions of compounds can take a long time, so anything that can increase the chances of striking a successful combination can only be a good thing. This is where machine learning at scale may help.
要するに、何百万もの化合物をテストするには長い時間がかかる。だから、うまくいく組み合わせに当たる機会が増えれば増えるほどいいのだ。大規模な機械学習が活躍できる可能性があるのは、そこである。
『Massively Multitask Networks for Drug Discovery』の部分の訳出を忘れてしまいました。『Massively Multitask Networks for Drug Discovery』(ネットワークの大規模マルチタスク化による薬剤発見)と追加願います。