Public perception of scientists often frames them with their head in the clouds; the reality of this is becoming ever more pertinent. Phrases like “cloud technology” and “the internet of things” (or IoT) are thrown around with increasing regularity in biological circles, but what do they actually mean?
Cloud technologies like Dropbox, OneDrive and iCloud now allow us to access information stored remotely, saving us from the constant pop-up reminders that we should have bought a bigger hard drive. It has never been easier to share and store data, which certainly lends nicely to the large datasets and documents flowing constantly through every scientific discipline.
The internet of things, another phrase beginning to worm its way in, is the connectivity of our everyday objects to the internet in an attempt to streamline our lives via a messy web of Wi-Fi connections. Everything from your toothbrush to your boiler can now be accessed remotely, giving us pre-warmed houses and stern warnings about over-brushing.
But what does all this mean for biology? The last few decades have seen revolutionary strides in technology such as next-generation DNA sequencing; the integration of these technologies with the internet may have an unprecedented impact on science as we know it.
One of the most captivating examples I’ve encountered, and on several occasions now, is remote automated biomonitoring.
Traditionally, to monitor the biological populations present at a given location would involve intensive surveying of the flora and fauna via laborious and time-consuming visits and collections. Next-generation sequencing can transform a drop of water, a handful of soil or a fistful of faeces into an extensive assemblage of DNA sequences*, allowing an effective survey without such intensive fieldwork and fiddly taxonomy. There are shortfalls involved in this method, like the difficulty** involved in quantifying the number of individuals of a given species from the number of DNA reads that return from the sequencer, but it’s a revolution nonetheless.
Since next-generation sequencing first arose, the technology has only become cheaper and smaller, with portable USB sequencers such as the Oxford Nanopore MINION now available for use in the field. With plenty of remote, automated biodiversity samplers in use across the globe, the question on many lips is whether such a portable sequencer could be incorporated into one of these automated samplers. This is where the internet of things and cloud technology could really come into their element.
If an automated sampler, fitted with a portable sequencer***, could feed data into the cloud and be operated remotely via the internet, biodiversity assessments could be carried out from the comfort of an armchair or beanbag (if you’re that way inclined). This would be a massive revolution in the field. So, what’s to fear?
Whilst the internet of things opens many doors for science, it opens many windows for the malicious burglars prowling the scientific back alleys of the cyber-verse. Another application of cloud technology to science is the storage of specimen locations, including those of endangered species. Just like the overzealous fans of Hollywood’s shiniest stars looking for any way to get a glimpse of their private lives (or parts), opportunistic collectors of all things living are looking for ways to find that rare beetle missing from their drawer or the butterfly they need to finish their set. As scandal after scandal has revealed over the last few years, cloud technology can be vulnerable and the stored data accessible, meaning tech-savvy collectors can find their beetle or butterfly by malicious means.
For example, using a tool called Shodan it is possible to find and track vulnerable ships via their Satcom devices, allowing hackers to not only follow the ship across the world, but also to exploit such vulnerabilities for malevolent purposes. If similar exploits are found in sequencers and other scientific technologies, research facilities could be held to ransom.
Whilst I like to think of the scientific community as a collaborative cohort, there is unfortunately a pressure for many groups to compete and many companies to push an agenda (or more likely a product). These technologies make data sabotage, manipulation and theft all the more possible. Whether it’s data stolen from the cloud or data collection tampered with via the internet of things, there are many malicious avenues to consider. An Iranian group known as Cobalt Dickens, thought to be affiliated with the Iranian government, were found last year to be targeting universities across 14 countries with phishing emails to acquire intellectual property and unpublished research. Scary stuff!
On a more personal level, as we enter the age of personalised genomics, by which I mean the use of DNA sequencing by an individual to find anything from their ancestry to whether they have joined ear lobes****, these vulnerabilities can leave your most private parts exposed – your DNA. Whilst I wholly doubt anyone will be using this to generate a Star Wars-esque clone army based on you (no offence), it has serious implications for privacy surrounding medical conditions and paternity that one may otherwise wish to stay hidden. Already, such DNA tests are revealing false paternities and dormant diseases, all of which may be private information and all of which could be used to blackmail if illegitimately acquired.
This problem is so pertinent that the 100,000 Genomes Project, a project which sequenced the genomes of 100,000 Brits, stores their data in a Ministry of Defence base to ward off a plethora of potential foreign attackers. This ‘gene theft’ could also theoretically be used by the more technologically-advanced criminal to deposit an innocent person’s DNA at the scene of a crime, thus endangering the blameless and impeding forensic investigations. The horror!
The digital and DNA worlds have met before though. Novel applications of DNA include steganography, essentially data storage in DNA, with one lucky bacterium bestowed a gif of a galloping horse and others granted such niceties as the entire novel War and Peace. Whether storing data via DNA in this way will catch on is another matter entirely. As all things do though, DNA can have its dark side for data. Tadayoshi Kohno’s group in University of Washington took the term “computer virus” to a literal level and synthesised a nucleotide which, when inserted into a sequencer, corrupts the underlying software and allows unprecedented access to the “biohacker”. This science-fiction feat opens a whole realm of problematic possibilities to the future of genetics and science as a whole.
These are just a few of the issues we may face as the future comes upon us. So, what stands in the way of these things happening? In short, robust cyber security infrastructure and practices. Whilst these intricate pathways to your passions and pastimes are opening, simple phishing remains the prevalent method for pilfering research and so much more. Although these technological advancements may give rise to novel vulnerabilities, they will also bring a new wave of answers to the questions of our time; we must protect our science so that it may protect us.
* This is an oversimplification – there are many more (often-frustrating) steps involved!
** Or impossibility; it’s an area of great contention!
*** Admittedly, this is an oversimplification; steps such as extraction of the DNA from the environmental samples collected need to be automated, too.
**** A mirror may provide quicker results.