One of the few bright spots in the worldwide pandemic is the breakneck pace of scientific progress towards understanding the disease, developing effective treatments, and, eventually, a vaccine. This progress is enabled in part by breakthrough innovations, the most prominent of which may be gene editing. These techniques have revolutionized biological research in much the same way that deep learning transformed natural language processing and image recognition, a recurring theme of this newsletter.
One of these new techniques is single-cell sequencing. This technique measures gene expression (activity) at the level of a single cell, instead of average expression in a tissue sample. It provides much more granular information, much like comparing economic data at the household level to country-wide statistics.
Single-cell sequencing is a very powerful technique, which over the past decade has evolved from a brittle process that was only available at a handful of academic labs to a standard protocol that can be run at scale using a compact and affordable instrument. Applications to covid-19 research already include identification of specific types of lung cells targeted by the virus, and insight into the development of “cytokine storms”, an elevated immune response causing the body to attack its own cells (more on this here and here). These detailed insights into the disease process are invaluable to help identify the most promising treatment targets.
The output of a single-cell experiment is a list of RNA snippets with a cell identifier -- a bit like having a list of credit cards and the most recent statement for each of them. It can tell you a lot about the economy, but concrete insights will require sophisticated analysis. So how is single-cell data analyzed?
As the physical devices to perform single-cell experiments have matured, so have the methods to analyze the results. New algorithms enable, for example, the detection of rare, previously unknown cell types and understanding how gene expression patterns in a cell evolve over time (I had the opportunity to contribute to these two topics in prior academic work).
These methods, too, have scaled from academic proofs-of-concept to software packages that are easy to install and run, so the researchers using single-cell sequencing to study covid-19 can analyze the results even with limited expertise in computational biology or machine learning. This process, for single-cell analysis as well as various other novel biological techniques, is similar to text and image processing, where new off-the-shelf tools allow junior data scientists to apply techniques that were until recently only available to researchers at companies like Google.
New algorithms for the analysis of biological data are rarely adorned with the moniker “AI”, but they represent an important branch of machine learning that has also seen fast progress in the last decade -- and are contributing today to the efforts to cure and prevent covid-19.
Not on the mailing list? Subscribe to the newsletter!