This Billion Dollar Startup is Using AI to Design New Drugs
Ep. 10, Unsupervised Learning | Ep. 24, Vital Signs
On our first crossover episode between Redpoint’s Unsupervised Learning and Vital Signs podcasts, we dove deep into the world of using AI in drug development with Insitro CEO Daphne Koller. See the full episode below!
We sat down with Daphne to discuss how Insitro distinguishes itself from other AI drug discovery companies, where ML can drive the most impact in drug development, if foundation models transform drug discovery and edtech and more. You can listen to the full conversation on Spotify and Apple, watch the episode on YouTube or read the highlights below.
⚡ Highlight 1: The language models of biology
Unlike the typical LLMs that are trained on human language, generative AI models in biology are trained on large data sets of molecular graphs — structures of chemical compounds and their molecular properties.
“So if you think about the large language models that are very much the talk of the day, we're building medium language models where the language is the language of cells and tissues. It's not human language, it's biology language...Now in this language we can ask, what does healthy look like? What does sick look like? And what are a set of interventions that might move us from one to the other? And for interventions, we look to genetics because genetics is one of the very few sources of ground truth that we have about causality in biology”
Listen to the full episode to learn more about the biological data Insitro generates in addition to the clinical data that allows them to apply AI drug discovery for targeted patient treatments.
⚡ Highlight 2: Daphne’s dream dataset
Insitro primarily works with cell and clinical-based datasets to develop their AI models. While there are an abundance of cell-based datasets that are rapidly improving in quality, collecting valuable clinical data takes significantly more time due to its longitudinal nature.
“Larger cohorts with deep phenotyping that's done longitudinally. The reason why that's hard is because if you want longitudinal data, you either have to wait a really long time or you have to go backwards and hope someone collected it already…What you really want is someone who picked a cohort 70 years ago, a large cohort, and deeply phenotyped them on an ongoing basis so that today we would have all that data collected over time.
That doesn't exist anywhere. ”
Daphne praised the biomedical dataset curated by the UK Biobank, but still acknowledged its limitations in diversity and size.
⚡ Highlight 3: Impact of GPT on drug discovery
While AI has already made a huge impact on biology and life sciences (see DeepMind’s AlphaFold), it’s still an open question how much LLMs specifically will impact the field.
“There's certainly pieces of the large language models that are potentially hugely enabling. Maybe in less interesting ways, but still really important ways to the work that's done…There's always some stuff that you might learn from reading the literature. So reading, summarizing and distilling that into something that's cogent is really very time consuming. So, is that sexy? I don't know. But it's certainly very, very useful.”
⚡ Bonus: Daphne’s new EdTech startup
In the midst of the pandemic when Daphne’s two daughters were shoved into “Zoom School” they realized that the teaching and learning experience that teachers and students get online was not a very engaging one. So that led Daphne and her husband to start the company Engageli. She shared her thoughts on the future of LLMs in edtech:
“With the large language models, you can start to think about an AI enabled form of engagement…you could create AI enabled tutors that can help answer student’s questions and point them towards course materials. You could help the instructor create meaningful personalized interventions with students in a way that still has that instructor voice, but you don't have to write a half page email personalized to every student from scratch, which is just time prohibitive in large classes.
So I think there's lots of ways in which LLMs could be woven into the education technology stack, and that is certainly the future of education, especially the online education.”
Watch an excerpt from the full episode where Daphne explains how Insitro is using AI to disrupt the pharmaceutical industry!