14-15 November 2023
FHNW University of Applied Sciences and Arts Northwestern Switzerland
Basel, Switzerland
14-15 November 2023
FHNW University of Applied Sciences and Arts Northwestern Switzerland
Basel, Switzerland
This session will explore current applications of machine learning and AI for experimental design and direction as well as the application of predictive modeling for a variety of applications.
Joshua Kangas, Ph.D. (Carnegie Mellon University)
Autonomous Scientific Research Capabilities of Large Language Models
Robert MacKnight (Carnegie Mellon University)
Transformer-based large language models are making significant strides in natural language processing, biology, chemistry, and computer programming. Here we show the development and capabilities of Coscientist, an AI system that autonomously designs, plans, and performs complex scientific experiments by incorporating large language models empowered by tools such as internet and documentation search, code execution, and experimental automation. Coscientist showcases its potential for accelerating scientific research across six diverse tasks, one being the successful reaction optimization of palladium-catalyzed cross-couplings while exhibiting advanced capabilities for (semi-)autonomous experimental design and execution. Our findings demonstrate the versatility, efficacy, and explainability of AI systems like Coscientist in advancing scientific research, while also highlighting the importance of addressing safety implications and ensuring responsible and ethical use of such powerful tools.
Data-Driven Drug Discovery for Chronic Liver Diseases:
Dimitris Polychronopoulos, M.S., Ph.D. (Ochre Bio)
To the best of our knowledge, Ochre Bio has generated more human liver deep phenotyping data than anyone else. Deep phenotyping involves studying how perturbation of genes affects the liver, by juxtaposing tissue, blood and clinical phenotypes with functional genomics data to produce a map, or knowledge graph, of the relationships between genes and phenotype. We are building on this data to enable knowledge-graph-based gene prioritisation. Running such ‘in silico’ screens allows us to narrow down the universe of genes for further study in our cellular, tissue, and organ models.
Learning Gene Cis-Regulation Using Large Language Models
Lindsay Edwards, Ph.D. (Relation Therapeutics)
Initially using 1D-CNNs, then LSTMs, scientists (including those at Relation) now use Transformer-based architectures to predict how changes in DNA sequence affect the biology of the DNA itself (for example, whether key molecules called transcription factors will bind or not). Ultimately, these models can predict how changes in DNA sequence might affect gene expression. In this talk I will outline the science underlying these efforts in detail, before describing Rosalind - Relation’s large language model for DNA – and the role of Cambridge-1 in pushing the scientific bounds in this exciting and important area of biology. I will also touch on broader issues related to the difficulties of deploying ML effectively in the drug discovery process.