AI-based Ideation Engine for Biopharma

Bringing a novel therapeutic to patients is difficult, expensive and time-consuming. The average cost of developing a drug and bringing it to market is about $3 billion and can take 12-14 years. The drug discovery phase, which consumes about a third of the overall cost, requires the synthesis of thousands of molecules and up to 5 years to develop a single pre-clinical lead candidate. Furthermore, only 10% of the compounds that enter Phase I trials actually receive approval. We believe that Artificial Intelligence (AI) has the potential to speed up the discovery phase and lower discovery costs significantly. As an additional benefit, AI can help scientists send higher quality compounds to the clinic, reducing the failure rate. Recent advances in molecular science and machine learning, combined with the availability of powerful cloud compute platforms, are turning this potential into reality.

BIOVIA Generative Therapeutics Design (GTD) improves and accelerates lead candidate design by automating the virtual creation, testing and selection of novel small molecules. The cloud-based solution employs advanced AI/Machine Learning techniques to help scientists decide what molecules to make next—helping to guide the drug discovery process and optimize R&D output.

Active Learning

Active learning is a specialization within Machine Learning in which computation (the ‘virtual’) and experiment (the ‘real’) are combined—allowing scientists to find optimal answers in the most efficient way possible. Using small molecule lead discovery as an example, a drug discovery team starts with an initial model built from a small amount of data, e.g., assay results for a few tens of compounds. They then use this model to suggest new compounds that can improve the scope of their models. As they synthesize and assay a series of new compounds, new training data becomes available to retrain and improve the models. Iteratively updating the model in this way is a well-established approach to optimizing designs using the fewest iterations, hence shortening the overall discovery timeline. As the scope and quality of the models improve, compounds recommended for achieving the desired target product profile (TPP) will become more diverse and more likely to be successful.

Human-in-the-Loop AI

Generative Therapeutics Design iteratively generates thousands of virtual molecules, exploring a vast chemical design space for optimal novel lead candidates. As lead optimization is a multi-objective optimization challenge, the system assesses and balances important target properties such as drug activity, solubility, hepatotoxicity, drug availability and metabolic stability, and potentially also ease of synthesis, developability and IP considerations such as patentability.

Bench chemists can provide expert insight into this process, complementing machine predictions and influencing subsequent design iterations. We use the term “augmented intelligence” for this “human-in-the-loop” concept. Human intelligence collaborates with machine intelligence to drive faster and more accurate results.

Lab-in-the-Loop AI

Of course, scientists also need to validate promising structures in the lab. This “Lab-in-the-loop Artificial Intelligence” combines the advantages of unbiased machine learning methods with real-world experimentation and the knowledge and experience of scientific experts.

As part of the design process, the system will be able to take into account reagents available for purchase from a third-party vendor or synthesis company, so organizations can minimize turn-around time and/or costs when working with internal labs or outsourcing to contract research organizations.

Ongoing compound testing provides additional training data to improve predictive models. This critical active learning process, combined with real-world testing, expands the scope of the models, allowing subsequent iterations to explore new territory. The process continues until the medicinal chemist finds compounds that meet the TPP.

Modeling and Simulation

Modeling and simulation can complement automated machine learning methods. Computational chemists can model complex systems from first principles and gain insights that would take much longer and cost far more when obtained via bench experimentation. For example, methods such as pharmacophore scoring, molecular docking and free energy perturbation (FEP) can help scientists predict in three dimensions if and how a proposed drug molecule will interact with a protein implicated in a disease. Scientists will be able to automate these methods and run them as part of the generative design process.

A Case Study

Using BIOVIA Generative Therapeutics Design, a large U.S. pharma was able to build a series of high-quality machine learning models from an initial set of project compounds. Based on these models, the system proposed a series of compounds for the next round of synthesis and testing. The system quickly ‘learned’ from their project compounds about structural motifs that were atypical, yet regarded as valuable for their specific therapeutic target. Medicinal chemists could also specify which parts of the starting compounds needed to be held constant to exploit a narrower chemical space around these compounds. This resulted in a new set of proposed virtual compounds with familiar synthetic routes and an improved TPP.

Ultimately, the medicinal chemists found that approximately 80% of the compounds proposed by the system met the predicted property profile, and one compound met the complete target product profile. The chemists’ feedback was that the majority of the proposed compounds were encouraging as they were structurally similar to compounds already under consideration. Even more interesting, a subset of the proposed compounds was structurally novel and compounds they would not have considered using traditional methods. This is where Generative Therapeutics Design shows real value— in proposing compounds outside the domain typically studied by these chemists.

Three Takeaways

Generative Therapeutics Design can be an effective ideation engine for bench chemists in the pharmaceutical, biotech and even the agrochemical sector. The system can give scientists new ideas about what to synthesize next and help them investigate beyond where they typically look. It nurtures their intuition and helps them think about compounds in different ways.
Generative Therapeutics Design can accelerate lead candidate development—improving molecular quality, reducing experimentation costs and shortening discovery timelines. By helping to advance only the most promising candidates to clinical trials, the system can potentially save millions of research dollars in drug development and other programs.
Chemists working together with AI/Machine Learning deliver the best results. With Generative Therapeutics Design, scientists and AI algorithms complement each other. Scientists can work expediently with algorithms, come up with their own designs and fully leverage their intuition to get the best possible results.

One Last Word

Generative design tools are particularly powerful when used as part of a larger business workflow. BIOVIA is adding tools for collaborative combination of Virtual and Real (V+R) data including request management in experimental labs, registration of virtual and real compounds and test results and automated re-learning of Machine Learning models. In this way, customers can embed groundbreaking new science into established workflow and business processes.