Validation Notes

Proteome Screening with MatchMaker achieves performance comparable to experimental techniques

Header_Validation.png
 
Final_logo_horizontal_1000.png

Proteome Screening with MatchMaker achieves performance comparable to experimental techniques

MatchMaker powered Proteome Screening identifies more known binders of methotrexate, dasatinib and panobinostat compared to thermal proteome profiling (TPP)

headerImage.png

OPPORTUNITY

In silico methods capable of matching the performance of experimental approaches in drug target deconvolution can accelerate the therapeutic development process.

TECHNOLOGY

MatchMaker™
Ligand Express Proteome Screening

SOLUTION

We compared MM Proteome Screening results with those from TPP. We show that MM exceeds the performance of TPP in identifying high confidence targets of methotrexate, dasatinib and panobinostat.

INTRODUCTION

Both in silico and experimental approaches have been proposed to elucidate a drug’s polypharmacology. Screening drugs against many proteins permits the identification of unknown targets, informs the selection of candidate molecules for drug development, enables repurposing of approved drugs for new indications, and improves safety by identifying potentially harmful off-target interactions. Thermal proteome profiling is a recent high throughput experimental technique for detecting drug target engagement, which combines cellular thermal shift assay (CETSA) with quantitative mass spectrometry[1]. TPP relies on altered protein thermostability upon ligand binding, applies to any class of target or compound, and does not require any modification to the compound. Other examples of chemical proteomics methods for target deconvolution include activity-based protein profiling, photoaffinity labeling and affinity purification, all of which require potentially activity-altering compound modification. Herein we compare MatchMaker powered Proteome Screening (MMPS) to TPP regarding their ability to identify the known protein drug targets of methotrexate, dasatinib, and panobinostat (Figure 1).

METHODOLOGY

We selected methotrexate, dasatinib, and panobinostat for this study because of readily available TPP data from human cell lines[1-3], and because of sufficient high quality experimental data available to use as a reference for ground truth. For reference, we use the comprehensive protein-chemical interaction database STITCH[4] which integrates experimentally validated interactions (e.g. binding experiments evaluating Ki, IC50, or KD), manually curated datasets, and automated text mining. We limit our reference list to proteins with high confidence interaction and binding scores (>0.7), which excludes solely text mining results. To estimate the performance of MMPS when applied to investigational compounds and Novel Chemical Entities (NCE), we retrained its deep learning model for the purpose of this benchmark while excluding bioassay and structural data for all compounds similar to the three test drugs. Specifically, we apply a cutoff of 0.5 Tanimoto Similarity (TS) or greater, using Morgan Fingerprints (Radius 3), to attenuate over-estimation of model quality due to compound series bias[5]. Recently reported Drug Target Interaction (DCI) predictions fail to exclude similar molecules between testing and training sets[6]or from database searches[7], and therefore reported performances would not be representative of NCEs, which are critical for pharmaceutical drug development. We then compared the lists of proteins identified by MMPS and TPP against the reference list of high confidence targets for each molecule. A protein was counted as identified by MMPS if it appeared within the top 25 of the 8700 proteins ranked. When multiple TPP screens were available for different conditions (e.g. intact cells versus cell extracts), we analyzed the results for each individual condition separately, in addition to the cumulative results across all conditions.

Figure 1. 2D chemical structures of dasatinib, methotrexate, and panobinostat.

Figure 1. 2D chemical structures of dasatinib, methotrexate, and panobinostat.

RESULTS

Methotrexate is an antimetabolite multi-targeted drug that inhibits the synthesis of pyrimidines and purines through its inhibition of dihydrofolate dehydrogenase (DHFR). Out of the 14 high confidence binders of Methotrexate, two proteins (DHFR and TYMS) were identified by both TPP3 and MMPS, with MMPS identifying an additional two (FOLR2 and DHFR2) known binders (Figure 2A), highlighting the value that MMPS has in elucidating a drug’s polypharmacology. Significantly, DHFR2 was found even though there is no crystal structure available for it in the PDB, demonstrating the flexibility of the MatchMaker engine.

Figure 2: (A) The number of proteins interacting with methotrexate identified by either thermal proteome profiling (TPP) or MMPS and their overlap with the reference (REF.) interactors. All three identified dihydrofolate reductase (DHFR) and thymidylate synthase (TYMS) as binders. (B) Comparison of results generated for Dasatinib. All three identified Tyrosine-protein kinase Yes (YES) as a binder.

Figure 2: (A) The number of proteins interacting with methotrexate identified by either thermal proteome profiling (TPP) or MMPS and their overlap with the reference (REF.) interactors. All three identified dihydrofolate reductase (DHFR) and thymidylate synthase (TYMS) as binders. (B) Comparison of results generated for Dasatinib. All three identified Tyrosine-protein kinase Yes (YES) as a binder.

Dasatinib is a well-studied kinase inhibitor targeting BCR-ABL and SRC. It is used in the treatment of chronic myeloid leukemia (CML) and acute lymphoblastic leukemia (ALL). For dasatinib, out of the 58 high confidence reference interactors, a total of four overlapped with those identified cumulatively through TPP1 (Figure 2B), spanning four individual TPP experimental conditions. MMPS identified 16, only one of which were also identified by TPP (YES), highlighting the complementary nature of the two approaches and greatly improving upon the false negative limitation of TPP. Extending the MMPS list to the top 100 predicted proteins for dasatinib captures 28 out of the 58 reference protein interactors. Notably, four of the reference proteins (SRMS, FRK, BLK and FGR) identified within the top 60 by MMPS were identified even though there is no structure available in the PDB.

Panobinostat is a pan-histone deacetylase (HDAC) inhibitor used in the treatment of multiple myeloma. Out of the 11 reference protein interactors of panobinostat, four were identified cumulatively through two separate TPP conditions. MMPS also identified four, with two proteins identified by both approaches (HDAC2 and HDAC10) (Figure 3), again highlighting the complementarity of the two approaches. Remarkably, HDAC10 was identified through MMPS even though there is no structure available for it in the PDB.

Figure 3: (A) The number of proteins interacting with panobinostat identified by either TPP or MMPS which overlap with the reference set. Extending MMPS to the top 100 predictions results in the identification of 6 out of the 11 high confidence reference interactors. (B) The number of proteins interacting with panobinostat identified by either TPP or MMPS which overlap with the reference set. All three methods identified Histone deacetylase 2 (HDAC2) and Histone deacetylase 10 (HDAC10) as binders.

Figure 3: (A) The number of proteins interacting with panobinostat identified by either TPP or MMPS which overlap with the reference set. Extending MMPS to the top 100 predictions results in the identification of 6 out of the 11 high confidence reference interactors. (B) The number of proteins interacting with panobinostat identified by either TPP or MMPS which overlap with the reference set. All three methods identified Histone deacetylase 2 (HDAC2) and Histone deacetylase 10 (HDAC10) as binders.

SUMMARY

Protein drug interactions predicted independently by MMPS and by TPP were compared to high confidence protein drug interactions listed in the STITCH database for methotrexate, dasatinib, and panobinostat. While the false positive rate of TPP is low, we find that the method is susceptible to false negatives, consistent with previous observation[8]. MMPS identified several additional true positives compared to TPP, demonstrating the complementary nature of the two methods, but also the viability of MMPS as an in silico only approach. We cannot draw firm conclusions about the false positive rate of either method, because any positives not found in the reference list may well represent true, previously uncharacterized interactions. For both methotrexate and dasatinib, MMPS outperformed TPP in identifying high confidence protein targets, supporting the exciting possibility that using exclusively an in silico technology can yield substantial improvements over experimental methods in target deconvolution as well as the assessment of therapeutic safety and efficacy of new molecules.

RESOURCES

1. Savitski MM, et al. Tracking cancer drugs in living cells by thermal profiling of the proteome. Science. 346, 1255784 (2014)
2. Becher I, et al. Thermal profiling reveals phenylalanine hydroxylase as an off-target of panobinostat. Nat Chem Biol. 12(11), 908-910 (2016)
3.Huber KV, et al. Proteome-wide drug and metabolite interaction mapping by thermal- stability profiling. Nat Methods. 12, 1055–7 (2015).
4. Szklarczyk D, et al. STITCH 5: augmenting protein-chemical interaction networks with tissue and affinity data. Nucleic Acids Res. 44, D380-4 (2016)
5. Mayr, A, et al. Large- Scale Comparison of Machine Learning Methods for Drug Target Prediction on ChEMBL. Chem. Sci. 9, 5441−5451. (2018)
6. Hie, B., et al. Realizing private and practical pharmacological collaboration. Science. 350, 347–350 (2018)
7. Zhou, H., et al. FINDSITE comb2.0 : A New Approach for Virtual Ligand Screening of Proteins and Virtual Target Screening of Biomolecules. J. Chem. Inf. Model., acs.jcim.8b00309. (2018)
8. Drug Selectivity: An Evolving Concept in Medicinal Chemistry., 2017 , John Wiley & Sons. , Handler N. et al.

MatchMaker: A Leap Forward in Proteome Screening Beyond Molecular Docking

Header_Validation.png
 
Final_logo_horizontal_1000.png

MatchMaker: A Leap Forward in Proteome Screening Beyond Molecular Docking

MatchMaker combines molecular biophysics and deep learning (DL) to predict binding of new drug molecules to all proteins with high speed, accuracy and generalizability, moving beyond the reliance of molecular docking.

Header_Figure.png

OPPORTUNITY

Drug/target interaction databases and 3D structure databases offer complementary advantages in machine learning (ML)-based drug-target prediction algorithms

TECHNOLOGY

MatchMaker™
Ligand Express Proteome Screening

SOLUTION

Cyclica has pioneered a novel DL strategy for DTI predictions, that generalizes to new molecules with exceptionally high accuracy, distinguishing binders from non-binders with 98.3% accuracy.

INTRODUCTION

Fast, accurate, and generalizable predictions of drug-target interactions (DTI) have the potential to transform pre-clinical stages of drug discovery. In living systems, a drug’s efficacy, polypharmacology, toxicity, and side effects are mediated through interactions with tens to hundreds of proteins found in the human proteome. The drive to profile the holistic impact of a drug on cellular systems has inspired numerous in silico and experimental approaches, each with their own strengths and weakness. While computational approaches offer undeniable speed and cost advantage over experimental techniques, accuracy and generalizability have historically challenged in silico methodologies.

Advanced ML approaches such as deep learning are increasingly popular in drug discovery, as they can generate accurate models that leverage large quantities of highly dimensional data. However, not all models are truly generalizable and fail to live up to their reported accuracies in real world settings. In particular, DTI predictions are particularly prone to overestimation as a consequence of compound series bias, ie. the presence of multiple similar ligands in a DTI database interacting with the same protein[1].

To address the need for accurate and generalizable DTI prediction model, we have developed MatchMaker, a deep learning algorithm that synthetically augments the millions of known DTIs found in public databases with biophysical information from 3D structure databases. Through stringent testing, we demonstrate accuracy and generalizability that outperform leading edge computational technologies, even outperforming experimental methods such as thermal proteome profiling (TPP).

METHODOLOGY

Ranking test set: We randomly selected 100 molecules (the “ranking test set”) from the DTI database STITCH 5.0 in which each molecule has at least one very high confidence target, i.e. with a confidence score of one. Most molecules had only one high confidence target, but there were seven with two, one with four and one with five targets, for a total of 114 known targets.

Model training: We trained a MatchMaker model with our currently best known DTI data set and hyperparameters. The DTI data was expanded with 19 negative drug/target pairs for each positive, generated by pairing the target with randomly chosen molecules with no evidence of interaction. To avoid compound series bias[1], we excluded all molecules from the DTI data that were similar to any of the ones selected for the ranking test set. Molecules were considered similar when their Tanimoto Similarity (TS) based on Morgan 3 fingerprints exceeded 0.5. The DTI set was then further divided randomly into a training and validation set, using a modified approach to cluster-cross-validation[2]. Briefly, the partition was done along the lines of Tanimoto clusters of 0.75 similarity, again to avoid series bias. The MatchMaker model was trained on the training set, and training was monitored using the validation set.

Individual binding vs. non-binding discrimination: The MatchMaker model is trained with positive and negative cases to discriminate between drug/target combinations that bind in reality (as represented in DTI databases) and those that don’t. We measured the performance of the model by predicting binding for all pairs in the validation set and comparing those predictions with the known labels. Since those labels were excluded from model training, the ability of the model to predict them represents its ability to generalize beyond its training data.

Rank of known targets: Since the purpose of MatchMaker is primarily in proteome screening, we also evaluated the ability of the model to rank all considered proteins such that true targets are found near the top of the list. For this purpose, we computed MatchMaker predictions for all 100 molecules of the ranking test set against all 8717 considered human proteins, noting the ranks of all 114 high confidence targets in their respective lists.

RESULTS

Individual Binding vs. non-binding discrimination: MatchMaker achieves an accuracy of 98.3% in discriminating interacting from non-interacting pairs in the validation set. Figure 1 shows the Receiver Operating Characteristic (ROC) and the Precision/Recall (PR) curves for the discriminator, the areas under the curve (AUC) are 0.986 and 0.87, respectively. The validation set consisted of 378,158 pairs, of which approximately 5% were labeled positive. We also computed a balanced model, which achieves an accuracy of 92.6% on validation data with 1:1 positives to negatives. We focus on the 1:19 model here because it is superior in ranking known targets.

Figure 1. ROC (A) and precision recall (B) curves for discrimination of binding vs. non-binding pairs in the validation set of 378,158 pairs. Approximately 5% of the pairs were labeled positive. The area under curve (AUC) is 0.986 for ROC and 0.87 for PR. The dashed lines represent the balanced model, with 1:1 positives to negatives in training and validation data.

Figure 1. ROC (A) and precision recall (B) curves for discrimination of binding vs. non-binding pairs in the validation set of 378,158 pairs. Approximately 5% of the pairs were labeled positive. The area under curve (AUC) is 0.986 for ROC and 0.87 for PR. The dashed lines represent the balanced model, with 1:1 positives to negatives in training and validation data.

Ranking of known targets: Figure 2 shows the cumulative fraction of true positives amongst all 114 actual positives in the ranking test set. One third of all positives are detected within the top 10 (0.12%), and one half within the top 64 (0.73%). Table 1 quantifies predictive power in proteome screening relative to Cyclica’s first generation Ligand Express®, Proteome Screening technology, which used a unique combination surface matching technology to identify potential binding sites and molecular docking to rank proteins. This first-generation technology was designed with a focus on generalizability and has historically performed well on blind tests, but partial reliance on molecular docking led to high computational demand and limited prediction accuracy. Matchmaker-powered Proteome Screening significantly outperforms its predecessor in ranking known targets, particularly in the top 0.1% (top 9 proteins).

Figure 2: Fraction of true targets ranking higher than a given rank in the list of all covered proteins (8717 for the present model). The area under the full curve (area under accumulation curve (AUAC)), is a measure sometimes used for ranking accuracy and computes to 0.911. The dashed line represents the balanced model.

Figure 2: Fraction of true targets ranking higher than a given rank in the list of all covered proteins (8717 for the present model). The area under the full curve (area under accumulation curve (AUAC)), is a measure sometimes used for ranking accuracy and computes to 0.911. The dashed line represents the balanced model.

Comparative Benchmarking: There are no standardized benchmarks or community challenges available yet to evaluate accuracy and generalizability of proteome-scale DTI predictions. We instead compared MatchMaker directly to Secure DTI, a leading peer-reviewed technology for DTI predictions published by Hie et al. Oct 2018 in Science[3]. The analogous technology trains deep learning models on STITCH DB and uses global protein features derived from domain composition. In evaluating their models however, Secure DTI only excludes training DTIs representing exact ligand matches to testing DTIs (TS=1.0). In other words, isomers, and structural analogs of training molecules may be present in their testing set.

Table 1: Comparison of ranking performance between traditional Ligand Express based on molecular docking and the MatchMaker deep learning method. Shown is the enrichment factor EFx for five different x, which is defined as the number of true positives in the top x ranks (as plotted in Figure 2) divided by the number expected for a random ranking (114·x, here).

Table 1: Comparison of ranking performance between traditional Ligand Express based on molecular docking and the MatchMaker deep learning method. Shown is the enrichment factor EFx for five different x, which is defined as the number of true positives in the top x ranks (as plotted in Figure 2) divided by the number expected for a random ranking (114·x, here).

In contrast, we have applied very stringent filters to exclude all training DTIs with ligands remotely similar to those found in testing DTIs (TS>=0.5). This ensures that reported accuracies are representative of real-world applications, where new lead scaffolds are characterized in preclinical stages of pharmaceutical R&D. Despite applying a more realistic and challenging testing criteria, MatchMaker outperforms Secure DTI when retrained with an analogous 1:1 positive-to-negative training data ratio (Table 2).

Table 2: Comparison of individual binding prediction performance between Secure DTI [3]  and MatchMaker. Shown for both methods are areas under ROC and PR curves, and in addition for MatchMaker, target ranking AUAC values

Table 2: Comparison of individual binding prediction performance between Secure DTI[3] and MatchMaker. Shown for both methods are areas under ROC and PR curves, and in addition for MatchMaker, target ranking AUAC values

SUMMARY

MatchMaker exhibits unprecedented accuracy in predicting ligand binding for individual drug/target pairs (98.3%), and substantially improves proteome screening accuracy relative to Cyclica’s first-generation technology. MatchMaker can screen millions of molecules against the entire human proteome and has the capacity to further augment accuracy by merging public and private sources of DTI data and 3D structures of protein-ligand complexes. By synthetically augmenting DTI data with biophysical information, MatchMaker provides fast, accurate, and generalizable DTI predictions for proteome-scale applications.

RESOURCES