Aboard the Ligand Express: The Time is Now
Cyclica has come out of its best year yet, with a clear vision for the future and focused strategy as communicated by our 2017 address. Our technology is founded upon a rich history of computational chemistry, that has embraced recent advancements to revolutionize drug discovery. The following is a guide through the pillars that Cyclica has built upon, how our technology surpasses past and contemporary offerings, and outlines Cyclica’s bright future. - Naheed and Andreas
Computational chemistry has a history that dates back almost as far as electronic computing itself. In the 1970s, Martin Karplus, Michael Levitt, Arieh Warshel and others laid the foundation for the computational modeling of macromolecules as a means of understanding and predicting chemical and biological processes. For this, Karplus, Levitt, and Warshel shared the 2013 Nobel prize in chemistry (Figure 1).
Despite this academic success, the use of computational modeling in drug discovery has a mixed history. In principle, you can design drugs by evaluating the fit between a potential drug and its target computationally, or in silico, in a process called molecular docking (Figure 2). However, early enthusiasm for computer aided drug design (CADD) in the 1980s was hampered by the high computational cost inherent in the method, as well as by the scarcity of proteins with known molecular structure. Consequently, CADD was eventually overshadowed by the advent of combinatorial chemistry and in vitro high throughput screening (HTS) in the 1990s. Twenty years later, drug development pipelines are running low and HTS, like CADD before it, has not been the panacea that it was hoped to become.
Meanwhile, the power of computers has increased more than 100,000,000-fold, or 8 orders of magnitude. The (now hopelessly outdated) iPhone 4 has more than double the computing power of the Cray-2, the world’s fastest supercomputer back in 1985. The Cray-2 sold for $17 million, in its heyday; the iPhone 4 sold for $600. During the same period, the rate of protein structures deposited into the PDB public repository has gone from a dozen per year to over 10,000 per year, with more than 100,000 structures now available. These two developments have created a completely new landscape for CADD, ushering it into the realm of Big Data.
Conventionally, CADD starts out with a well established target protein and binding site, and then tries to fit millions of different chemicals into that site, selecting those that work best for further development. This is called virtual screening, and is the in silico equivalent of the HTS that dominated drug discovery for 2 decades starting in the 1990s. Virtual screening and HTS are good at identifying molecules with the desired therapeutic effect, but they provide minimal insight into other “off-target” effects these molecules might have. There are over 20,000 different proteins in the human body, and it is now known that any given small molecule will interact with many of them to various degrees. By some estimate, a given drug may have hundreds of off-target interactions, a phenomenon that is called polypharmacology.
Polypharmacology causes toxicity and other adverse effects, most of which are only discovered much further down the development pipeline, after the drug has already been heavily invested in (Figure 3). Such adverse effects can be discovered in animal studies, clinical trials, or, worst of all, when the drug is already on the market and widely used.
Over the past few years, driven by the increase in computational power and a resurgence of machine learning algorithms, numerous companies have emerged to tackle this problem by applying machine learning to large databases of known experimental drug binding data. However, these methods rely on existing knowledge of drug binding, and are by nature unable to predict binding sites that have never before been seen. As such, they traditionally provide insights only into “blockbuster” diseases, which account for ~1.5% of the known proteins. They are challenged to expand outside of that space, i.e. looking at the remaining ~98.5% of known proteins, many of which are related to orphan/rare diseases, because of a lack of drug binding data for those proteins. To predict previously unknown binding sites, computational modeling at the molecular level is needed, machine learning will not suffice.
Enter Cyclica. Since 2013, we have been fascinated by the possibilities of bringing together computational chemistry and Big Data. We believe that simply relying on cognitive computing and machine learning is insufficient to address the requirements by the scientific community, and that traditional virtual screening technologies provide only one piece of the drug discovery puzzle. We decided that the best way we could make a difference was to focus on “proteome-wide screening”. With proteome-wide screening, we have flipped CADD on its head, and approached it with a unique and innovative strategy: While virtual screening evaluates the binding of hundreds of thousands of molecules to a single target, in proteome-wide screening we are able to screen hundreds of thousands of proteins for binding of an individual drug molecule (Figure 4).
Proteome-wide screening is known as a difficult problem, mostly for its forbidding computational complexity, even with today’s computers. We have tackled that by using our unique, proprietary Ligand Express™ Proteome Docking surface matching algorithm that efficiently scans the surfaces of all known protein structures for potential binding sites for a small molecule, and then confirms the matches by conventional docking similar to that used in virtual screening. See our video for an illustration of this process. Another reason proteome-wide screening was considered problematic in the past is that the structure of most proteins was not known, reducing the utility of the approach. This has now changed. According to some estimates, 80% of the proteome now has structural data associated with it, and more data is being generated at an ever accelerating pace.
With these two hurdles out of the way, we undertook a 3 year effort in developing Ligand Express™, the only drug-centric, cloud-based, computational platform that can screen a small molecule compound against all known structurally characterized proteins. This first- and best-in-class technology allows us to provide our clients with a panoramic view of their small molecule, enabling a comprehensive understanding of a drug’s effect, both on- and off-target across the entire structurally characterized proteome. By taking this unique drug-centric approach, Cyclica expands the prediction potential of protein-drug interactions by a factor of 10 to 100 over coventional machine learning approaches, and provides access to novel associations, networks and pathways that could not be identified by evaluating a single protein. Cyclica’s platform can be applied to “blockbuster” diseases as well as for rare/orphan disorders that are difficult to address by other technologies. Because Proteome Docking addresses side effects as well as therapeutic effects, Cyclica’s market also encompasses the personal care/cosmetic and nutraceutical markets, in addition to pharmaceuticals. Ligand Express™ provides information in a matter of days or weeks that would otherwise take months or years to obtain, if obtainable at all. Armed with this information, our clients can make better, faster, and cheaper decisions on which drugs to allocate their time and money to.
What is in store for the future? After developing Ligand Express™, and demonstrating its disruptive applications in drug development, we now want to make it so easy to use for scientists directly. Our cloud-based SaaS platform, slated to be released in the summer of 2017, will allow users to submit molecules for processing, and to interactively and visually analyze the results right at their own desks. Increasingly, we will add to the platform additional tools to reduce the need to manage a large number of different software packages and move towards an integrated environment for computation in drug development. We look forward to work with the CADD community to begin a renaissance of CADD that will finally deliver on its early promises.
Naheed Kurji, President and CEO
Andreas Windemuth, Senior Vice President and Chief Scientist
With thanks to our awesome team!