The Biotechnologists Toolkit for New Drug Discovery – Rise of the Machines

Earl Prinsloo

“Time is money,” they say and never is this truer than in the discovery of new drugs against human diseases like cancer. It is estimated that it takes about 15 years for a new drug to be discovered, tested and approved with a 2012 news report in Forbes claiming that it could cost up to US$11 billion (on average) to achieve the goal of an approved drug.  This time needs to be reduced significantly for human health and economic reasons.

Naturally, strictly controlled clinical trials need to be performed to ensure safety before a drug is marketed towards the general public. An area where time can be reduced is in the laboratory discovery phase. Today we biotechnologists find ourselves in the “high-throughput” era. The high-throughput being described here is simply the development of new technologies that have reduced the amount of time required to answer specific questions; in this case the questions are of a biological nature e.g. does chemical compound A interact with protein B. Now imagine that to discover chemical compound A's interaction with protein B you need to test a library of chemical compounds that can be in the range of hundreds of thousands different permutations; thus time becomes a huge factor. Apart from running the test, analysis time becomes a consideration too. To achieve one’s goal, manpower needs to be superseded by computing power and automation. Computing speed becomes an important determinant in drug discovery.

Computing Power in Biotechnology

Bioinformatics is defined as the application of computer science and information technology to the modelling and analysis of biological data. With the sequencing of the human genome came a veritable explosion of data. The sequence of G, A, T, and C bases created a new problem; what did all the data mean? What it did mean was that it allowed for the building of computational databases of information that could be utilized in further experiments to test how the genes encoded in the genome change in conditions of disease, i.e. the presence of genetic mutations could be detected. Information technology and statistics facilitate the comparative analysis of genomes; a process that would be near impossible without the advancement in computing capacity.

Further to this, computers create a virtual visualisation space that allow for simulations of biological systems. One of the most exciting of these advancements is in the field of Structural Bioinformatics, where modern computers are used to simulate the three-dimensional structure of proteins. Proteins are three-dimensional biological entities composed of chains of amino acids that fold into specific shapes. These shapes confer a specific function or activity to the protein (in the case of an enzyme, for example). Altering this structure, even at one single point, can either lead to loss of function or increased activity; both can have detrimental effects on a mammalian cell either by causing the cell to die or to mutate.  Previously, our understanding of how these changes affected proteins had to be performed in a test tube (in vitro) which was often a long process (and still is). With 3-D structural bioinformatics we can now investigate protein folding and how proteins interact with one another or with possible drugs in 3-D space. The algorithms controlling the simulation are often based in the world of physics and mathematics as the positions of each atom within the amino acid are calculated and sent to a 3-D model.

And then further to this, how each atom in the amino acid, and each amino acid in turn, interacts with its neighbours (e.g. charge interactions) in the same protein can be calculated. This becomes a computationally heavy exercise when you consider that the average protein consists of 400 amino acids (on average; there are larger proteins e.g. Titin consists of about 33000 amino acids). Although this appears extremely complex, the computer simulations are advancing drug discovery in leaps and bounds. As a reminder, all of this is performed to reduce the discovery time. The process of virtual screening allows us to now perform hundreds of thousands of calculations all within a so-called in silico (in silicon or computer based) environment. We can “see” how a drug would interact with a specific protein surface (called an active site) and eliminate all the non-binders. Therefore, instead of testing a large library of 300 000 chemical compound drugs in a laboratory for 1 month to obtain the 1000 potential “hits”, we can reduce the time by removing all the non-binders by computer simulation and focus only on possible “hits” in the laboratory. Therefore 1 month laboratory time becomes 1 week.

Understanding the system

We need to understand our system to potentially cure a disease. Once we have a potential set of drugs against a specific target, we need to test it on an actual biological system to determine the effect. Questions we ask ourselves are:

1. Does the drug interact with the pure protein?
2. Does the drug interact with the protein in a cell?
3. If it does interact in a cell, how does it change the cell?
 

Fortunately computational and instrument advances and automation do come to the rescue. We can physically test how a protein and its corresponding drug interacts using biophysical methods like surface plasmon resonance spectroscopy. This specific technique measures how one molecule interacts with another on a surface and depending on changes in mass that affect the direction of LED laser light we can obtain a measurable response. This technique has advanced rapidly in the past 15 years, with current generation instruments being coupled with automated platforms facilitating precise liquid handling and decreasing the potential of human error. An excellent example of this is the BioRad ProteOn XPR36 Interaction Array System which uses a unique array system to test interactions simultaneously in comparison to controls. Experimental time spent on investigating interactions can now be reduced from three weeks to approximately eight hours. This high-throughput screening technique allows for further information regarding how strong an interaction is. As powerful as this platform is, we need to consider whether the interaction occurs in the cellular environment.

The cell is a complex three-dimensional environment protected from the outside by a fatty lipid membrane. We need to investigate whether the drug can enter a cell and, once inside it, we need to test whether it will affect the cell, for example, if it will kill a cancerous cell and if so, how. Biotechnologists use a multitude of techniques and instruments to rapidly test candidate drug effects on cells. One of the recent advances in monitoring the effect of a drug on cells is the ACEA Biosciences/Roche xCELLigence Real-Time Cell Analyser which uses electrical impedance to monitor whether cells are growing, moving or, in the case of drug screening, dying, all in real-time, allowing for the precise observation of the test environment. If one considers that in the past, answers to whether a drug was killing or inhibiting a cell could only be obtained possibly after a week by end-point assay, we can start to appreciate how an hourly view of a drugs effect allows us to build a better picture.

The bigger(or smaller) picture

Speaking of pictures,  “a picture is worth a thousand words” and biotechnologists engaged in drug discovery rely on microscopes to see how proteins move and interact in a cell. Consider a specialised molecule known as a signalling protein which moves from the cell’s outer membrane to the nucleus and interacts with DNA to tell the cell that it needs more of another specific protein. In cancer this highly controlled process may be changed due to an amino acid mutation and this could lead to a continuous signal being sent, therefore the cell produces more of the target protein and this could alter the cell’s growth, i.e. allowing it to grow into a cancerous tumour. If our drugs are designed to inhibit this signalling protein we may want to view whether it is blocking the movement into the nucleus. If the library of possible inhibitors is composed of 20 000 chemical compounds, it becomes extremely laborious to view all of them.

The development of the cutting edge microscopic technique of High Content Analysis combines computing power, a robotic platform and a powerful microscope to analyse thousands of samples in record time. Therefore we can programme the instrument to look for a specific marked (often fluorescent) proteins movement in a cell. The platform will photograph and analyse thousands of images, representing thousands of candidate drugs, beyond the talents of any mere mortal and output the result which, using mathematical statistics, it will compare to a control (e.g. no drug). If one considers that on average it may take an experienced biotechnologist 20 minutes to observe and photograph a single microscope slide whereas with High Content Analysis, the instrument can observe and analyse hundreds of images in the same time we start to see where the balance of the power lies.

The future?

Combined, the veritable treasure chest that encompasses the Biotechnologists toolkit is invaluable both in terms of time and money. The drug discovery pipeline is a long and arduous road. The reduction in time afforded by the advancement in computational power and instrumentation is only to the advantage of the biotechnologist. Although the equipment and examples discussed in this article appear to reduce the requirement for human scientists, this cannot be further from the truth: people are still in control. Trained biotechnologists and scientists are required to use the equipment to pose and answer questions and, in some cases, improve on the instrumentation or build better more efficient test platforms. The rise of the machines in drug discovery biotechnology serves a single combined purpose for the improvement of the validity, accuracy and speed of discovery.
 

* Dr Earl Prinsloo lectures Biotechnology at Rhodes University