The present invention relates to methods, apparatus and computer program products for investigating and characterising treatments or stimulus applied to cells. In particular, the present invention allows a fuller characterisation of a treatment or stimulus by evaluating side effects as well as the effect or effects on which the investigation is focussed.
A variety of methods exist for carrying out assays to investigate the effects of a compound or treatment, for example as part of a drug discovery program or as part of a medical investigation. Such investigations tend to be designed so as to focus on a primary effect of the treatment. Such as, what is the effect of the treatment on a specific condition or mechanism of action, or is the treatment efficacious for a specific condition or mechanism of action, or what is the effect of the treatment.
In such investigations, there can be multiple effects caused by the treatment. However, such investigations tend to focus only on the effect that the investigation is intended to elucidate (herein the “on-target effect”). Hence, in some circumstances, while an investigation may indicate that a treatment has no efficacy for a first condition, or is in fact harmful, it is possible that the treatment could have effects other than the on-target effect, that is side effects (herein “off-target effects”) which could be harmful or beneficial. An example of a drug which can have some negative side effects not detected during the drug development or approval stages would be thalidomide, which had harmful effects not related to its on-target effect. Hence some method by which a treatment can be more fully investigated or characterised would be beneficial.
Further, the interaction between a treatment and an organism, for example the human body, can be very subtle and complex. A large variety of factors can be involved in the mechanism and expression of a disease. Hence, a method which can be used to investigate and characterise treatments at a practicable level and which is appropriate for understanding and elucidating the biological processes involved would be beneficial.
Furthermore, owing to the large number of factors that may be involved and the complexity and subtlety of their interaction, a robust method which can be used to systematically acquire a practicable amount of potentially relevant data for analysis and which can provide a more quantitative indication of the various effects of a treatment, rather than a merely qualitative indication of an effect would be beneficial.
The foregoing discussion of the background to the present invention is not acknowledged to be part of the prior art nor within the common general knowledge of a person of ordinary skill in the art. In particular, the appreciation of the drawbacks of present methods of investigating and characterising treatments is not acknowledged to be part of the prior art and has been presented above merely so as to more clearly present the nature of the present invention.
The present invention provides in one aspect, methods, apparatus and software for drug discovery, investigating, characterising or classifying treatments applied to cells and for investigating, characterising or classifying the effects and side effects of treatments on cells.
In one aspect of the invention, a method is provided for investigating a treatment applied to cells. The treatment has an on-target effect on the plurality of cells. An on-target cellular feature or group of on-target cellular features is identified. The on-target cellular feature or features can be affected by the treatment. The on-target cellular feature or features can be related to the on-target effect. An off-target cellular feature or group of off-target cellular features are identified. The off-target cellular feature or group of off-target cellular features can be different to the on-target cellular feature or features. The off-target cellular feature or group of off-target cellular features can also be affected by the treatment and can be related to a side effect of the treatment. A measure of the side effect can be determined based on the off-target cellular feature or features.
In another aspect of the invention, a method is provided for characterising a treatment applied to a population of cells. The treatment can have an on-target effect on the population of cells. A first group of cellular features, which have been affected by the treatment, is identified from a plurality of cellular features of the population of cells. The first group of cellular features can be related to the on-target effect of the treatment. A second group of cellular features can be identified from the plurality of cellular features which have been affected by the treatment and which are not related to the on-target effect of the treatment. A first signature characteristic of the on-target effect from the first group of cellular features can be created. A second signature not characteristic of the on-target effect can be created from the second group of cellular features. A first measure derived from the first signature and a second measure derived from the second signature can be evaluated to characterise the treatment.
In another aspect of the invention, a method is provided for characterising a treatment applied to a population of cells. A plurality of cellular features can be derived from a captured image of cells that have been exposed to the treatment. An on-target effect signature can be created, which is characteristic of an on-target effect of the treatment, from a first one of the plurality of cellular features. The plurality of features can relate to cellular properties involved in the on-target effect. A side effect signature is created, which is characteristic of a side effect to the on-target effect, from a second one of the plurality of cellular features. The second one of the plurality of cellular features can relate to cellular properties not involved in the on-target effect. An on-target effect metric derived from the on-target effect signature and/or a side effect metric derived from the side effect signature can be evaluated to characterise the treatment.
Other aspects of the invention include computer program products, computer program code, data structures and computing devices which can provide the various method aspects of the invention.
These and other features and advantages of the present invention will be described below in more detail with reference to the associated drawings.
Generally, this invention relates to processes and apparatus for use in investigating and characterising the effects of a treatment or stimulus on cells. The methods and apparatus presented in the following can also be used in order to investigate, characterise, or otherwise quantify, an intended effect under investigation and a one or more side effects on cellular behaviour caused by or resulting from the treatment as will be apparent from the following discussion. The invention also relates to computer programs, machine-readable media on which are provided instructions, data structures, etc. for performing the processes of the invention. Features of cell components, which have been derived from captured images of cells, are analyzed in order to provide some measures, or metrics, indicative of the extent to which the treatment caused various biologically relevant effects. These metrics can then be used to help characterise, classify or otherwise categorise a treatment that has been applied to the cells.
The general method includes the analysis of cellular features derived from images captured by an image capture system. Typically an image will be captured of a cell or plurality of cells, depending on the magnification at which the image is captured and certain markers can be used to highlight in the captured image the component of the cell of interest. The term “marker” or “labeling agent” refers to materials that specifically bind to and label cell components. These markers or labeling agents should be detectable in an image of the relevant cells. Typically, a labeling agent emits a signal whose intensity is related to the concentration of the cell component to which the agent binds. Preferably, the signal intensity is directly proportional to the concentration of the underlying cell component. The location of the signal source (i.e., the position of the marker) should be detectable in an image of the relevant cells.
Preferably, the chosen marker binds indiscriminately with its corresponding cellular component, regardless of location within the cell. Although in other embodiments, the chosen marker may bind to specific subsets of the component of interest (e.g., it binds only to sequences of DNA or regions of a chromosome). The marker should provide a strong contrast to other features in a given image. To this end, the marker should be luminescent, radioactive, fluorescent, etc. Various stains and compounds may serve this purpose. Examples of such compounds include fluorescently labeled antibodies to the cellular component of interest, fluorescent intercalators, and fluorescent lectins. The antibodies may be fluorescently labeled either directly or indirectly.
As part of the general method, the effect of a stimulus or treatment on cells can be investigated using the algorithms and processes described herein. The term “treatment” or “stimulus” refers to something that may influence the biological condition of a cell. Often the term will be synonymous with “agent” or “manipulation.” Stimuli may be materials, radiation (including all manner of electromagnetic and particle radiation), forces (including mechanical (e.g., gravitational), electrical, magnetic, and nuclear), fields, thermal energy, and the like. General examples of materials that may be used as stimuli include organic and inorganic chemical compounds, biological materials such as nucleic acids, carbohydrates, proteins and peptides, lipids, various infectious agents, mixtures of the foregoing, and the like. Other general examples of stimuli include non-ambient temperature, non-ambient pressure, acoustic energy, electromagnetic radiation of all frequencies, the lack of a particular material (e.g., the lack of oxygen as in ischemia), temporal factors, etc.
Specific examples of biological stimuli include exposure to hormones, growth factors, antibodies, or extracellular matrix components. Or exposure to biologics such as infective materials such as viruses that may be naturally occurring viruses or viruses engineered to express exogenous genes at various levels. Biological stimuli could also include delivery of antisense polynucleotides by means such as gene transfection. Stimuli also could include exposure of cells to conditions that promote cell fusion. Specific physical stimuli could include exposing cells to shear stress under different rates of fluid flow, exposure of cells to different temperatures, exposure of cells to vacuum or positive pressure, or exposure of cells to sonication. Another stimulus includes applying centrifugal force. Still other specific stimuli include changes in gravitational force, including sub-gravitation, application of a constant or pulsed electrical current. Still other stimuli include photobleaching, which in some embodiments may include prior addition of a substance that would specifically mark areas to be photobleached by subsequent light exposure. In addition, these types of stimuli may be varied as to time of exposure, or cells could be subjected to multiple stimuli in various combinations and orders of addition. Of course, the type of manipulation used depends upon the application.
As part of the processing of captured images, certain features of the cells can be extracted using suitable image processing techniques. The algorithms and processes of the present invention can take this feature data as input in order to carryout their analysis. As used herein, the term “feature” or “cellular feature” refers to a property of a cell or population of cells derived from cell images and includes the basic “parameters” extracted from a cell image. The basic parameters are typically morphological, concentration, and/or statistical values obtained by analyzing a cell image showing the positions and concentrations of one or more markers bound within the cells. Examples of the various features used by the algorithms and processes are given later on herein. It will be appreciated in the following that the algorithms and processes of some aspects of the present invention can work directly from the feature data, and may not need to themselves process the images from which the feature data has been obtained. In other embodiments, the algorithms may processes images in order to obtain information.
Generally, a wide number of cell components can be detected and analyzed. Cell components can include proteins, protein modifications, genetically manipulated proteins, exogenous proteins, enzymatic activities, nucleic acids, lipids, carbohydrates, organic and inorganic ion concentrations, sub-cellular structures, organelles, plasma membrane, adhesion complex, ion channels, ion pumps, integral membrane proteins, cell surface receptors, G-protein coupled receptors, tyrosine kinase receptors, nuclear membrane receptors, ECM binding complexes, endocytotic machinery, exocytotic machinery, lysosomes, peroxisomes, vacuoles, mitochondria, Golgi apparatus, cytoskeletal filament network, endoplasmic reticulum, nuclei, nuclear DNA, nuclear membrane, proteosome apparatus, chromatin, nucleolus, cytoplasm, cytoplasmic signaling apparatus, microbe specializations and plant specializations.
With reference to
At step 102 a population, or populations, of cells is exposed to the treatment or stimulus according to any suitable experimental protocol. The cell may be treated using a chemical agent which can be any type of chemical or chemical compound and may in particular be a potential drug or pharmaceutical, any other type of therapeutic agent. Typically, a chemical agent may be delivered in a solution and/or with other compounds or treatments, and at varying dose levels. The cells may also be exposed to a biological treatment, such as a virus, protein or by having the cells' DNA modified by any other means by which biological effects may be induced in the cells. An example of an experimental protocol will be described later in greater detail.
An experiment into the effect of a treatment can typically be carried out by combining sets of assay plates to achieve some scientific purpose. An assay plate is typically a collection of wells arranged in an array with each well holding at least one cell or a related group or population of cells which have been exposed to a treatment or which provides a control group, population or sample. In other embodiments, multiwell plates are not used and single sample holders can be used. As explained above, a treatment can take many forms and in one embodiment can be a particular drug or any other external stimulus (or a combination of stimuli and/or drugs) to which cells are exposed on an assay plate or have previously been exposed. Experimental protocols for investigating the effect of a treatment will be apparent to a person of skill in the art and can include variations in the dose level, incubation time, cell type, cell line and other parameters which are typically varied as part of an experimental protocol.
After the cells have been treated, the extent of the effect of the treatment for the on-target effect is evaluated in step 104. The evaluation of the extent to which the treatment affects the on-target effect is determined by investigating, in a quantitative way, how the properties of the cells that are involved in or related to the on-target effect have changed.
For example, the on-target effect could be mitotic arrest in which case the efficacy of a treatment in delaying the progression of mitosis, or arresting cells in mitosis, could be under investigation. After the treatment has been applied to the cells and features have been extracted from captured images, then some of the cellular features can be used to classify cells as interphase or mitotic. For example, the amount of fluorescence from an anti-phospho-histone 3 (PH3) coupled to a fluorophore can be used to distinguish between mitotic and interphase cells. If PH3 staining is not available, or desirable, then cells can be classified as mitotic or interphase based on a combination of the size of nuclei and the amount of DNA material in nuclei (as revealed by DNA staining using DAPI or Hoechst stains). Mitotic cell DNA is generally smaller and brighter (i.e. captured images have higher mean and median pixel intensities) than DNA in interphase cells. Although there is no real nucleus during mitosis in mammalian cells, amounts of DNA can still be identified. After each cell, or image object, has been classified as interphase or mitotic (or discarded as being an imaging artefact), the proportion of mitotic cells in the cell population can be calculated and provides a metric for the on-target effect: in this example a mitotic index. The effect of the treatment can then be determined by comparing the mitotic index for the treated cells with the mitotic index for a control group of cells. An increase in the mitotic index compared with the negative controls is an indication that the treatment promotes mitotic arrest.
In the above example, mitotic arrest of cells is the on-target effect or property, and a cellular feature, or group of cellular features, which are characteristic of that effect are used to indicate the extent of that effect. In the above example, the detection of PH3 is used. Alternatively, in the above example, the size of the nuclei in the cells and/or other features relating to nuclear size can be used as the cellular feature, or group of cellular features, as, in general, mitotic arrest causes nuclei to be smaller than the nuclei of interphase cells. Therefore the size of the nuclei in the treated cells is a cellular feature which is related to the on-target effect of interest. Other cellular features, involved in mitotic arrest, are also cellular features which are related to the on-target effect. For example the nuclear perimeter, nuclear area, nuclear form factor and other metrics relating to the morphology, shape or texture of a nucleus could also be used as cellular features related to the on-target effect.
There will likely be other cellular features of cell components which are involved in or relate to mitotic arrest and which will also be affected by the treatment and so change. Therefore, from the set of all cellular features, there will be a subset which relate to mitotic arrest (the on-target cellular features). Therefore using a one or a combination of the on-target cellular features, the effect of the treatment on the on-target effect can be evaluated.
It is possible that there will be a number of cellular features which will not be affected by the treatment and these can be considered to be “irrelevant” or neutral cellular features as the treatment has no noticeable or substantial affect on them.
As well as producing the on-target effect, the treatment may have a one or a number of side effects or “off-target” effects on the cells. For example, as well as a treatment causing mitotic arrest, the same treatment may also cause the breakdown of the actin cytoskeleton of a cell, or a Golgi apparatus in interphase cells. This breakdown may be a more or a less dominant effect of the treatment than mitotic arrest, but nonetheless it can be considered to be a “side effect” or “off-target effect” as it is not the intended or targeted effect (which in this example is mitotic arrest) of the treatment under investigation.
For any treatment, there will likely be a number of cellular features relating to a cell or cell components which are related to the side or off-target effect or effects. For example cellular features relating to or characteristic of the Golgi apparatus can be used to determine the extent of the off-target effect of the treatment on the proteins involved in the maintenance of the Golgi, and which are not involved in mitotic arrest. Therefore, there will be a number of cellular features which are affected by the treatment, but which are not related to the on-target effect. A one, some or all of those cellular features can be considered off-target cellular features which can be used in step 104 to evaluate the extent of the effect of the treatment on off-target effects.
It is envisaged that there may be one or more side or off-target effects and that different groups of off-target cellular features may be used in order to evaluate or assess the effect of the treatment on the multiple side effects. In some instances, the side effect may be toxicity. However, in general, the side or off-target effects of a treatment can be any effect on the cellular proteins which are not related to the intended or on-target effect under investigation.
By evaluating 104 both the on-target and off-target effects of the treatment, a better characterisation of the treatment on the cells can be obtained at step 106. Conventional, investigations have tended to focus on the single intended effect of a treatment and side effects have not been systematically evaluated in order to better characterise the overall effect of the treatment of the cells. For instance, a treatment may have a high an efficacy as a mitotic arrest agent but may also be highly toxic and result in significant cell death. Therefore, an investigation which evaluates the affect on mitotic arrest alone would not necessarily highlight this important and potentially harmful side effect. Therefore, the methods of the present invention allow a better characterisation of the overall affect of the treatment by considering the intended effect and also evaluating side effects.
Further, it has been found that different dose levels and experimental protocols can result in different levels at which the intended and side effects occur. Therefore, a treatment, which under conventional investigation methods may be discarded from further evaluation as being either harmful or non-efficacious, can be identified as beneficial under methods of the present invention. Also appropriate dose levels can be determined at which the desired effects are increased and the harmful effects are reduced, which otherwise would not be identified in the absence of information as to the extent of any side effects. Therefore at step 106, the treatment can be characterised based on the on-target effect and any off-target effects, and, in some embodiments, over multiple experimental conditions. It will be appreciated that the on-target effect is not limited to being a beneficial effect and can be a beneficial or harmful effect on the cells, and similarly the off target effect is not limited to being a harmful effect and may also be beneficial or harmful, depending on the context of the overall investigations.
Having discussed the overall methodology of the invention, an example embodiment will now be described in greater detail in the context of an image based collection of cellular features and using the example of mitotic arrest. However, it will be understood that the invention is not limited to investigation of the effect of a treatment on mitotic arrest and side effects thereof, but is applicable to any treatment and to any effect on cellular components, mechanisms or activities and side effects. In particular, the on-target cellular features, relating to the on-target effect, and the off-target cellular features, relating to the off-target effect, will be entirely application dependent. The off-target and on-target cellular features will depend on a number of factors, including: the nature of the intended on-target effect of the treatment and of any anticipated side effects; specific assay configurations, such as cell lines and markers used in the assay; the desired sensitivity; the concentration or dose levels of the treatment; the definition of the on-target and off-target effects; and the sensitivity of the assay at detecting the off-target effects.
Different types of cells can be used in the investigations. For example, for side effects of anti-mitotic cancer treatments, a set of transformed and primary cell lines can be used. Cell lines or mixed cell cultures that can serve as a surrogate for specific types of toxicity can be used, for example primary hepatocytes or hippocampal neurons.
Cellular features relating to various different types of generic cellular phenomena can be related to the on-target and off-target effect, such as changes in growth rate, cell cycle status, cytoskeletal organization, cell shape, alterations in organization and functioning of the endocytic pathway, changes in expression and/or localization of transcription factors, receptors and similar.
It is not necessary to know the off-target cellular features in advance as the off-target features are essentially the features which are affected by the treatment but which are not related to the intended or on-target effect of the treatment. Therefore the cellular features to be used in order to evaluate the extent of the off-target effect may only become apparent after the investigations have been initiated. The off-target cellular features may be selected based on biological knowledge of already known potential effects, in which case the investigation it can be determined whether the particular treatment gives rise to any of these effects as a side effect. In another embodiment, computational techniques can be used in order to identify off-target cellular features, if a good training set from previous experiments is available.
Although illustrated as sequential in
In the illustrated embodiment, at step 254, the cells are treated, chemically fixed, stained and placed in wells. However, this is not necessary and in another embodiment, live cells can be used which express a fluorescent protein or stained with live dyes and so no fixing or staining operations are required. In greater detail, wells are provided holding a population of cells. The treatment, in this example a compound, to be investigated is applied to the cells at different concentration levels, by dilution in culture medium. In this example, eight different concentration or dose levels are used, with a different dose level in each well. Fewer or more dose levels can be used as appropriate. The experiment is replicated three times so as to provide three sets of results for each concentration level. Fewer replicates can be used based on cost considerations, but larger numbers of replicates are preferred as providing data with a lower noise level. The drug and cells can be allowed to incubate for a fixed period of time, e.g. in one embodiment 24 hours, to allow the treatment to take effect. In other embodiments, the cells are allowed to incubate for varying periods of time, in order to investigate the time variation of the treatment. The cells can then be chemically fixed, for a single time point assay. The cells for each cell line are subject to a first staining protocol and a second staining protocol, which may involve multiple stains depending on the number and type of cellular features to be marked. Hence, in the described embodiment, 288 wells (eight dose levels, six cell lines, two staining protocols and three replicates) are used each holding a cellular population or group therein.
At the same time as the treated cells are being prepared, a number of control populations of cells are also prepared in step 256. The cells are subject to the same staining treatments, fixation and incubation periods as the treated cells, but without being subjected to the treatment. In one embodiment, the cells are incubated with DMSO, at the same concentrations levels as that used to administer the treatments, in order to provide controls for each cell line and staining or experimental condition. In one embodiment eight control wells are provided on each well plate. This provides at least one control for each cell line/staining protocol combination. Hence the cell sample preparation step 204 results in eight treatment concentrations, in triplicate, with cells stained according to two different protocols, and for six different cell lines and with control populations of cells which have not been exposed to the treatment. It is not necessary to use more than one stain or staining protocol and in other embodiments a single stain only can be used.
Returning to
A user interface device 292, which can be a personal computer, a work station, a network computer, a personal digital assistant, or the like, is coupled to the computing device. In the case of cells treated with a fluorescent marker, a collection of such cells is illuminated with light at an excitation frequency from a suitable light source (not shown). A detector part of the image capturing device is tuned to collect light at an emission frequency. The collected light is used to generate an image, which highlights regions of high marker concentration.
Sometimes corrections can be made to the measured intensity. This is because the absolute magnitude of intensity can vary from image to image due to changes in the staining and/or image acquisition procedure and/or apparatus. Specific optical aberrations can be introduced by various image collection components such as lenses, filters, beam splitters, polarizers, etc. Other sources of variability may be introduced by an excitation light source, a broad band light source for optical microscopy, a detector's detection characteristics, etc. Even different areas of the same image may have different characteristics. For example, some optical elements do not provide a “flat field.” As a result, pixels near the center of the image have their intensities exaggerated in comparison to pixels at the edges of the image. A correction algorithm may be applied to compensate for this effect. Such algorithms can be developed for particular optical systems and parameter sets employed using those imaging systems. One simply needs to know the response of the systems under a given set of acquisition parameters.
After the images have been captured, at step 264, the captured images are processed using any suitable image processing and image correction techniques in order to extract the cellular features for the cells from the stored captured images.
A number of image processing steps can be carried out in step 264 and not all the steps described are essential. Certain steps may be omitted and other steps may be added depending on the exact nature of the image capture process and markers used. The image can be corrected to remove any artefacts introduced by the image capture system and to remove any background. Other conventional image correction technique which will improve the quality of the image can also be used. Typically, nuclear markers and cytoplasmic markers generate radiation at different wavelengths and so separate nuclear images and cytoplasmic images may be captured. Therefore different image correction techniques may be used for the nuclear and cytoplasm images, or for images captured of different markers or stains. Similarly, in the rest of the processes, different techniques may be used for the nuclear and cytoplasmic images, depending on the markers used. Also, different processing techniques can be carried out depending on the type of imaging that is used, e.g. brightfield, confocal or deconvolution.
After image correction, a segmentation process is carried out on the images in order to identify individual objects or entities within the image. Any suitable segmentation process may be used in order to obtain various cellular objects or components, such as nuclear and cellular objects and components. Typically nuclear DNA markers provide a strong signal and there is a high contrast in the image and an edge detection based segmentation process can be used. For segmenting cells, a watershed type method can be used instead. The segmentation process typically identifies edges where there is a sudden change in intensity of the cells in the image and then looks for closed connected edges in order to identify an object. Segmentation will not be described in greater detail as it is well understood in the art and so as not to obscure the present invention.
Additional operations may be performed prior to, during, or after the imaging operation 206 of
In a specific embodiment, a correction algorithm may be applied prior to segmentation to correct for changing light conditions, positions of wells, etc. In one example, a noise reduction technique such as median filtering is employed. Then a correction for spatial differences in intensity may be employed. In one example, the spatial correction comprises a separate model for each image (or group of images). These models may be generated by separately summing or averaging all pixel values in the x-direction for each value of y and then separately summing or averaging all pixel values in the y direction for each value of x. In this manner, a parabolic set of correction values is generated for the image or images under consideration. Applying the correction values to the image adjusts for optical system non-linearities, mis-positioning of wells during imaging, etc.
Generally the images used as the starting point for the methods of this invention are obtained from cells that have been specially treated and/or imaged under conditions that contrast the cell's marked components from other cellular components and the background of the image. Typically, the cells are fixed and then treated with a material that binds to the components of interest and shows up in an image (i.e., the marker).
At every combination of dose, cell line and staining protocol, one or more images can be obtained. As mentioned, these images are used to extract various parameter values of cellular features of relevance to a biological, phenomenon of interest. Generally a given image of a cell, as represented by one or more markers, can be analyzed, in isolation or in combination with other images of the same cell (as provided by different markers), to obtain any number of image features. These features are typically statistical or morphological in nature. The statistical features typically pertain to a concentration or intensity distribution or histogram.
Some general feature types suitable for use with this invention include a cell, or nucleus where appropriate, count, an area, a perimeter, a length, a breadth, a fiber length, a fiber breadth, a shape factor, a elliptical form factor, an inner radius, an outer radius, a mean radius, an equivalent radius, an equivalent sphere volume, an equivalent prolate volume, an equivalent oblate volume, an equivalent sphere surface area, an average intensity, a total intensity, an optical density, a radial dispersion, and a texture difference. These features can be average or standard deviation values, or frequency statistics from the parameters collected across a population of cells. In some embodiments, the features include features from different cell portions or cell lines.
Examples of some specific cellular and nuclear features and parameters that may be extracted from the captured images during step 264 are included in the following table. Other features and parameters can also be used without departing from the scope of the invention.
After the features have been extracted 264 from the image they are stored 210 in database 286, and analysis of the features is carried out in order to assess the effect of the treatment on the cells.
As explained above, some of the cellular features obtained for the cells are simple features, e.g the area of a nucleus. Other cellular features are statistical in nature, e.g. the standard deviation of the nuclear area for a group of cells, and reflect properties of the group of cells in a well or related wells. It will be appreciated that any simple or complex cellular feature than can be derived from the images is suitable for use in the present invention and that the invention is not to be limited to the specific examples given, nor to the specific sequence of actions, which is merely by way of an illustrative example. The result of step 264 can be thousands or tens of thousands of cellular features derived from each of the treated wells and control wells.
In general in steps 266 and 268 cells from a well are evaluated and some statistics for that well, e.g. the average of a property, are calculated. Then, the same quantity is obtained for the replicate wells (i.e. the other five wells when the experiment is replaicte six time) statistics are computed on those statistics for the replicate wells in order to aggregate (e.g. obtain the median of the average value mentioned above). However, averaging is not necessary and instead cell level information can be used, and have all further computations to be based on cell level information. Hence, for each drug/cell line/time point/marker set/etc there would be thousands of data points. Models based on this would be more complicated and would require greater computing power, but it may provide better estimates compared to the matrix discussed below.
At step 266, at each dose level and for each cell line, the cellular features can be averaged, e.g to obtain an average nuclear area for the cells from a certain cell line at a certain dose level. Hence an average simple cellular feature can be obtained for each cell line at each dose level. However, it is not necessary to calculate averages over cells. Also, other statistical measures can be used such as the median, specific quantiles, standard deviations and other measures of the statictical properties of a group of objects. Further, the statistical properties need not be calculated over all cells, but can be calculated over a sub-population of cells, for example over the sub-group of interphase cells. In that case, a cell cycle related classification of the cells is carried out prior to summarizing or avegaring the cell feature values. For example, in the example where the on-target effect is mitotic arrest, the off target cellular features are computed only for the sub-population of interphase cells, e,g, the average cell area for all interphase cells and not for all cells.
At step 268, more complex cellular features, based on a statistical analysis of the properties of the cells in the wells, rather than the properties of a single cell, are calculated over all the wells for each cell line at each dose level. Hence the cellular features obtained characterise the simple cellular features and statistical cellular features for the cellular populations at each dose level for each cell line.
In other embodiments, the simple cellular features and the statistical cellular features can be determined across cell lines so as to be characteristic of the effect of the treatment across different cell lines. In other embodiments, different incubation times can be used for a given concentration and the cellular features can be averaged over the different incubation times in order to provide cellular features characteristic of the effect of the treatment at the same dose level but over different incubation times.
Returning to
The method then proceeds to calculate at step 304 a quantitative measure of the on-target effect relative to the control cells. In this example, the on-target metric is the proportion of mitotic cells in a cellular population. For example the proportion of mitotic cells for a certain dose level may be of order 30%. As will be appreciated, the reliability of determination of the proportion of mitotic cells will depend on the number of cells present in the population of cells being evaluated. For example a determination of 30% from a population of 1500 cells can be considered to have greater reliability than the proportion obtained from a cellular population of, for example, only 100 cells. Further, the calculation of the on-target metric is carried out relative to the control cell population for the cell line. Again the reliability of the determination of the proportion of mitotic cells in the control well will depend on the number of cells in the control well.
Therefore, in one embodiment, in order to take this effect into account, chi-squared statistics are used. A method for obtaining approximate confidnec intervals for the ratio of two binomial proportions based on two independent binomially distributed random variables is used. A chi-square test is used to test the null hypothesis, that the treated and control cell populations can be considered to come from the same cell population, against the hypothesis that the treated cells and the control cells can be considered to come from different cell populations, and hence that the treatment has had a significant effect. The method is described in greater detail in “Confidence Intervals for the Ratio of Two Binomial Proportions”, Biometrics Volume 40, Issue 2, pp. 513-517, June 1984 which is incorporated herein by reference for all purposes.
In particular, where n is the total number of objects (cells) and X is the number of objects under investigation (i.e. mitotic cells) and with the subscript t referring to treated cells and c referring to control cells, then:
p′=((nt+nc+Xt+Xc)−{(nt+nc+Xt+Xc)2−4(nt+nc)(Xt+Xc)}1/2)/2(nt+nc)
under the null hypothesis H0 θ=1, and the chi squared statistic I is given by:
I=(nt(pt−p′)2+nc(pc−p′)2)/p′(1−p′)
Where p′ is calculated as given above, and pt is the proportion of mitotic treated cells and pc is the proportion of mitotic control cells. Although chi square statistics are used to provide the test, other statistics can be used.
Hence the end result of step 304 is a quantitative measure of the extent of on-target effect of the treatment on the cell line at a particular dose level relative to the control group for that same cell line. As will be appreciated, in other embodiments, the value can be calculated across the cell lines rather than on a per cell line basis. Also, it is not essential to calculate the mitotic index taking into account the properties of the control group in order to arrive at a suitable on-target metric. However, it is preferred if the on-target metric is calculated using on-target cellular features which vary with respect to the control group of cells.
Returning to
More specifically, in the example under discussion, a particular group of off-target cellular features for characterising the off-target effectiveness of a mitotic arrest drug, could include, for all cells that are not mitotic:
(i) the average size of cell nuclei;
(ii) the average elliptical axis ratio for nuclei;
(iii) the average kurtosis intensity of cells;
(iv) the average pixel intensity for Golgi apparatus in cells;
(v) the average cell area;
(vi) the elliptical axis ratio for cells;
(vii) the form factor (area divided by perimeter) for cells;
(viii) the kurtosis of the intensity of tubulin;
(ix) the second moment of a cell;
(x) the average total intensity of tubulin for each cell;
(xi) the proportion of branched (i.e. having projections) cells.
In this example, the above group of cellular features constitutes the group of off-target cellular features which in combination define the off-target signature. A sub-group of these features can be used, or alternatively other groups of off-target cellular features can be used. As will be appreciated, there are a large number of variables in this group of features. Some of these variables may be more important than others, i.e. may be more affected by the treatment than others. The combination of these features can be thought of as defining a vector in a multivariate space (defined by the cellular features) and which is characteristic of the off-target effect, i.e. provides a signature of the off-target effect.
At step 324, a quantitative measure of the extent of the off-target effect is determined by calculating an off-target metric at each dose level and for each cell line. In another embodiment, the off-target metric can be calculated for the combination of all cell lines. The degree to which the treatment causes an off-target effect is reflected in the separation in multivariate space between the off-target signature for treated cells and the off-target signature for the control group of cells.
In one embodiment, each cellular feature can be normalised with respect to the other cells in the group of cells at the particular dose level and for the cell line. Each cellular feature is normalised (fN) by subtracting the average value (fav) for the cellular feature over the population of cells from the value (f) and dividing by the standard deviation (σ) for the population of cells as follows: fN=(f-fav)/σ.
After each cellular feature has been normalised in this way, and similarly for the control group cellular features, a distance in multivariate space is calculated. For the purposes of simplicity of discussion, if it is assumed that there are only three cellular features (a, b, c) comprising the off-target signature, and where the subscript ‘t’ refers to a feature of a treated cell and the subscript ‘c’ refers to a feature of a control cell, then the distance (L1) in multivariate space between the off-target signature of the treated cells and off-target signature of the control cells can be calculated as L1=|at-ac|+|bt-bc|+|ct-cc|, which provides the off-target metric.
Alternatively, the Euclidean distance (L2) can be calculated using L2={square root}((at-ac)2+(bt-bc)2+(ct-cc)2) to provide the off-target metric. Other methods of calculating the separation in multivariate space between the treated cell off-target signature and the control cell off-target signature can also be used. Further, in other embodiments of the invention, the on-target metric can be calculated in the same way, using on-target signatures, rather than using the example method described above with reference to
Returning to
By way of example of evaluation, point 336 corresponds to a particular dose level for a particular treatment on a particular cell line. As can be seen, at this dose level, both the on-target and off-target metrics are significant. It may be that in the absence of the off-target metric, this dose level would be considered acceptable as providing a desired efficacy with regard to the on-target effect. However, by utilising the off-target metric, this dose level may be identified as being undesirable, e.g. toxic, and so the treatment can be more accurately characterised. Point 338 corresponds to a different dose level for the same compound and the same cell line. At this dose level, the compound may be considered to provide sufficient efficacy and to have sufficiently low off-target effect as to be of utility. In this example, the dose level associated with point 338 is lower than the dose level associated with point 336 and therefore is useful in identifying a suitable dosage level for the treatment in order to avoid unwanted side effects. The dose level correspondent to point 340 is lower than the dose level correspondence to point 338 but at this dose level, the side effects are greater, indicated by the higher off-target metric, and so again this helps to identify dosage levels at which undesirable effects can be reduced.
Similarly, point 342 which corresponds to the same drug as points 336, 338 and 340 but applied to a different cell line shows a high level of on-target effect and possibly an acceptably low level of off-target effect. As can be seen for the dosage levels either side of this point, there is a significant reduction in the on-target effect and also an increase in the off-target effects. Hence the graphical representation of the on-target and off-target metrics can be of use in evaluating the on-target and off-target effects and can provide indications as to further areas of interest to be the subject of further investigations and experiments.
Also, evaluation of the on-target and off-target metrics can be used as a screening method in order to help identify good candidate drugs or pharmaceuticals for further investigation. For example the treatment resulting in the points plotted in the left hand side of the plot may be a better candidate drug than the drug corresponding to the points plotted in the bottom right hand side area of the plot.
With regard to characterising compounds, either the on-target or off-target effect metric reaching a threshold or not reaching a threshold can be used as a mechanism in order to characterise a treatment. For example the set of three lines to the right of the 75 mark on the off-target axis may be considered too harmful for further investigation, if the off-target effect is a harmful one, or alternatively may be considered good candidate compounds if the off-target effect is a beneficial effect. Similarly, the group of lines toward the origin, and which relate to a further treatment, may be considered to indicate that the treatment does not have a sufficient effect on the on-target or off-target effect. However, whether an on-target or off-target metric falls above or below a threshold and so can be considered to be indicative of a useful property, or not, will be entirely application dependent as in some applications exhibiting the effect may be considered beneficial and in other applications not exhibiting the effect may be considered beneficial, and vice versa.
The distance L1 is calculated for each control well and then the average distance is calculated together with the standard deviation in step 364. Then the off-target metric for treated wells is calculated at step 366, again relative to the origin of multi-variant space. Then the number of standard deviations between the control well mean off-target metric and the treated well off-target metric is determined at step 368. If the metric for the treated well is considered to lay a significant number of standard deviations from the mean for control wells, then this can be considered indicative of a significant off-target effect and the treatment characterised accordingly at step 370. The actual number of standard deviations that can be considered significant will vary from application to application. For some screens, 10 to 15 standard deviations have been found to be indicative of significance.
Generally, embodiments of the present invention, and in particular the processes involved in the calculation of the on-target and off-target metrics, their evaluation and characterization of the treatments, employ various processes involving data stored in or transferred through one or more computer systems. Embodiments of the present invention also relate to an apparatus for performing these operations. This apparatus may be specially constructed for the required purposes, or it may be a general-purpose computer selectively activated or reconfigured by a computer program and/or data structure stored in the computer. The processes presented herein are not inherently related to any particular computer or other apparatus. In particular, various general-purpose machines may be used with programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required method steps. A particular structure for a variety of these machines will appear from the description given below.
In addition, embodiments of the present invention relate to computer readable media or computer program products that include program instructions and/or data (including data structures) for performing various computer-implemented operations. Examples of computer-readable media include, but are not limited to, magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROM disks; magneto-optical media; semiconductor memory devices, and hardware devices that are specially configured to store and perform program instructions, such as read-only memory devices (ROM) and random access memory (RAM). The data and program instructions of this invention may also be embodied on a carrier wave or other transport medium. Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter.
CPU 402 is also coupled to an interface 410 that connects to one or more input/output devices such as such as video monitors, track balls, mice, keyboards, microphones, touch-sensitive displays, transducer card readers, magnetic or paper tape readers, tablets, styluses, voice or handwriting recognizers, or other well-known input devices such as, of course, other computers. Finally, CPU 402 optionally may be coupled to an external device such as a database or a computer or telecommunications network using an external connection as shown generally at 412. With such a connection, it is contemplated that the CPU might receive information from the network, or might output information to the network in the course of performing the method steps described herein.
Although the above has generally described the present invention according to specific processes and apparatus, the present invention has a much broader range of applicability. In particular, aspects of the present invention is not limited to any particular kind of treatment, cells, cellular process or assay formats and can be applied to virtually any cellular effects where an understanding of the affect of a treatment on a cell is desired. Thus, in some embodiments, the techniques of the present invention could provide information about many different types or groups of cells, substances, cellular processes and mechanisms of action, and genetic processes of all kinds. One of ordinary skill in the art would recognize other variants, modifications and alternatives in light of the foregoing discussion.