The present invention relates to methods and computer programs for the classification of samples which have been subject to a separation of their constituents, for example either by chromatography or electrophoresis, and more particularly to a method for classifying that sample based on the relative amount and constituent profile similarity to a reference sample.
In the manufacturing of biopharmaceuticals such as vaccines, antibodies, recombinant proteins, gene therapy vectors etc. several chromatographic separation steps are usually needed to remove various contaminants and impurities from the product. During each step of the manufacturing process, there is a need to check both amount and purity compared to reference samples.
However, the separation profiles often display multiple molecule-peak bands which may overlap. The analysis of such complex separation profiles adds significant cost and process time. Furthermore, separation profiles of complex samples can be difficult to analyse accurately which may introduce individual operator bias. Hence, there is a significant interest in fast automated analysis, both to remove personal bias and to reduce the time of manufacturing biopharmaceuticals. Accordingly, there is a need for methods and computer programs for sample comparisons which can be both fast and automated if needed.
One aspect of the invention is to provide a method, which may be implemented by a computer program, for comparison of different samples in a biopharmaceutical process. This is achieved with the introduction of a similarity score based on a two-dimensional analysis of the relative amount and constituent profile similarity to a reference sample. The relative amount is a measurement of the magnitude of different chemical constituents of a sample compared to the magnitude of constituents of a reference sample. The constituent profile similarity calculation is a measurement of similarity of a spatial, or temporal, profile of separated constituents generated by a separation process, in comparison to the profile(s) of one or multiple reference sample(s) which have been subject to the same separation process. The resulting two-dimensional data set forms the basis of classification of samples by providing a score of similarity to each sample which allows an estimate of the similarity of the sample of interest with the reference sample(s). The analysis method can be automated and implemented using computer analysis software together with suitable hardware, after the classification criteria has been set.
One advantage is that such analysis method allows for a fast, non-operator dependent, classification scheme, in which limits for grouping samples can be easily set for automated analysis. This method allows for decisions to be made, for example if the manufacturing process is working satisfactory, or if separation parameters need to be changed. Additionally, it is common for separations of sample constituents to be incomplete, in other words measured bands of constituents overlap, leading to data which is difficult to analyse. The proposed method allows for such incomplete separation by comparing a measured profile with a reference in the manner described immediately above, rather than looking at certain peaks of measurements only.
Further suitable embodiments of the invention are described in the dependent claims.
As used herein, the terms “comprises,” “comprising,” “containing,” “having” and the like can have the meaning ascribed to them and can mean “includes,” “including,” and the like; “consisting essentially of” or “consists essentially” likewise has the meaning ascribed in patent law and the term is open-ended, allowing for the presence of more than that which is recited so long as basic or novel characteristics of that which is recited is not changed by the presence of more than that which is recited, but excludes prior art embodiments.
In one aspect, the present invention discloses a computer assisted method for automated analysis of samples which have been subject to a separation, which term includes forming into aliquots, fractions or streams of higher concentration of at least some of the samples' constituents. The term separation includes partial separation.
In certain embodiments, the samples may be intermediates or a final product, in a biopharmaceutical manufacturing process. In addition, reference samples may either be subjected to a chemical separation and subsequent analysis when the process was created, or saved for a chemical separation analysis at a later stage. It is well known in the art of biopharmaceutical manufacturing how store reference samples for future analysis. Saved reference sample data may be used also for comparisons.
In some embodiments, the separation is performed by electrophoresis and the separated molecules are detected using colour stains or fluorescent dyes. Lane profiles of the electrophoretic separations are then compared, both in terms of how much sample constituent there is in the lane, and how similar the lane spatial profiles are, in comparison with a reference sample.
In some embodiments, the separation is performed by chromatography and separated molecules are detected using light absorbance measurements. Separation profiles are then compared, both in terms of how much sample was eluted from the chromatography column and how similar the sample profiles (also called chromatograms) are. In this way the reproducibility of different batches in a bioprocess manufacturing process can quickly assessed and decisions for the continuation of the process can be easily made.
Different protein samples were analysed using SDS-PAGE electrophoresis. The resulting Coomassie stained gel is shown in
The results of the correlation calculation for each of the ten lanes shown in
As shown in
Thus, depending on which samples have been analysed, different rules for grouping samples may be applied. For example, only group C samples in
A similar technique as described above can be applied to chromatographic separations as shown in
Those skilled in the art would be aware that various analyses could be performed in accordance with the present invention. For example, when considering chromatography examples, additional dimensions in the chromatography analysis may be, for example, one or more of: pH, conductivity and/or pressure. Hence, chromatography run data such as time-related data corresponding to pressure, pH and/or conductivity data can be used for the classification of samples. Such parameters may instead of a relative amount, or additionally thereto, be used as y-axis data (ordinate data) for a similarity plot.
In
Furthermore, in various embodiments, the 2D scatter plot(s) can enable a user to select a region of interest for analysis, remove all data points from an analysis which have reached the maximum limit of the detector and/or use the scatter plot to set limits for grouping samples into different groups. Moreover, various embodiments may also allow a scatter plot to track a protein purification process, for example, by using trend lines or colour gradients.
In various embodiments, relating to electrophoresis, the method 100 may include one or more of the following steps:
1a. Selecting region of interest in an image. For electrophoresis a user may create a lane box, for example, using a GUI of the type referred to above.
2a. Optionally, saturated regions of the lane profiles may then be excluded from analysis, i.e. regions in which the detector has reached its maximum value. This may be automated or could be user driven.
3a. Correcting for uneven migration. A user may adjust a lane box and lanes to correct for uneven migration across an electrophoresis gel. Optionally, the lane profile scales may be corrected by comparing to marker samples, i.e. samples with known molecular weight, in other lanes, either by the user or automatically.
In various embodiments, relating to chromatography, the method 100 may include one or more of the following steps:
1b. Selecting a region of interest in chromatogram either manually or automatically by way of analysis software. This step is optional, alternatively a full chromatogram can be analyzed.
2b. Optionally, saturated regions of the lane profiles are excluded from an analysis, i.e. regions in which the detector has reached its maximum value.
3b. Aligning of chromatograms for comparison is undertaken. Chromatogram alignment can be performed either automatically by a software algorithm automatically or by the user. Alignment is typically based on the performed chromatography operations, for example start of a phase, or time at elution of a known reference sample. This step is optional, and in some cases no alignment is needed.
The following steps may then be applied to both electrophoresis analysis and chromatogram analysis:
4. If data-sets have a different number of data points, individual electrophoresis lane profiles or chromatograms may then either be sampled or interpolated to obtain the same number of data points per sample.
5. Analysis of all data points or user a defined analysis range may then be performed. For example, in some cases there is only one peak of interest, then an analysis range may adjusted accordingly either automatically or by the user.
6. The integrated signal of samples of a lane or chromatogram is calculated. Alternatively, the volume sum of all detected bands, or peaks, are summed for each lane.
7. In a preferred embodiment, all possible pair-wise comparisons of the N samples are made. This results in N X N arrays. For example, one array for relative amounts, and one for profile similarity score. Such may also be used to compare pressure, pH and/or conductivity data-sets for chromatography.
8. A GUI allows user to select one reference lane or chromatogram and a 2D scatter plot may then be generated showing relative amounts on the y-axis and profile similarity score on the x-axis.
Various embodiments and features of the present invention have thus been described. This written description further uses examples to disclose the invention, including the preferred mode, and is provided to enable any person skilled in the art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the invention is defined by the claims, and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims if they have structural elements that do not differ from the literal language of the claims, or if they include equivalent structural elements with insubstantial differences from the literal languages of the claims. Any patents or patent applications or commercially available products, such as systems or software, mentioned in the text herein are hereby incorporated by reference in their entireties, as if they were individually incorporated, where such is permitted.
Number | Date | Country | Kind |
---|---|---|---|
1914575.4 | Oct 2019 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2020/078119 | 10/7/2020 | WO |