Many areas of industrial production, including pharmaceutical production and food production, have to deal with structurally heterogeneous products, which are anyhow regarded as one type of material. The property of a material of having a range of structures (heterogeneity) is an issue in industrial production because of the difficulty of monitoring the precise nature of the material.
In a molecule this heterogeneity can take several forms: it can comprise a range of different molecular weights in the case of a homopolymer (e.g. cellulose), varied sequence with the same molecular formula, varied sequence and molecular weight, as well as, in some cases, different degrees or types of substitution or branching.
Typically, the spread of structures can only be monitored overall. This is usually performed by using one or several physical techniques, which must be sensitive to one or more of the variable properties. These techniques measure the molecular weight or size, and report the average of this property for the material under examination.
However in many industrial processes, the ability to monitor the composition in more detail, for example to provide sequence information or the means of setting acceptance criteria for a particular measure of quality control, would be highly desirable.
A typical example of heterogeneous product, related to the pharmaceutical industry, is the widely used anticoagulant agent heparin, which is a linear polysaccharide comprising a mixture of polysaccharide chains with both varied sequences and a spread of molecular weights. Moreover, since it is a natural product extracted from animal mucosa (at present) it is also subjected to variation due to individual animal, regional variation and even seasonal differences. Furthermore, it can have additional structural modifications, which are introduced during the extraction and processing procedures. Heparin consists of 1,4 linked uronate-glucosamine unit: the uronate residue is primarily α-L-iduronic acid (α-L-IdoA), but can also be the C-5 epimer β-D-glucuronic acid (β-D-GlcA). The uronic acid can be O-sulfated at position 2, while the α-D-glucosamine α-D-GlcN) residue can be O-sulfated at positions 6 and 3, the latter being rarer. Furthermore, the glucosamine can have multiple modifications at position 2, being N-sulfated, N-acetylated or a free amine. The most common disaccharide is the tri-sulfated structure 2-O-sulfated iduronic acid and 6-O-sulfated N-sulfated glucosamine.
Producers and regulatory authorities share an interest in knowing more about the composition of such materials for several reasons. The first is that it would provide the means by which a better-defined and reproducible product could be produced in the sense of it being more homogeneous. This would also help to provide a reference to which each production run could be compared. Second, more detailed information that can link structure and activity can be provided to the manufacturer. These are of considerable importance in the growing area of biotechnological production, including bio-similar compounds/agents, and generic products in the pharmaceutical industry.
The achievement of these aims is a considerable challenge as it is required to compare products, each of which consists of mixtures of material and whose compositions cannot be defined precisely owing to the difficulties of separation, identification and/or quantification of the components.
Several analytical techniques have been developed in order to analyse heterogeneous molecules, for example NMR analytical techniques.
One is principal component analysis (PCA), which decomposes a matrix of numerical data into a number of model features that, when recombined will reproduce closely the original dataset. This can be used to examine how different heterogeneous samples are related to each other, but lacks information on the alien features that can be present in a sample when compared to another.
Another is two-dimensional correlation spectroscopy (2D-COS) which is a means of elucidating correlated and uncorrelated changes in perturbed chemical systems, this perturbation maybe be mechanical or chemical. 2D-COS analysis can be performed on data generated by different forms of spectroscopy; it can be performed on a single dataset, as a perturbed chemical system observed by one form of spectroscopy (homo-correlations), or between spectroscopic data generated by two different forms of spectroscopy for the same system and then correlated together (hetero-correlations).
A development of 2D-COS is two-dimensional correlation spectroscopic filtering (2D-COSf). In 2D-COSf the spectrum of a heterogeneous sample is tested against a library of spectra of verified heterogeneous products (Library 1). Library 1 is used to “filter” the test sample spectrum, removing spectral features consistent with verified heterogeneous products library and leaving only alien features, if present. Any feature that remains is considered not to be consistent with the Library 1 of verified heterogeneous compounds. However 2D-COSf does not give information on whether the extracted alien features arise from variations due to natural heterogeneity or from unnatural signals. Indeed since in a heterogeneous polymer no two samples are identical, it is conceivable that, if a bona fide heterogeneous test sample is analysed using 2D-COS-f against a library containing bona fide heterogeneous samples, spurious signals may be found. Thus a pass or fail criteria needs to be set that handles the natural variation within heterogeneous samples.
Therefore the need remains for a method of analysis of heterogeneous samples that is generally applicable and capable of providing objective test for assessing the conformity of heterogeneous samples to set standards of production.
The present invention provides a method of analysis of heterogeneous products, for example heparin, that can define whether said heterogeneous product is consistent with a library of verified heterogeneous samples by analysing the variation, whether natural or alien, within a set of heterogeneous samples.
In 2D-COSf the spectrum of a heterogeneous product is filtered against a library of spectra of verified heterogeneous products (Library 1) and any feature that is not consistent with Library 1 is considered as alien feature.
The method of the present invention is a new development of 2D-COSf, which makes use of a second set of verified spectra (Library 2) to determine the acceptable variation of the heterogeneous product. Said method is defined “comparative 2D-COS-f” since it compares a filtered test sample with a filtered bona fide heterogeneous samples library.
In one embodiment the method comprises obtaining one-dimensional complex spectra of a heterogeneous product and applying comparative 2D-COS-f. The one-dimensional complex spectra/chromatographs are, for example, 1H-NMR spectra, mass spectra, infrared spectra, Raman spectra, chromatographs produced by liquid/gas chromatography, near-infrared spectra and UV spectra. If a heterogeneous product tested against Library 1 has features that are greater than features found testing a spectrum from Library 2 against Library 1 the features are considered not to be consistent with those of Library 1.
In a second embodiment the method comprises obtaining one-dimensional complex spectra of a heterogeneous product and applying comparative 2D-COS-f with iterative random sampling (2D-COS-firs). The one-dimensional complex spectra/chromatographs are, for example, 1H-NMR spectra, mass spectra, infrared spectra, Raman spectra, chromatographs produced by liquid/gas chromatography, near-infrared spectra and UV spectra. Each component is tested against the others randomly, where a proportion of Library 1 is randomly selected and a randomly selected spectrum from Library 2 is used in each iteration, while the test sample remains constant. This embodiment provides a measure of the variation within the verified products spectra, to which the test sample can be compared. Iteration with random sampling process provides a more accurate and stable extraction of aliens/unnatural features.
In another embodiment of the invention, the output of any of comparative 2D-COS-f or 2D-COS-firs, can be used in further statistical tests, for example, principal component analysis, partial least squares or support vector machines, to identify common/known and alien features within the test sample.
Optionally the content of Library 1 and the content of Library 2 can be tested using principal component analysis in order to verify that they are consistent with each other.
In the present specifications, spectrum/spectra is/are defined as any complex one-dimensional datum/dataset.
Through the different embodiments of the invention it is possible to decide whether a test sample is consistent with a library of production norms of heterogeneous products, to determine the acceptance criteria to be considered as normal production for heterogeneous products and to detect species alien to the production norms of heterogeneous products.
The present invention provides a method of analysis of heterogeneous products capable of defining whether a heterogeneous product, for example a natural or bio-manufactured product, is consistent with a library of verified heterogeneous samples by analysing the variation, whether natural or alien, within a set of heterogeneous samples.
Preferably the heterogeneous product is heparin, high-, low- (MW from 3000 to 7000 Da, preferably from 4000 to 6000 Da) and ultra-low- (MW from 1200 to 3000 Da, preferably from 1600 to 2400 Da) molecular weight. In other preferred embodiments the heterogeneous product consists typically of polymer chains which, even though they contain consistent levels of subunits (or within some range), nevertheless are characterised by chains in which the sequence of these sub-units is variable.
In a first embodiment a one-dimensional complex spectrum of the product to be tested (Test sample) is obtained and it is tested against a library of verified heterogeneous products (Library 1) by use of comparative 2D-COS-f as described hereafter. The one-dimensional complex spectra/chromatographs are, for example, 1H-NMR spectra, mass spectra, infrared spectra, Raman spectra, chromatographs produced by liquid/gas chromatography, near-infrared spectra and UV spectra. Preferably the one-dimensional complex spectrum is a 1H-NMR spectrum.
Before testing the heterogeneous product against Library 1 by use of comparative 2D-COS-f, a second library of verified products (Library 2) is tested against Library 1 by use of comparative 2D-COS-f in order to determine the acceptable variation within Library 1. Both Libraries 1 and 2 comprise bona fide samples of the heterogeneous product.
A suitable Library 1 contains more than 2 spectra, preferably more than 50 spectra.
Library 1 contains a number of spectra greater than the number of spectra of Library 2.
The contents of Library 1, that defines the features of the heterogeneous product, and the contents of Library 2, that measures the acceptable variation of the heterogeneous product, comply with the requisite regulations. The consistency of the members of Library 2 with Library 1 can also be confirmed using an explorative statistical technique such as principal component analysis.
Library 1, (x(library1)), is mean-centred by subtracting the mean spectra of Library 1 from each of the spectra in Library 1 (i) and a mean-centred data set x is obtained (x=x(library1)ij−x(library1)average i). The covariance matrix of the mean-centred Library 1 (COVLIB) is then determined (ii), where COVLIB is equal to the outer product matrix of x, scaled to the number of spectra n in the dataset (COVLIB=1/(n−1)*xxT); steps (i) and (ii) are then repeated on Library 1 plus one of the spectra from Library 2 and the covariance matrix COVLIBTEST is obtained (iii); COVLIB is subtracted from COVLIBTEST (iv) obtaining the difference covariance matrix ΔCOVLIBTEST-LIB (COVLIBTEST−COVLIB=ΔCOVLIBTEST-LIB).
ΔCOVLIBTEST-LIB is a measure of the acceptable variation within the heterogeneous samples.
The same procedure (iii-iv) is repeated one by one for all the spectra that are within Library 2 (v): the difference covariance spectra form the acceptance criteria whether the Test sample conforms to the library or not.
Steps (i) and (ii) are then repeated on Library 1 plus the spectrum of the heterogeneous product to be tested (Test sample) obtaining the covariance matrix COVTEST (vi). COVLIB is then subtracted from COVTEST (vii) forming the difference covariance matrix ΔCOVTEST-LIB (COVTEST−COVLIB=ΔCOVTEST-LIB) revealing the features of the Test sample that are not consistent with Library 1.
The following pass-fail criteria is applied in the analysis of the Test sample: if the amplitude of any of the features within ΔCOVTEST-LIB is greater than any of the features within ΔCOVLIBTEST-LIB, then the Test sample fails the test (see scheme 1).
In a second embodiment two-dimensional correlation spectroscopy filtering with iterative random sampling (2D-COS-firs) is applied.
2D-COS-firs provides more accurate measure of the variation within the verified spectra (Library 2 against Library 1) and more accurate extraction of alien/unnatural features from the spectrum of the test sample by using a randomly selected proportion of Library 1 with one randomly selected spectrum from Library 2 and iterating the procedure while the test sample remains constant.
In this second embodiment ΔCOVLIBTEST-LIB is determined for a randomly selected proportion of Library 1 and one randomly selected spectrum of Library 2 with the steps (i-iv) described for the first embodiment; these steps are repeated j times until the response is stable, where j is greater than 10, preferably j is from 10 to 8000, more preferably from 1000 to 2000. The mean is determined for the j filtered spectra and a measure of the variation is determined at each point along the spectra, for example 95% confidence interval at each point (the mean value at a point±the error of the mean at that point×1.96). This forms the acceptance criteria whether the test spectrum conforms to the library or not (see scheme 2).
The covariance matrix COVTEST of the same randomly selected portion of Library 1 plus the test spectrum and the ΔCOVTEST-LIB are determined as in the first embodiment (steps vi and vii); these steps are repeated j times, until the response is stable, where j is greater than 10, preferably j is from 10 to 8000, more preferably from 1000 to 2000, determining the mean spectrum of the j repeats.
If the alien variation within the test sample (i.e. the amplitude of the spectrum determined by testing the test sample spectrum against Library 1) is greater that the natural variation of the library measured at each point of the spectra (i.e., the 95% confidence interval at each point), then the test sample is considered not to be consistent with the definition of the heterogeneous samples, Library 1.
To perform the principal component analysis the spectra are mean-centred; the covariance matrix of the mean-centred set of spectra x is determined (c=xxT, where c is equal to the cross product matrix of x); the Eigen decomposition/diagonalization of the covariance matrix is performed, which forms a new orthonormal coordinate system, the results of which is c=TΛTT, where Λ is a diagonal matrix of eigenvalues while T are the eigenvectors (loadings). The data set x are then projected on to the new coordinate system by the following transformation S=xT, where T are the eigenvectors and S are the component scores.
In another embodiment of the invention, the output of any of comparative 2D-COS-f or 2D-COS-firs, can be used in further statistical tests, for example, principal component analysis, partial least squares or support vector machines, to identify common/known and alien features within the test sample.
1H NMR spectrum of porcine intestinal mucosal heparin is obtained. Porcine intestinal mucosal heparin is a heterogeneous carbohydrate, therefore its 1H NMR spectrum contains many overlapping bands (
Principal component analysis can be used to find oddities within a dataset.
Instead of trying to decompose the entire dataset, test sample and bona fide pharmaceutical porcine intestinal mucosal heparin samples, into components, the 57 heparin spectra, which are considered to be an example of the definition of pharmaceutical porcine intestinal mucosal heparin, can be used to “filter” the test sample removing spectral features consistent with bona fide pharmaceutical porcine intestinal mucosal heparin leaving only alien features, when present.
According to the first embodiment of the invention, a test sample is tested (filtered) against Library 1 of bona fide porcine intestinal mucosal heparin, which defines the heterogeneous sample. A bona fide heparin, contained in a second library (Library 2) and not contained within Library 1, is also tested (filtered) against Library 1. This second test illustrates the acceptable variation of the heterogeneous product in question. The test sample filtering by Library 1 is then compared with the bona fide porcine intestinal mucosal heparin sample of Library 2, filtered by the Library 1 as well. If the amplitude of the filtered spectrum of the test sample is greater than the amplitude of the filtered spectrum of the Library 2 filtered bona fide heparin, it is considered to contain features alien or non-consistent to porcine intestinal mucosal heparin. In this example the porcine intestinal mucosal heparin contaminated with 10% (w/w) bovine mucosal heparin failed the test.
According to the second embodiment of the invention, random sampling is used to provide a stricter pass or fail criteria. This analysis requires three data sets: a library of bona fide porcine intestinal mucosal heparin which is consider to be the definition of porcine intestinal mucosal heparin (Library 1—containing 57 spectra in this example), a further library of bona fide porcine intestinal mucosal heparin which is a test library which will determine the natural variation within porcine intestinal mucosal heparin (Library 2—containing 12 spectra in this example) and finally the test sample. The pass or fail criteria is found by filtering a randomly selected sample from Library 2 by a random selection of Library 1 (the number of samples contained within Library 1-1), this is repeated 1500 times and the resultant spectra can be averaged to form a spectrum which encompassed the average natural variation with heparin. Here we determined the 95% confidence interval (x±SEx×1.96) and used it as the pass or fail criteria (
The effect of varying the size of Library 1 and the number of iterations used for 2D-COS-firs is illustrated in
In this example principal component analysis is applied after all the samples spectra have been filtered by Library 1, the definition of porcine intestinal mucosal heparin, removing all signs from the spectra that are consistent with features contained with Library 1. As can be seen in
By filtering the generic LMWH test sample against the lovenox-containing Library 1 all the features within the generic LMWH that are not consistent with lovenox are revealed.
Number | Date | Country | Kind |
---|---|---|---|
12168422.9 | May 2012 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2013/001387 | 5/10/2013 | WO | 00 |