Malignant tissue in the human prostate is known to have a unique type of signature expressed in hydrogen atom magnetic resonance spectroscopic imaging (1H-MRSI) in which a spectrum is obtained for every voxel in a three-dimensional volume encompassing the prostate in an endorectal scan. A voxel is a volume element for which at least one value is available representing a physical property at a corresponding location in the subject of the scan. Currently, interpretation of MRSI data requires a trained physicist who performs a visual inspection of the data. This is somewhat time-consuming, requires expertise, and is subject to inter-observer variability. The expert physicist must currently parse through over one-hundred spectra per scan for one patient to identify the voxels with suspicious spectra indicative of the malignant tissue. Therefore this valuable diagnostic technique is not as widely used as would be beneficial to public health.
Therefore, there is a need for automated classification of 1H-MRSI voxels to draw a radiologist's attention to the most important portions of a scan, whether expert in MRSI signatures or not.
According to one embodiment, a method includes receiving spectra for voxels from a 1H-MRSI scan of a human prostate, segregating the voxels by prostate anatomical zone, providing the spectrum of each voxel in the zone as input to a neural network trained to give expert classification in the zone for a training set, and automatically classifying each voxel based on output from the neural network.
According to another embodiment, a method includes receiving spectra for voxels from a 1H-MRSI scan of a human prostate, segregating the voxels by prostate anatomical zone, determining the amplitude of principal components derived from all spectra in the zone, providing those amplitudes as input to a functional form fit to the expert classification in the zone for the training set, and automatically classifying each voxel based on output from the functional form.
In other embodiments, an apparatus, or logic encoded in one or more tangible media, or instructions encoded on one or more computer-readable media is configured to perform one or more steps of the above methods.
Still other aspects, features, and advantages of the invention are readily apparent from the following detailed description, simply by illustrating a number of particular embodiments and implementations, including the best mode contemplated for carrying out the invention. The invention is also capable of other and different embodiments, and its several details can be modified in various obvious respects, all without departing from the spirit and scope of the invention. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
The embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which:
A method, apparatus, and software are disclosed for classification of MRSI voxels as positive or negative for malignant prostate tissue. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the invention. It is apparent, however, to one skilled in the art that the embodiments of the invention may be practiced without these specific details or with an equivalent arrangement. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the embodiments of the invention.
An example embodiment is described in this section. Alternative or more detailed embodiments are described in later sections. In various embodiments, magnetic resonance spectroscopy data undergoes a purely objective principal component analysis or a neural network is developed, or both. In some embodiments, a human anatomy database is incorporated which has been generated by a specialized genitourinary radiologist. In some embodiments, factors specific to an examination, such as lesion volume and degree of metabolic abnormality, data quality, endorectal coil sensitivity profile, periurethral location and zonal location are incorporated. In some embodiments, accuracy is assessed by comparison to assessments by an expert medical spectroscopic physicist. In some embodiments, accuracy is assessed by comparison to whole-mount, step-section pathology.
The illustrated embodiments automate the process of spectral interpretation. Such embodiments could be used to reduce the interpretation time by indicating suspicious regions of a prostate for the physicist/spectroscopist or pathologist to inspect. In the many facilities where a trained physicist is not available, an embodiment could be used to assist the radiologist in interpreting the MRSI data. In some embodiments, the automated process outperforms the spectroscopist and indicates tumor positive voxels with fewer false positives.
Nuclear magnetic resonance (NMR) studies magnetic nuclei by aligning them with an applied constant magnetic field (B0) and perturbing this alignment using an alternating magnetic field (B1), orthogonal to the constant magnetic field. The resulting response to the perturbing magnetic field is the phenomenon that is exploited in magnetic resonance spectroscopy (MRS) and magnetic resonance imaging (MRI).
The elementary particles, neutrons and protons, composing an atomic nucleus, have the intrinsic quantum mechanical property of spin. The overall spin of the nucleus is determined by the spin quantum number I. If the number of both the protons and neutrons in a given isotope are even, then I=0. In other cases, however, the overall spin is non-zero. A non-zero spin is associated with a non-zero magnetic moment, μ, as given by Equation 1a.
μ=γ I (1a)
where the proportionality constant, γ, is the gyromagnetic ratio. It is this magnetic moment that is exploited in NMR. For example, nuclei that have a spin of one-half, like Hydrogen nuclei (1H), a single proton, have two possible spin states (also referred to as up and down, respectively). The energies of these states are the same. Hence the populations of the two states (i.e. number of atoms in the two states) will be approximately equal at thermal equilibrium. If a nucleus is placed in a magnetic field, however, the interaction between the nuclear magnetic moment and the external magnetic field means the two states no longer have the same energy. The energy difference between the two states is given by Equation 1b.
ΔE=γ B0 (1b)
where is Planck's reduced constant. Resonant absorption will occur when electromagnetic radiation of the correct frequency to match this energy difference is applied. The energy of photons of electromagnetic radiation is given by Equation 2.
E=h f (2)
where f is the frequency of the electromagnetic radiation and h=2π . Thus, absorption will occur when the frequency is given by Equation 3.
f=γ B0/(2π) (3)
The NMR frequency f is shifted by the ‘shielding’ effect of the surrounding electrons. In general, this electronic shielding reduces the magnetic field at the nucleus (which is what determines the NMR frequency). As a result, the energy gap is reduced, and the frequency required to achieve resonance is also reduced. This shift of the NMR frequency due to the chemical environment is called the chemical shift, and it explains why NMR is a direct probe of chemical structure. The chemical shift in absolute terms is defined by the frequency of the resonance expressed with reference to a standard compound which is defined to be at 0. The scale is made more manageable by expressing it in parts per million (ppm).
Applying a short electromagnetic pulse in the radio frequency (RF) range to a set of nuclear spins simultaneously excites all the NMR transitions. In terms of the net magnetization vector, this corresponds to tilting the magnetization vector away from its equilibrium position (aligned along the external magnetic field, B0). The out-of-equilibrium magnetization vector precesses about the external magnetic field at the NMR frequency of the spins. This oscillating magnetization induces a current in a nearby pickup coil acting as a radio frequency (RF) receiver, creating an electrical signal oscillating at the NMR frequency. A portion of this time domain signal (intensity vs. time) is known as the free induction decay (FID) and contains the sum of the NMR responses from all the excited spins. In order to obtain the frequency-domain NMR spectrum (intensity vs. frequency) for magnetic resonance spectroscopy (MRS) and MRS imaging (MRSI), this time-domain signal is Fourier transformed, as is well known in the art.
In addition to the spectra obtained from a MRSI scan, a higher spatial resolution magnetic resonance imagery (MRI) image can also be generated without spectral information from the same scan. MRI spatial resolution provides imaging volume elements (voxels) that are much smaller than a typical MRS voxel. For example, an MRSI voxel is approximately three orders of magnitude larger than a high resolution MRI voxel (e.g., an MRSI voxel is on the order of cubic centimeters , cm3, and a MRI voxel is on the order of cubic millimeters, mm3, where 1 mm=10−3 meters and 1 cm=10−2 meters).
The much lower resolution MRSI voxels 120 encompass the prostate gland in the depicted section. Each MRSI voxel 120 is associated with a MRS spectrum of intensities at 512 resolved frequencies. Each MRSI voxel 120 can be associated with one or more of the prostate anatomical zones. For example, MRSI voxels 120a, 120b, 120c, 120d, 120e, 120f, 120g, 120j are associated with the transition zone (TZ) 116; MRSI voxels 120h, 120L, 120m and 120n are associated with the periurethral zone (U) 118; and MRSI voxels 120k and 120o are associated with the peripheral zone (PZ) 114. In some embodiments, at least some MRSI voxels are each associated with a percentage of each of two or more zones.
A data base of MRI and MRSI data and histology sections were collected to train and test models that automatically classify voxels as tumor positive or tumor negative. Different embodiments of the models were developed using different portions of the data set. At its greatest extent, the data base was collected from 18 men with prostate cancer. This group had an age range from 49 to 82 years with a median age of 62 years. This group had a mean biopsy Gleason grade of 7. To be included in the data set, the cancer patient had to have at least one MRSI voxel that was rated by a physicist/spectroscopist as suspicious of including a tumor, at least one lesion indicating a tumor on a whole-mount histopathological map, and no prior hormonal or radiation treatment.
The 3D 1H-MRSI examinations were performed on a 1.5-Tesla whole-body unit (Signa Horizon from GE Medical Systems of General Electric Healthcare of Waukesha, Wis.) with an endorectal coil (from Medrad of Pittsburgh, Pa.) and PROSE acquisition package (GE Medical Systems) in a location prescribed by T2-weighted fast spin-echo images {4400/(effective) 102; echo train length, 12; section thickness, 3 millimeters (mm); intersection gap, 0 mm; field of view, 14 centimeters (cm); matrix, 256×192}. The spectroscopic acquisition parameters were as follows: PRESS voxel selection, 1000/130 milliseconds (m, 1 ms=10−3 seconds) [TR/TE]; one average; spectral width, 1250 Hertz (Hz, 1 Hz=1 cycle per second); number of points, 512; field of view, 11 cm×5.5 cm×5.5 cm; and 16×8×8 phase encoding steps. Each MRSI voxel represented a patient volume of 0.69 cm×0.69 cm×0.34 cm. Spectral data were automatically processed by Functool software (GE Medical Systems). Some or all of the data in GE format within the range 4.3 ppm to 0.4 ppm (256 points) were used; and real (magnitude) spectra were exported for further analysis.
In each patient, all voxels within the PRESS excitation volume were labeled as healthy or suspicious by an experienced spectroscopist according to established decision rules based on the resonances of total choline (CHO) at 3.2 ppm, creatine/phosphocreatine (CR) at 3.0 ppm, polyamines (PA) at 3.1 ppm and citrate (CIT) at 2.6 ppm.
Although a particular set of modules and data structures are shown in
The spectra conditioning module 304 is configured to perform any preprocessing on MRSI spectra that is considered desirable, such as time series padding, frequency bin averaging or correcting amplitudes for windowing performed for the Fourier transforms. Any conditioning of MRSI spectra known in the art may be performed by module 304, such as the Functool software identified above. Similarly, the imagery conditioning module 314 is configured to perform any conditioning of MRI images known in the art.
The spectra alignment module 306 is configured to align the frequency bins for all spectra in the input scan so that peaks can be properly characterized by the principal components or properly input to the neural network or both.
The segmentation module 316, segments the voxels from the high spatial resolution MRI images derived from the scan into one or more zones. Other data derivable from the imagery and used in OSC filtering, if any, are also determined in the module 316. The segmentation is based at least in part on definitions of the zones and OSC parameters of interest as determined by an expert. In some embodiments, manual input for segmentation is included in segmentation module 316. In some embodiments, this information is derived beforehand, as described in more detail below with reference to
The zonal separation module 308 is configured to select the spectra for voxels in a current one of the one or more zones, based on the zone mapping in data structure 318. This allows the spectra to be analyzed with principal components and neural networks tailored to that particular zone. In some embodiments, all spectra to be analyzed are in one zone, and the other zones, if any, merely indicate voxels in which the data is not suitable for classifying suspicion of malignant tissue; and therefore not subject to either principal component analysis or neural network processing. In some embodiments, the zonal separation module 308 simply labels a voxel with membership in one or more zones.
In the illustrated embodiment, the zonal separation module 308 is further configured to determine one or more values for corresponding one or more exam-specific factors based on the spectra in one or more zones and to store those values in the exam-specific factors data structure 328.
The OSC module 321 is configured to perform orthogonal signal correction (OSC) filtering, which effectively removes information unrelated to the separation of classes. For example, in some embodiments, the OSC module is further configured to indicate which principal components do not need amplitudes determined in order to classify a voxel as suspicious or not or tumor positive or not. In some embodiments (not shown), the OSC module 321 is further configured to consider one or more values in the exam-specific factors data structure 328.
The principal components module 320 is configured to determine the amplitudes in a current spectrum of the principal components predefined and stored in data structure 322. The data structure 322 is depicted as layered to indicate different principal components for different zones. The principal components are derived beforehand based on a training set, as described in more detail below with reference to
The output of principal components module 323 is a set of voxels classified as suspicious (or otherwise tumor positive) for representing malignant tissue; and the output is provided on output port 330a.
The neural network module 324 is configured to accept values for a predefined set of neural network input nodes based on amplitude values of a spectrum from module 308 for each voxel in the zone, and zero or more exam-specific factors from data structure 328, such as zone associated with the voxel. The neural network module 324 then classifies each voxel using predefined neural network weights among predefined layers of neural network nodes. The neural network nodes, layers and weights are derived beforehand, as described in more detail below with reference to
The output of neural network module 324 is a set of voxels classified as suspicious (or otherwise tumor positive) for representing malignant tissue; and the output is provided on output port 330b.
In some embodiments, either principal components module 323 or neural network module 324, and corresponding data structures 322 and 326, respectively, is omitted, and system 300 performs a single classification.
In step 402, a training set of NMR scans of prostates is received. Any method may be used to receive this data. For example, in various embodiments, the data is included as a default value in software instructions, is received as manual input from a network administrator on the local or a remote node, is retrieved from a local file or database, or is sent from a different node on a network, either in response to a query or unsolicited, or the data is received using some combination of these methods. In various embodiments, MRI and MRSI voxels from 10 or more of the 18 patients in the data base described above are used to produce the training set. In various embodiments, 70 percent of the thousands of MRSI voxels from a portion of the data base are used in a training set, 15 percent of the MRSI voxels are used in a validation set during formation of the models, and 15 percent of the MRSI voxels are used in a test set that is not used during formation of the models.
Step 402 includes any conditioning of images and spectra, e.g. by modules 304 or 314 or both. For example, in some embodiments, conditioning includes processing spectral data with commercially available software, well known in the art, such as free software 3DiCSI v1.9.11 (available in directory 3dicsi.html from public Internet domain mrs.cpmc.columbia in class edu). Using this software, the MRSI data were spatial zero filled to a 16×8×16 matrix and zero filled in the spectral dimension to 1024 points. The time-spectral dimension was apodized with a 4-Hz Gaussian function. The spectra were aligned and referenced to the water peak at 4.7 ppm or some other peak or not aligned at all in various embodiments. Magnitude spectra in a desired frequency shift range (e.g., 3.6 ppm to 0.6 ppm in some embodiments and 4.3 ppm to 0.4 ppm in some embodiments) were exported to achieve better and reproducible results in the subsequent modeling, as well as to fully automate and simplify preprocessing. In other example embodiments, spectral data were automatically processed by Functool software (GE Medical Systems). The data in GE format within the desired frequency shift range were used and magnitude spectra were exported for further analysis.
In step 404, reference data is received, in any manner as described above. The reference data indicates voxels of the training set associated with disease, e.g., a malignant tissue of the prostate gland. In some embodiments, the reference data is based on conclusions of an expert radiologist. In some embodiments, the reference data is based on post operative histology for the same tissues that had been imaged pre-operatively in the training set of scans. For example, all patients whose pre-operative scans are used to generate the training set subsequently undergo radical prostatectomy with whole-mount step section pathology. This “gold standard” information is made available to form or improve the system 300. Information on tumor location and size from the pathology analysis is incorporated into the training set to improve its discriminatory power. A very large training data set is available at Memorial Sloan Kettering Cancer Center (MSKCC) because of the large volume of patients who undergo endorectal MRI/MRSI of the prostate.
In step 406, the voxels in each scan of the training set are divided into zones of anatomical or analytical significance. In some embodiments, step 406 may be repeated several times until it is understood what are appropriate zones and OSC filtering values, based on results obtained during step 438, described below. In some embodiments, step 406 is performed, at least initially, based on a priori knowledge of reasonable zone definitions, e.g., based on the scientific literature. Human input and intervention is expected, especially initially, during step 406. The decisions on how to define zones for automated segmentation are captured as segmentation rules and parameters in zone definitions and OSC data structure 317. In an illustrated embodiment, the zone definitions include rules for segmenting anatomical portions of the prostate using any method known in the art. Identifiers for one or more of the OSC filtering properties, such as MRSI lesion volume and degree of metabolic abnormality, data quality, endorectal coil sensitivity profile, periurethral location and zonal location are also included in the data structure 317. All scans in the training set, as well as the voxels selected from the data base for the validation set or test set, are segmented during step 406.
For example, according to some embodiments, a zone excludes voxels that indicate a urethra within the prostate. According to some embodiments, the zone excludes voxels with certain artifacts, such as those recognized to include contamination by lipids. According to some embodiments, the zone excludes voxels with low data quality, such as low signal to noise ratio.
The steps 408 through 436, described below, are repeated for each zone for which voxels are to be classified. In some embodiments, voxels in one or more zones, e.g., a zone outside the prostate gland, are not to be classified.
In step 408, the frequency axes of all the NMR spectra in one zone are aligned. This alignment is described in more detail below. In some embodiments, alignment is based on the location of the suppressed water peak. In other embodiments, the suppressed water peak is considered too variable because of the suppression techniques, and the axes are aligned using some other peak, such as the CIT peak, or other feature of the spectra.
In step 410, non-diagnostic spectra are eliminated. For example, spectra with artifacts or low signal to noise ratio (SNR) are eliminated. In some embodiments, the non-diagnostic spectra are already eliminated by virtue of the zone segmentation, and step 410 is omitted
In step 412, values for the exam-specific factors are determined for the current scan.
In step 414, it is determined whether there is another scan of the training set with voxels in the current zone. If so, control passes back to step 408 to align the frequency axes of the spectra in the current zone and the next scan. If not then control passes to step 416.
In step 416, the principal components are determined for all diagnostic spectra in all scans in the current zone. The determination of principal components of arbitrary data series is well known in the art, and any known method may be used. As a result, the definitions of principal components are stored in data structure 322. In some embodiments the principal components are Gaussian peaks centered on the known resonances for choline (CHO), creatine/phosphocreatine (CR), polyamines (PA) and citrate (CIT) among others.
The data structure 322 is depicted as layered to indicate different principal components for different zones. Typically, the definition includes for each principal component, also known as an eigenfunction, a relative value at each frequency value. The principal components have the property that they are orthogonal to each other. Each principal component has associated a value, also known as an eigenvalue, that is proportional to the percent of the total variance accounted for by magnitude changes of that principal component. Thus the principal components can be ranked by eigenvalues, importance increasing with eigenvalue. In some embodiments, the ranks, or eigenvalues, are included in the data structure 322.
The principal components need not be simple peaks, but can be more complicated. In some embodiments, the principal components are determined using empirical orthogonal functions of arbitrary shape determined by the training set itself.
In step 420, depicted in
In step 424, the uncorrelated principal components are eliminated from a set for which amplitudes are to be included as input to the model, such as a polynomial or other functional fit or the neural network, for the current zone. The other principal components are considered the relevant principal components for which amplitudes are to be included as input to a functional form or a neural network for the current zone.
Partial least squares (PLS) are used to convert principal component amplitudes to a classification output using a functional form. For example, in some embodiments, multivariate analysis was performed using SIMCA-P software v.11.5 from Umetrics of Sweden.
In various embodiments, three different approaches to variable centering and auto scaling were compared (centered and scaled to Unit Variance (UV); centered but not scaled (Ctr); no centering or scaling (None)). The OSC algorithm was used to remove unwanted variation in the spectra that was irrelevant for the classification. Five orthogonal components were removed that removed over 70% of variation that did not contribute to discrimination.
In an example embodiment, using 10 patients and 2740 voxels, the principal components were determined to be the frequency shift bins between 3.4 ppm and 2.41 ppm. The overall predictive power (Q2) calculated by cross-validation was greater than 80.4% (a success rate dominated by the large number of healthy, tumor negative voxels). Cross-validation was performed by dividing the data set (for the ten patients) into seven parts, and for each run one seventh were left out as a test set and the other six sevenths constituted a training set used to generate the model. The model is used to classify the voxels in the test set. The process is repeated six more time, using a different one seventh as a test set. The model classifications are compared to the expert classifications; and the sum of the squared errors (called the Predicted Residual Sum of Squares, PRESS) calculated for the whole data set. The PRESS values is divided by an initial sum of squares and subtracted from 1 to give the value Q2. The lower the PRESS value, and thus the higher the Q2 value, the better the model is.
The functional form for the tumor positive output (P1) is represented by the loading plot, which depicts the weights for each frequency in the frequency range.
In step 426, a neural network is constructed with an input layer that includes a node (also called a neuron in this art) for an amplitude for each frequency bin in a spectrum for one voxel, and an input node for a value of each exam-specific factor, if any, found correlated to disease classified voxels in a scan. The output layer includes two nodes, one for degree of suspicion that the voxel represents diseased tissue, e.g., malignant prostate tissue, and a second node for degree to which the voxel appears to represent tissue that is clear of the disease. The model also contains one or more hidden layers, each with an intermediate number of nodes. For example, in one embodiment described below, a neural network is constructed with 133 input layer nodes, 45 nodes in one hidden layer and two output layer nodes. As is well known in the art, the values in one layer of a neural network are based on the values in the previous layer and a set of weights connecting each node in one layer to each node in the previous layer. The weighted sum of the inputs is typically multiplied by a sigmoid function that levels off at large negative values and large positive values for the sum. In a preferred embodiment, there is one input node for each of the 256 frequency shifts in the range from 4.3 ppm to 0.4 ppm, six nodes in a hidden layer, and two nodes in the output layer. In other embodiments, other neural network structures in terms of number of layers and number of nodes per layer, are employed.
There is a weight associated with each connection between every node in the input layer and each node in the hidden layer, as indicted by the weights 912 in
The neural network model is not complete until the weights are determined for all connections between nodes in adjacent layers. The weights are determined based on the training set for which the output layer values are known, e.g., (1,0) for voxels marked suspicious (or otherwise tumor positive) in the reference data and (0,1) for voxels marked non-suspicious (or otherwise tumor negative) in the reference data. This process of setting the weights is called training the neural network, and several methods for training are well known in the art, including back propagation. In some embodiments, multilayer perceptron networks (MLP) were implemented using MATLAB's Neural Network Toolbox (Mathworks; Natick, Mass.), including both determining the weights for the connected nodes of the neural network and operating the neural network thus formed. In some embodiments, the ANN was implemented using Statistica Neural Networks from Statsoft of Tulsa, Okla.
In step 428, the spectrum for the next voxel in the training set for the current zone is selected. In step 430, the model to be used (e.g., either the neural network or the functional form for principal component amplitudes, or both) is incrementally trained based on the known inputs and the desired output for that voxel. The desired output is the classification provided in the reference data for this voxel, either a label by the spectroscopist or based on the histology by the pathologist. The known inputs are the amplitudes of the relevant principal components and the values of the exam-specific factors, if any, retained as input for the least squares fit of a functional form. The known inputs are the amplitudes of the spectral frequency bins and the values of the exam-specific factors, if any, retained as input for the neural network. Note that the exam-specific values are constant for all voxels in one scan in the current zone, but may change as voxels are selected from a different scan or zone. The training is incremented based on this voxel, but the model is not completely trained until all training set voxels in the zone are processed. In step 432, it is determined whether there is another spectrum, i.e., whether there is another MRSI voxel in the zone for any of the scans of the training set. If so, control passes back to step 428 and 430 to continue training the model. If not, then the training is finished, and control passes to step 434.
In step 434 the model weights for the zone are stored in data structure 437. The data structure 437 is depicted as layered to indicate different weights for different zones and different models. For principal components, the functional form and coefficients for the zone are stored in principal components data structure 322 represented by data structure 437. For neural networks, the network structure (layers and nodes) and weights for the zone are stored in neural network weights data structure 326 represented by data structure 437.
In step 436, it is determined whether there is another zone for which principal components or neural network weight, or both, are to be determined. If so, control passes back to step 408. Otherwise the classification model is complete.
If the model is complete, then in step 438 the model performance is evaluated with one or more scans that have reference data with correct classification but which have been excluded from the training set used to develop the model. Step 438 includes statistical analysis, as described below, in some embodiments. In some embodiments, a change to the model or data or definitions of zones, alone or in some combination is determined during step 438; and at least a portion of process 400 is repeated. For example, steps 406 and following steps are repeated if zone definitions are changed; while steps 426 and following are repeated if it is determined to change the number of nodes or layers in the neural network.
Thus using the steps in method 400, a system 300 is developed to classify voxels of MSRI data. Reasonably successful classification is obtained using a single zone, as described below, as well as with four prostate anatomical zones (O, U, PZ and TZ).
Some embodiments employ principal component models, alone or in combination with an artificial neural network.
In step 611, MRSI voxels are segregated by prostate anatomical zone. In some embodiments, voxels are determined to be inside the prostate or outside the prostate, i.e., in one of two zones, e.g., by comparison to manually input boundaries or automatically determined boundaries based on high resolution MRI images. In some embodiments, voxels are determined to be in one of multiple zones inside the prostate or outside the prostate, e.g., by comparison to manually input boundaries (as depicted in
In some embodiments, step 611 comprises multiple steps.
In step 653 the MRSI image data is conditioned, e.g. to determine the voxel locations relative to a MRI image, including aligning or calibrating data from one or more coils in the MRI device. Any image conditioning algorithm may be used. In step 655, MRI image data is conditioned, including aligning or calibrating data from one or more coils in the MRI device.
In step 657 the conditioned MRI image data is segmented based on the expertly segmented MRI images. For example, in some embodiments, the segment boundaries from one or more historical scans are warped to fit the current scan. In some embodiments, spectral classes from historical data are used to determine voxels inside prostate from voxels outside prostate based on data in the zone definitions data structure.
In step 659 the high resolution MRI voxels that are in each prostate anatomical zone are determined, e.g., by comparison of voxel locations to one or more boundary locations. In step 661, the low resolution MRSI voxels are associated with the MRI voxels in each prostate anatomical zone. For example, it is determined in step 551 that MRSI voxel 10 in
Returning to
In step 617 the spectrum for the voxel is determined. Step 617 includes any conditioning of the spectra desired for comparison to other spectra, such as padding, windowing, re-coloring, calibrating and other conditioning described above with reference to step 402 and modules 304 and 314.
Step 617 also includes spectral alignment. In some embodiments the spectra are aligned to a water peak. In some embodiments, the spectra are not aligned or are aligned to a different peak because the water peak is suppressed, which can lead to migration of the maximum value off the true water resonance. Also alignment is rendered less effective in magnitude spectra because of increased peak widths, so, in some embodiments, spectral alignment is omitted.
In step 619 it is determined whether the spectrum passes the OSC filter, e.g., OSC module 132, which effectively removes information unrelated to the separation of classes, such as principal components that do not need amplitudes determined, threshold values for one or more values in the exam-specific factors data structure 328, or voxels with spectra having signal to noise ratio below a threshold value, such as 10.
If a voxel does not pass the OSC filter, then control passes back to step 613 to determine the next voxel to make the current voxel. If a voxel does pass the OSC filter, then, in step 621, the amplitudes in the current voxel are determined for the principal components of the training set of voxel spectra. For example, the amplitude of the most important peaks in the spectrum of the current voxel are determined in step 621.
In step 623, the amplitudes of at least the most important principal components are input to the functional form, e.g., inputs 580 are input to the functional form represented by block 582, such as loading plot 596. In some embodiments, the amplitudes are input to a neural network instead of a functional form during step 623.
In step 625, the classification for the voxel is determined based on the output from the functional form. For example, a voxel for which an output is near 1.0 is classified as tumor positive; while, a voxel for which an output is near 0.0 is classified as tumor negative.
In step 627, it is determined if there is another voxel in the set to be classified. If so, control passes back to step 613 to select the next voxel as the current voxel. Otherwise, in step 629, data is presented that indicates the voxels classified as suspicious of a tumor or otherwise tumor-positive. For example, an MRI image is presented with a MRSI voxel outline for each voxel classified as tumor positive, as in image 220 depicted in
In some embodiments, the classification accuracy for the model was computed as the ratio of the number of spectra predicted correctly to the total number of spectra in the test set. Such values were provided by the SIMCA-P software in the variable YPredPS. YPredPS is the Y value predicted by the model based upon the X block variables (resonance intensities at given ppm). A YPredPS value close to 1 indicates that the object is likely to belong to the class. (e.g., tumor positive) A YPredPS value close to 0 indicates that the object is unlikely to belong to the class (e.g., tumor positive)
In the computed models after OSC filtering, the first PLS component explained greater than 82.1% of the variation in the spectra between healthy and tumor voxels. The overall predictive power of the training set calculated by cross-validation was greater than 80.4%. Using the models generated by the training set, the spectra in the test set were correctly predicted greater than 81% of the time, dominated by the tumor-negative voxels. The results were dependent on the choice of training datasets and peak position variation. Frequency shifts with Variable Importance in the Projection (VIP) values larger than 1 are the most relevant for explaining differences between classes of spectra. The most important variable in differentiating tumor and healthy voxels was the CHO amplitude at 3.2 ppm. CR, PA and CIT amplitudes also had VIP values greater than one and thus were important for differentiating cancer voxels The best results for a test set of voxels and principal components were obtained by not applying scaled or centered spectra during least squares fitting of he classification values. The modeling results presented here show that the multivariate principal component amplitudes, as PLS method with OSC, works well with the tested data sets and could help to automatically distinguish the tumor-suspicious voxels. An important advantage of this method is the much shorter time of analysis compared to visual inspection and the possibility of broad implementation in cancer centers not employing experienced spectroscopists.
Some embodiments employ artificial neural network ANN models, alone or in combination with principal components. These embodiments show higher percentages of correct classification of tumor positive voxels in the MRSI data. For the artificial neural networks the prostate voxels data base was randomly divided, 70% in a training set, 15% in a validation set, and 15% in a test set.
In one embodiment of the ANN, the input layer 910 included 133 nodes 904 corresponding to all frequency shift bins from 3.4 ppm to 2.4 ppm—the same frequency range used for the principal components model. In this embodiment, the hidden layer 920 included 25 nodes. The two output layer 930 classification nodes were trained based on spectroscopist assessments, and correspond to suspicious for tumor (a type of tumor positive) and non-suspicious (a type of tumor-negative).
The ANN models are used as depicted in
In step 1113, the next voxel to be classified is selected from the voxels to be processed. For example, voxel 1 from
In step 1117 the spectrum for the voxel is determined. Step 1117 includes any conditioning of the spectra desired for comparison to other spectra, such as padding, windowing, re-coloring, calibrating and other conditioning described above with reference to step 402 and modules 304 and 314. In some embodiments, spectra are aligned as described above for step 617; however, in many example embodiments of ANNs, step 1111 excludes spectral alignment.
In step 1119 it is determined whether the spectrum passes the OSC filter, e.g., OSC module 132, which effectively removes information unrelated to the separation of classes, such as frequency shifts that do not need amplitudes determined, threshold values for one or more values in the exam-specific factors data structure 328, or voxels with spectra having signal to noise ratio below a threshold value, such as 10. If a voxel does not pass the OSC filter, then control passes back to step 1113 to determine the next voxel to make the current voxel. If a voxel does pass the OSC filter, then control passes to step 1121. In some embodiments, a SNR threshold is used as described above for step 619; however, in many example embodiments of ANNs, no OSC is performed and step 1119 is omitted.
In step 1121, the amplitudes of multiple spectral frequencies in the current voxel are determined as input to the ANN. In step 623, the amplitudes of at least the most important frequency shifts are input to the ANN, e.g., inputs 902 are input to the input layer 910 nodes 904. In step 1125, the classification for the voxel is determined based on the output from the ANN. For example, a voxel for which a first output layer 930 node holds data with a value near 1.0 and a second output node holds data with a value near 0.0 is classified as tumor positive. Similarly, a voxel for which the first output layer 930 node holds data with a value near 0.0 and the second output layer node holds data with a value near 1.0 is classified as tumor negative.
In step 1127, it is determined if there is another voxel in the set to be classified. If so, control passes back to step 1113 to select the next voxel as the current voxel. Otherwise, in step 1129, data is presented that indicates the voxels classified as suspicious of a tumor or otherwise tumor-positive. For example, an MRI image is presented with a MRSI voxel outline for each voxel classified as tumor positive, such as image 220 of
In one ANN embodiment, described above (called ANN1, hereinafter), the ANN was trained and resulting classifications evaluated based on the tumor suspicious voxels identified by the spectroscopist. ANN1 Includes 133 nodes in the input layer, 25 nodes in the hidden layer and two nodes in the output layer. The inputs to the input layer are 133 frequency shift amplitudes from the voxel spectra from 3.4 to 2.4 ppm. The results are presented in Table 1 by combining the classifications of the training set, the validation set and the test set. While a significant number of the ANN results were not consistent with the suspicious voxels, because of the small number of suspicious voxels, the overall agreement is quite high (over 97%).
In another two embodiments, the data set includes voxels from ten patients (age range from 36 to 68 years, median age 54 years). The data set was randomly divided into a training set (70%), validation set (15%) and test set (15%). The data set includes 2903 voxels. In these ANN embodiments (called ANN2 and ANN3 hereinafter), the ANNs were trained and resulting classifications evaluated based on the tumor positive voxels identified by the pathologist. ANN2 Includes 256 nodes in the input layer, 8 nodes in the hidden layer and two nodes in the output layer. The inputs to the input layer are 256 frequency shift amplitudes from the voxel spectra from 4.3 ppm to 0.4 ppm. ANN3 Includes 260 nodes in the input layer, 15 nodes in the hidden layer and two nodes in the output layer. The input to the input layer are 256 frequency shift amplitudes from the voxel spectra from 4.3 ppm to 0.4 ppm, as in ANN2, plus four nodes representing the percentage of the four prostate anatomical zones (O, U, PZ, TZ) in the voxel.
After training and validating, the results on the test sets are presented in Table 2 and table 3 for ANN2 and ANN3, respectively, by combining the classifications of the training set, the validation set and the test set.
ANN2 showed an overall correct classification rate at 99%; and ANN3 showed a slightly higher overall correct classification rate at 99.45%. Of greater interest is the correctness of classification of tumor voxels for which ANN3 with the segmentation information showed a correct classification rate of 91.9%—much higher than the correct classification rate of 76.7% by ANN2 relying on spectra alone. Nine false positive voxels were identified by both ANN2 and ANN3.
Tables 2 and 3 demonstrate fewer missed tumor voxels by ANN models compared to visual analysis by an experienced spectroscopist. As expected, both ANN models perform better than the visual analysis, because only true positive voxels confirmed by histopathology were used to train ANN, while this information is not available to the spectroscopist. This suggests a protocol for use of such ANN models in which voxels identified by ANN as tumor, but labeled as healthy by a spectroscopist, should be localized on histopathological maps to check whether they were missed by the spectroscopist.
In other embodiments, a different number of nodes are used in the ANN. These embodiments of ANN are based on the complete data set of 18 patients. In the following example ANN embodiments, the data set includes 5308 voxels within the PRESS excitation volume, of which 149 voxels are marked suspicious by a physicist/spectroscopist. Pathologist classification based on histology sections reveal that 101 of these 149 voxels are true positives that actually include a lesion, while 48 voxels were false positives by the spectroscopist. These ANN are trained to classify voxels that correspond to lesions (true positives) as tumor positive, rather than voxels marked suspicious by the spectroscopist.
Six sets of additional ANN models were trained on the same data set and tested on the 5308 voxels in the training set. Three of these model sets include 256 nodes in the input layer 910, but use different numbers of nodes in the hidden layer 920, either 4 or 5 or 6. The 256 input values for the input nodes are the amplitudes of the 256 frequency shift bins from 4.3 ppm to 0.4 ppm. The model sets are designated Set256-4, Set256-5 and Set256-6, respectively. The remaining three sets of these models include 260 nodes in the input layer 910, and use the same three numbers of nodes in the hidden layer 920, either 4 or 5 or 6. The 260 input values for the input nodes are the amplitudes of the 256 frequency shift bins from 4.3 ppm to 0.4 ppm and the percentages of the voxel in the four prostate anatomical zones. The model sets are designated Set260-4, Set260-5 and Set260-6, respectively. Thus each set represents a different arrangement (architecture) of ANN nodes. For each set of models, six models were trained—each using a different randomly selected 70% of the voxels in the full data set.
The classification summaries for these sets of ANN models are plotted in the graphs of
In a particular embodiment of the ANN model, the number of nodes in the hidden layer is 6 with 256 nodes in the input layer for input of spectra only, without anatomical zones, as in Set256-6. This particular embodiment performs better than the mean in
The weights for this particular embodiment are given in Table 4a and Table 4b for the connections between each of the 256 input nodes to each of the six hidden nodes; and in Table 5 for the connections between each of the 6 hidden layer nodes to each of the two output nodes. Table 6. Indicates the bias, a factor added to the weighted sums for each hidden layer node and output layer node, as is well known in the art.
It is anticipated that in some embodiments, the sensitivity of cancer detection by ANN models are improved by fusing MRI images with histo-pathological maps and using the precise locations of the tumor from the maps to inform the ANN about the locations of missed tumor voxels. It is also expected that higher accuracy can be attained in other embodiments by increasing the number of cases and to test and re-train the ANN models as more data becomes available.
The processes and modules described herein may be implemented via software, hardware (e.g., general processor, Digital Signal Processing (DSP) chip, an Application Specific Integrated Circuit (ASIC), Field Programmable Gate Arrays (FPGAs), etc.), firmware or a combination thereof. Such example hardware for performing the described functions is detailed below.
A bus 1210 includes one or more parallel conductors of information so that information is transferred quickly among devices coupled to the bus 1210. One or more processors 1202 for processing information are coupled with the bus 1210.
A processor 1202 performs a set of operations on information. The set of operations include bringing information in from the bus 1210 and placing information on the bus 1210. The set of operations also typically include comparing two or more units of information, shifting positions of units of information, and combining two or more units of information, such as by addition or multiplication or logical operations like OR, exclusive OR (XOR), and AND. Each operation of the set of operations that can be performed by the processor is represented to the processor by information called instructions, such as an operation code of one or more digits. A sequence of operations to be executed by the processor 1202, such as a sequence of operation codes, constitute processor instructions, also called computer system instructions or, simply, computer instructions. Processors may be implemented as mechanical, electrical, magnetic, optical, chemical or quantum components, among others, alone or in combination.
Computer system 1200 also includes a memory 1204 coupled to bus 1210. The memory 1204, such as a random access memory (RAM) or other dynamic storage device, stores information including processor instructions. Dynamic memory allows information stored therein to be changed by the computer system 1200. RAM allows a unit of information stored at a location called a memory address to be stored and retrieved independently of information at neighboring addresses. The memory 1204 is also used by the processor 1202 to store temporary values during execution of processor instructions. The computer system 1200 also includes a read only memory (ROM) 1206 or other static storage device coupled to the bus 1210 for storing static information, including instructions, that is not changed by the computer system 1200. Some memory is composed of volatile storage that loses the information stored thereon when power is lost. Also coupled to bus 1210 is a non-volatile (persistent) storage device 1208, such as a magnetic disk, optical disk or flash card, for storing information, including instructions, that persists even when the computer system 1200 is turned off or otherwise loses power.
Information, including instructions, is provided to the bus 1210 for use by the processor from an external input device 1212, such as a keyboard containing alphanumeric keys operated by a human user, or a sensor. A sensor detects conditions in its vicinity and transforms those detections into physical expression compatible with the measurable phenomenon used to represent information in computer system 1200. Other external devices coupled to bus 1210, used primarily for interacting with humans, include a display device 1214, such as a cathode ray tube (CRT) or a liquid crystal display (LCD), or plasma screen or printer for presenting text or images, and a pointing device 1216, such as a mouse or a trackball or cursor direction keys, or motion sensor, for controlling a position of a small cursor image presented on the display 1214 and issuing commands associated with graphical elements presented on the display 1214. In some embodiments, for example, in embodiments in which the computer system 1200 performs all functions automatically without human input, one or more of external input device 1212, display device 1214 and pointing device 1216 is omitted.
In the illustrated embodiment, special purpose hardware, such as an application specific integrated circuit (ASIC) 1220, is coupled to bus 1210. The special purpose hardware is configured to perform operations not performed by processor 1202 quickly enough for special purposes. Examples of application specific ICs include graphics accelerator cards for generating images for display 1214, cryptographic boards for encrypting and decrypting messages sent over a network, speech recognition, and interfaces to special external devices, such as robotic arms and medical scanning equipment that repeatedly perform some complex sequence of operations that are more efficiently implemented in hardware.
Computer system 1200 also includes one or more instances of a communications interface 1270 coupled to bus 1210. Communication interface 1270 provides a one-way or two-way communication coupling to a variety of external devices that operate with their own processors, such as printers, scanners and external disks. In general the coupling is with a network link 1278 that is connected to a local network 1280 to which a variety of external devices with their own processors are connected. For example, communication interface 1270 may be a parallel port or a serial port or a universal serial bus (USB) port on a personal computer. In some embodiments, communications interface 1270 is an integrated services digital network (ISDN) card or a digital subscriber line (DSL) card or a telephone modem that provides an information communication connection to a corresponding type of telephone line. In some embodiments, a communication interface 1270 is a cable modem that converts signals on bus 1210 into signals for a communication connection over a coaxial cable or into optical signals for a communication connection over a fiber optic cable. As another example, communications interface 1270 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN, such as Ethernet. Wireless links may also be implemented. For wireless links, the communications interface 1270 sends or receives or both sends and receives electrical, acoustic or electromagnetic signals, including infrared and optical signals, that carry information streams, such as digital data. For example, in wireless handheld devices, such as mobile telephones like cell phones, the communications interface 1270 includes a radio band electromagnetic transmitter and receiver called a radio transceiver.
The term computer-readable medium is used herein to refer to any medium that participates in providing information to processor 1202, including instructions for execution. Such a medium may take many forms, including, but not limited to, non-volatile media, volatile media and transmission media. Non-volatile media include, for example, optical or magnetic disks, such as storage device 1208. Volatile media include, for example, dynamic memory 1204. Transmission media include, for example, coaxial cables, copper wire, fiber optic cables, and carrier waves that travel through space without wires or cables, such as acoustic waves and electromagnetic waves, including radio, optical and infrared waves. Signals include man-made transient variations in amplitude, frequency, phase, polarization or other physical properties transmitted through the transmission media.
Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, a hard disk, a magnetic tape, or any other magnetic medium, a compact disk ROM (CD-ROM), a digital video disk (DVD) or any other optical medium, punch cards, paper tape, or any other physical medium with patterns of holes, a RAM, a programmable ROM (PROM), an erasable PROM (EPROM), a FLASH-EPROM, or any other memory chip or cartridge, a transmission medium such as a cable or carrier wave, or any other medium from which a computer can read. Information read by a computer from computer-readable media are variations in physical expression of a measurable phenomenon on the computer readable medium. Computer-readable storage medium is a subset of computer-readable medium which excludes transmission media that carry transient man-made signals.
Logic encoded in one or more tangible media includes one or both of processor instructions on a computer-readable storage media and special purpose hardware, such as ASIC 1220.
Network link 1278 typically provides information communication using transmission media through one or more networks to other devices that use or process the information. For example, network link 1278 may provide a connection through local network 1280 to a host computer 1282 or to equipment 1284 operated by an Internet Service Provider (ISP). ISP equipment 1284 in turn provides data communication services through the public, world-wide packet-switching communication network of networks now commonly referred to as the Internet 1290. A computer called a server host 1292 connected to the Internet hosts a process that provides a service in response to information received over the Internet. For example, server host 1292 hosts a process that provides information representing video data for presentation at display 1214.
At least some embodiments of the invention are related to the use of computer system 1200 for implementing some or all of the techniques described herein. According to one embodiment of the invention, those techniques are performed by computer system 1200 in response to processor 1202 executing one or more sequences of one or more processor instructions contained in memory 1204. Such instructions, also called computer instructions, software and program code, may be read into memory 1204 from another computer-readable medium such as storage device 1208 or network link 1278. Execution of the sequences of instructions contained in memory 1204 causes processor 1202 to perform one or more of the method steps described herein. In alternative embodiments, hardware, such as ASIC 1220, may be used in place of or in combination with software to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware and software, unless otherwise explicitly stated herein.
The signals transmitted over network link 1278 and other networks through communications interface 1270, carry information to and from computer system 1200. Computer system 1200 can send and receive information, including program code, through the networks 1280, 1290 among others, through network link 1278 and communications interface 1270. In an example using the Internet 1290, a server host 1292 transmits program code for a particular application, requested by a message sent from computer 1200, through Internet 1290, ISP equipment 1284, local network 1280 and communications interface 1270. The received code may be executed by processor 1202 as it is received, or may be stored in memory 1204 or in storage device 1208 or other non-volatile storage for later execution, or both. In this manner, computer system 1200 may obtain application program code in the form of signals on a carrier wave.
Various forms of computer readable media may be involved in carrying one or more sequence of instructions or data or both to processor 1202 for execution. For example, instructions and data may initially be carried on a magnetic disk of a remote computer such as host 1282. The remote computer loads the instructions and data into its dynamic memory and sends the instructions and data over a telephone line using a modem. A modem local to the computer system 1200 receives the instructions and data on a telephone line and uses an infra-red transmitter to convert the instructions and data to a signal on an infra-red carrier wave serving as the network link 1278. An infrared detector serving as communications interface 1270 receives the instructions and data carried in the infrared signal and places information representing the instructions and data onto bus 1210. Bus 1210 carries the information to memory 1204 from which processor 1202 retrieves and executes the instructions using some of the data sent with the instructions. The instructions and data received in memory 1204 may optionally be stored on storage device 1208, either before or after execution by the processor 1202.
While the invention has been described in connection with a number of embodiments and implementations, the invention is not so limited but covers various obvious modifications and equivalent arrangements, which fall within the purview of the appended claims. Although features of the invention are expressed in certain combinations among the claims, it is contemplated that these features can be arranged in any combination and order.
This application claims benefit of Provisional Appln. 61/171,451, filed Apr. 21, 2009, the entire contents of which are hereby incorporated by reference as if fully set forth herein, under 35 U.S.C. 119(e).
This invention was made in part with Government support under Contract No. R01-CA076423 awarded by the National Institutes of Health (NIH) National Cancer Institute (NCI) The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
61171451 | Apr 2009 | US |