Method, apparatus and system for automated spine labeling

Abstract
A method, an apparatus, and a system label one or more parts of a spine in an image, in particular a computed tomography (CT) image, of a human or animal body, and in order to achieve a reliable spine labeling and a high throughput of images, match a model of a spine segment with segments of the spine in the image by starting matching the model of a spine segment with an initial segment of the spine in the image, wherein the initial segment of the spine in the image is located at an initial position along the spine in the image, and continue to match the model of a spine segment with one or more further segments of the spine in the image, wherein the further segments of the spine in the image are located at positions farther along the spine in the image.
Description
BACKGROUND OF THE INVENTION

1. Field of the Invention


The present invention relates to a method and a corresponding apparatus and system for automated labeling of a spine in an image, in particular a computed tomography (CT) image, of a human or animal body.


2. Description of the Related Art


The acquisition of CT images with and without contrast agent of abdomen, thorax and/or neck is a routine procedure for the diagnosis of a multitude of diseases or injuries. The spinal column represents a natural reference structure of the upper part of the body for describing the locations of organs and pathologies. To be used as a reference system in daily clinical routine, the vertebrae and/or intervertebral disks in the image have to be labeled. A manual labeling can be time consuming, especially if only arbitrary parts of the spine are visible in the data. Therefore, automatic approaches are of interest which deliver labeling results after image acquisition without any user interaction.


For the labeling task, a sparse localization of spine components, e.g. vertebrae and/or disks, is sufficient. Within this context, the term “sparse” refers to the requirement according to which correct anatomical labels should be visible in all views showing a certain vertebra or intervertebral disk, and optionally also in a 3D rendering. This does not necessarily require a full segmentation of all spinal structures. The localization of centers of disks and vertebrae and a coarse approximation of their extent delivers adequate results.


Although this task seems to be trivial, the realization of a fully automatic labeling system for 3D CT data supporting radiologists is challenging: The labeling should be available within a reasonable time in order to guarantee a fast diagnosis after image acquisition. Nonetheless, the labeling algorithm has to reliably handle varying resolution and image quality, showing spinal columns with variations in size, shape, bone densities and varying number of vertebrae. Presence of contrast agent or pathologies like scoliosis, collapsed disks, broken vertebrae, degenerative changes or fused vertebrae based on surgical procedures make high demands on the flexibility of the chosen methods.


SUMMARY OF THE INVENTION

Preferred embodiments of the invention provide a method, apparatus and system for automated labeling of a spine in an image of a human or animal body with high reliability and high image throughput.


The advantages and benefits of the preferred embodiments are achieved by the method, apparatus and system as defined below.


The method according to a preferred embodiment of the invention comprises the following steps: a) matching a model of a spine segment with segments of the spine in the image by starting matching said model of a spine segment with an initial segment of the spine in the image, wherein said initial segment of the spine in the image being located at an initial position along the spine in the image, and by continuing matching said model of a spine segment with one or more further segments of the spine in the image, wherein said further segments of the spine in the image being located at further positions along the spine in the image, wherein said model of a spine segment relates to anatomical properties of one or more parts of a spine, and b) labeling one or more parts of the spine in the image in response to step a).


The apparatus according to preferred embodiment of the invention comprises an image processing unit for executing and/or controlling the following steps: a) matching a model of a spine segment with segments of the spine in the image by starting matching said model of a spine segment with an initial segment of the spine in the image, wherein said initial segment of the spine in the image being located at an initial position along the spine in the image, and by continuing matching said model of a spine segment with one or more further segments of the spine in the image, wherein said further segments of the spine in the image being located at further positions along the spine in the image, wherein said model of a spine segment relates to anatomical properties of one or more parts of a spine, and b) labeling one or more parts of the spine in the image in response to step a).


The system according to preferred embodiment of the invention comprises an image acquisition unit, in particular a computed tomography (CT) unit, for acquiring at least one image of at least a part of a human or animal body and an apparatus for automated labeling of a spine in an image.


The preferred embodiments of the invention include a fully automatic algorithm for labeling arbitrary parts of the vertebral column shown in CT data. The algorithm finds an initial position with its anatomical label by detection of reference regions (e.g. sacrum) and subsequently labels all remaining visible disks and vertebrae automatically. Preferably, a high-performance method for sparse structure localization by Markov Random Fields (MRF) is applied, wherein sparse 3-disk MRF models are built and, starting from the initial position, propagated to all parts of the spine. Preferably, a boosted decision tree based feature detection method inside regions of interest is used for optimization of the MRF model matching. Moreover, prior knowledge on spine anatomy and appearance is considered.


Due to the preferred embodiments of the invention, high precision results—even for CT scans of only few vertebrae—are obtained in less time so that both high reliability and high throughput of images to be labeled are achieved. E.g., for volume images constructed from 512×512 axial slices an average labeling precision of 99.0% in about 2 minutes is achieved.


In the context of the invention, the term “part of a spine” preferably relates to a vertebra or intervertebral disk of a spine. The terms “spine segment” and “segment of a spine” preferably relate to a portion of a spine comprising one or more parts of the spine, in particular one or more vertebrae and/or intervertebral disks. Accordingly, an “initial segment of the spine” or a “further segment of the spine” comprises one or more parts of the spine located at an initial or a further position, respectively, on or along the spine.


The term “matching” or “to match” in the sense of the invention relates to a comparison of said model of a spine segment with segments of the spine in the image and/or an examination whether the model of a spine segment corresponds and/or correlates with segments of the spine in the image.


Further, the term “in response to” relating to matching the model of a spine segment with segments of the spine in the images means that one or more parts of the spine are labeled dependent on and/or subject to the result of the mentioned comparison and/or examination step. In particular, if a model of a spine segment corresponds and/or correlates with a segment of the spine in the image, one or more parts, i.e. vertebrae and/or disks, of said segment of the spine in the image are labeled according to the corresponding parts of the model of a spine segment.


In a preferred embodiment of the invention, said further positions correspond to positions propagating from said initial position. This means that upon completion of matching said model of a spine segment with said initial segment of the spine in the image, said model of a spine segment is matched with at least one first further segment of the spine being located at a first further position, wherein said first further position being next to said initial position and/or said first further segment of the spine being adjacent to said initial segment of the spine. Further, upon completion of matching said model of a spine segment with said first further segment of the spine in the image, said model of a spine segment is matched with at least one second further segment of the spine being located at a second further position, wherein said further position being next to said first further position and/or said second further segment of the spine being adjacent to said first further segment of the spine. This matching process may be repeated for third, fourth, fifth etc. further segments of the spine in the image being located at respective further positions. By matching the spine model with propagating positions and respective segments of the spine, only one promising initial position has to be established making the inventive approach very fast and reliable.


In another preferred embodiment of the invention, one or more parts of the spine in the image correspond to one or more vertebrae and/or intervertebral discs of the spine in the image. By this, highly relevant and recognizable parts of the spine can be labeled very quickly.


According to a further preferred embodiment, said model of a spine segment relates to anatomical properties of two to five vertebrae and/or intervertebral discs of a spine. In particular, said model of a spine segment relates to anatomical properties of three intervertebral discs of a spine and/or to anatomical properties of two vertebrae of a spine. As a result, only relatively small segments of the spine in the image comprising a small number of vertebrae and/or intervertebral discs are sufficient in order to ensure high reliability and throughput.


Preferably, said three intervertebral discs of the spine are associated with said two vertebrae of the spine. In this context, the term “associated” means that said model of a spine segment considers anatomical properties of two consecutive vertebrae, a disk between these two consecutive vertebrae and two disks adjacent to—i.e. at the “bottom” and “top” of—these two consecutive vertebrae. This type of spine segment model results both in a considerable increase of image throughput in daily routine and a high reliability in spine labeling. Because only small segments of the spine are required without adverse effects on the reliability, the flexibility and versatility of the invention is further enhanced.


According to another preferred embodiment of the invention, said initial position of a segment of the spine in the image is established by considering anatomical knowledge about a spine. Preferably, said initial position of an initial segment of the spine in the image is established by detecting at least one anatomical landmark of the spine in the image. In particular, said at least one anatomical landmark relates to one of: a vertebra (T1) at a first rib, a vertebra (T12) of a last rib and/or a sacral foramina (S1). In this way, a reliable and promising initial position leading to good matching results is established very quickly so that image throughput and labeling reliability are still increased.


Preferably, before establishing an initial position of an initial segment, the spinal canal of the spine is detected. Subsequently, one or more initial segment candidates, in particular intervertebral disks, being located next to the spinal canal are determined. Moreover, so-called transition detectors relating to transition disks, e.g. C7/T1, T12/L1 or L5/S1, are determined.


Within the context of the invention, the term “detector” or “feature detector” refers to methods that aim at computing abstractions of image information and making local decisions at every image point whether there is an image feature, e.g. in particular an interesting part of the image, of a given type at that point or not. Accordingly, the term “transition detector” refers to a method for finding a transition between at least two image features.


Moreover, it is preferred that said initial position of a segment of the spine in the image is established by disk appearance profiles or disk profiles, in particular by profiles of the intervertebral disk candidates and/or the transition detectors. Preferably, a disk label is assigned to the most prominent disk candidate.


In a further preferred embodiment of the invention, said initial position of a segment of the spine in the image is established by deducing an initialization disk by regular expression matching, i.e. by matching the disk profile to a full-spine profile. The term “regular expression matching” in the context of the invention relates to a regular expression which provides a concise and flexible way to match, i.e. to specify and recognize, patterns, e.g. strings or character patterns of a text or patterns or profiles of an image or a part thereof.


Preferably, a disk profile corresponds to a string (“TTT.LLL.”) of region classes to which a set of disk candidates ({θm}) is mapped by classifying each disk candidate (θm) to a region class (“C”, “T”, “L”) or region transition uncertainty (“.”).


Preferably, a set of disk candidates ({θm}) is detected by disk detectors (ΦC, ΦT, ΦL) which are trained to detect disks in the cervical (“C”), thorax (“T”) and lumbar areas (“L”) of the spine, respectively.


Preferably, said initial position of a segment of the spine in the image is established by deducing an initialization disk by regular expression matching, wherein a disk profile (“TTT.LLL.”) is matched to a full-spine profile (“CCCCCCTTTTTTTTTTTTLLLLL”).


Preferably, multiple initialization disk candidates can result from the region transition uncertainty (“.”) in the disk profile (“TTT.LLL.”). Preferably, multiple initialization disk candidates are resolved by multiple labeling runs.


Preferably, said initial position of a segment of the spine in the image is established by deducing an initialization disk by localizing one of three most distinguishing transition disks (C7/T1, T12/L1, L5/S1) of the spine.


Preferably, transition disks (C7/T1, T12/L1, L5/S1) are detected by transition detectors (ΦCT, ΦTL, ΦLS) which are trained to detect cervical/thorax (CT), thorax/lumbar (TL) and lumbar/sacrum (LS) transitions.


Preferably, said initial position of a segment of the spine in the image is established by considering Markov Random Field (MRF) matching qualities. A MRF in the context of the invention is a graphical model of a joint probability distribution. It consists of an undirected graph in which the nodes represent random variables. A MRF is a convenient and consistent way to model context-dependent entities such as image pixels and correlated features. This is achieved by characterizing mutual influences among such entities using conditional MRF distributions.


The preferred steps, alone or in combination, set forth above also contribute to a further enhancement of reliability and image throughput.


Further advantages, features and examples of the present invention will be apparent from the following description of following figures:





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 shows an example of an apparatus and a system according to a preferred embodiment of the invention;



FIG. 2 shows an example of a multi-view rendering of vertebra labels in a radiology software;



FIG. 3 shows examples of CT images with annotated landmarks;



FIG. 4 shows (left) a sparse 3-disk model custom characteri for a fixed intervertebral “Middle Disk” d1 and (right) a 2D sagittal projection of a steerable sampling pattern around a disk di and along an edge defined by di and di+1, wherein pattern layers define regions R1 . . . Rr;



FIG. 5 shows an overview on the spine labeling framework in a sagittal projection;



FIG. 6 shows an example of a correctly labeled full-spine dataset; and



FIG. 7 shows further examples of correctly labeled datasets of parts of a spine.





DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS


FIG. 1 shows an example of an apparatus 10 and a system according to a preferred embodiment of the invention. A medical image data set 11 comprising a plurality of images, in particular slice images, of a human or animal body is acquired by a medical imaging apparatus 12, in particular a computer tomography (CT) apparatus.


The apparatus 10 comprises a control unit 13, e.g. a workstation or a personal computer (PC), to which the image data set 11 is fed. Preferably, the image data set 11 can be transferred from the medical imaging apparatus 12 to the control unit 13 via a data network 18 to which the control unit 13 is, at least temporarily, connected. For example, the data network 18 can be a local area network (LAN) or wireless LAN (WLAN) in a hospital environment or the internet.


Preferably, the control unit 13 is configured to generate a volume reconstruction and/or a slice view 15 of the image data set 11 on a display 14, e.g. a TFT screen of the workstation or PC, respectively. Moreover, the control unit 13 is designed to label one or more parts of a spine in the image data set 11 according to preferred embodiments of the invention.


In the example shown in FIG. 1, a vertebra 19 in the axial slice view 15 is labeled, i.e. marked or denoted, with a label “L3” indicating that the displayed vertebra 19 corresponds to the third lumbar vertebra of the spine.



FIG. 2 shows an example of a multi-view rendering of vertebra labels in a radiology software. The left part of FIG. 2 again shows the axial slice view 15 rendered on the display 14 shown in FIG. 1. The middle part of FIG. 2 shows a sagittal slice view 16 of a spinal segment, wherein respective vertebra is labeled with “L3”. In the right part of FIG. 2 a three-dimensional representation of the image data set is shown, wherein all of the vertebrae contained in the image are labeled with respective labels “L1” to “L5”.


The spinal column represents a natural reference frame of the upper part of human body. To localize nearby organs and pathologies, sparse spine labeling is sufficient, wherein correct vertebra/disk labels are visible in arbitrary 2D and 3D views prior their segmentation. According to an aspect of the invention, an automatic, segmentation-free approach to sparsely label spinal columns in 3D CT datasets is proposed and an according framework was designed with two main goals in mind. First, to relax requirements on the input data for labeling of both full and partial spine scans. Though presence of sacrum, T12, or T1 vertebrae in the data is predominantly used, it is not strictly necessary. Second, to be used in daily clinical routines, the method, apparatus and system according to preferred embodiments of the invention need to be high throughput, capable of processing thousands of slices in few minutes. To accomplish these goals, structural knowledge from training data is preferably encoded in probabilistic boosting trees and used to detect relevant landmarks in the incoming scans. Desired disk landmarks and labels are then localized preferably by Markov Fandom Field-based matching of sparse appearance models which encode the anatomical knowledge.


The spine labeling approach outlined above will be presented in more detail in the following.


1. Building Spine-Related Models


1.1. Data Requirements


In contrast to approaches according to the state of the art, a preferred embodiment the invention does not expect any specific part of the spine to be present in a CT scan. Rather, the only requirement is to have at least a three-intervertebral disk part of the spine in the data. Additionally, the following information to be available in the DICOM tags is required: (1) a CT-to-Hounsfield intensity transformation and (2) the patient position.


1.2. Training Data Annotation


Having these requirements met, volumes Ik in Hounsfield scale and in right-handed head-first supine (face-up) orientation of the patient are reconstructed. The data is annotated in the following way: For each disk landmark di, i=1 . . . 23, present in image IkεS, k=1 . . . 48, the center position of the disk dki is annotated with an anatomical label Λiε{C2/C3, C3/C4, . . . , L4/L5, L5/S1}. A cylinder Kkdi approximating the disk is positioned at the disk center dki. Furthermore, next to the disk landmarks di canal landmarks ci with anatomical labels Λi are placed within the spinal canal at position cki, lying on the perpendicular defined by the spinal canal and the disk center. The canal landmark set is extended by landmarks defined by the middle point lying on a linear interpolation of cki and cki+1. Cylinders Kkci around cki approximate the extent of the spinal canal. Further landmarks bj, j=1 . . . 12 with anatomical labels θjε{T1, . . . , T12} are placed in the middle of rib bodies bkj and s1, s2 in the center of the two uppermost sacral foramina sk1, sk2. Cylinders Kkbj; Kks1, Kks2 were placed around bkj, sk1, sk2 to approximate their extent.



FIG. 3 shows examples of CT images in a sagittal, coronal and axial plane, respectively, which have been annotated accordingly with landmarks, i.e., disk centers di, spinal canal centers ci, ribs bj and sacral foramina centers s1, s2. For a better clarity, the interpolated spinal canal landmarks are not visualized in this representation.


1.3. Learning Appearance of Target Regions


In order to detect and label the intervertebral disks, three detectors dedicated to cervical, thorax, and lumbar disks are trained for this purpose. To prune the input volume and to clean the disk outliers, however, it is started with detecting the most reliable reference structure, i.e. the spinal canal. To deduce an initialization label, three detectors are further trained to detect T1 rib, T12 rib, and the uppermost sacral foramina.


1.3.1. Spinal Canal Detector: ΦS


Preferably, the spinal canal is chosen as a central part of spine-related problems. Positive samples for the spinal canal detector ΦS are generated within the cylinders Kkci around the annotated points cki. Negative samples are generated randomly, constrained to have a minimal distance of 10 mm to positive regions.


1.3.2. Disk Detectors: ΦC, ΦT, ΦL


In order to place labels inside the intervertebral disks, three disk detectors ΦC, ΦT, ΦL are learned to detect the disks in the cervical, thorax, and lumbar areas, respectively.


Positive disk samples are generated from the cylindrical disk approximation Kkdi of the respective region: ΦC is learned from {C2/C3, . . . , C7/T1} disks, ΦT from {T1/T2, . . . , T12/L1}, and ΦL from {L1/L2, . . . , L5/S1}.


Negative samples are taken from the counterpart disks as well as from random distribution with assured minimal distances of 10 mm to the positive samples.


While the disk detectors best respond in areas they have been trained for, false positive responses may occur frequently especially in the cervical/thorax and thorax/lumbar transitions. The disks can be therefore best localized as clusters in a union ΦC∪ΦT∪ΦL of all three disk detector responses.


The association of a mixed disk cluster with a particular region, however, needs to be learned from the relative contributions of the responses ΦC, ΦT and ΦL. The posterior probabilities of the three detectors are combined to classify disk clusters into one of the respective regions “C”, “T”, “L”, or “.” to reflect region transitions and further uncertainties.


1.3.3. Transition Detectors: ΦCT, ΦTL, ΦLS


The following three detectors are trained to detect the three transitions where the labeling can easily be initialized from: cervical/thorax, thorax/lumbar, and lumbar/sacrum. The feature detector


ΦCT is trained to detect voxels in the T1 rib. Positive rib samples are generated within cylindrical regions Kkb1 around the rib points bk1;


ΦTL is trained to detect voxels in the T12 rib. Similar to T1, positive T12 rib samples are generated within cylinders Kkb12 around the rib points bk12;


ΦLS is trained to detect the sacral foramina points sk1, sk2. Positive samples are constrained to the sacral foramina cylinder approximations Kks1, Kks2.


As negative samples for all of the transition detectors, all remaining annotated parts, manually selected parts (i.e. transverse processes of vertebrae) of the body and randomly sampled points with a safety margin of 10 mm to the positive samples are taken.


1.3.4. Employing Probabilistic Boosting Trees


To perform the feature detection during both training and testing, probabilistic boosting trees (PBT) are employed. PBTs are special kinds of decision trees which hold an ensemble of weak classifiers at each tree node and compose them into one strong classifier. The weak classifiers preferably used in preferred embodiments of the invention include so-called Haar-like features, image derivatives (intensity, gradient magnitude, structure tensor, and principal curvature) and their histograms.


Preferably, cascading, classifier sorting and a multi-resolution scheme are used in order to optimize time performance. Cascading considers only true samples running along the tree, while classifier sorting uses cheap classifiers at first and more expensive ones at deeper levels of the tree. The multi-resolution scheme significantly reduces the amount of voxels to be processed. Each detector consists of 3 boosted decision tree classifiers Φnp (n=0 . . . 2) for 3 levels of resolution. Input volumes Ik are resampled into a pyramid of 3 isotropic grids Ink with voxel sizes of 2n mm. During classification the feature detectors are applied in a coarse-to-fine manner, i.e. Φ2p → Φ1p → Φ0p, early terminating as soon as any test fails.


1.4. The 3-Disk Model


The feature clouds from disk detectors are usually not suitable for the final result. First, there are false positive (outliers) both outside and inside of the disk column. Second, there are problematic data, e.g. with broken vertebrae collapsed disks, where disks are detected weakly or even remain undetected.


To correct the disk detection afterwards and to compensate for the weak or missing disks feature clouds, sparse MRF appearance models record a priori information about appearance of local image regions, of the edges between them and about the geometrical setup of these regions.


The geometrical setup and the anatomical appearance of a compound of 3 consecutive disks is modeled and adapted according to morphometry while propagating along the spine. This has three advantages over a model of the full spinal column. First, the matching is done locally and is therefore fast and robust. Second, anatomical variation can be easily integrated into the whole framework. Third, the framework is applicable on datasets which contain only parts of the spine.


Left part of FIG. 4 shows vertebrae 20 to 22 and intervertebral disks 24 to 26 of a spine segment. In the model custom characteri of a spine segment set forth below, preferably two consecutive vertebrae 21 and 22, a middle disk 25 located between the consecutive vertebrae 21 and 22 as well as an upper and lower disk 24 and 26 adjacent to the upper side of upper vertebra 21 or lower side of lower vertebra 22, respectively, are considered.


1.4.1. Setup


For a fixed disk label Λi, computation of the 3-disk model custom characteri involves 6 nodes and preferably 11 or 5 edges around disk landmark d1, over all training images that contain the entire 3-tuple of disks di−1, di, di+1. This is illustrated in left part of FIG. 4, in which the 6 nodes di, di+1, ci, ci+1 and ci are connected by 11 edges. For computation of the 3-disk model around disk landmark d1 preferably all of the 11 edges are considered or, alternatively, only a 5-edge subset thereof is involved, as exemplarily indicated in FIG. 4 by thick lines.


For each training volume IkεStr a local coordinate frame is spanned with reference point at the i-th disk landmark di and three orthogonal vectors:

ui1=ci−1−ci+1, ui2=ui1×(0,−1,0), ui3=ui1×ui2


Subsequently, a morphometry feature vector custom characterki is computed to capture the geometrical configuration of the 6 nodes and appearance feature vectors custom characterki, custom characterki to sparsely model the appearance of both the 6 nodes and the 11 or 5 edges, respectively.


In the context of the invention, the term “feature vector” in general relates to a multi-dimensional vector of numerical features that represent an object. Accordingly, a “morphometry feature vector” relates to a multi-dimensional vector of numerical features relating to a quantitative analysis of form, i.e. size and shape, of an object, and an “appearance feature vector” relates to a multi-dimensional vector of numerical features relating to a quantitative analysis of the texture, in particular the surface texture, of an object.


The model custom characteri will be finally computed as an average feature vector across the training data.


1.4.2. The Morphometry Feature Vector


For each edge e its length ∥e∥ and 3 angles to the local coordinate frame, i.e.

∠eui1,∠eui2,∠eui3

  • are computed yielding a 44-dimensional or 20-dimensional feature vector, respectively:

    custom characterik={∥e∥, ∠eui1, ∠eui2, ∠eui3|eε{ei−11, . . . , ei−15, ei1, . . . , e15, ei+11}}

    1.4.3. Appearance Feature Vectors


The appearance of edges and nodes is modeled by intensity differences between r sampling patterns R1, R2, . . . , Rr.


While the edges are sampled linearly, the node sampling patterns are steerable features: layers orthogonal to u1i are displaced and scaled according to the captured morphometry. This is illustrated by means of FIG. 4 (right part) showing a 2D sagittal projection of steerable sampling Regions R around the disk di−1 and samples along an edge defined by di and di+1, wherein pattern layers define the regions R1 . . . RrN or R, respectively. ui1, ui2 and ui3 define the local coordinate frame at disk di.


For each node/edge a feature vector D is created computing intensity sum differences between all possible combinations of region pairs RA and RB:






D
=


[





p


R
A





I


(
p
)



-




q


R
B





I


(
q
)




]


(




A
=

1







r







B
=

A
+

1







r






)






This yields









(



r




2



)



-


dimensional






appearance vectors for each edge and the 6 nodes. The edge appearance can be excluded by setting rN=0. Depending on an actual configuration, the edge appearance vector custom characterki becomes thus







11
×

(




r





ɛ





2



)



-


dimensional

,








5
×

(




r





ɛ





2



)



-


dimensional





or 0-dimensional, respectively, and the nodes appearance vector







i
k






6
×

(



rN




2



)



-



dimensional
.






1.4.4. The Model


The final edge-node feature vector of model custom characteri is computed by averaging the feature vectors of all training data, Str:











i

=


(



𝒢
_

i

,


ɛ
_

i

,


𝒩
_

i


)

=


1



S
tr










I
k



S
tr





(




𝒢
i
k

,

ɛ
i
k




edges


,


𝒩
i
k



nodes



)








(
1
)








Models custom characteri are built for every disk label from C3/C4 to L4/L5, i.e., 2≦i≦22.


In addition, the models custom character20, custom character21, custom character22 are associated with mean distances s21, s22, s23, of their bottommost disks centers, d21; d22; d23, to the sacral foramina.


2. Labeling Framework


In this section, the spine auto-labeling framework following its components will be described with reference to FIG. 5 which shows a sagittal projection of a segment of a spine in an image at different phases or steps (corresponding to part a to f of FIG. 5) of the method according to a preferred embodiment of the invention. In the given example, the segment of the spine in the image comprises 13 vertebrae and 14 intervertebral disks.


After input of CT image data (FIG. 5a), feature detection is performed in order to prune the search space for the subsequent model matching. Here, a detection of the spinal canal (see FIG. 5b and sec. 2.1 below) is followed by a detection of intervertebral disk candidates next to it and by transition detectors (see FIG. 5c and sec. 2.2 below).


A subsequent initialization disk identification is based on the disk candidates, their profile, and on the transition detectors. As a result, a disk label is assigned to the most prominent disk candidate (see FIG. 5d and sec. 2.3 below).


In a following step, model matching and propagation is performed, wherein a 3-disk model, which is determined by the initialization label, is matched to a subset of disk and canal features (see FIG. 5e and sec. 2.4 below). The matching is propagated up and/or downwards until stopping criteria are met (see sec. 2.5 below). In a final step, the CT image of the spine is labeled with respective labels according to the results of the previous model matching and propagation step (see FIG. 5f).


The above-mentioned steps of the spine auto-labeling framework will be described in more detail in the following.


2.1. Spinal Canal


In a preferred embodiment of the invention, the algorithm considers the spinal canal which is a significant feature. Accordingly, the spinal canal feature detector ΦS is applied inside the whole volume. Positively classified voxels yield a point cloud. To avoid false positives, a B-spline is fitted to the tallest connected component of this cloud. In the following the B-spline will also be referred to as canal spline ç (see FIG. 5b).


2.2. Disk Candidates and Profile


To accelerate the detection of disk candidates the disk detectors ΦC, ΦT, ΦL are restricted to a region extruded by the largest possible disk along the canal spline ç.


Positively classified voxels yield cervical, thoracic, and lumbar feature clouds, C, T, and L. In order to eliminate false positives and to yield only true disk points, places of highest concentrations of disk points along the canal spline are determined.


The canal spline is sampled at a fixed arc length of 1 mm yielding a set of points and tangents {çs, ç′s}. Each spline sampleç S is associated with disk feature subsets Cs, Ts, and Ls less distant than 0.5 mm from a plane defined by (çs, ç′s) and counted: Σs=|Cs|+|Ts|+|Ls|. Canal samples çm where Σm attains local maxima (with respect to a window of the smallest vertebra height) lie in disk planes and the centroids of the corresponding feature clusters θm=avg(Cm∪Tm ÅLm) are picked as disk candidates. The corresponding counts |Cm|+|Tm|+|Lm| are used to classify each disk candidate θm in accordance with section 1.3.2 above, i.e. to a region class “C”, “T”, “L”, or region transition uncertainty “.”. The set of all disk candidates {θm} is mapped to a string of region classes and is referred to as the disk profile. For example, a disk profile “TTT.LLL.” corresponds to a thoracic-lumbar transition with two uncertain disk candidates θ4 and θ8. In FIG. 5c, disk detectors ΦT, ΦL and a transition detector ΦTL are shown.


2.3. Initialization Disk and its Label


From a set of disk candidates {θm} an initial one, θi, is picked for which a disk label Λi can be assigned reliably. Preferably, one of the three most distinguishable transition disks C7/T1, T12/L1, and L5/S1 is localized (see sec. 2.3.1 below). If none of these transition disks can be localized, the initialization disk label is deduced by a regular expressions match of the disk profile to a full-spine profile (see sec. 2.3.2 below).


2.3.1. Applying Transition Detectors


The three transition detectors introduced in section 1.3.3 above are applied near the spinal canal if the disk profile suggests it, terminating as soon as the initialization disk can be deduced as follows:

    • C7/T1 rib features (ΦCT) are computed next to a potential C-to-T transition. An overlap of the feature points with disk candidates is evaluated in the sagittal projection. The disk candidate with a maximum feature overlap is assigned the disk label C7/T1.
    • T12/L1 rib features (ΦTL) are computed next to a potential T-to-L transition. Similar to the former case, the disk candidate with the maximum feature overlap is assigned the disk label T12/L1.
    • Sacrum features (ΦLS) are computed in the vicinity of the bottom tip of the spline provided there is a chain of L-class disk candidates at the bottom of the profile. The distance between the centroid of ΦLS and the bottommost disk candidate is compared to all three disk-to-foramina distances s22, s22, s23 introduced in section 1.4.4. In most of the cases s23 determines the best match to the transition label L5/S1. For troublesome cases where bottommost disks were undetected the matches to s22 or even s22 may decide on the bottommost label, i.e. L4/L5 or L3/L4.


      2.3.2. Resolving Ambiguities


If all above transition detectors fail due to insufficient features or lacking features-to-profile correspondence, a regular expression search is applied to match the disk profile to a full-spine profile “CCCCCCTTTTTTTTTTTTLLLLL”.


While such a match can happen to be unique, the algorithm has to be prepared to handle eventual multiple candidates resulting from missing transition (e.g. if only a part of thorax in the CT scan is seen) or uncertainties in the disk profile. The example disk profile “TTT.LLL.” would yield two matches, “TTTLLLLL” and “TTTLLLLL”, with two candidates for label T12/L1, i.e. θ3 and θ4. Similarly, “LLL” profile would yield three candidate configurations in the lumbar part, “TTTTTT” would yield 7 candidate configurations in the thorax, and so forth.


Multiple candidates are resolved by multiple labeling runs initialized from each disk-label pair. The labeling results are assessed by MRF matching qualities as shown below.


2.4. Model Matching Using MRFs


Having found the initialization (θi, Λi) at disk i, the 6-tuple

i−1, ζi−1, Θi, Θi+1, ζi+1)

of previously identified adjacent disk candidates and their associated canal points become subject to be refined with model custom characteri and/or compensated for eventual missing disk features.


A cloud of 100 closest disk or canal features, respectively, is associated to every of the 6-tuple points (see FIG. 4). The 6×100 points become the refinement candidates. The task is to find an optimal match of the model custom characteri to one of the 1006 possible configurations τi as each of the 6 model nodes attempts to find an optimal position among its 100 associated candidates.


The fitness

Qi|Mi)  (2)

  • Q of model custom characteri to a particular 6-tuple configuration τi will be assessed by Euclidean distance between the feature descriptor

    (custom characteri, Ēi, Ni)
  • of the model (eq. (1)) and a descriptor

    (custom characterτi, Eτi, Nτi)

    computed analogically from the configuration τi.


As the exact optimal match of the model, i.e. the maximum quality configuration










τ
i
*

=

arg







max

τ
i




Q


(


τ
i

|


i


)








(
3
)








is NP-hard to find, an efficient approximation approach is applied which involves computation of a 11×1002 edge quality matrix E and of a 6×100 node quality matrix C which are fed to a so-called Max-Sum solver.


The reference node di of the optimally matched model is fixed as the final position d*i associated with the Λi. If the uppermost or bottommost model custom character2 or custom character22, respectively, has been matched, the reference node d1 or d23, respectively, of the model is additionally taken as the final position associated with label Λ1 or Λ23, respectively.


Compensating Missing Disks


Insufficient disk features (e.g., due to vertebra collapses, pathologies) would lead to an improper MRF-based match of the model custom characteri. To account for that, dummy edge and node qualities can be pre-computed from the training data yielding extended quality matrices, a 11×(100+1)2 matrix E′ and a 6×(100+1) matrix C′. If the model happens to find the optimum using dummy qualities, an extra point is inserted into the set of the disk candidates {θm}.


2.5. Propagation


After the initial MRF model custom characteri has been matched, the algorithm propagates downwards and/or upwards along the canal spline in order to refine the remaining label positions {j} in the input volume.


During the downward propagation, (θi+1, Λi+1) becomes the initialization position/label pair and the 6-tuple

*(di, ζi, Θi+1, ζi+1, Θi+2, ζi+2)

is subject to be refined by the model custom characteri+1. The downward propagation terminates when either of the following criteria is met: the volume data bottom is reached or a marginal disk (L5/S1) is labeled. Upward propagation is analogical to the downward propagation.


The total labeling after upper/downer propagation can be assessed by geometry components of the optimal matches (see eqn. (3)) of models {custom characterj} to all detected disks {j} in the input dataset, excluding models optimized due to a dummy case:










Q


=


-

1



{

𝒢

τ
j
*


}









j






𝒢

τ
j
*


-


𝒢
_

j










(
4
)








2.6. Finalization


Assessing Multiple Initializations: Labeling results from eventual multiple initializations (section 2.3.2) are compared by total labeling qualities (see eqn. (4)). Labeling with the maximal total labeling quality custom character becomes the final one.


Vertebra Labels by Interpolation: The model matching framework delivers positions of intervertebral disks. Vertebral body positions and labels are obtained by linear interpolation between adjacent disks.


2.7 Results


By applying the method set forth above to the example given in FIG. 5, the intervertebral disk T12/L1 in the spine image is determined as an initialization disk, i.e. an initial segment, of the spine and is labeled accordingly with “T12/L1” (see FIG. 5d).


Subsequently, as shown in FIG. 5e, model matching starts with a model of a spine segment considering properties of three intervertebral disks and two vertebrae around the initialization disk T12/L1 corresponding to the middle disk 25 shown in left part of FIG. 4.


After repetitive propagation of model matching to further segments of the spine, intervertebral disks from L2/L3 to T1/T2 were detected and accordingly labeled in the image as shown in FIG. 5f.



FIG. 6 shows an example of a full-spine image dataset which has been correctly labeled by a method according to a preferred embodiment of the present invention. As apparent from the figure, the vertebrae represented in the image are annotated with respective spine labels Cn, Tn and Ln from cervical vertebra C3 to lumbar vertebra L5.



FIG. 7 shows further examples of labeled image datasets of parts of a spine. The segment of the spine represented in the left image features collapsed vertebrae and herniated disks; despite these unfavorable anatomical conditions, due to preferred embodiments of the invention respective vertebrae are correctly labeled from cervical vertebra C7 to lumbar vertebra L2. Same applies to the middle image featuring an extremely scoliotic spine segment, where the vertebrae are correctly labeled from thoracic vertebra T12 to lumbar vertebra L5. The right image shows a correctly labeled cervical image data set, labeled from C3 to T3.


3. Conclusion


In summary, by the method, apparatus and system disclosed herein both full and partial CT scans of a spine get labeled reliably and in a clinically reasonable time. With a recall of 95.5% the algorithm set forth above automatically labels a broad spectrum of input volumes including full spinal columns, partial scans at different regions (cervical, thorax, lumbar), data with pathologies (e.g., scoliosis, osteoporosis, disk collapses), as well as data acquired by different vendors (like GE, Philips or Siemens) at a variety of spatial resolutions. An exemplary 512×512×5966 dataset labeled in 5.7 minutes evidences that the method scales very well.


To cope with all this variance in the input data, a framework was introduced based on following ideas: First, fast feature detection of target structures, mainly intervertebral disks and spinal canal, is refined by three-disk models. Second, a correct labeling is assured by learned structures to identify the initial disk at one of C7/T1, T12/L1, and L5/S1.


Preferably, the framework set forth above can be extended by disk orientation estimation. This can reliably be derived from the canal spline tangent. In fact, the canal features and spline fitting of our framework are robust so that it is also possible to investigate the Frenet frame (i.e., curvature and torsion) of the canal spline to quantify spine abnormalities.


Moreover, it is assumed within the algorithm set forth above that a standard atlas of the spinal column with 24 vertebrae, thought anomalies with one more or less vertebra exists. Such cases can be resolved by looking at the number of disk candidates relative to the reference structures (ribs, sacral foramina) in the disk initialization step. Furthermore, it is possible to train the fully automatic spine labeling framework on MR data and to extend the training data by more examples which cover a higher degree of anomalies and deviations in morphometry, e.g., spine scans from children.

Claims
  • 1. A method for labeling one or more portions of a spine in an image of a human or animal body, the method comprising the steps of: a) matching a model of a spine segment with segments of the spine in the image by: starting matching the model of the spine segment with an initial segment of the spine in the image, wherein the initial segment of the spine in the image is located at an initial position along the spine in the image; andcontinuing matching the model of the spine segment with one or more further segments of the spine in the image, wherein the one or more further segments of the spine in the image are located at farther positions along the spine in the image, and the model of the spine segment relates to anatomical properties of one or more portions of the spine; andb) labeling the one or more portions of the spine in the image in response to step a); whereinan initial position of an initialization disk of the spine in the image is established by a disk profile corresponding to a string of region classes to which a set of disk candidates is mapped by classifying each disk candidate of the set of disk candidates to a region class or a region transition uncertainty; andthe disk profile is matched to a full spine profile and multiple initialization disk candidates, which result from the region class or the region transition uncertainty in the disk profile, are resolved by repeating the labeling step.
  • 2. The method according to claim 1, wherein the farther positions along the spine correspond to positions propagating from the initial position along the spine.
  • 3. The method according to claim 1, wherein the one or more portions of the spine in the image correspond to one or more vertebrae and/or intervertebral discs of the spine in the image.
  • 4. The method according to claim 1, wherein the model of the spine segment relates to anatomical properties of two to five vertebrae and/or intervertebral discs of the spine.
  • 5. The method according to claim 4, wherein the model of the spine segment relates to anatomical properties of three intervertebral discs of the spine.
  • 6. The method according to claim 4, wherein the model of the spine segment relates to anatomical properties of two vertebrae of the spine.
  • 7. The method according to claim 5, wherein the three intervertebral discs of the spine are associated with two vertebrae of the spine.
  • 8. The method according to claim 6, wherein three intervertebral discs of the spine are associated with the two vertebrae of the spine.
  • 9. The method according to claim 1, wherein the initial position of the initial segment of the spine in the image is established by considering anatomical knowledge about spines.
  • 10. The method according to claim 9, wherein the initial position of the initial segment of the spine in the image is established by detecting at least one anatomical landmark of the spine in the image.
  • 11. The method according to claim 10, wherein the at least one anatomical landmark relates to one of a vertebra at a first rib, a vertebra at a last rib, and/or a sacral foramina.
  • 12. The method according to claim 1, wherein the initial position of the initial segment of the spine in the image is established by considering Markov Random Field matching qualities.
  • 13. An apparatus for labeling one or more portions of a spine in an image of a human or animal body, the apparatus comprising: an image processing unit configured or programmed to:a) match a model of a spine segment with segments of the spine in the image by: starting matching the model of the spine segment with an initial segment of the spine in the image, wherein the initial segment of the spine in the image is located at an initial position along the spine in the image; andcontinuing matching the model of the spine segment with one or more further segments of the spine in the image, wherein the further segments of the spine in the image are located at positions farther along the spine in the image, and the model of the spine segment relates to anatomical properties of one or more portions of the spine; andb) label the one or more portions of the spine in the image in response to step a), whereinthe image processing unit is configured or programmed to establish an initial position of an initialization disk of the spine in the image by a disk profile corresponding to a string of region classes to which a set of disk candidates is mapped by classifying each disk candidate of the set of disk candidates to a region class or a region transition uncertainty; andthe disk profile is matched to a full spine profile and multiple initialization disk candidates, which result from the region class or the region transition uncertainty in the disk profile, are resolved by repeating the labeling of the one or more portions of the spine in the image.
  • 14. A system for labeling one or more portions of a spine in an image of a human or animal body, the system comprising: an image acquisition unit configured or programmed to acquire at least one image of at least a portion of a human or animal body; andthe apparatus according to claim 13.
  • 15. The system according to claim 14, wherein the image acquisition unit is a computed tomography unit.
Priority Claims (1)
Number Date Country Kind
12177656 Jul 2012 EP regional
CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a 371 National Stage Application of PCT/EP2013/065457, filed Jul. 23, 2013. This application claims the benefit of U.S. Provisional Application No. 61/678,108, filed Aug. 1, 2012, which is incorporated by reference herein in its entirety. In addition, this application claims the benefit of European Application No. 12177656.1, filed Jul. 24, 2012, which is also incorporated by reference herein in its entirety.

PCT Information
Filing Document Filing Date Country Kind
PCT/EP2013/065457 7/23/2013 WO 00
Publishing Document Publishing Date Country Kind
WO2014/016268 1/30/2014 WO A
US Referenced Citations (1)
Number Name Date Kind
20110064291 Kelm et al. Mar 2011 A1
Non-Patent Literature Citations (4)
Entry
Roberts et al., “Automatic Segmentation of Lumbar Vertebrae on Digitised Radiographs Using Linked Active Appearance Models,” Proceedings of Medical Image Understanding and Analysis, vol. 1, Jan. 1, 2006, pp. 120-124.
Official Communication issued in International Patent Application No. PCT/EP2013/065457, mailed on Oct. 8, 2013.
Roberts, “Automatic Detection and Classification of Vertebral Fracture Using Statistical Models of Appearance,” http://personalpages.manchester.ac.uk/staff/martin.roberts/mrthesis.pdf, downloaded Jan. 1, 2008, pp. 1-205.
Major, “Markov Random Field Based Structure Localisation of Vertebrae for 3D-Segmentation of the Spine in CT Volume Data,” http://www.cg.tuwien.ac.at/researc/publications/2010/major-2010-mrf/major-2010-mrf-paper.pdf, downloaded May 11, 2010, 102 pages.
Related Publications (1)
Number Date Country
20150173701 A1 Jun 2015 US
Provisional Applications (1)
Number Date Country
61678108 Aug 2012 US