This application claims the benefit of priority from European Patent Application No. 22182378.4, filed on Jun. 30, 2022, the contents of which are incorporated by reference.
The present framework relates to identifying a type of organ in a volumetric medical image.
Segmentation is one of the core problems in medical imaging. It has been used for identifying organ boundaries, displaying visualizations or volume calculations. One prominent use case is organ identification in the finding location. However, full segmentation of all organs for this purpose is computationally intensive since 3D images could contain up to billions of voxels. Also, the identification of organs in a location of interest does not need full organ segmentation in most cases.
Landmarking is another methodology to identify the organ location information. However, the landmarks do not give the inside outside information of a selected location due to the coarse level of representation. Recent studies on fine granular organ segmentation reduced computational time burden to a few seconds. Unfortunately, these segmentation approaches still re-quire additional hardware to be able to run which is not practical in many set-tings.
It is possible to precompute the segmentations masks before radiology reading session. This process adds complexity to the database design, authentication and communication protocols. See Zhang, Fan, Yu Wang, and Hua Yang, “Efficient Context-Aware Network for Abdominal Multi-organ Segmentation.” arXiv preprint arXiv:2109.10601 (2021), and Yan, Zhennan, et al. “Bodypart recognition using multi-stage deep learning,” International conference on information processing in medical imaging, Springer, Cham, 2015, which are herein incorporated by reference.
A framework for identifying a type of organ in a volumetric medical image is provided. The framework may include receiving a volumetric medical image, the volumetric medical image comprising at least one organ or portion thereof, and further receiving a single point of interest within the volumetric medical image. Voxels are sampled from the volumetric medical image, wherein at least one voxel is skipped between two sampled voxels. The type of organ is identified at the single point of interest by applying a trained classifier to the sampled voxels.
A more complete appreciation of the present disclosure and many of the attendant aspects thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings.
The present framework enables improved identification of organs. According to a first aspect, a computer-implemented method for identifying a type of organ in a volumetric medical image is provided. The method comprises:
Advantageously, at least one voxel is skipped between two sampled voxels. This has the effect that the amount of data which needs to be processed by the trained classifier to identify the type of organ is reduced, thus reducing computation time and computation resources. Even though the amount of data processed by the trained classifier is reduced, the inventors found that the type of organ can still be identified reliably using the present approach. One reason for this is that the sampled voxels (also referred to as a “descriptor” herein) correspond to a larger field of view, thus also considering neighborhood information, compared to the case where every voxel of a smaller sub volume is sampled.
An organ is to be understood as a collection of tissue joined in a structural unit to serve a common function. The organ may be a human organ. The organ may be any one of the following, for example: intestines, skeleton, kidneys, gall bladder, liver, muscles, arteries, heart, larynx, pharynx, brain, lymph nodes, lungs, spleen bone marrow, stomach, veins, pancreas, and bladder.
The volumetric medical image may be captured by and received from a medical imaging unit, the medical imaging unit may include, for example, but not limited to, a magnetic resonance imaging device, a computer tomography device, an X-ray imaging device, an ultrasound imaging device, etc. The volumetric medical image may be three-dimensional (3D) and/or related to a volume. The volumetric medical image may be made up of a number of slices, i.e., 2D (two-dimensional) medical images. The 2D medical images may be captured by and received from the medical imaging unit mentioned above. The 2D medical images may then be assembled to form the volumetric medical image.
Presently, a voxel represents a value in three-dimensional space, whereas a pixel represents a value in two-dimensional space. The pixels or voxels may or may not have their position, i.e., their coordinates explicitly encoded with their values. Instead, the position of a pixel or voxel is inferred based upon its position relative to other pixels or voxels (i.e., is positioned in the data structure that makes up a single 2D or 3D (volumetric) image). The voxels may be arranged on a 3D grid, the pixels on a 2D grid. The 2D medical image may, for example, be in the form of an array of pixels. The volumetric medical image may comprise an array of voxels. The pixels of a number of 2D medical images making up a volumetric medical image are also presently referred to as voxels. The pixels or voxels may be representative of intensity, absorption or other parameters as a function of a three-dimensional position, and may, for example, be obtained by a suitable processing of measurement signals obtained by one or more of the above-mentioned medical imaging units.
The single point of interest within the volumetric medical image may correspond to one (exactly one) pixel or voxel or the position of said pixel or voxel in a three-dimensional space. The method steps a) to d) may be repeated for different single points of interest, for example to identify different organs within the volumetric medical image.
Sampling of the voxels may be done by reading from a data file, a database, a (e.g., temporary) memory, or an array comprising the voxels. Sampling of the voxels can be done sequentially or in parallel (for example when multiple voxels are read at the same time). At least one voxel is skipped between two sampled voxels. This is to say that, when looking at all the voxels of the volumetric medical image in their three-dimensional relationship, at least one voxel between two sampled voxels is not sampled. For example, the volumetric medical image may comprise a first, second and third voxel arranged in the same row or column. In this case, only the first and third voxel are sampled, the second voxel is not sampled. It may be provided that first the first voxel and then the third voxel is sampled. Alternatively, the first and third voxel are sampled in parallel. The sampled voxels may be saved in memory.
A trained classifier is applied to the sampled voxels. In particular, the sampled voxels are read from memory and then processed by a trained classifier. The trained classifier is, for example, a trained neural network.
In one implementation, a robot, (e.g., CT or MR) scanner or other device or machine is controlled depending on the identified type of organ (or organ specific abnormality as discussed below). The robot may be configured for operating on a patient's body, for example. In particular, a robot (e.g., an operating instrument thereof such as a scalpel) or scanner movement may be controlled depending on the identified organ.
According to an implementation, the voxels are sampled in a sparse and/or random manner. “Sparse” is to be understood as, when having regard to the total number of voxels making up the volumetric medical image, only few voxels being used in sparse sampling. In particular, “sparse” is to say that less than 50% or less than 20% or even less than 10% of the total number of voxels of the volumetric medical image are sampled.
“Random” is to say that the sampled voxels do not follow a regular pattern (except that at least one voxel is skipped between two sampled voxels). Still, the (random) pattern used may be the same for different single points of interest. In some implementations, a random number generator or pseudorandom number generator may be used to select the voxels sampled.
In an implementation, the voxels are sampled with a sampling rate per unit length, area or volume which decreases with a distance of the respective voxel from the single point of interest. Thereby, a field of view is obtained, which still focuses on the single point of interest, but at the same time also takes into account information a distance away from the point of interest.
Step c) can be done using a sampling model. The sampling model contains the information about the location of the voxels in the volumetric medical image which are to be sampled, thus providing the descriptor. The sampling model can be or make use of an algorithm, for example.
According to an implementation, the sampling rate decreases at a nonlinear rate, in particular at the rate of an exponential, logarithmic or power function. The inventors found that using a sampling rate as described reduces computation time significantly, while, at the same time, identifying the type of organ reliably. According to a further implementation, the sampled voxels are less than 1%, preferably less than 0.1%, and more preferably less than 0.01% of the total number of voxels in the volumetric medical image.
In an implementation, the trained classifier is a neural network, in particular a multilayer perceptron, a convolutional neural network, a Siamese network or a triplet network. The difference between classification (using a trained classifier) and segmentation (the latter also usually being carried out using a neural network) is that in classification a single label (for example the type of organ) is output for a single point of interest (for example for a specific voxel), whereas in segmentation a label is determined for each voxel of the input data. Therefore, training and application for classification and segmentation is significantly different.
A simple neural network classifier could be used to classify organ labels using the descriptor at the location of interest. This neural network classifier is not a translation invariant model since the descriptor will change with the changing location.
In one example, the neural network could be a single layer logistic regression. The computation would be softmax of linear combination of descriptor values formulated as:
where Iorgan is the organ label, Wi is the weight vectors of different organs and x is the descriptor.
In other examples, a multi-layer neural network may be used to model non-linear relationships. For example, this may be done by adding at least one hidden layer on the input descriptor and adding a softmax classifier layer as in the previous example.
In a further implementation, the received volumetric medical image or a part thereof comprising the single point of interest is displayed on a graphical user interface, wherein a semantic description of the identified type of organ is generated and displayed at or adjacent to the single point of interest. In this manner, the user can quickly understand which type of organ is located at the single point of interest.
According to an implementation, the single point of interest is selected by a user. For example, the single point of interest can be selected using a graphical user interface and an input device, such as a pointer device, to interact with the graphical user interface to select the single point of interest. In another implementation, the single point of interest may be input using a keyboard, a data file or the like.
According to a further implementation, the single point of interest is selected by pausing a cursor operated by the user on the volumetric medical image or a part thereof displayed on the graphical user interface.
Since the present classification (step d)) is fast, for example taking less than 10 ms (which may be faster than the screen refresh rate used as part of the graphical user interface), the type of organ, e.g., its semantic description, associated with the single point of interest can be displayed whenever the cursor is paused for a short period of time. “Pausing” here means that the cursor is not moved by the operator. This allows for a quick and efficient analysis of a volumetric medical image by a user, for example a doctor.
According to a further implementation, when a user takes a measurement with respect to the volumetric medical image or part thereof, the identified type of organ is saved in the database along with the taken measurement.
Oftentimes, a user will take a measurement with respect to the volumetric medical image or a part thereof. For example, a doctor may measure the size of the organ or the size of a lesion or tumor within the organ. This measurement can be performed using a graphical user interface, for example. By having the measurement associated with a specific organ, future reference thereto is simplified and more efficient.
According to a further implementation, the method further includes receiving a number of untrained classifiers for identifying organ specific abnormalities, selecting one or more of the untrained classifiers from the number of classifiers depending on the identified organ, and training the selected one or more untrained classifiers using the volumetric medical image.
There exist classifiers (i.e., neural networks, for example) configured for identifying the type of organ in a volumetric medical image. There also exist classifiers (i.e., neural networks, for example) for identifying organ-specific abnormalities. These organ-specific abnormalities include, for example, nodules, lesions and tumors. Since these organ-specific abnormalities differ substantially with respect to their shape, texture, density etc., it is preferable to train such classifiers for each type of organ respectively. Therefore, the training data used for such organ-specific classifiers needs to be sorted so that, for example, a lung nodule classifier is trained with volumetric medical images showing lungs, a liver lesion classifier is trained with volumetric medical images showing livers, etc. The present method can be used effectively to supply the respective classifiers with suitable volumetric medical images.
According to a further implementation, the method further comprises performing and/or repeating steps a) to d) for each of N−1 single points of interest within the volumetric medical image, wherein N≤the total number of voxels of the volumetric medical image.
Sometimes it is required to determine the type of organ for many single points of interest in a volumetric medical image. To this end, the present method can be performed or repeated many times for different single points of interest. “N” is a positive, real and whole number. The difference here between “performing” and “repeating” is that performing can be done in parallel or otherwise, whereas repeating requires a sequential approach. Typically, the N points of interest will correspond to nodes on a rough grid applied to the volumetric medical image. Put differently, the N single points of interest correspond to a grid with a grid spacing exceeding the spacing of the voxels contained in the volumetric medical image. Preferably, the grid has a regular spacing. In the extreme, however, where N equals the total number of voxels, a similar result as compared to a segmentation method may be obtained. However, with the present approach, all single points of interest are evaluated independent of each other.
Preferably, steps in the present method are performed in parallel on one or more processing units. Thereby, the processing time can be further reduced.
According to a second aspect, a computer-implemented method for training the above mentioned classifier (step d) above) is provided. The method comprises:
The received type of the at least one organ may be stored in a database prior to step a). The type of the at least one organ may be determined by a human studying the received volumetric medical images. The classifier may be modified by changing the weights in the neural network of the untrained or partially trained classifier.
According to a third aspect, a device for identifying a type of organ in a volumetric medical image is provided. The device comprises:
The respective unit, for example, the processing or the first or second receiving unit, may be implemented in hardware and/or software. If said unit is implemented in hardware, it may be embodied as a device, for example as a computer or as a processor or as a part of a system, for example a computer system. If said unit is implemented in software, it may be embodied as a computer program, as a function, as a routine, as a program code or as an executable object.
According to a fourth aspect, a system for identifying a type of organ in a volumetric medical image is provided. The system includes one or more servers, and a medical imaging unit coupled to the one or more servers. The one or more servers include instructions, which when executed cause the one or more servers to perform the method steps as claimed described above.
According to a fifth aspect, a computer program product is provided. The computer program product includes machine readable instructions, that when executed by one or more processing units cause the one or more processing units to perform method steps as described above. A computer program product, such as a computer program means, may be embodied as a memory card, USB stick, CD-ROM, DVD or as a file which may be downloaded from a server in a network. For example, such a file may be provided by transferring the file comprising the computer program product from a wireless communication network.
According to a sixth aspect, one or more non-transitory computer-readable media are provided. Instructions executable by a machine to perform operations are saved on the one or more non-transitory computer-readable media. The instructions are loadable into and/or executable by a machine to make the system execute the method steps or operations as described above.
The features, advantages and implementations described with respect to the first aspect equally applies to the second and following aspects, and vice versa.
“A” is to be understood as non-limiting to a single element. Rather, one or more elements may be provided, if not explicitly stated otherwise. Further, “a”, “b” etc. in steps a), step b) etc. is not defining a specific order. Rather, the steps may be interchanged as deemed fit by the skilled person.
Further possible implementations or alternative solutions of the invention also encompass combinations—that are not explicitly mentioned herein—of features described above or below with regard to the implementations. The person skilled in the art may also add individual or isolated aspects and features to the most basic form of the invention.
Hereinafter, implementations for carrying out the present invention are described in detail. The various implementations are described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purpose of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more implementations. It may be evident that such implementations may be practiced without these specific details.
The client devices 107A-N are user devices, used by users, for example, medical personnel such as a radiologist, pathologist, physician, etc. In an implementation, the user device 107A-N may be used by the user to receive volumetric or 2D medical images associated with the patient. The data can be accessed by the user via a graphical user interface of an end user web application on the user device 107A-N. In another implementation, a request may be sent to the server 101 to access the medical images associated with the patient via the network 105.
An imaging unit 108 may be connected to the server 101 through the network 105. The unit 108 may be a medical imaging unit 108 capable of acquiring a plurality of volumetric medical images. The medical imaging unit 108 may be, for example, a scanner unit such as a magnetic resonance imaging unit, computed tomography imaging unit, an X-ray fluoroscopy imaging unit, an ultrasound imaging unit, etc.
The processing unit 201, as used herein, means any type of computational circuit, such as, but not limited to, a microprocessor, microcontroller, complex instruction set computing microprocessor, reduced instruction set computing microprocessor, very long instruction word microprocessor, explicitly parallel instruction computing microprocessor, graphics processor, digital signal processor, or any other type of processing circuit. The processing unit 101 may also include embedded controllers, such as generic or programmable logic devices or arrays, application specific integrated circuits, single-chip computers, and the like.
The memory 202 may be volatile memory and non-volatile memory. The memory 202 may be coupled for communication with said processing unit 201. The processing unit 201 may execute instructions and/or code stored in the memory 202. A variety of non-transitory computer-readable storage media may be stored in and accessed from said memory 202. The memory 202 may include any suitable elements for storing data and machine-readable instructions, such as read only memory, random access memory, erasable programmable read only memory, electrically erasable programmable read only memory, a hard drive, a removable media drive for handling compact disks, digital video disks, diskettes, magnetic tape cartridges, memory cards, and the like. In the present implementation, the memory 201 comprises a module 103 stored in the form of machine-readable instructions on any of said above-mentioned storage media and may be in communication to and executed by processing unit 201. When executed by the processing unit 201, the module 103 causes the processing unit 201 to identify a type of organ in a volumetric medical image. Method steps executed by the processing unit 201 to achieve the abovementioned functionality are elaborated upon in detail in the following figures.
The storage unit 203 may be a non-transitory storage medium which stores the medical database 102. The input unit 204 may include input means such as keypad, touch-sensitive display, camera (such as a camera receiving gesture-based inputs), a port etc. capable of providing input signal such as a mouse input signal or a camera input signal. The bus 205 acts as interconnect between the processor 201, the memory 202, the storage unit 203, the input unit 204, the output unit 206 and the network interface 104. The volumetric medical images may be read into the medical database 102 via the network interface 104 or the input unit 204, for example.
Those of ordinary skilled in the art will appreciate that said hardware depicted in
A data processing system 101 in accordance with an implementation of the present disclosure may comprise an operating system employing a graphical user interface (GUI). Said operating system permits multiple display windows to be presented in the graphical user interface simultaneously with each display window providing an interface to a different application or to a different instance of the same application. A cursor in said graphical user interface may be manipulated by a user through a pointing device. The position of the cursor may be changed and/or an event such as clicking a mouse button, generated to actuate a desired response.
One of various commercial operating systems, such as a version of Microsoft Windows™, a product of Microsoft Corporation located in Redmond, Washington may be employed if suitably modified. Said operating system is modified or created in accordance with the present disclosure as described. Disclosed implementations provide systems and methods for processing medical images.
In step 305, a volumetric medical image 305 (see
The volumetric medical image 305 as shown in
Instead of the three-dimensional array, the method explained herein may also use a number of slices (two-dimensional arrays of pixels) which, taken together, describe a (three-dimensional) volume. In fact, any other data structure may be used comprising values, such as intensities, and describing a three-dimensional space. Any such value is termed a “voxel” herein. The value may be combined with information describing its three-dimensional relationship with respect to other values, or the three-dimensional relationship can be inferred from the data structure, or any other source.
The volumetric medical image 305 comprises at least one organ 309 or a portion thereof. In the example of
In step 302, a single point of interest 310 within the volumetric medical image is received. As will be explained in more detail later, the single point of interest 310 may be selected by a user, for example, through a graphical user interface using a pointer device, such as a mouse. The single point of interest 310 may be received, for example, through the network interface 104 or the input unit 204 (see
The single point of interest 310 is a point in the volumetric medical image for which it is desired to identify the type of organ 309 corresponding to said point. Said single point of interest 310 may be described through coordinates in x, y, z and may correspond to a specific voxel in the cuboid. If, for example, a mouse is used to select the single point of interest 310, the coordinates of the mouse cursor, once the selection of the single point of interest 310 is done, for example by clicking or pausing the cursor over the image, said coordinates x, y, z are transmitted, for example, via the input unit 204 to the data processing system 101.
In step 303, voxels 306, 308 are sampled from the volumetric medical image 305. To this end, for example, the module 103 (see
To define which voxels are sampled and which are not, the module 103 may comprise a corresponding sampling model which may be an algorithm or another type of code. Alternatively, the sampling model can be defined by the user via the input unit 204 or supplied via the network interface 104.
In implementations of step 303, the voxels 306, 308 are sampled in a sparse manner. That is to say that the number of voxels sampled in the volumetric medical image 305 is less than the total number of voxels contained in the volumetric medical image. In particular, the number of sample voxels may be less than 50%, less than 20% or less than 10% of the total number of voxels to be considered sparse. In one implementation, the sampled voxels 306, 308 are less than 1%, preferably less than 0.1% and more preferably less than 0.01% of the total number of voxels in the volumetric medical image 305.
Additionally, the voxels may be sampled in a random manner (except for at least one voxel being skipped between two sampled voxels). For example, a random number generator or pseudonumber random generator may be used to identify voxels 306, 307, 308 in the volumetric medical image 305 which are to be sampled and others which are not sampled. For example, such a random number generator or pseudorandom number generator may be part of a sampling model or may be used to provide a sampling model as described above.
In particular, the inventors found that it is beneficial if the voxels 306, 308 are sampled in step 303 with a sampling rate per unit length, area or volume which decreases with a distance 311 from the single point of interest 310. It was found that results improve even more, when the sampling rate decreases at a nonlinear rate, in particular at the rate of an exponential, logarithmic or power function.
In this regard, it is referred to
In the experiment made by the inventors, D4 was selected 8 mm, D5 20 mm and D6 80 mm. The nodes 507, 508 considered were only those nodes within the volume of each cube (or cuboids) minus the volume of the largest cube (or cuboid) nested inside said cube. For example, for cube 503, the nodes 507, 508 were considered inside the volume of the cube 503 which were not lying in the volume of the cube 502.
The nodes 507, 508 define the sampling model 400 and thus define the voxels 306, 308 (see
When turning to
Initially, for comparison purposes, a segmentation may be used to find the labels (the type of organ) for each of the voxels of the volumetric medical image 305 of which
On the other hand, using the trained classifier with the descriptor (derived using the sampling model 400), finding the label (i.e., the type of organ) at the single point of interest 310 only took 10 ms as determined in experiments by the inventors. The trained classifier used in this experiment was a ResNet (Residual Neural Network) classifier. ResNet uses a very deep feed-forward neural network with hundreds of layers.
When the inventors applied the ResNet classifier in the classical way, i.e., by sampling the total number of voxels in the volumetric medical image 305, and solving for the label (the type of organ) at the single point of interest 410, data processing took around 1 s.
In the example, it is found that the organ 309 is lung. Thus, the module 103 or another piece of software and/or hardware generates a semantic description, for example “lung”, which is then displayed at or adjacent (as in the example of
In
In particular, the method for federated learning may be implemented as shown in
Initially, untrained classifiers 1001 to 1003 are received, for example by the module 103 (step 1101). The untrained classifiers 1001 to 1003 are configured to, when trained, identify organ-specific abnormalities. For example, the lung nodule classifier 1001 is, when trained, configured to detect nodules in lung. Similarly, the lesion classifier 1002 is, when trained, configured to detect lesions and liver and so on.
In step 1102, the module 103 selects from the untrained classifiers 1001 to 1003 those that correspond to the specific organ identified in step 304 (see
In step 1103, the untrained classifiers 1001 to 1003 are trained using the volumetric medical image 305 showing the corresponding organ.
For training, the organ-specific abnormality must be known for each volumetric medical image 305. Such information is, preferably, also stored in a database 102 and used during the training step 1102.
Advantageously, in the methods of
At the application stage, a trained organ-specific classifier 1001 to 1003 is selected, for example by the module 103 in a process step following step 304, based on the identified organ, and then the trained organ-specific 1001 to 1003 is applied to the volumetric medical image 305 or sub image. Thus, the above-described method is not only suitable to identify the type of organ, but also to identify a specific abnormality within an organ.
In step 1201, a volumetric medical image 305 is received along with an identifier (label) identifying the type of the at least one organ 309 shown in the volumetric medical image 305. The identifier may correspond to the semantic label 702 referred to in
In step 1202, a single point of interest 310 within the volumetric medical image 305 is received. For training purposes, the single point of interest may be predetermined or randomly generated, for example.
In step 1203, voxels are sampled from the volumetric medical image 305, wherein at least one voxel 307 is skipped between two sampled voxels 306, 308.
In step 1204, the type of organ 309 at the single point of interest 310 is identified by applying an untrained classifier to the sampled voxels 306, 308.
The features, advantages and explanations given in respect of the method of
In step 1205, the identified type of organ 309 is compared with the received type of organ.
In step 1206, the classifier is modified depending on the comparison. For example, weights are adapted in the classifier to provide a trained classifier. Steps 1201 to 1206 may be repeated for different volumetric medical images 305 until the classifier has been fully trained. The steps 1201 to 1206 may be carried out using the device 101 or system 100 described in
The foregoing examples have been provided merely for the purpose of explanation and are in no way to be construed as limiting of the present invention disclosed herein. While the invention has been described with reference to various implementations, it is understood that the words, which have been used herein, are words of description and illustration, rather than words of limitation. Further, although the invention has been described herein with reference to particular means, materials, and implementations, the invention is not intended to be limited to the particulars disclosed herein, rather, the invention extends to all functionally equivalent structures, methods and uses, such as are within the scope of the appended claims. Those skilled in the art, having the benefit of the teachings of this specification, may effect numerous modifications thereto and changes may be made without departing from the scope and spirit of the invention in its aspects.
Number | Date | Country | Kind |
---|---|---|---|
22182378.4 | Jun 2022 | EP | regional |