The invention concerns a method of mapping a medical data acquisition protocol to an acquisition protocol lexicon; and a protocol mapping computer that implements such a method.
The collection of settings and parameters defining a medical imaging examination is called the “exam protocol” or “acquisition protocol”. An acquisition protocol defines the actions to be performed on a patient, such as the scan modality, whether or not a contrast agent is to be used, whether a surgical instrument will be used, the number of views to be acquired, the relevant population such as pediatric, trimester, etc. Each acquisition protocol may be given a unique identifier or protocol ID. The different procedures that are performed in an institution such as a hospital or a radiological practice are usually defined internally in that institution. To set up an image acquisition procedure, it may be sufficient for the clinician or medical technical assistant to enter the protocol ID into a workstation or scanning apparatus. The acquisition protocol and/or the protocol ID, as well as other information related to the institution and the patient, can be saved along with the image data. To facilitate the exchange, comparison and interpretation of imaging results between medical personnel and institutions, the additional data is often generated and stored using the standard DICOM (Digital Imaging and Communications in Medicine) format. This standard was specifically developed to handle data related to all stages of medical imaging (image acquisition, storage, transmission, exchange, etc.), and is widely used by institutions such as hospitals, surgical practices, medical imaging service providers, etc.
The same imaging procedure may be given different names by different institutions. For instance, one institution may define an abdomen/pelvis CT exam without contrast agent as “ABD/PEL WO” while another institute may use “CT Abdomen Pelvis without Contrast” to define the same exam or imaging procedure. However, the exam quality and radiation dose depend to a great extent on the acquisition protocol that was used to set up the imaging procedure. Furthermore, it is very important to be able to understand, reproduce, and compare acquisition protocols used by different institutions. This would make it necessary for all institutions to adopt a unifying protocol. An example of such a unifying protocol is given in the radiological lexicon named RadLex® (often referred to as the “RadLex® Playbook” or simply the “Playbook”), which has been compiled with the aim of providing a unified description for all possible kinds of imaging acquisition procedure, and associating each procedure with a unique identifier, its RPID (RadLex® protocol identifier). While this unifying lexicon is not an official standard, many institutions recognize the need to convert past (and future) acquisition protocols to a common lexicon such as that provided by RadLex®. However, not all operators of the various kinds of medical imaging acquisition devices are sufficiently familiar with the protocols of such a unifying lexicon. Furthermore, it is not always possible for an operator to simply “translate” the protocol of that institution into a protocol of the unifying lexicon.
In one approach to solving this problem, a software program or tool applies a set of “hand-crafted” predicates or rules to extract the relevant information from an acquisition protocol, re-formats the information in keeping with the unifying protocol of a lexicon such as the RadLex® Playbook, and maps the reformatted protocol to the lexicon in order to find the RPID that matches the acquisition protocol. However, a limitation of this approach is that it is necessary to compile a comprehensive rule set in the first place, and then to manually maintain and update this rule set. Furthermore, this approach requires a comprehensive medical ontology database as well as a search engine in order to correctly map a freely composed acquisition protocol to a corresponding lexicon protocol. A further drawback is that each time another institution or another exam protocol is added, the rule set needs to be manually updated to augment it with the new information, and the updated rule set must be provided to all users of the tool.
It is an object of the invention to provide an improved way of assisting institutions in their endeavor to apply the acquisition protocols defined in a unifying lexicon.
According to the invention, the method of mapping an acquisition protocol to an acquisition protocol lexicon includes the steps of extracting multiple tags from the acquisition protocol, performing text pre-processing in a computer on the extracted tags, converting the pre-processed text in the computer into an input feature set for a classifier, and applying the classifier to associate the input feature set with one or more entries of the acquisition protocol lexicon. The one or more entries of the acquisition protocol lexicon, with which the classifier associates the input feature set, are presented to a user as an output from the computer so as to inform a viewer (user) of those entries in the acquisition protocol lexicon that correspond to the input feature set.
In the context of the invention, “mapping an acquisition protocol to an acquisition protocol lexicon” means identifying one or more lexicon entries that are the most likely equivalents of the input acquisition protocol. It may be assumed that the acquisition protocol is a medical imaging acquisition protocol for an intended imaging procedure, and that the acquisition protocol is informal, i.e. it is put together or composed by the user (the operator of the imaging device, usually a clinician or medical technical assistant, for example) without strict adherence to any “global” formulation constraints, since such constraints do not exist at present. The user's input may at best adhere to local formulation guidelines of that institution, but such formulation guidelines will generally differ widely among institutions, as explained above. Therefore, the acquisition protocol put together by a user may be considered to be “informal” or “freely composed”, in the sense that users at different institutions may arrive at significantly different acquisition protocols for the same intended procedure.
The inventive method can be used to associate any acquisition protocol with one or more entries in the lexicon. An advantage of the mapping method according to the invention is that it is an approach based on a machine learning pipeline. Therefore, there is no need to manually create and maintain a rule set, since any rules are learned directly from the input data fed to the classifier. This means that advantageous savings can be made in time and costs. Furthermore, when a new institution or a new protocol is added, the inventive method can easily adapt by automatically accumulating the new information and learning from it.
According to the invention, the protocol mapping computer is configured (designed or programmed) to map an acquisition protocol to an acquisition protocol lexicon and has a tag extraction processor configured to extract a number of tags from an acquisition protocol, a pre-processing processor configured to perform text pre-processing on the extracted tags, a feature extraction processor configured to convert the pre-processed text into an input feature set, and a classifier configured to associate an input feature set with one or more entries of the acquisition protocol lexicon. The one or more entries of the acquisition protocol lexicon, with which the classifier associates the input feature set, are presented to a user as an output from the computer so as to inform a viewer (user) of those entries in the acquisition protocol lexicon that correspond to the input feature set.
An advantage of the protocol mapping computer according to the invention is that relatively little effort need be expended in order to achieve a reliable and accurate tool which can provide a user with a list of relevant protocol descriptions that best match the intended imaging procedure. This assists the user in making an accurate selection from the list of lexicon entries returned by the classifier. In this way, a user at any institution can apply the guidelines of that instruction to assemble an “informal” acquisition protocol, and can quickly receive a list of entries from the more “formal” lexicon, which best match that acquisition protocol. The user can then choose the most suitable entry from the list, and use this to program the device for the planned imaging procedure.
In the following, it may be assumed that a suitable unifying lexicon is the RadLEx® playbook, which is already widely used as a standard for defining imaging procedures such as Ultrasound, X-ray, CT, MRI, fluoroscopy, etc. A specific imaging procedure is defined in the RadLEx® playbook by a specific identifier, called its “RPID”. For example, a procedure for performing an ultrasound of the liver has the identifier RPID5928 in the RadLEx® playbook, with the associated description “US Abdomen Limited Liver”.
The terms “protocol mapping computer”, “classification pipeline” and “machine learning pipeline” may be regarded as synonyms in the context of the invention, and these terms may therefore be used interchangeably in the following.
The local protocol for acquiring image data at a certain institution differ significantly from an equivalent protocol of a unifying lexicon. As explained above, it is necessary to identify that imaging procedure using its specific RPID if the results of the imaging procedure are to be viewed at a different institution, for example. The inventive method provides a reliable and quick way of obtaining the most likely RAID for an informally composed local protocol.
The tag extraction processor is preferably configured to identify tags that correspond to parameters defined in the DICOM standard. Preferably, the extracted tags have at least a “body region” tag, a “local protocol name” tag, an “institution” tag and a “modality” tag. The “modality” tag defines the imaging modality, for example CT (computed tomography), FL (fluoroscopy), US (ultrasound), etc. The “institution” tag is a unique identifier, so that each institution can be defined by a unique number or customer number. For example, allocation of the institution tag can be done by a counter that is incremented for each new institution that becomes a customer of the inventive mapping service. The “body region” tag defines the part of the body to be imaged, for example “chest”, “head” etc. The “local protocol name” tag is the text used by an institution to define a certain imaging procedure.
The extracted body region tag and protocol name tag are subject to lexical thinning in the text pre-processing step, for example to remove non-alphanumeric “special” characters, to discard any one-character or two-character terms, to convert all letters to lower-case, etc. After this step of lexical thinning, the feature extraction module converts the remaining text into an input feature set. This will include an entry for the modality (e.g. “CT”), an entry for the institution (e.g. “4”), and lexically thinned entries for the body region and protocol name. In a preferred embodiment of the invention, in addition to the modality and institution entries, the input feature vector comprises a sparse signature compiled using a bag-of-words technique. In the bag-of-words technique, an algorithm reviews all words in a training set of local protocol names and body regions to create a dictionary or “bag of words”. Using this dictionary, it is then possible to describe the “body region” word(s) or “protocol name” word(s) by the number of times those word appear in the dictionary or “bag of words”. The contribution of each word or term in the dictionary can be weighted according to term frequency (TF), inverse document frequency (IDF) or a combination of both. A sparse signature for the body region and/or protocol name can then be created with this information.
This feature vector or feature set is then fed to the classifier or “predictive model”, which applies a suitable classification algorithm to associate or map that feature set to one or more entries of the lexicon. In a particularly preferred embodiment of the invention, the classifier applies a random forest algorithm to associate the input feature vector with one or more entries of the acquisition protocol lexicon. Alternatively, the classifier might use a support vector machine (SVM) or a neural network to associate an input feature vector with one or more entries of the acquisition protocol lexicon. Preferably, the inventive method includes an initial step of training the classifier using any appropriate machine learning algorithm that is able to learn the parameters of the classifier's predictive model using a suitable dataset or input.
The inventive protocol mapping computer is suited for implementation in the cloud, i.e. it can be realized in a cloud computing platform. Certain modules of the protocol mapping computer such as the tag extraction module can be implemented in a web-based application. This can interface with a user via an internet browser, for example, so that the user can enter information and view results in such a browser window. Other processors of the protocol mapping computer such as the pre-processing processor, the feature extraction processor and the classifier can be implemented in a web-based service. The web-based service can be realized to communicate with multiple web applications and/or multiple instances of the same web application. To this end, the tag extraction processor of a web application is preferably realized to convert upload data for the web-based service input into a suitable format such as JSON (JavaScript object notation). Similarly, the web application is preferably realized to convert the classifier results from such a format in order to present the mapping results to the user, for example as a table of entries and their probabilities. A web-based application for the inventive protocol mapping computer can be adapted to receive acquisition protocols originating from a single institution such as a hospital or a radiology practice. In a preferred embodiment of the invention, a web-based application for the inventive protocol mapping computer is adapted to receive acquisition protocols from a plurality of institutions and/or for a plurality of modalities. Equally, a web-based application for the inventive protocol mapping computer can be adapted to receive acquisition protocols relating to a specific modality, for example only protocols relating to computed tomography. In a further preferred embodiment of the invention, it is possible to use “modality” and “institution” a priori to build a modality-specific and/or institution-specific mapping pipeline, or a posteriori to refine the prediction results (for example by filtering out classes that do not contain the modality of interest). An institution-specific model could be initialized with a generic model and then refined by integrating user feedback in an online learning procedure.
The steps of the inventive method can be implemented as a computer readable data storage medium encoded with programming instructions (program code) when this is loaded into a memory of a programmable device. For example, any method steps relating to user dialog can be implemented as computer program code running on a server hosting the web application, while the remaining method steps can be implemented as a computer program code running on a server hosting the protocol mapping web service.
In the figures, like numbers refer to like objects throughout. Objects in the diagrams are not necessarily drawn to scale.
The advantage of the inventive method, as explained above, is that there is no need to create and maintain a rule set. Instead, rules are learned directly from input data that is used to train the classifier 13.
The accuracy of the method can be refined continually by learning from user feedback. For example, the user may be given the opportunity to inform the web application 1B whether or not a result was correct. This feedback can be used as shown in
Although the present invention has been disclosed in the form of preferred embodiments and variations thereon, it will be understood that numerous additional modifications and variations could be made thereto without departing from the scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
16200055 | Nov 2016 | EP | regional |