The present disclosure overcome the bottleneck of generating annotated dataset for machine learning models for segmenting lesions in medical images by employing computational geometric techniques to automate the generation of ground truth, thus reducing the expert's efforts to answering yes/no questions or selecting the best option from a set of prospective options, thereby reducing time and effort. This ground truth annotated data is then fed back to the geometric filter and the model for their further refinement and fine-tuning.
In an aspect, the present disclosure provides a method for semi-automatically generating annotated dataset for machine learning implementation for segmenting lesions in medical images. The method comprises receiving a plurality of medical images, with a set of medical images of the plurality of medical images being categorized into positive images having one or more lesions. The method further comprises utilizing a trained segmentation model to generate a first segmentation mask for each of the one or more lesions in each of the medical images in the set of medical images. The method further comprises implementing a geometric filter to validate if the generated first segmentation mask corresponds to the one or more lesions in the corresponding medical image based on geometric properties of the respective one or more lesions. The method further comprises updating the trained segmentation model using medical images with the validated first segmentation mask to obtain an updated segmentation model. The method further comprises implementing the updated segmentation model to generate a second segmentation mask for each of the one or more lesions in the medical images in the set of medical images. The method further comprises providing a user interface to allow an operator to validate last generated segmentation mask for each of the medical images in the set of medical images. The method further comprises selecting the medical images from the plurality of medical images with the validated generated segmentation mask, to generate a ground truth annotated dataset.
In one or more embodiments, the method further comprises providing a user interface to allow an operator to classify each of the received plurality of medical images into one of: single lesion image with only one lesion, multiple lesions image with at least two lesions, normal image with no lesions and unclear image, and wherein the medical images being classified as at least one of the single lesion image and the multiple lesions image are categorized as the positive images.
In one or more embodiments, in case of the medical image being classified as the multiple lesions image, the method further comprises: providing a user interface to an operator to outline a bounding polygon on each of at least two lesions in the multiple lesions image; configuring the trained segmentation model to generate a sub-segmentation mask for each of the at least two lesions in the multiple lesions image; and generating the segmentation mask for the multiple lesions image by union of the corresponding generated sub-segmentation masks.
In one or more embodiments, the step of implementing the geometric filter comprises one or more of, two or more of:
In one or more embodiments, the method further comprises: configuring the geometric filter to discard the medical image if the corresponding determined area fraction is less than a predefined threshold; and implementing a regression function, trained based on ground truth values for relative area fraction, convexity, elliptical aspect ratio and elliptical proximity for lesions in medical images, to validate if the generated segmentation mask corresponds to the one or more lesions in the corresponding medical image of the plurality of medical images based on the determined relative area fraction, the determined convexity, the determined elliptical aspect ratio, and the determined elliptical proximity for lesions in the corresponding medical image.
In one or more embodiments, the method further comprises varying one or more weighted parameters of the implemented regression function to manipulate validation of the generated segmentation mask.
In one or more embodiments, the method further comprises augmenting a number of medical images in the plurality of medical images by using one or more techniques of: rotate, width-shift, horizontal-shift, horizontal-flip, vertical-flip, zoom, brightness and shear.
In one or more embodiments, providing the user interface comprises configuring the user interface to allow the operator to confirm one of the first segmentation mask and the last generated segmentation mask by using a single action.
In another aspect, the present disclosure provides a system for semi-automatically generating annotated dataset for machine learning implementation for segmenting lesions in medical images. The system comprises a memory arrangement configured to receive a plurality of medical images corresponding to regions with possible one or more lesions, with a set of medical images of the plurality of medical images being categorized into positive images having one or more lesions. The system further comprises a processing arrangement configured to: utilize a trained segmentation model to generate a first segmentation mask for each of the one or more lesions in each of the medical images in the set of medical images; implement a geometric filter to validate if the generated first segmentation mask corresponds to the one or more lesions in the corresponding medical image based on geometric properties of the respective one or more lesions; update the trained segmentation model using medical images with the validated first segmentation mask to obtain an updated segmentation model; implement the updated segmentation model to generate a second segmentation mask for each of the one or more lesions in the medical images in the set of medical images; provide a user interface to allow an operator to validate last generated segmentation mask for each of the medical images in the set of medical images; and select the medical images from the plurality of medical images with the validated generated segmentation mask, to generate a ground truth annotated dataset.
The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.
For a more complete understanding of example embodiments of the present disclosure, reference is now made to the following descriptions taken in connection with the accompanying drawings in which:
The Description is organized as follows.
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be apparent, however, to one skilled in the art that the present disclosure is not limited to these specific details.
Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearance of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not for other embodiments.
Furthermore, in the following detailed description of the present disclosure, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. However, it will be understood that the present disclosure may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the present disclosure.
Embodiments described herein may be discussed in the general context of computer-executable instructions residing on some form of computer-readable storage medium, such as program modules, executed by one or more computers or other devices. By way of example, and not limitation, computer-readable storage media may comprise non-transitory computer-readable storage media and communication media; non-transitory computer-readable media include all computer-readable media except for a transitory, propagating signal. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or distributed as desired in various embodiments.
Some portions of the detailed description that follows are presented and discussed in terms of a process or method. Although steps and sequencing thereof are disclosed in figures herein describing the operations of this method, such steps and sequencing are exemplary. Embodiments are well suited to performing various other steps or variations of the steps recited in the flowchart of the figure herein, and in a sequence other than that depicted and described herein. Some portions of the detailed descriptions that follow are presented in terms of procedures, logic blocks, processing, and other symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. In the present application, a procedure, logic block, process, or the like, is conceived to be a self-consistent sequence of steps or instructions leading to a desired result. The steps are those utilizing physical manipulations of physical quantities. Usually, although not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as transactions, bits, values, elements, symbols, characters, samples, pixels, or the like.
In some implementations, any suitable computer usable or computer readable medium (or media) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer-usable, or computer-readable, storage medium (including a storage device associated with a computing device) may be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fibre, a portable compact disc read-only memory (CD-ROM), an optical storage device, a digital versatile disk (DVD), a static random access memory (SRAM), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, a media such as those supporting the internet or an intranet, or a magnetic storage device. Note that the computer-usable or computer-readable medium could even be a suitable medium upon which the program is stored, scanned, compiled, interpreted, or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory. In the context of the present disclosure, a computer-usable or computer-readable, storage medium may be any tangible medium that can contain or store a program for use by or in connection with the instruction execution system, apparatus, or device.
In some implementations, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. In some implementations, such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. In some implementations, the computer readable program code may be transmitted using any appropriate medium, including but not limited to the internet, wireline, optical fibre cable, RF, etc. In some implementations, a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
In some implementations, computer program code for carrying out operations of the present disclosure may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Java®, Smalltalk, C++ or the like. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. However, the computer program code for carrying out operations of the present disclosure may also be written in conventional procedural programming languages, such as the “C” programming language, PASCAL, or similar programming languages, as well as in scripting languages such as JavaScript, PERL, or Python. In present implementations, the used language for training may be one of Python, Tensorflow™, Bazel, C, C++. Further, decoder in user device (as will be discussed) may use C, C++ or any processor specific ISA. Furthermore, assembly code inside C/C++ may be utilized for specific operation. Also, ASR (automatic speech recognition) and G2P decoder along with entire user system can be run in embedded Linux (any distribution), Android, IOS, Windows, or the like, without any limitations. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the internet using an Internet Service Provider). In some implementations, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGAs) or other hardware accelerators, micro-controller units (MCUs), or programmable logic arrays (PLAs) may execute the computer readable program instructions/code by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
In some implementations, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus (systems), methods and computer program products according to various implementations of the present disclosure. Each block in the flowchart and/or block diagrams, and combinations of blocks in the flowchart and/or block diagrams, may represent a module, segment, or portion of code, which comprises one or more executable computer program instructions for implementing the specified logical function(s)/act(s). These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the computer program instructions, which may execute via the processor of the computer or other programmable data processing apparatus, create the ability to implement one or more of the functions/acts specified in the flowchart and/or block diagram block or blocks or combinations thereof. It should be noted that, in some implementations, the functions noted in the block(s) may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
In some implementations, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks or combinations thereof.
In some implementations, the computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed (not necessarily in a particular order) on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions/acts (not necessarily in a particular order) specified in the flowchart and/or block diagram block or blocks or combinations thereof.
Referring to
As discussed, the present disclosure pertains to medical diagnosis, and in particular to aid in the process of detection of lesions in the medical images. A lesion may be an area of abnormal growth in an organ which has suffered damage through disease, such as polyps, tumour, melanin, and other abnormal growths. In particular, the present disclosure caters to lesions or abnormalities with well-defined geometry, such as, but not limited to, polyps. One form of lesion is a polyp, an abnormal biological mass that is projecting from a mucous membrane. Generally, a polyp may be in the form of a mass of tissue that bulges or projects outward or upward from the normal surface level, thereby being macroscopically visible as a hemispheroidal, spheroidal, or regular mound-like structure growing from a relatively broad base or a slender stalk. Polyps may be found in a number of tissues, including but not limited to colon, stomach, nose, car, sinus(es), urinary bladder, and uterus. Examples include colon, rectal and nasal polyps. If it is attached to the surface by a long and narrow handle, then it is called a pedunculated polyp; and if there is no handle, then it is called a sessile polyp.
Several machine learning approaches have been proposed for identifying and localizing polyps in endoscopy images. However, these models' ability to perform is constrained by the quality and the variety within the data. While the model can be improved by training it using segmented images from varying sources, this task is time consuming, requiring several minutes even for a single image. This difficulty is corroborated by the fact that crowdsourcing platforms are very expensive for such tasks compared to performing simpler annotations. Thus, given both the cognitive load of this task, as well as the time constraints imposed on the medical experts, it becomes impractical to generate usable training data for each such data source.
A semi-automated approach may be capable of generating high-quality segmentation maps required for training the necessary models while also reducing the burden on the experts. The key idea is to make use of the typical shapes of polyps in conjunction with existing models that are trained on publicly available data to first generate a candidate set of annotations. The final annotated images are then chosen by the experts from among this candidate set using a visual interface that involves answering a simple multiple-choice questionnaire or the like.
In
The received medical images may be filtered and cleaned, such that the positive images with polyps are separated from the negative images without polyps. For this purpose, a tool is provided (as seen in
In some embodiments, in case of the medical image being classified as the multiple lesions image, the method 100 further comprises labelling each of the multiple lesions in the corresponding medical image.
Referring back to
As discussed, in case of the medical image being classified as the multiple lesions image, a bounding polygon is outlined on each of the at least two lesions in the multiple lesions image. In such case, the method 100 comprises configuring the trained segmentation model (i.e. trained model as described above) to generate a sub-segmentation mask for each of the at least two lesions in the multiple lesions image. It may be appreciated that each of the lesions is identified using the respective bounding polygon, may be independently processed by the trained model for such purpose. Method 100 further comprises generating the segmentation mask for the multiple lesions image by union of the corresponding generated sub-segmentation masks. That is, the final segmentation mask (first segmentation mask) is obtained by combining the independently generated sub-segmentation mask for each of the at least two lesions in the multiple lesions image.
Referring again to
The geometric filter may validate the medical images (as received) with corresponding segmentation mask conforming to typical geometric properties of lesions (polyps). In the step of implementing the geometric filter, the method 100 comprises determining an area fraction. Further, the geometric filter may include determining one or more of: a relative area fraction, a convexity, an elliptical aspect ratio and an elliptical proximity. While a polyp does not contain holes, due to the segmentation model, a component can potentially have holes, in which case the current algorithm closes (or fills) the holes before computing the measures as described herein and further in the proceeding paragraphs.
The area fraction is based on a ratio of an area covered by each of the one or more lesions to a total area covered by other of one or more lesions for each of the plurality of medical images. This measure is computed as
the fraction of the area covered by the component with respect to the total area covered by all the components. Noise in the segmentation typically results in very small components, and this measure is used to filter out such spurious components (as discussed later in more detail).
The relative area fraction is based on a ratio of an area covered by each of the one or more lesions to a total area covered by a largest one component of other of one or more lesions for each of the plurality of medical images. This measure is computed as
the fraction of the area covered by the component with respect to the area of the largest component. This measure is primarily used to handle cases wherein an image has multiple polyps. In such cases, the area fraction measure computed above will be significantly smaller than the single polyp case, and will not be an effective measure to characterize polyps.
The convexity is based on a ratio of an area covered by each of the one or more lesions to an area of a convex hull of the corresponding one of the one or more lesions. This measure is computed as
the ratio between the area of the component with the area of the convex hull of the component. As polyps are typically complex, and the convexity allows us to identify potentially erroneous component segments.
The elliptical aspect ratio is based on a ratio of major axis to minor axis of an ellipse corresponding to each of the one or more lesions for each of the plurality of medical images. This measure is computed as
the ratio between major axis and minor axis of an elliptical structure of the component, which may be proxy to length and width of a minimum bounding rectangle that encompasses the given component. It is to be noted that this minimum bounding rectangle need not be axis-aligned. Such measure is used since it is expected that the polyp be “meatier”.
The elliptical proximity is based on a ratio of perimeter of an elliptical structure with same area as an ellipse corresponding to each of the one or more lesions with major axis and minor axis thereof being equal to length and width respectively of a minimum bounding rectangle thereto in the corresponding medical image, to a perimeter of the ellipse corresponding to each of the one or more lesions in the corresponding medical image, for each of the plurality of medical images. This measure is computed as
The closer the shape of a component is to being an ellipse, the closer the value of the elliptical proximity is to 1.
Further, the geometric filter may discard the medical image if the corresponding determined area fraction is less than a predefined threshold. That is, the geometric filter first uses the area fraction of the components to discard very small components (having area fraction <0.01). The method 100 further comprises implementing a regression function, trained based on ground truth values for relative area fraction, convexity, elliptical aspect ratio and elliptical proximity for lesions in medical images, to validate if the generated segmentation mask corresponds to the one or more lesions in the corresponding medical image of the plurality of medical images based on the determined relative area fraction, the determined convexity, the determined elliptical aspect ratio, and the determined elliptical proximity for lesions in the corresponding medical image. That is, using the ground truth masks from the public data, the regression function is trained based on the other four measures, namely relative area fraction, convexity, elliptical aspect ratio, and elliptical proximity. Some of the parameters of that regression function may be set to zero, effectively excluding one or more of the subcomputations from the final evaluation value of the geometric filter.
In particular, the present disclosure implements a logistic regression option provided by the “XGBoost” library for this purpose. It may use an ensemble of decision trees to learn and predict the classes, to classify if the properties of the shape correspond to a polyp or not. Training this model requires two classes of masks, those with valid polyps, and those that are not polyps. The valid polyp masks are obtained using the set of components from the public data set, while the set of masks that do not correspond to polyps consists of non-empty masks from the ‘Normal images’ as classified (described in the preceding paragraphs with reference to
Referring back to
Referring to
In particular, for handling multiple lesions images, the present disclosure utilizes annotated images with each of the at least two lesions in the multiple lesions images with bounding box (or polygon), with none of the bounding boxes (or polygons) intersect (as described above). The probabilities of the mask restricted to each of bounding box is computed. To do this, for each bounding box, the regions outside the bounding box are concealed to generate a sub-mask. The geometric probability of the sub-mask being that of a (single) lesion is then evaluated. A mask is deemed to be valid if there is at least one sub-mask that passes the geometric filter. Then, a simple union of the masks that pass the geometric filter is performed. The union is straight forward as the bounding boxes (or polygons) do not intersect. Subsequently, the last probability of each of the sub-masks is fetched, and the geometric probability of the new sub-mask is compared with the existing sub-mask. Finally, a union of the best sub-masks is performed for each region and saved as the final mask for the image.
Referring again to
Referring again to
As discussed, the present disclosure aims to significantly reduce the burden on human experts in the process of generating high quality segmented data that can be used for training new deep neural networks (DNNs). Given a data set from a new source, the process to generate new training data is described in reference to
On the other hand, as discussed in description of
If YES, at block 814, the current model is updated using the updated training set. In particular, the masks generated from the single lesion image are passed through a geometric filter to identify a subset of images for which the DNN model generates “good” masks. These images and masks are then added as part of the training data, and a new model is trained. The above process of identifying good masks is then repeated using the new model. This process is repeated until there are sufficient new images in the training set. If NO, at block 816, the obtained images are used to form a validation dataset. At block 818, the validation dataset is validated by an operator, which in turn may be used to select the medical images from the plurality of medical images to be included in the ground truth annotated dataset.
In one or more embodiments, the method 100 further comprises varying one or more weighted parameters of the implemented regression function to manipulate validation of the generated segmentation mask. It may be noted that the segmentation model obtained by the described method 100 (process 800) depends on the geometric filter. A conservative filter might end up filtering out a lot of medium quality mask, while a more liberal filter could allow spurious masks. Since, the filter itself is a heuristic, the above data generation process may be performed using both a conservative filter and a liberal (over-fitted) filter by appropriately varying the parameters of used regression function. Thus, for each medical image that has one or more lesions, the method 100 (process 800) may generate three segmentation masks, one generated using the initial model trained using the public data sets, and two masks generated using the iterative process, one using the conservative filter, and one using the liberal filter. Each of the generated three segmentation masks are provided as options to the operator in the user interface 700, to allow the expert operator to select a valid one to be included in the ground truth annotated dataset.
The present disclosure further provides a system for semi-automatically generating annotated dataset for machine learning implementation for segmenting lesions in medical images. The various embodiments and variants disclosed above apply mutatis mutandis to the present system.
The present disclosure further provides a machine learning model trained using the generated ground truth annotated dataset (as discussed in the preceding paragraphs) for diagnosis of lesions from medical images. The present disclosure further provides a clinical decision support system implementing the machine learning model to support a medical practitioner in diagnosis of lesions from medical images. A machine learning model may include a model developed based software algorithms that can improve performance of the model by one or more training procedures implemented by a computer without the computer having to be explicitly programmed based on the data. For example, neural networks techniques, support vector machine techniques can be used for implementing the machine learning model. However, other embodiments could utilize one or more of available machine learning algorithms.
A bus 1110 includes one or more parallel conductors of information so that information is transferred quickly among devices coupled to the bus 1110. One or more processors 1102 for processing information are coupled with the bus 1110.
A processor 1102 performs a set of operations on information as specified by computer program code related to generating ground truth annotated dataset. The computer program code is a set of instructions or statements providing instructions for the operation of the processor and/or the computer system to perform specified functions. The code, for example, may be written in a computer programming language that is compiled into a native instruction set of the processor. The code may also be written directly using the native instruction set (e.g., machine language). The set of operations include bringing information in from the bus 1110 and placing information on the bus 1110. The set of operations also typically include comparing two or more units of information, shifting positions of units of information, and combining two or more units of information, such as by addition or multiplication or logical operations like OR, exclusive OR (XOR), and N. Each operation of the set of operations that can be performed by the processor is represented to the processor by information called instructions, such as an operation code of one or more digits. A sequence of operations to be executed by the processor 1102, such as a sequence of operation codes, constitute processor instructions, also called computer system instructions or, simply, computer instructions. Processors may be implemented as mechanical, electrical, magnetic, optical, chemical or quantum components, among others, alone or in combination.
Computing hardware 1100 also includes a memory 1104 coupled to bus 1110. The memory 1104, such as a random-access memory (RAM) or other dynamic storage device, stores information including processor instructions for generating ground truth annotated dataset. Dynamic memory allows information stored therein to be changed by the computing hardware 1100. RAM allows a unit of information stored at a location called a memory address to be stored and retrieved independently of information at neighboring addresses. The memory 1104 is also used by the processor 1102 to store temporary values during execution of processor instructions. The computing hardware 1100 also includes a read only memory (ROM) 1106 or other static storage device coupled to the bus 1110 for storing static information, including instructions, that is not changed by the computing hardware 1100. Some memory is composed of volatile storage that loses the information stored thereon when power is lost. Also coupled to bus 1110 is a non-volatile (persistent) storage device 1108, such as a magnetic disk, optical disk or flash card, for storing information, including instructions, that persists even when the computing hardware 1100 is turned off or otherwise loses power.
Information, including instructions for generating ground truth annotated dataset, is provided to the bus 1110 for use by the processor from an external input device 1112, such as a keyboard containing alphanumeric keys operated by a human user, or a sensor. A sensor detects conditions in its vicinity and transforms those detections into physical expression compatible with the measurable phenomenon used to represent information in computing hardware 1100. Other external devices coupled to bus 1110, used primarily for interacting with humans, include a display device 1114, such as a cathode ray tube (CRT) or a liquid crystal display (LCD), or plasma screen or printer for presenting text or images, and a pointing device 1116, such as a mouse or a trackball or cursor direction keys, or motion sensor, for controlling a position of a small cursor image presented on the display 1114 and issuing commands associated with graphical elements presented on the display 1114. In some embodiments, for example, in embodiments in which the computing hardware 1100 performs all functions automatically without human input, one or more of external input device 1112, display device 1114 and pointing device 1116 is omitted.
In the illustrated embodiment, special purpose hardware, such as an application specific integrated circuit (ASIC) 1120, is coupled to bus 1110. The special purpose hardware is configured to perform operations not performed by processor 1102 quickly enough for special purposes. Examples of application specific ICs include graphics accelerator cards for generating images for display 1114, cryptographic boards for encrypting and decrypting messages sent over a network, speech recognition, and interfaces to special external devices, such as robotic arms and medical scanning equipment that repeatedly perform some complex sequence of operations that are more efficiently implemented in hardware.
Computing hardware 1100 also includes one or more instances of a communications interface 1170 coupled to bus 1110. Communication interface 1170 provides a one-way or two-way communication coupling to a variety of external devices that operate with their own processors, such as printers, scanners and external disks. In general, the coupling is with a network link 1178 that is connected to a local network 1180 to which a variety of external devices with their own processors are connected. For example, communication interface 1170 may be a parallel port or a serial port or a universal serial bus (USB) port on a personal computer. In some embodiments, communications interface 1170 is an integrated services digital network (ISDN) card or a digital subscriber line (DSL) card or a telephone modem that provides an information communication connection to a corresponding type of telephone line. In some embodiments, a communication interface 1170 is a cable modem that converts signals on bus 1110 into signals for a communication connection over a coaxial cable or into optical signals for a communication connection over a fiber optic cable. As another example, communications interface 1170 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN, such as Ethernet. Wireless links may also be implemented. For wireless links, the communications interface 1170 sends or receives or both sends and receives electrical, acoustic or electromagnetic signals, including infrared and optical signals, that carry information streams, such as digital data. For example, in wireless handheld devices, such as mobile telephones like cell phones, the communications interface 1170 includes a radio band electromagnetic transmitter and receiver called a radio transceiver. In certain embodiments, the communications interface 1170 enables connection to the local network 1180.
A computer-readable medium may be any medium that participates in providing information to processor 1102, including instructions for execution. Non-volatile media include, for example, optical or magnetic disks, such as storage device 1108. Volatile media include, for example, dynamic memory 1104. Transmission media include, for example, coaxial cables, copper wire, fiber optic cables, and carrier waves that travel through space without wires or cables, such as acoustic waves and electromagnetic waves, including radio, optical and infrared waves. Signals include man-made transient variations in amplitude, frequency, phase, polarization or other physical properties transmitted through the transmission media. Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, CDRW, DVD, any other optical medium, punch cards, paper tape, optical mark sheets, any other physical medium with patterns of holes or other optically recognizable indicia, a RAM, a PROM, an EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave, or any other medium from which a computer can read.
The problem of disease detection or segmentation has been previously solved for modalities like optical endoscopy, wireless capsule endoscopy and CT and MR virtual endoscopy using convolutional neural networks along with several pre-processing and post-processing interventions. However, the method of data annotation using novel semi-automated human-in-the-loop approach in conjunction with computational geometric techniques as described in the embodiments of the present disclosure provides significant advantages over the known techniques. Experiments demonstrated that the present approach not only generated high quality annotations but also significantly reduces the time imposed on the experts. In one experiment, the present approach managed to annotate over 60% of the images in the quantitative experiment using public data sets, and 69% in the local data. While having a higher number is preferable, in terms of absolute numbers (e.g., 674 in case of the local data set), this is sufficiently high for training new models.
The foregoing descriptions of specific embodiments of the present disclosure have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the present disclosure to the precise forms disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiment was chosen and described in order to best explain the principles of the present disclosure and its practical application, to thereby enable others skilled in the art to best utilize the present disclosure and various embodiments with various modifications as are suited to the particular use contemplated.
Various processes described herein may be implemented by appropriately programmed general purpose computers, special purpose computers, and computing devices. Typically a processor (e.g., one or more microprocessors, one or more microcontrollers, one or more digital signal processors) will receive instructions (e.g., from a memory or like device), and execute those instructions, thereby performing one or more processes defined by those instructions. Instructions may be embodied in one or more computer programs, one or more scripts, or in other forms. The processing may be performed on one or more microprocessors, central processing units (CPUs), computing devices, microcontrollers, digital signal processors, or like devices or any combination thereof. Programs that implement the processing, and the data operated on, may be stored and transmitted using a variety of media. In some cases, hard-wired circuitry or custom hardware may be used in place of, or in combination with, some or all of the software instructions that can implement the processes. Algorithms other than those described may be used.
Programs and data may be stored in various media appropriate to the purpose, or a combination of heterogeneous media that may be read and/or written by a computer, a processor or a like device. The media may include non-volatile media, volatile media, optical or magnetic media, dynamic random access memory (DRAM), static ram, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, an EPROM, a FLASH-EEPROM, any other memory chip or cartridge or other memory technologies. Transmission media include coaxial cables, copper wire and fiber optics, including the wires that comprise a system bus coupled to the processor.
Databases may be implemented using database management systems or ad hoc memory organization schemes. Alternative database structures to those described may be readily employed. Databases may be stored locally or remotely from a device which accesses data in such a database.
In some cases, the processing may be performed in a network environment including a computer that is in communication (e.g., via a communications network) with one or more devices. The computer may communicate with the devices directly or indirectly, via any wired or wireless medium (e.g. the Internet, LAN, WAN or Ethernet, Token Ring, a telephone line, a cable line, a radio channel, an optical communications line, commercial on-line service providers, bulletin board systems, a satellite communications link, a combination of any of the above). Each of the devices may themselves comprise computers or other computing devices, such as those based on the Intel® Pentium® or Centrino™ processor, that are adapted to communicate with the computer. Any number and type of devices may be in communication with the computer.
A server computer or centralized authority may or may not be necessary or desirable. In various cases, the network may or may not include a central authority device. Various processing functions may be performed on a central authority server, one of several distributed servers, or other distributed devices.
Number | Date | Country | Kind |
---|---|---|---|
202141028259 | Jun 2021 | IN | national |
PCT/IN2022/050547 | Jun 2022 | WO | international |
This application is a continuation-in-part of International application Ser. No. PCT/IN2022/050547, filed Jun. 15, 2022, System and Method for Generating Ground Truth Annotated Dataset for Analysing Medical Images, and claims priority from India Patent App. 202141028259, filed Jun. 23, 2021, System and Method for Generating Ground Truth Annotated Dataset for Analysing Medical Images. The entire disclosure of the parent application(s) is/are incorporated herein by reference. The present disclosure generally relates to the field of medical diagnosis by computer-aided analysis of medical images, and more particularly to a system and a method for semi-automatically generating ground truth annotated dataset for machine learning implementation for segmenting lesions in medical images. In the early stages of medical conditions, like various types of cancers, many patients experience few or no symptoms. Because the lack of symptoms makes such cancers difficult to detect, thus medical personnel are often only able to diagnose such cancerous disease at more advanced stages, and by that point it may get very challenging to treat. That said, in case of some cancers, for instance colorectal cancer, which is the third most frequently diagnosed cancer and the second leading cause of cancer death worldwide, the risk of developing colorectal cancer can be significantly reduced through early diagnosis of polyps during a colonoscopy by analysing correspondingly generated medical images. However, in the field of medical imaging, the segmentation of abnormal anatomical structures or lesions such as gastrointestinal polyps, aneurisms or lung nodules in many cases, are possible precursors to cancers) is a challenging problem because of their highly variable shape, texture, density and size and their attachment to surrounding normal structures. Currently, endoscopy screenings are the standard for detection and localization of such abnormal tissue regions and precancerous lesions. The success of the diagnosis depends highly on the operator's experience and skills, both for dexterous manoeuvring of the camera and for ensuring full exploration of the mucosa. Such screenings are manual procedures performed by physicians and are therefore affected by human factors like fatigue, lack of sensitivity to visual characteristics of lesions or insufficient attentiveness during examination. Therefore, there is a significant miss rate for these suspicious lesions. For example, some studies suggest a miss rate of 8% to 37%. This can be potentially attenuated by detecting and localizing polyps accurately. Computer aided diagnosis of diseases has become important to such fields as cardiology, radiology, and other areas of medicine. These have the benefit of images or time dependent signals. With the advent of deep learning in recent years, the ability of computers to perform vision tasks such as identification, localisation and generation has seen success. More specifically, Convolutional Neural Networks (CNNs) have proven particularly effective towards the task of image segmentation and therefore can be leveraged to solve lesion segmentation tasks in medical space with applications in endoscopic, MRI, CT and X-ray images among several others. Ideally, a real-time automatic pixel wise lesion-segmentation system could serve as an effective second observer that could draw the doctor's attention, in real time, to concerning lesions, effectively creating a ‘second set of eyes’ on all aspects of the video data with fidelity. However, development of such a system requires training a machine learning model on a large volume of ground truth annotated data which is usually cited as a bottleneck because of its time and labour-intensive nature. Moreover, existing deep learning models trained on publicly available data sets do not retain their performance when used on data locally sourced from different hospitals, and requires incorporating annotated images from the new data source into the training scheme. Therefore, there exists a need to overcome the bottleneck of generating annotated dataset for machine learning model for segmenting lesions in medical images. The present disclosure has been made in view of such considerations, and it is an object of the present disclosure to provide systems and methods for generating ground truth annotated dataset for machine learning implementation for segmenting lesions in medical images to overcome the limitations of the prior-art.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/IN2022/050547 | Jun 2022 | WO |
Child | 18395017 | US |