The present disclosure relates to an information processing apparatus, an information processing method, and a recording medium.
In the related art, with respect to an endoscope, a technology for searching for a region to be observed and a lumen direction for insertion to reduce a time taken to resume original operation and improve convenience even when an observation imaging object in a subject is missing or an insertion direction is missing has been known (for example, see Japanese Patent No. 6577031). In this technology, coordinates of an observation position that indicates an innermost position of a lumen are identified from a group of chronological images, and when a coordinate position is not identified in an image of a current frame, a corresponding point between an image of a past frame previous to the current frame and the image of the current frame is detected, coordinate transformation is performed to obtain the coordinate position in the image of the current frame from the detection result, and information on a lumen direction is displayed together with the image of the current frame.
In some embodiments, an image processing device includes: one or more processors comprising hardware, wherein the one or more processors are configured to: input a captured image captured by an endoscope in a body cavity of a subject to a Convolutional Neural Network (CNN) using a trained model, the trained model having training data in which each of training images is associated with category information including a lumen direction of a region outside the each of training images in which the lumen is present; estimate, based on the CNN, category information including a lumen direction of a region outside the captured image in which the lumen is likely to be present; and output the category information estimated.
In some embodiments, provided is an image processing method implemented by an image processing device including a processor. The image processing method includes: inputting a captured image captured by an endoscope in a body cavity of a subject to a Convolutional Neural Network (CNN) using a trained model, the trained model having training data in which each of training images is associated with category information that includes a lumen direction of a region outside the each of training images in which the lumen is present; estimating, based on the CNN, category information including a lumen direction of a region outside the captured image in which the lumen is likely to be present; and outputting the category information estimated.
In some embodiments, provided is a non-transitory computer readable recording medium having recorded therein an executable program. The program causes a computer to perform: inputting a captured image captured by an endoscope in a body cavity of a subject to a Convolutional Neural Network (CNN) using a trained model, the trained model having training data in which each of training images is associated with a category information that includes a lumen direction of a region outside the each of training images in which the lumen is present; estimating, based on the CNN, category information including a lumen direction of a region outside the captured image in which the lumen is likely to be present; and outputting the category information estimated.
The above and other features, advantages and technical and industrial significance of this disclosure will be better understood by reading the following detailed description of presently preferred embodiments of the disclosure, when considered in connection with the accompanying drawings.
A medical system according to the present disclosure will be described in detail below together with the drawings. The present disclosure is not limited by the embodiments below. Further, in each of the drawings to be referred to in the description below, shapes, sizes, and positional relationships are only schematically illustrated so as to make it possible to understand details of the present disclosure. In other words, the present disclosure is not limited to only the shapes, the sizes, and the positional relationships that are illustrated in each of the drawings. Furthermore, in the description of the drawings, explanation will be given by denoting the same components by the same reference symbols.
Overall configuration of medical system
As illustrated in
The endoscope 2 successively generates image data (RAW data) by capturing image inside the subject, and sequentially outputs the image data to the control device 4. As illustrated in
The insertion portion 21 is configured such that at least a part of which has flexibility, and is inserted into a subject. As illustrated in
The operating unit 22 is connected to a proximal end portion of the insertion portion 21. The operating unit 22 receives various kinds of operation on the endoscope 2. As illustrated in
The bending knob 221 is configured so as to be rotatable in accordance with user operation that is performed by a user, such as an operator. Further, the bending knob 221 rotates to cause a curved mechanism (not illustrated), such as a wire made of metal or resin, that is arranged inside the insertion portion 21. With this configuration, the bending portion 25 is curved.
The insertion opening 222 is an insertion opening that communicates with a treatment tool channel (not illustrated) that is a pipe extending from the distal end of the insertion portion 21, and that is used for inserting a treatment tool or the like into the treatment tool channel from outside of the endoscope 2.
The plurality of operating members 223 include buttons for receiving various kinds of operation that is performed by a user, such as an operator, and outputs an operation signal corresponding to each kind of operation to the control device 4 via the universal cord 23. Examples of various kinds of operation include release operation for instructing the endoscope 2 to capture a still image and operation of changing an observation mode of the endoscope 2 to a normal light observation mode or a special light observation mode.
The universal cord 23 is a cord which extends from the operating unit 22 in a different direction from an extending direction of the insertion portion 21 and in which a light guide 231 (see
The display device 3 is configured with a display monitor, such as liquid crystal or organic Electro Luminescence (EL), and displays a display image based on image data that is subjected to image processing by the control device and various kinds of information on the endoscope 2, under the control of the control device 4.
The control device 4 is realized by using a processor that is a processing apparatus including hardware, such as a Graphics Processing Unit (GPU), a Field Programmable Gate Array (FPGA), or a Central Processing Unit (CPU), and a memory that is a temporary storage area used by the processor. The control device 4 comprehensively controls operation of each of the units of the endoscope 2 in accordance with a program that is recorded in the memory.
A functional configuration of a main part of the medical system 1 as described above will be described.
A configuration of the endoscope 2 will be described below.
As illustrated in
The illumination optical system 201 is configured with one or more lenses or the like, and emits illumination light that is supplied from the light guide 231 toward an imaging object.
The imaging optical system 202 is configured with an actuator that includes a stepping motor or a voice coil motor that moves a plurality of lenses and a predetermined lens among the plurality of lenses in an optical axis direction. The imaging optical system 202 condenses light, such as reflected light that is reflected from an imaging object, returning light that comes from the imaging object, or fluorescence that is emitted by the imaging object, and forms an object image on a light receiving surface of the image sensor 203. Further, the imaging optical system 202 is able to change a focal distance (imaging magnification or magnification) and a focal position by moving a predetermined lens along an optical axis direction O1 under the control of the imaging control unit 207. In one embodiment, the imaging optical system 202 is able to change the imaging magnification to one time, 80 times, or 520 times. The imaging optical system 202 need not, of course, change the imaging magnification in a stepwise manner, but the imaging magnification may be changed in a continuous manner.
The image sensor 203 is configured with an image sensor, such as a Charge Coupled Device (CCD) or a Complementary Metal Oxide Semiconductor (CMOS), in which one of color filters that form Bayer arrangement (RGGB) is arranged on each of pixels that are arranged in a two matrix manner. The image sensor 203 receives the object image that is formed by the imaging optical system 202, and generates a captured image (analog signal) by performing photoelectric conversion, under the control of the imaging control unit 207. Meanwhile, in the present embodiment, the image sensor 203 may be configured by integrating an image sensor and a Time Of Flight (TOF) sensor that acquires imaging object distance information (hereinafter, described as depth map information) by a TOF method. The depth map information is information in which an imaging object distance from a position of the image sensor 203 (position of the distal end portion 24) to a corresponding position on an observation object that corresponds to a pixel position of a captured image is detected for each of pixel positions. Meanwhile, the configuration that generates the depth map information is not limited to the TOF sensor as described above, but it may be possible to adopt an image sensor that includes a phase difference sensor. In the following, the depth map information and the captured image are collectively described as image data. The image sensor 203 outputs the image data to the A/D converter 204.
The A/D converter 204 is configured with an A/D conversion circuit or the like. The A/D converter 204 performs an A/D conversion process on an analog image data that is input from the image sensor 203 and outputs the image data to the P/S converter 205, under the control of the imaging control unit 207.
The P/S converter 205 is configured with a P/S conversion circuit or the like, performs parallel-to-serial conversion on digital image data that is input from the A/D converter 204 and outputs the image data to the control device 4 via the first signal line 232, under the control of the imaging control unit 207.
Meanwhile, it may be possible to arrange, instead of the P/S converter 205, an E/O converter that converts image data to an optical signal and output the image data by the optical signal to the control device 4. Further, for example, it may be possible to transmit image data to the control device 4 by radio communication using Wireless Fidelity (Wi-Fi) (registered trademark) or the like.
The imaging recording unit 206 is configured with a non-volatile memory or a volatile memory, and records therein various kinds of information on the endoscope 2 (for example, pixel information of the image sensor 203). Further, the imaging recording unit 206 records therein various kinds of setting data and a control parameter that are transferred from the control device 4 via the second signal line 233.
The imaging control unit 207 is realized by using a Timing Generator (TG), a processor that is a processing apparatus including hardware, such as a CPU, and a memory that is a temporary storage area used by the processor. The imaging control unit 207 control operation of each of the image sensor 203, the A/D converter 204, and the P/S converter 205 based on the setting data that is received from the control device 4 via the second signal line 233.
A configuration of the control device 4 will be described below.
As illustrated in
The condenser lens 40 collects light that is emitted by each of the first light source unit 41 and the second light source unit 42, and outputs the light to the light guide 231.
The first light source unit 41 emits white light (normal light) that is visible light, and supplies the white light, as illumination light, to the light guide 231, under the control of the light source controller 43. The first light source unit 41 is configured with a collimator lens, a white Light Emitting Diode (LED) lamp, a driving driver, and the like. Meanwhile, as the first light source unit 41, it may be possible to cause a red LED lamp, a green LED lamp, and a blue LED lamp to simultaneously emit light and supply white light that is visible light. Further, the first light source unit 41 may be configured with a halogen lamp, a xenon lamp, or the like.
The second light source unit 42 emits special light with a predetermined wavelength band and supplies the special light, as the illumination light, to the light guide 231, under the control of the light source controller 43. Here, the special light is light that is used for Narrow band Imaging (NBI) using narrow band light including 390 to 445 nanometers (nm) and 530 to 550 nm. It is of course possible to adopt, as the special light, light of amber color (600 nm and 630 nm) that is used for Red dichromatic Imaging (RDI), apart from the narrow band light.
The light source controller 43 is realized by using a processor that is a processing apparatus including hardware, such as a CPU, and a memory that is a temporary storage area used by the processor. The light source controller 43 controls a light emission timing, a light emission time, and the like of each of the first light source unit 41 and the second light source unit 42 based on control data that is input from the control unit 49.
The S/P converter 44 performs serial-to-parallel conversion on image data that is received from the endoscope 2 via the first signal line 232 and outputs the image data to the image processing unit 45 under the control of the control unit 49. Meanwhile, when the endoscope 2 outputs the image data by an optical signal, it may be possible to arrange an O/E converter that converts an optical signal to an electrical signal, instead of the S/P converter 44. Further, when the endoscope 2 transmits the image data by radio communication, it may be possible to arrange a communication module that can receive a radio signal, instead of the S/P converter 44.
The image processing unit 45 is realized by using a processor including hardware, such as a GPU or an FPGA, and a memory that is a temporary storage area used by the processor. The image processing unit 45 performs predetermined image processing on image data that is parallel data input from the S/P converter 44 and outputs the image data to the display device 3 under the control of the control unit 49. Examples of the predetermined image processing include demosaic processing, white balance processing, gain adjustment processing, Y correction processing, and format conversion processing.
The input unit 46 is configured with a mouse, a foot switch, a keyboard, a button, a switch, a touch panel, and the like, receives user operation that is performed by a user, such as an operator, and outputs an operation signal corresponding to the user operation to the control unit 49.
The recording unit 47 is configured with a volatile memory, a non-volatile memory, a Solid State Drive (SSD), a Hard Disk Drive (HDD), a recording medium, such as a memory card, or the like. Further, the recording unit 47 records therein data including various kinds of parameters that are needed for operation of the control device 4 and the endoscope 2. Furthermore, the recording unit 47 includes a program recording unit 471 for recording various kinds of programs for operating the endoscope 2 and the control device 4, an image data recording unit 472 for recording an image file in which an image corresponding to image data is recorded, and a trained model recording unit 473 for recording a trained model. Details of the trained model will be described later.
The communication unit 48 transmits various kinds of information to an external server via a network N100, receives various kinds of information from the server, and outputs the various kinds of information to the control unit 49, under the control of the control unit 49. The communication unit 48 is configured with a communication module or the like.
The control unit 49 corresponds to a second processor according to the present disclosure. The control unit 49 is realized by using a second processor that is a processing apparatus including hardware, such as an FPGA or a CPU, and a memory that is a temporary storage area used by the processor. Further, the control unit 49 comprehensively control each of the units included in the endoscope 2 and the control device 4. The control unit 49 includes an acquiring unit 491, an estimation unit 492, a determination unit 493, and an output control unit 494. Meanwhile, in the first embodiment, the control unit 49 functions as an image processing device.
The acquiring unit 491 acquires a captured image that corresponds to the image data that is captured by the endoscope 2 via the S/P converter 44.
The estimation unit 492 estimates category information in the captured image based on the captured image that is acquired by the acquiring unit 491 and the trained model that is recorded in the trained model recording unit 473.
The determination unit 493 determines whether or not a lumen is present in the captured image based on the category information that is estimated by the estimation unit 492. The determination unit 493 determines whether or not the estimation unit 492 is able to estimate a lumen direction based on the category information that is estimated by the estimation unit 492. Here, the lumen may refer to the position at the back of the lumen. For example, when an endoscope is inserted into the colon to capture images, it is the innermost position of the lumen in relation to the direction of insertion of the endoscope Within a range of observation by endoscopy. The lumen may also refer to an abbreviated cylindrical shadowed area located at the innermost point on the image. The lumen may be a deep portion located distally relative to a proximal intestinal wall.
The output control unit 494 causes the image processing unit 45 to superimpose lumen information that corresponds to the category information that is estimated by the estimation unit 492 onto a captured image that corresponds to captured image data that is generated by performing image processing by the image processing unit 45, and outputs the captured image with the lumen information to the display device 3.
An overview of generation of the trained model that is recorded in the trained model recording unit 473 and estimation by a CNN using the trained model will be described below. The trained model is a parameter of the CNN, and, the trained model is generated, in advance, by training using training data and used as a parameter, so that the CNN is able to perform estimation on an image that is different from training data.
A process performed by the control device 4 will be described below.
At Step S101 illustrated in
Subsequently, at Step S102, the estimation unit 492 estimates category information in the captured image from the captured image that is acquired by the acquiring unit 491, by the CNN using the trained model that is recorded in the trained model recording unit 473. Specifically, the estimation unit 492 inputs the captured image to the CNN using the trained model that is recorded in the trained model recording unit 473, and causes the category information that includes presence or absence of a lumen in the captured image to be output and, in the case where a lumen is absent, a lumen direction indicating a position of a lumen that is estimated to be present outside the captured image when directions viewed from the center of the image are divided by a predetermined angle, and a degree of reliability of the lumen direction to be output.
As illustrated in
In this manner, the estimation unit 492 estimates the category information in the captured image from the captured image that is acquired by the acquiring unit 491 by the CNN using the trained model that is recorded in the trained model recording unit 473. Meanwhile, the estimation unit 492 outputs the category of each of the directions and the degree of reliability of each of the directions, but embodiments are not limited to this example, and it may be possible to output only the lumen direction with the highest degree of reliability.
Referring back to
At Step S103, the determination unit 493 determines whether or not a lumen is present in the captured image based on the category information that is estimated by the estimation unit 492. Specifically, the determination unit 493 determines whether or not a lumen is present in the captured image based on presence or absence of a lumen included in the category information that is estimated by the estimation unit 492. When the determination unit 493 determines that a lumen is present in the captured image (Step S103: Yes), the control device 4 goes to Step S107 to be described later. In contrast, when the determination unit 493 determines that a lumen is absent in the captured image (Step S103: No), the control device 4 goes to Step S104 to be described below. At Step S103, the determination unit may determine whether a deep portion (deep area or deep region) exists in the captured image. The deep portion may be a substantially circular or oval. The deep portion may be relatively distant from an endoscope, so that the illumination from the endoscope hardly reaches. Thus, the deep portion may be a dark area or dark region. The area in which dark pixels are gathered is extracted as a lumen deep portion.
At Step S104, the determination unit 493 determines whether or not the estimation unit 492 is able to estimate a lumen direction based on the category information that is estimated by the estimation unit 492. Specifically, the determination unit 493 determines whether or not the degree of reliability of each of the lumen directions included in the category information that is estimated by the estimation unit 492 is equal to or larger than a threshold, and determines whether or not the estimation unit 492 is able to estimate the lumen direction by determining whether or not the lumen direction for which the degree of reliability is equal to or larger than the threshold is present. For example, the determination unit 493 determines that the estimation unit 492 is able to estimate the lumen direction when the degree of reliability of at least one of the lumen directions is equal to or larger than the threshold, and determines that the estimation unit 492 is not able to estimate the lumen direction when the degrees of reliability of all of the lumen directions are not equal to or larger than the threshold. When the determination unit 493 determines that the estimation unit 492 is able to estimate a lumen direction (Step S104: Yes), the control device 4 goes to Step S105 to be described later. In contrast, when the determination unit 493 determines that the estimation unit 492 is not able to estimate a lumen direction (Step S104: No), the control device 4 goes to Step S106 to be described later.
At Step S105, the output control unit 494 causes the image processing unit 45 to superimpose lumen information that corresponds to the category information that is estimated by the estimation unit 492 onto the captured image that corresponds to the captured image data that is generated by performing image processing by the image processing unit 45, and outputs the captured image data with the lumen information to the display device 3.
At Step S106, the output control unit 494 causes the image processing unit 45 to superimpose a warning indicating that the estimation unit 492 fails to estimate the lumen direction onto a captured image that corresponds to the captured image data that is generated by performing image processing by the image processing unit 45, and outputs the captured image with the warning to the display device 3. For example, the output control unit 494 causes the image processing unit 45 to superimpose a message indicating that the lumen direction is not estimated or a message indicating that an image capturing direction of the insertion portion 21 of the endoscope 2 is to be changed onto the captured image, and displays the captured image with the message on the display device 3. Meanwhile, the output control unit 494 may outputs the warning by an alarm or a voice, instead of the message, to indicate that the estimation unit 492 fails to estimate the lumen direction. After Step S106, the control device 4 goes to Step S108 to be described later.
At Step S107, the output control unit 494 outputs the captured image that corresponds to the captured image data that is generated by performing image processing by the image processing unit 45 to the display device 3. After Step S107, the control device 4 goes to Step S108 to be described later.
At Step S108, the determination unit 493 determines whether or not the operator terminates observation of the subject. Specifically, the determination unit 493 determines whether or not a termination signal for terminating the observation of the subject is input from the operating unit 22 of the endoscope 2 by operation that is performed on the operating unit 22 of the endoscope 2 by the operator. When the determination unit 493 determines that the operator terminates the observation of the subject (Step S108: Yes), the control device 4 terminates the process. In contrast, when the determination unit 493 determines that the operator does not determine the observation of the subject (Step S108: No), the control device 4 returns to Step S101 as described above.
According to the first embodiment as described above, even when a lumen is not present in a captured image, it is possible to present a lumen direction.
Furthermore, according to the first embodiment, the output control unit 494 superimpose a warning indicating that the estimation unit 492 fails to estimate a lumen direction onto the captured image and outputs the captured image with the warning to the display device 3; therefore, when it is difficult to estimate a lumen, it is possible to actively encourage the operator to move the insertion portion 21 of the endoscope 2.
A first modification of the first embodiment will be described below.
As illustrated in
According to the first modification of the first embodiment as described above, even when the captured image P1 does not include a lumen, an operator is able to intuitively recognize a lumen direction.
A second modification of the first embodiment will be described below.
As illustrated in
According to the second modification of the first embodiment as described above, even when the captured image P1 does not include a lumen, an operator is able to intuitively recognize a lumen direction.
A third modification of the first embodiment will be described below.
As illustrated in
According to the third modification of the first embodiment as described above, even when the captured image P1 does not include a lumen, an operator is able to intuitively recognize a lumen direction and recognize that a lumen moves in accordance with operation on the endoscope 2.
A fourth modification of the first embodiment will be described below. In the fourth modification of the first embodiment, a part of the process executed by the control device 4 is different. Therefore, the process performed by the control device 4 will be described below.
At Step S201 illustrated in
Subsequently, the estimation unit 492 estimates the category information in the captured image based on the plurality of chronologically successive captured images that are acquired by the acquiring unit 491 and the trained model that is recorded in the trained model recording unit 473 (Step S202). Specifically, the estimation unit 492 inputs the plurality of chronologically successive captured images to the trained model, and causes the trained model to output the category information that includes presence or absence of a lumen in the captured image, the lumen direction, and the degree of reliability. In this case, the trained model may be generated by training using, as the CNN, a Long Short-Term Memory (LSTM) or the like. In this case, as the training data, it is satisfactory to use a plurality of chronologically successive captured images that are the plurality of captured images that are captured by the endoscope 2 in the subject. After Step S202, the control device 4 goes to Step S203.
According to the fourth modification of the first embodiment as described above, the estimation unit 492 inputs the plurality of chronologically successive captured images to the trained model, and causes the trained model to output the category information that includes presence or absence of a lumen in the captured image, the lumen direction, and the degree of reliability; therefore, it is possible to use a captured image without disturbance, such as a bubble, a residue in the subject, or blurring, and it is possible to use captured images that are captured in a plurality of different directions, so that it is possible to perform estimation with high accuracy.
A fifth modification of the first embodiment will be described below. In the fifth modification of the first embodiment, a part of the process executed by the control device 4 is different. Specifically, in the fifth modification of the first embodiment, a display mode is changed in accordance with the degree of reliability of the lumen direction included in the lumen information. Therefore, the process performed by the control device 4 will be described below.
At Step S303, the determination unit 493 determines whether or not a lumen is present in the captured image based on the lumen information that is estimated by the estimation unit 492. Specifically, the determination unit 493 determines whether or not a lumen is present in the captured image based on presence or absence of a lumen included in the category information that is estimated by the estimation unit 492. When the determination unit 493 determines that a lumen is present in the captured image (Step S303: Yes), the control device 4 goes to Step S305. In contrast, when the determination unit 493 determines that a lumen is absent in the captured image (Step S303: No), the control device 4 goes to Step S304 to be described later.
At Step S304, the output control unit 494 causes the image processing unit 45 to superimpose the lumen information that corresponds to the category information that is estimated by the estimation unit 492 onto a captured image that corresponds to the captured image data that is generated by performing image processing by the image processing unit 45, and outputs the captured image with the lumen information on the display device 3.
According to the fifth modification of the first embodiment as described above, even when the captured image P1 does not include a lumen, an operator is able to intuitively recognize a lumen direction and recognize a probability of estimation of the lumen direction.
A second embodiment will be described below. In the second embodiment, a region of a subject that is captured by the endoscope 2 is estimated based on the captured image, a trained model corresponding to the region is selected from among a plurality of trained models based on the estimation result of the region, and estimation is performed. In the following, a functional configuration of a control device according to the second embodiment will be first described, and thereafter, a process performed by the control device according to the second embodiment will be described.
The recording unit 47A includes a trained model recording unit 473A, instead of the trained model recording unit 473 of the recording unit 47 according to the first embodiment as described above. Further, the recording unit 47A includes a region trained model recording unit 474.
The trained model recording unit 473A records therein a plurality of trained models that are able to estimate lumen information for each organ or each region in the subject. Specifically, the trained model recording unit 473A records therein a plurality of trained models corresponding to recto sigmoid, sigmoid colon, descending colon, transverse colon, ascending colon, upper rectum, and lower rectum. The plurality of trained models as described above has performed training for each organ or each region by training data in which the CNN described in the first embodiment, a plurality of pieces of image data, each of which is captured for each organ or each region and includes at least one of a lumen, a wall of an intestinal wall, and a shade of a lumen of the subject, and a lumen direction as a correct value are associated with one another.
The region trained model recording unit 474 records therein a region trained model for estimating a region of the subject that is captured by the endoscope 2. Specifically, the region trained model has performed training on training data in which the CNN described above in the first embodiment, a plurality of pieces of image data that are captured for each organ or each region, and a name of the organ or the region as a correct value are associated with one another. The region trained model receives input of the captured image data and outputs the name of the organ or the region.
The control unit 49A further includes a region estimation unit 495 and a selector 496 in addition to the functional configuration of the control unit 49 according to the first embodiment as described above.
The region estimation unit 495 estimates, from the captured image that is acquired by the acquiring unit 491, a region of the subject that is captured by the endoscope 2 by the CNN using the region trained model that is recorded in the region trained model recording unit 474.
The selector 496 selects a trained model corresponding to the region of the subject from among a plurality of trained models that are recorded in the trained model recording unit 473A based on the region of the subject that is estimated by the region estimation unit 495.
A process performed by the control device 4A will be described below.
At Step S402, the region estimation unit 495 estimates a region of the subject that is captured by the endoscope 2 from the captured image that is acquired by the acquiring unit 491 by the CNN using the region trained model that is recorded in the region trained model recording unit 474 (Step S402). Specifically, the region estimation unit 495 inputs the captured image to the CNN using the region trained model, and causes a name of the region of the subject to be output.
Subsequently, the selector 496 selects a trained model that corresponds to the region of the subject from among a plurality of trained models that are recorded in the trained model recording unit 473A, based on the region of the subject that is estimated by the region estimation unit 495 (Step S403).
Thereafter, the estimation unit 492 estimates the lumen information in the captured image from the captured image that is acquired by the acquiring unit 491 by the CNN using the trained model that is selected by the selector 496 (Step S404). Specifically, the estimation unit 492 inputs the captured image to the CNN using the trained model that is selected by the selector 496, and causes the category information that includes presence or absence of a lumen in the captured image, a lumen direction, and a degree of reliability to be output. After Step S404, the control device 4A goes to Step S405.
According to the second embodiment as described above, the estimation unit 492 estimates the category information in the captured image from the captured image that is acquired by the acquiring unit 491 by the CNN using the trained model that is selected by the selector 496, so that it is possible to estimate the category information by using the trained model that is suitable for the region that is currently captured by the endoscope 2.
Meanwhile, in the second embodiment, the selector 496 selects the trained model that corresponds to the region of the subject from among the plurality of trained models that are recorded in the trained model recording unit 473A based on the organ or the region that is estimated by the region estimation unit 495, but embodiments are not limited to this example, and it may be possible to select a trained model that corresponds to an instruction signal that designates an organ or a region and that is input from the operating unit 22 or the input unit 46, from among the plurality of trained models that are recorded in the trained model recording unit 473A, for example.
A third embodiment will be described below. In the third embodiment, coordinates of an observation position in the captured image are identified, and lumen information is displayed by using an identification result. In the following, a functional configuration of a control device according to the third embodiment will be first described, and thereafter, a process performed by the control device according to the third embodiment will be described.
The coordinate transformation matrix calculation unit 497 calculates an affine matrix by performing a coordinate transformation matrix calculation process by using the captured image P1, determines trackability of lumen tracking, and outputs results to the lumen coordinates calculation unit 498.
The lumen coordinates calculation unit 498 calculates lumen coordinates of a lumen in the captured image P1 based on the category information that is input from the estimation unit 492 and the affine matrix and the trackability of the lumen tracking that are input from the coordinate transformation matrix calculation unit 497, outputs the lumen coordinates to the lumen direction calculation unit 499, and outputs a possibility of estimation of a lumen to the determination unit 493.
The lumen direction calculation unit 499 calculates a lumen direction based on the lumen coordinates that are input from the lumen coordinates calculation unit 498, and outputs the lumen direction to the output control unit 494.
An overview of a data flow of a main part in the control unit 49B will be described below.
As illustrated in
Subsequently, the estimation unit 492 estimates, from the captured image P1, the category information by using the CNN, and when presence or absence of a lumen that is the estimated category information indicates that a lumen is absent, the estimation unit 492 outputs the category information that includes the lumen direction 1 to the lumen direction N, the degree of reliability of each of the directions, and a possibility of estimation of a lumen to the lumen coordinates calculation unit 498 and the determination unit 493.
Further, the coordinate transformation matrix calculation unit 497 performs the coordinate transformation matrix calculation process by using the captured image P and outputs the affine matrix and the trackability of the lumen tracking to the lumen coordinates calculation unit 498, at the same time with the estimation process that is performed by the estimation unit 492. Meanwhile, details of the coordinate transformation matrix calculation process performed by the coordinate transformation matrix calculation unit 497 will be described later.
Subsequently, the lumen coordinates calculation unit 498 calculates the lumen coordinates of the lumen in the captured image P1 based on the category information that is input from the estimation unit 492 and the affine matrix and the trackability of the lumen tracking that are input from the coordinate transformation matrix calculation unit 497, outputs the lumen coordinates to the lumen direction calculation unit 499, and outputs the possibility of estimation of a lumen to the determination unit 493. Meanwhile, the lumen coordinates calculation process that are calculated by the lumen coordinates calculation unit 498 will be described later.
Thereafter, the lumen direction calculation unit 499 calculates and outputs the lumen direction based on the lumen coordinates that are input from the lumen coordinates calculation unit 498.
A functional configuration and the coordinate transformation matrix calculation process of the coordinate transformation matrix calculation unit 497 will be described below.
As illustrated in
The memory unit 4971 temporarily records therein the input captured image until next frame processing. Further, the memory unit 4971 outputs a captured image of a previous frame to the template matching unit 4972. The memory unit 4971 is configured with, for example, a frame memory or the like.
The template matching unit 4972 performs template matching between the captured image of the previous frame recorded in the memory unit 4971 and a current captured image.
Referring back to
The affine matrix calculation unit 4973 performs an affine matrix calculation process based on the set of corresponding points input from the template matching unit 4972, and outputs an affine matrix and the number of inliers to the trackability determination unit 4974. Specifically, the affine matrix calculation unit 4973 performs a process as described below.
The affine matrix calculation unit 4973 randomly extracts n corresponding points that are minimum corresponding points needed for estimation of a parameter from the set of corresponding points input from the template matching unit 4972 (Step A).
Subsequently, the affine matrix calculation unit 4973 estimates a parameter by using a set of n corresponding points (Step B).
Thereafter, the affine matrix calculation unit 4973 assigns all of points other than the points that are extracted at Step A as described above to the parameter that is estimated at Step B, and compares an error between data that is obtained by the assignment and original data (Step C).
Further, the affine matrix calculation unit 4973 determines whether the error that is calculated at Step C is equal to or smaller than a threshold t, and if the error is equal to or smaller than the threshold t, the error is counted as an inlier (Step D).
Finally, the affine matrix calculation unit 4973 repeats Step A to Step D as described above, and extracts the affine matrix (a model parameter or a homography matrix) for which the number of inliers is the largest and the largest number of inliers (Step E).
The trackability determination unit 4974 determines whether or not the number of inliers that is input from the affine matrix calculation unit 4973 is equal to or larger than a threshold. Specifically, the trackability determination unit 4974 determines whether or not the number of inliers is equal to or larger than 5 (the number of inliers≥5). Thereafter, the trackability determination unit 4974 outputs the trackability based on the determination result of the number of inliers and the affine matrix that are input from the affine matrix calculation unit 4973 to the lumen coordinates calculation unit 498.
A process performed by the control device 4B will be described below.
At Step S503, the coordinate transformation matrix calculation unit 497 performs the coordinate transformation matrix calculation process by using the captured image P1 that is acquired by the acquiring unit 491.
At Step S601 illustrated in
Subsequently, the template matching unit 4972 performs template matching between the captured image of the previous frame recorded in the memory unit 4971 and the current captured image (Step S602). Specifically, the template matching unit 4972 performs the template matching as described above, generates a set of nine corresponding points between the captured image Pn-1 of the previous frame and the captured image Pn of the current frame, and outputs the set of nine corresponding points to the affine matrix calculation unit 4973.
Thereafter, the affine matrix calculation unit 4973 performs the affine matrix calculation process based on the set of corresponding points input from the template matching unit 4972 and calculates the affine matrix and the number of inliers (Step S603). Specifically, the affine matrix calculation unit 4973 executes Step A to Step E as described above, and calculates the affine matrix (a model parameter or a homography matrix) for which the number of inliers is the largest and the largest number of inliers.
Subsequently, the trackability determination unit 4974 determines whether or not the number of inliers that is input from the affine matrix calculation unit 4973 is equal to or larger than a threshold (Step S604). Specifically, the trackability determination unit 4974 determines whether or not the number of inliers is equal to or larger than 5 (the number of inliers≥5).
Thereafter, the trackability determination unit 4974 outputs the trackability based on the determination result of the number of inliers and the affine matrix that are input from the affine matrix calculation unit 4973 to the lumen coordinates calculation unit 498 (Step S605). After Step S605, the control device 4 returns to the main routine in
Referring back to
At Step S504, the lumen coordinates calculation unit 498 performs a lumen coordinates calculation process of calculating the lumen coordinates of the lumen in the captured image P1 based on the category information that is input from the estimation unit 492 and the affine matrix and the trackability of the lumen tracking that are input from the coordinate transformation matrix calculation unit 497, outputting the lumen coordinates to the lumen direction calculation unit 499, and outputting the possibility of estimation of a lumen to the determination unit 493.
At Step S701 illustrated in
At Step S702, the lumen coordinates calculation unit 498 calculates coordinates of the category of the lumen.
Referring back to
At Step S703, the lumen coordinates calculation unit 498 records therein the lumen coordinates calculated at Step S702 until next frame processing on the captured image. After Step S703, the control device 4B returns to the main routine in
At Step S704, the lumen coordinates calculation unit 498 determines whether or not a lumen is trackable in the captured image of the current frame based on the trackability that is input from the coordinate transformation matrix calculation unit 497. When the lumen coordinates calculation unit 498 determines that the lumen is trackable in the captured image of the current frame from the captured image of the previous frame (Step S704: Yes), the control device 4B goes to Step S705 to be described later. In contrast, when the lumen coordinates calculation unit 498 determines that the lumen is not trackable in the captured image of the current frame from the captured image of the previous frame (Step S704: No), the control device 4B goes to Step S708 to be described later.
At Step S705, the lumen coordinates calculation unit 498 acquires the affine matrix and the lumen coordinates of the previous frame that are input from the coordinate transformation matrix calculation unit 497.
Subsequently, the lumen coordinates calculation unit 498 calculates tracking coordinates for tracking the lumen in the captured image of the current frame based on the affine matrix and the lumen coordinates of the previous frame that are input from the coordinate transformation matrix calculation unit 497 (Step S706).
Thereafter, the lumen coordinates calculation unit 498 records the lumen coordinates calculated at Step S706 until next frame processing on the captured image (Step S707). After Step S707, the control device 4B returns to the main routine in
At Step S708, the lumen coordinates calculation unit 498 outputs the coordinates of the previous frame that are recorded at Step S703 or Step S707 as described above.
Subsequently, the lumen coordinates calculation unit 498 records the lumen coordinates that are output at Step S708 as described above (Step S709). After Step S709, the control device 4B returns to the main routine in
Referring back to
At Step S505, the lumen direction calculation unit 499 calculates the lumen direction based on the lumen coordinates that are input from the lumen coordinates calculation unit 498, and outputs the lumen direction to the output control unit 494.
Subsequently, the determination unit 493 determines whether or not a lumen is present in the captured image based on the category information that is estimated by the estimation unit 492 (Step S506). Specifically, the determination unit 493 determines whether or not a lumen is present in the captured image based on presence or absence of a lumen that is included in the category information that is estimated by the estimation unit 492. When the determination unit 493 determines that a lumen is present in the captured image (Step S506: Yes), the control device 4B goes to Step S510. In contrast, when the determination unit 493 determines that a lumen is absent in the captured image (Step S506: No), the control device 4B goes to Step S507 to be described later.
At Step S507, the determination unit 493 determines whether or not the estimation unit 492 is able to estimate the lumen direction based on the category information that is estimated by the estimation unit 492. When the determination unit 493 determines that the estimation unit 492 is able to estimate the lumen direction (Step S507: Yes), the control device 4 goes to Step S508 to be described later. In contrast, when the determination unit 493 determines that the estimation unit 492 is not able to estimate the lumen direction (Step S507: No), the control device 4B goes to Step S509.
At Step S508, the output control unit 494 causes the image processing unit 45 to superimpose the lumen information that corresponds to the category information corresponding to the lumen direction that is calculated by the lumen direction calculation unit 499 onto the captured image that corresponds to the captured image data that is generated by performing image processing by the image processing unit 45, and output the captured image with the lumen information to the display device 3. After Step S508, the control device 4B goes to Step S511.
According to the third embodiment as described above, it is possible to output the lumen direction even when a lumen is absent in the captured image, and, when it is possible to track a corresponding point between chronologically successive captured images, it is possible to output the lumen direction with high accuracy.
A fourth embodiment will be described below. In the fourth embodiment, a lumen in a captured image is detected based on the captured image. In the following, a functional configuration of a control device according to the fourth embodiment will be first described, and thereafter, a process performed by the control device according to the fourth embodiment will be described.
The recording unit 47C further includes a lumen detection trained model recording unit 475, in addition to the functional configuration of the recording unit 47 of the first embodiment as described above.
The lumen detection trained model recording unit 475 records therein a lumen detection trained model. The lumen detection trained model is for estimating presence or absence of a lumen in the captured image that is captured by the endoscope 2. Specifically, the lumen detection trained model is obtained by causing a CNN to perform training on training data in which each of training images that correspond to pieces of captured data that are captured by the endoscope 2 is associated with presence or absence of a lumen in each of the training images. The CNN using the lumen detection trained model receives input of the captured image data and outputs presence or absence of a lumen in the captured image.
The control unit 49C further includes a lumen detection unit 500, in addition to the functional configuration of the control unit 49B according to the third embodiment as described above.
The lumen detection unit 500 outputs presence or absence of a lumen in the captured image, from the captured image, by the CNN using the lumen detection trained model that is recorded in the lumen detection trained model recording unit 475, and outputs detected coordinates at which the lumen is detected when the lumen is present.
An overview of a data flow of a main part of the control unit 49C will be described below.
As illustrated in
A process performed by the control device 4C will be described below.
At Step S802, the lumen detection unit 500 outputs presence or absence of a lumen in the captured image P1, from the captured image P1, by the CNN using the lumen detection trained model that is recorded in the lumen detection trained model recording unit 475, and outputs detected coordinates at which the lumen is detected when the lumen is present. After Step S802, the control device 4C goes to Step S803.
At Step S805, the lumen coordinates calculation unit 498 performs the lumen coordinates calculation process of calculating the lumen coordinates of the lumen in the captured image P1 based on presence or absence of a lumen and the detected coordinates that are input from the lumen detection unit 500, the category information that is input from the estimation unit 492, and the affine matrix and the trackability of the lumen tracking that are input from the coordinate transformation matrix calculation unit 497, outputting the lumen coordinates to the lumen direction calculation unit 499, and outputting the possibility of estimation of a lumen to the determination unit 493.
At Step S901 illustrated in
At Step S902, the lumen coordinates calculation unit 498 outputs lumen coordinates based on presence or absence of the lumen and the detected coordinates of the lumen that are detected by the lumen detection unit 500. The lumen coordinates calculation unit 498
records the lumen coordinates calculated at Step S902 until next frame processing on the captured image (Step S903). After Step S903, the control device 4C returns and goes to Step S806 in
According to the fourth embodiment as described above, even when a lumen is absent in the captured image, it is possible to output the lumen direction, and when a lumen is present in the captured image it is possible to output a position of the lumen with high accuracy.
A fifth embodiment will be described below. In the fifth embodiment, lumen information is output in accordance with proficiency of endoscope operation of an operator. In the following, a functional configuration of a control device according to the fifth embodiment will be first described, and thereafter, a process performed by the control device according to the fifth embodiment will be described.
The recording unit 47D further includes a proficiency trained model recording unit 476, in addition to the functional configuration of the recording unit 47 according to the first embodiment as described above.
The proficiency trained model recording unit 476 records therein a proficiency trained model that has performed training on training data in which each of training images that correspond to pieces of captured data that are captured by the endoscope 2 is associated with lumen presence-absence information that differs depending on proficiency of an operator with respect to an endoscope. The proficiency trained model is generated by using the CNN that is described above in the first embodiment.
The control unit 49D further includes a proficiency estimation unit 501, in addition to the functional configuration of the control unit 49 according to the first embodiment as described above.
The proficiency estimation unit 501 inputs the captured image to the CNN using the proficiency trained model that is recorded in a proficiency trained model recording unit 476, and causes the proficiency of the operator and perform estimation to be output.
A process performed by the control device 4D will be described below.
At Step S1005, the proficiency estimation unit 501 inputs the captured image to the CNN using the proficiency trained model that is recorded in the proficiency trained model recording unit 476, causes the proficiency of the operator to be output, and performs estimation.
Subsequently, the output control unit 494 superimposes the lumen information that corresponds to the category information in accordance with the proficiency of the operator onto the captured image P1, and outputs the captured image with the lumen information to the display device 3 (Step S1006). After Step S1006, the control device 4D goes to Step S107.
According to the fifth embodiment as described above, the output control unit 494 superimposes the lumen information that corresponds to the category information in accordance with the proficiency of the operator onto the captured image P1, and outputs the captured image with the lumen information to the display device 3; therefore, it is possible to assist observation in accordance with the proficiency of the operator.
Meanwhile, in the fifth embodiment, the output control unit 494 superimposes the lumen information that corresponds to the category information in accordance with the proficiency of the operator onto the captured image P1, and outputs the captured image with the lumen information to the display device 3, but embodiments are not limited to this example. For example, it may be possible to use a common trained model, and when the determination unit 493 determines presence or absence of a lumen, it may be possible to adjust a threshold for a degree of reliability in accordance with the proficiency. For example, the determination unit 493 may adjust the threshold for the degree of reliability in accordance with an estimation result that is estimated by the proficiency estimation unit 501 by inputting the captured image to the CNN using the proficiency trained model that is recorded in the proficiency trained model recording unit 476, causing the proficiency of the operator to be output, and performing estimation. For example, the determination unit 493 may perform determination by setting a larger threshold such that a lumen is determined as being present with an increase in the proficiency.
Various inventions may be made by appropriately combining a plurality of components disclosed in the endoscopes according to the first embodiment to the fifth embodiment of the present disclosure as described above. For example, some of the components may be removed from all of the components described in the endoscope system according to one embodiment of the present disclosure as described above. In addition, components described in the endoscope system according to embodiments of the present disclosure as described above may be combined appropriately.
Furthermore, in the endoscope system according to the first embodiment to the fifth embodiment of the present disclosure, the control devices 4 and 4A to 4D have functions to estimate a lumen direction and output the lumen direction, and include the trained model recording unit 473, the acquiring unit 491, the estimation unit 492, the determination unit 493, and the output control unit 494; however, for example, the trained model recording unit 473, the acquiring unit 491, the estimation unit 492, the determination unit 493, and the output control unit 494 may be arranged in a different support device or a different image processing device other than the control devices 4 and 4A to 4D. It is of course possible that the endoscope system according to the first embodiment to the fifth embodiment of the present disclosure may be implemented by a server that includes the trained model recording unit 473, the acquiring unit 491, the estimation unit 492, the determination unit 493, and the output control unit 494 via a network. It is of course possible that a server is assigned to each of the functions and each of the processes is performed in a distributed manner.
Moreover, in the endoscope system according to the first embodiment to the fifth embodiment of the present disclosure, the “unit” as described above may be replaced with a “means”, a “circuit”, or the like. For example, the control unit may be replaced with a control means or a control circuit.
Furthermore, a program that is executed by the endoscope system according to the first embodiment to the fifth embodiment of the present disclosure is provided by being recorded in a computer readable recording medium, such as a compact disc-read only memory (CD-ROM), a flexible disk (FD), a CD-recordable (CD-R), a Digital Versatile Disk (DVD), a universal serial bus (USB) medium, or a flash memory, as installable or executable file data.
Moreover, a program that is executed by the endoscope system according to the first embodiment to the fifth embodiment of the present disclosure may be stored in a computer connected to a network, such as the Internet, and may be provided by download via the network
Meanwhile, in the description of the flowcharts in the present specification, context of the processes among the steps is disclosed by using expressions such as “first”, “thereafter”, and “subsequently”, but the sequences of the processes needed to carry out the present disclosure are not uniquely defined by these expressions. In other words, the sequences of the processes in the flowcharts described in the present specification may be modified as long as there is no contradiction.
According to the present disclosure, it is possible to present a lumen direction even when a lumen is absent in an image.
Additional advantages and modifications will readily occur to those skilled in the art. Therefore, the disclosure in its broader aspects is not limited to the specific details and representative embodiments shown and described herein. Accordingly, various modifications may be made without departing from the spirit or scope of the general inventive concept as defined by the appended claims and their equivalents.
An image processing device comprising:
This application is based on and claims priority under 35 U.S.C. § 119 to U.S. Provisional Application No. 63/610,604, filed Dec. 15, 2023, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63610604 | Dec 2023 | US |