This application claims priority to German Application 10 2022 102 898.4, which was filed on Feb. 8, 2022. The content of this earlier filed application is incorporated by reference herein in its entirety.
Examples relate to a surgical microscope system and to a system, a method, and a computer program for a microscope of a surgical microscope system.
Autofocus is a functionality that is commonly used in optical cameras today. For example, surgical microscopes often provide an autofocus functionality. However, in surgical microscopes, the usual autofocus function typically uses a predefined image area, which is used as a focus reference. In many cases, the area of interest to the surgeon is often not within this autofocus area. This can lead the camera to focus on areas other than the surgeon intended.
In smartphones and mirrorless interchangeable lens cameras, an enhanced autofocus functionality is provided that searches for human faces and attempts to focus on the faces. Even when the faces move within the image, the autofocus may keep searching for the faces and adjust the focus accordingly.
In surgical microscopes and other medical optical imaging devices, the area of interest is not as universal as a human face. Instead, it could be anything, such as an anatomical or morphological structure, e.g., a vessel or a surgical cavity, respectively. Thus, it is not possible to predefine a spatial pattern, such as a face, and search for it within the image to set the autofocus target.
There may be a desire for an improved autofocus functionality for a surgical microscope.
This desire is addressed by the subject-matter of the independent claims.
The proposed concept is based on the finding, that, during surgery, the surgeon concentrates on a small region (i.e., the actual surgical site), with other portions around that area being of less interest. This region is defined or delineated by one or a few anatomical features, with the surgeon either operating on these anatomical features or between these anatomical features. In the proposed concept, the surgeon may point to an area in the image and, doing so, trigger the proposed functionality, to make sure that the image focus is automatically focused to this area. The surgeon may provide instructions that identify the anatomical feature in this area that needs to be followed, e.g., the system may be instructed to follow a user-specified vessel in a given area, or the instructions may be unspecific so that the proposed functionality may make some assumptions about which of the features are implied by the area indicated, e.g., to select a surgical cavity if the area indicated by the surgeon contains a surgical cavity. The proposed concept may thus allow the surgeon to select the autofocus area in a smart and meaningful way, such that the autofocus follows a certain image feature, instead of a spatially fixed area of the image.
Various examples of the present disclosure relate to a system for a microscope of a surgical microscope system. The system comprises one or more processors and one or more storage devices. The system is configured to obtain imaging sensor data from at least one optical imaging sensor of the microscope. The system is configured to determine information on an area of interest of a user of the surgical microscope system based on an input of the user. The system is configured to determine an anatomical feature of interest within the area of interest. The system is configured to detect a position of the anatomical feature of interest within the imaging sensor data. The system is configured to trigger an autofocus functionality of the microscope to focus on the position of the anatomical feature of interest. By determining an area of interest based on a user input, a detection algorithm may be provided with sufficient information for determining the likely feature of interest, which may be tracked and used as target for the autofocus functionality, such that the likely feature of interest remains in focus throughout the surgical procedure.
In general, the system may be configured to track the position of the anatomical feature of interest over a plurality of frames of the imaging sensor data. By tracking the feature over multiple frames, the autofocus functionality may be triggered to adjust the focus based on the current position of the anatomical feature of interest.
For example, the system may be configured to trigger the autofocus functionality if the position of the anatomical feature of interest shifts relative to a field of view of the imaging sensor data for at least a pre-defined time interval. In other words, if the position of the feature of interest is shifted not only temporarily (e.g., as the field of view is changed by the surgeon, or as the surgeon has moved the feature of interest), the autofocus functionality may be employed to adjust the focus to the current position of the anatomical feature of interest.
While surgical microscope systems are generally equipped with powerful processors for image processing, the efficiency of the proposed concept may be improved by limiting the detection of the position of the anatomical feature to only a subset of the frame, e.g., to every n-th frame. For example, the system may be configured to detect the position of the anatomical feature of interest in at most every second frame of the imaging sensor data.
In various examples, the anatomical feature of interest is detected using a software algorithm, which may be based on a machine-learning model. The effort required by said software algorithm may be reduced, and the accuracy may be improved, by limiting the imaging sensor data to the area of interest. For example, the system may be configured to locate the area of interest within the imaging sensor data. The system may be configured to determine the anatomical feature of interest within a portion of the imaging sensor data representing the area of interest.
Within the area of interest, different anatomical features may be identified by a (machine-learning based) algorithm, and the anatomical feature of interest may be selected among the identified anatomical features. For example, the system may be configured to perform image segmentation on at least the portion of the imaging sensor data representing the area of interest to determine at least one feature present within the portion of the imaging sensor data representing the area of interest. Additionally, or alternatively, the system may be configured to perform object detection on at least the portion of the imaging sensor data representing the area of interest to identify at least one feature present within the portion of the imaging sensor data representing the area of interest. The system may be configured to determine the anatomical feature of interest based on the at least one feature present within the portion of the imaging sensor data representing the area of interest. For example, an existing image segmentation machine-learning model or object detection machine-learning model may be employed for determining the anatomical features present in the area of interest, among which the anatomical feature of interest may be selected.
There are various features that are of potentially of interest to a surgeon, e.g., during brain surgery. For example, the system may be configured to perform the object detection to identify at least one of a blood vessel, branching points of a blood vessel, a bleeding, and a tumor within at least the portion of the imaging sensor data representing the area of interest.
Additionally, non-anatomical features may also be detected and used to define a non-anatomical feature of interest, which may be used instead of the anatomical feature of interest in some cases. For example, the system may be configured to perform the object detection to identify at least one of a clip and a stitching as non-anatomical feature of interest. The system may be configured to detect a position of the non-anatomical feature of interest within the imaging sensor data. The system may be configured to perform the autofocus functionality of the microscope further based on the position of the non-anatomical feature of interest. In some cases, non-anatomical features, such as a clip or a stitching, are placed right at the portion of the surgical site the surgeon intends to operate on, making them useful targets with respect to the autofocus functionality.
In some examples, the anatomical feature of interest may be part of a larger anatomical feature. For example, the anatomical feature may be a branch in a blood vessel, which is part of the larger blood vessel. This larger anatomical feature may thus be subdivided into smaller anatomical features, with neighboring anatomical features delimiting the extent of a given anatomical feature within the larger anatomical feature. For example, the system may be configured to determine an extent of the anatomical feature of interest based on an extent of one or more features located adjacent to the feature the anatomical feature of interest is based on.
In some examples, the area of interest might not be definable by a single anatomical feature of interest but may relate to an area between features or to a group of features. For example, the system may be configured to, if the area of interest indicated by the user relates to two or more features or to an area between two or more features, to determine the anatomical feature of interest based on the two or more features, and to determine the position of the anatomical feature of interest based on the positions of the two or more features.
In this case, the features delimiting the area of interest may even be outside the area of interest. Accordingly, the system may be configured to, if the area of interest indicated by the user relates to an area between two or more features, select the two or more features from features located inside the area of interest and features located outside the area of interest.
There are various options for the user to specify the area of interest. For example, the system may be configured to detect a pointer operated by the user within the imaging sensor data to determine the area of interest. Alternatively, or additionally, the system may be configured to determine the area of interest based on a user input signal obtained via a user interface of the surgical microscope system. A pointer or user interface enables a very precise specification of the area of interest.
Alternatively, or additionally, the system is configured to determine the area of interest based on a voice description of an anatomical feature obtained via a voice command system of the surgical microscope system. This way, the area of interest may be defined without the surgeon having to move the instruments away from the surgical site.
The previously listed options allow the surgeon to explicitly define the area of interest. In some examples, however, the area of interest may be defined implicitly, e.g., by deducing the area of interest from the actions of the surgeon. For example, the system may be configured to determine a portion of a surgical site being operated on by the user within the imaging sensor data to determine the area of interest. Alternatively, or additionally, the system may be configured to determine the area of interest using a gaze tracking mechanism. Alternatively, or additionally, the system may be configured to determine the area of interest from a predetermined image area (preferably in the center of the field of view) after the user finishes aligning the field of view. In these cases, the user does not have to explicitly define the area of interest, thus decreasing the effort required by the surgeon.
In some examples, some amount of image processing may be performed to facilitate the definition of the area of interest. For example, the system may be configured to perform image segmentation and/or object detection to determine a plurality of features within the imaging sensor data. The system may be configured to determine a visual representation of the plurality of features and to provide a display signal comprising the visual representation to a display device of the surgical microscope system. The system may be configured to obtain the input of the user in response to the visual representation of the plurality of features. By providing the visual representation of the plurality of features, the user can select among the detected features to define the area of interest.
Surgical microscope systems are often adapted to detect fluorescence emissions that are emitted by a fluorophore that is injected into a blood vessel or tissue. These fluorescence emissions may be used to clearly identify the anatomical feature of interest. Also, additional imaging information such as information gained from multispectral imaging may be used to clearly identify the anatomical feature of interest. For example, the imaging sensor data may comprise a first component with color imaging data and a second component with at least one of hyperspectral imaging data, multispectral imaging data, fluorescence imaging data. The system may be configured to determine the anatomical feature of interest and/or to detect the position of the anatomical feature of interest at least based on the second component.
The determined anatomical (or non-anatomical) feature of interest may be highlighted and presented to the user, e.g., so the user can be assured that the “right” feature is used for the autofocus functionality. For example, the system may be configured to generate a digital view based on the imaging sensor data. The system may be configured to highlight the area of interest and/or the anatomical (or non-anatomical) feature of interest within the digital vie. The system may be configured to provide a display signal comprising the digital view to a display device of the surgical microscope system
Various examples relate to a surgical microscope system comprising a microscope with an optical imaging sensor and the system presented above.
Various examples relate to a corresponding method for a microscope of a surgical microscope system. The method comprises obtaining imaging sensor data from at least one optical imaging sensor of the microscope. The method comprises determining information on an area of interest of a user of the surgical microscope system based on an input of the user. The method comprises determining an anatomical feature of interest within the area of interest. The method comprises detecting a position of the anatomical feature of interest within the imaging sensor data. The method comprises triggering an autofocus functionality of the microscope to focus on the position of the anatomical feature of interest.
Various examples relate to a corresponding computer program with a program code for performing the above method when the computer program is executed on a processor.
Some examples of apparatuses and/or methods will be described in the following by way of example only, and with reference to the accompanying figures, in which
Various examples will now be described more fully with reference to the accompanying drawings in which some examples are illustrated. In the figures, the thicknesses of lines, layers and/or regions may be exaggerated for clarity.
Beside the optical components that are part of the microscope 120, the surgical microscope system 100 further comprises the system 110, which is a computer system. The system comprises one or more processors 114 and one or more storage devices 116. Optionally, the system may further comprise one or more interfaces 112. As shown in
In general, a microscope system, such as the surgical microscope system 100, is a system that comprises a microscope 120 and additional components, which are operated together with the microscope, such as the system 110 (which is a computer system being adapted to control the surgical microscope system, and, for example, process imaging sensor data of the microscope), and additional sensors, displays etc.
There are a variety of different types of microscopes. If the microscope is used in the medical or biological fields, the object being viewed through the microscope may be a sample of organic tissue, e.g., arranged within a petri dish or present in a part of a body of a patient. In the present case, the microscope 120 is a microscope of a surgical microscope system, i.e., a microscope that is to be used during a surgical procedure, such as an oncological surgical procedure or during tumor surgery. Accordingly, the object being viewed through the microscope, and shown in the composite image, may be a sample of organic tissue of a patient, and may be in particular be the surgical site that the surgeon operates on during the surgical procedure. In the following, the object to be imaged, i.e., the surgical site, is assumed to be a surgical site of a brain during the course of neurosurgery. However, the proposed concept is also suitable for other types of surgery, such as cardiac surgery or ophthalmology.
The proposed concept is based on two main components—obtaining the user input to define the area of interest and using the user-specified area of interest to determine the anatomical feature of interest, which is then tracked so the autofocus can be applied on the anatomical feature of interest even if the field of view changes or if the feature of interest is moved.
The proposed concept starts by determining the area of interest based on the user input. There are various options for performing this task. For example, the user may use a pointer, such as a finger, a surgical instrument, or a dedicated pointer device (e.g., with an active visual marker such as a light, a passive visual marker such as a two-dimensional code) and point at, or encircle, one or more anatomical features to indicate the area of interest (with the area of interest surrounding the point or circle being indicated by the user). The pointing and/or encircling may be detected using a machine-learning model being trained to determine pointing and/or encircling actions in imaging sensor data or using a deterministic algorithm that is configured to determine the pointing or encircling action based on the imaging sensor data (e.g., based on the active or passive visual marker or by performing object detection on the tip of the finger). The system may be configured to detect the pointer operated by the user within the imaging sensor data to determine the area of interest. For example, the system may be configured to, if the user points at a (single) position in the field of view, determine the area of interest around the (single) point. For example, if the user points at a feature (e.g., an anatomical or non-anatomical feature), the area of interest may surround said feature. The system may be configured to, if the user encircles an area within the field of view, determine the area of interest to coincide with the area being encircled by the user. For example, if the user encircles a feature or set of mutually adjacent features, the area of interest may surround said feature or set of features. The system may be configured to, if the user points at multiple positions, determine, whether the points intersect with the same anatomical feature, and then set the area of interest around said (portion) of the anatomical feature. For example, if the user points multiple times along an elongated blood vessel (e.g., as shown in
A similar approach may be taken if the user defines the area of interest via a user interface, such as a touch screen of the surgical microscope system. For example, the system may be configured to determine the area of interest based on a user input signal obtained via the user interface of the surgical microscope system. Again, the user may use a single or multiple points or a circle to define the area of interest.
To support the selection process, the imaging sensor data may be analyzed to determine and distinguish features in the imaging sensor data, such as anatomical features (e.g., blood vessels, tumors, branches, portions of tissue etc.) or non-anatomical features (e.g., clips or stitches). For this, one or both of the following machine-learning-based techniques may be used—image segmentation and object detection. The system may be configured to perform image segmentation and/or object detection to determine a plurality of features within the imaging sensor data. In object detection, the location of one or more pre-defined objects (i.e., objects that the respective machine-learning model is trained on) in the imaging sensor data is output by a machine-learning model, along with a classification of the object (if the machine-learning model is trained to detect multiple different types of objects). In general, the location of the one or more pre-defined objects is provided as a bounding box, i.e., a set of positions forming a rectangular shape that surrounds the respective object being detected. In image segmentation, the location of features (i.e., portions of the imaging sensor data that have similar attributes, e.g., that belong to the same object) are output by a machine-learning model. In general, the location of the features is provided as a pixel mask, i.e., the location of pixels that belong to a feature are output on a per-feature basis.
For both object detection and image segmentation, a machine-learning model is used that is trained to perform the respective task. For example, to train the machine-learning model being trained to perform object detection, a plurality or samples of imaging sensor data may be provided as training input samples, and a corresponding listing of bounding box coordinates may be provided as desired output of the training, with a supervised learning-based training algorithm being used to perform the training using the plurality of training input samples and corresponding desired output. For example, to train the machine-learning model being trained to perform image segmentation, the plurality or samples of imaging sensor data may be provided as training input samples, and corresponding pixel masks may be provided as desired output of the training, with a supervised learning-based training algorithm being used to perform the training using the plurality of training input samples and corresponding desired output. In some examples, the same machine-learning model may be used to perform both object detection and image segmentation. In this case, the two above-mentioned types of desired output may be used in parallel during the training, with the machine-learning model being trained to output both the bounding boxes and the pixel masks. The machine-learning model might not only be used to support the selection of the area of interest, but also with to determine the anatomical feature of interest, as will be shown at a later stage.
The system may be configured to determine a visual representation of the plurality of features. For example, the system may be configured to determine the visual representation with overlays highlighting and/or delineating the plurality of features, e.g., with an outline around the features (e.g., based on the pixel masks or bounding boxes), and/or with a color overlay being overlaid over the respective features (e.g., based on the pixel masks or bounding boxes). For example, if object detection is being performed, a written description of the feature may be included next to the feature. In addition, a grid may be displayed, which may become useful for a voice-based user interface. For example, the visual representation may be included in a digital view of the surgical site being viewed through the microscope, e.g., overlaid over the digital view. The system may be configured to provide a display signal comprising (digital view with) the visual representation to a display device 130 (e.g., the ocular displays 130a or the auxiliary display 130b) of the surgical microscope system. The user may use the visual representation to select the area of interest, e.g., by pointing at one or multiple of the features or by encircling the features. Accordingly, the system may be configured to obtain the input of the user in response to the visual representation of the plurality of features.
This visual representation may be particularly useful if a voice-based user interface is used to select the area of interest. For example, the system may be configured to determine the area of interest based on a voice description of an anatomical feature obtained via a voice command system of the surgical microscope system. The system may be configured to process the voice description using a voice processing algorithm to associate the anatomical feature being described with one of the plurality of features identified using object detection and/or image segmentation. For example, the system may be configured to, if the voice description comprises a reference to a cell of the grid, select an anatomical feature fitting the description that is shown within the cell of the grid, and determine the area of interest based on the selected anatomical feature. Alternatively, or additionally, a natural language description may be used (e.g., “the largest branch on the lower right side”, or “the portion of the blood vessel between the two branches shown on the right side of the view”), and the system may be configured to select the anatomical feature based on the natural language description and determine the area of interest based on the selected anatomical feature.
Above, the area of interest is selected explicitly by the user. Alternatively, the selection of the area of interest may be performed implicitly, with the system interpreting the actions of the user (e.g., the surgeon) to determine the area of interest. For example, the system may be configured to determine a portion of a surgical site being operated on by the user within the imaging sensor data to determine the area of interest. For example, the system may be configured to perform object detection to detect the location of one or more surgical instruments within the imaging sensor data, and to determine the portion of the surgical site being operated on by the user based on the location of the one or more surgical instruments, and to determine the area of interest around the location of the one or more surgical instruments (e.g., around the surgical site-facing side of the one or more surgical instrument). Additionally, or alternatively, the system may be configured to determine the area of interest using a gaze tracking mechanism, e.g., by tracking the gaze of the user through the oculars of the microscope and determining the area of interest to be an area the gaze is focused on for a pre-defined time interval. Alternatively, or additionally, the system may be configured to determine the area of interest from a predetermined image area (within the field of view, e.g., a central area within the field of view covering at least 10% and at most 50% of the field of view) after the user finishes aligning the field of view (i.e., after the user changes the position of the microscope or adjusts the magnification of the microscope.
Once the area of interest is determined, the anatomical feature of interest within the area of interest is selected. In particular, the system may be configured to locate the area of interest within the imaging sensor data (and thus the current field of view), and to determine the anatomical feature of interest within a portion of the imaging sensor data representing the area of interest, e.g., using image processing techniques, such as the aforementioned object detection or image segmentation. In other words, image processing techniques are used to automatically, e.g., without additional user input in addition to the selection of the area of interest, select the anatomical feature of interest.
As outlined above, at least one of the two techniques “object detection” and “image segmentation” may be used to analyze the imaging sensor data and determine features (e.g., anatomical, or non-anatomical features) within the imaging sensor data, with the anatomical feature being selected being among the determined features. For this, either the (optional) previous determination of features across the imaging sensor data may be used, or the object detection or image segmentation may be performed on a subset of the imaging sensor data corresponding to the area of interest (and potentially some surrounding areas, as will become evident in the following). Accordingly, the system may be configured to perform image segmentation on at least the portion of the imaging sensor data representing the area of interest to determine at least one feature present within the portion of the imaging sensor data representing the area of interest, and to determine the anatomical feature of interest based on the at least one feature (e.g., pixel mask) present within (e.g., intersecting with) the portion of the imaging sensor data representing the area of interest. Alternatively, or additionally, the system may be configured to perform object detection on at least the portion of the imaging sensor data representing the area of interest to identify at least one feature (e.g., bounding box) present within (e.g., intersecting with) the portion of the imaging sensor data representing the area of interest, and to determine the anatomical feature of interest based on the identified at least one feature present within the portion of the imaging sensor data representing the area of interest. In other words, if the output of the object detection of image segmentation machine-learning model indicates that one or more features intersect with the area of interest in the imaging sensor data, said feature (or features) may be used as feature(s) of interest. If multiple points are indicated by the user, the feature or combination of features intersecting with all or most of the points may be selected as anatomical feature of interest.
In some examples, the features being used to select the anatomical feature of interest may be restricted to specific groups of features. For example, the system may be configured to perform the object detection to identify at least one of a blood vessel, branching points of a blood vessel, a bleeding, a tumor, a surgical cavity, tissue having a pre-defined color, tissue having a discoloration, an elevated area, and a depressed area within at least the portion of the imaging sensor data representing the area of interest. Accordingly, the machine-learning model being trained to perform object detection may be trained to detect at least one of a blood vessel, branching points of a blood vessel, a bleeding, a tumor, a surgical cavity, tissue having a pre-defined color, tissue having a discoloration, an elevated area, and a depressed area in imaging sensor data. Analogously, the machine-learning model being trained to perform image segmentation may be trained to perform image segmentation for at least one of a blood vessel, branching points of a blood vessel, a bleeding, a tumor, a surgical cavity, tissue having a pre-defined color, tissue having a discoloration, an elevated area, and a depressed area in imaging sensor data. Accordingly, the anatomical feature of interest may be selected based on the output of one or more machine-learning models being trained to output information, such as bounding boxes or pixel masks, representing specific features, such as the aforementioned blood vessel, branching point, bleeding, tumor, surgical cavity, tissue having a pre-defined color, tissue having a discoloration, elevated area, or depressed area.
In the following, some examples are given on how the proposed methodology is applied to imaging sensor data collected during neurosurgery.
In some examples, the system may also consider the neighboring structures, such as branching of the same vessels, which are not directly within the user defined area, i.e., a certain segment of the may be understood by the algorithm as the vessel part between two characteristic brunches or vessel curvatures. For example, as shown in
In some examples, the area between two characteristic features may be deemed to be of interest. In absence of characteristic features, the area of interest, and thus the anatomical feature of interest, may be specified as an area spatially oriented relative to other features even outside the area of interest. For example, the system may be configured to, if the area of interest indicated by the user relates to two or more features or to an area between two or more features, to determine the anatomical feature of interest based on the two or more features, and to determine the position of the anatomical feature of interest based on the positions of the two or more features. Moreover, the system may be configured to, if the area of interest indicated by the user relates to an area between two or more features, select the two or more features from features located inside the area of interest and features located outside the area of interest. For example, the area of interest may be understood by the algorithm as the tissue between two vertically oriented vessels near but outside the user-indicated area of interest. A secondary optical imaging sensor with a wider field of view may also be used for this purpose, i.e., for selecting features outside the area of interest. In some examples, the afore-mentioned image analysis/processing techniques may be used to extract features which are either of low contrast, or invisible, or require quantification. For example, the system may be configured to perform image processing to measure parameters related to vessels/capillaries such as density, branching, and curvature, and use these parameters to segment and identify the anatomical feature of interest. For example, a tissue area with high density of capillaries may be an anatomical feature of interest.
In some examples, the system may also consider foreign objects such as clips or stitches. For example, the system may be configured to perform the object detection to identify at least one of a clip and a stitching as non-anatomical feature of interest. Accordingly, the machine-learning model being trained to perform object detection and/or image segmentation may be trained based on imaging sensor data showing non-anatomical features (as training input samples) and the corresponding bounding box coordinates and/or pixel masks as desired output. The system may be configured to detect a position of the non-anatomical feature of interest within the imaging sensor data, and to perform the autofocus functionality of the microscope further based on the position of the non-anatomical feature of interest. The non-anatomical feature of interest may be used in addition to or instead of the anatomical feature of interest, e.g., the system may support both using the anatomical feature and the non-anatomical feature with respect to the autofocus capability. For example, if the user points at an anatomical feature of interest or the area of interest comprises a (more prominent) anatomical feature, the autofocus functionality may be performed based on the anatomical feature, and if the user points at a non-anatomical feature of interest or the area of interest comprises a (more prominent) non-anatomical feature, the autofocus functionality may be performed based on the non-anatomical feature. In other words, if the user selection (clearly) relates to the non-anatomical feature, the autofocus functionality may be performed based on the non-anatomical feature, else the autofocus functionality may be performed based on the anatomical feature.
Once the anatomical (or non-anatomical) feature of interest is determined, the autofocus functionality is targeted at the feature of interest. In particular, the system is configured to detect the position of the anatomical feature of interest within the imaging sensor data, and to trigger the autofocus functionality of the microscope to focus on the position of the anatomical or non-anatomical feature of interest. In other words, the system may be configured to set the focus point used by the autofocus system to the position of the (anatomical or non-anatomical) feature of interest.
In various examples, the elaborate determination of the feature of interest is performed to enable tracking the feature of interest across frames and reengaging the autofocus functionality once the field of view changes (e.g., because the user has moved the microscope or changed the magnification of the microscope) or once the position of the feature of interest changes. Accordingly, the system may be configured to track the position of the anatomical or non-anatomical feature of interest over a plurality of frames of the imaging sensor data, e.g., by performing the object detection or image segmentation on the plurality of frames and determining a correspondence between features of subsequent frames (e.g., based on the shape of the features indicated by the pixel masks output by the image segmentation). However, since this process is computationally expensive and a difference between subsequent frames is usually low, the tracking may be performed at a lower frame rate than the frame rate of the imaging sensor data. The system may thus be configured to detect the position of the anatomical feature of interest in at most every second frame (e.g., every n-th frame, with n∈{2, 3, 4, 6, 10, 12, 15, 24, 30, 45, 60}) of the imaging sensor data. The system may be configured to trigger the autofocus functionality if the position of the anatomical or non-anatomical feature of interest shifts relative to a field of view of the imaging sensor data (e.g., as the field of view shifts or as the anatomical feature is moved) for at least a pre-defined time interval, e.g., for at least 1, or at least 2, or at least 5, or at least 10 seconds. In other words, the autofocus functionality may be reengaged based on the position of the anatomical or non-anatomical feature of interest if the position of the anatomical or non-anatomical feature of interest non only temporarily changes relative to the field of view. The system may be configured to detect the feature of interest shifting relative to the field of view, e.g., by comparing the position of the feature of interest in successive frames.
In some examples, the feature of interest may change during the surgical procedure. For example, if the feature of interest is a tumor, the tumor may be removed, if the feature of interest is a blood vessel, the blood vessel may be deformed, if the feature of interest is a crossing of two blood vessels, and one of the blood vessels is removed. In such cases, the anatomical feature of interest may be updated and/or redefined. In other words, the system may be configured to repeat determining the anatomical feature of interest if the previously determined feature of interest is removed or deformed. In this case, the area surrounding the previous anatomical feature of interest may be used as area of interest, and the new anatomical feature of interest may be determined within said area of interest. For example, if the anatomical feature of interest is a vessel, and at some point, the vessel is deformed, the new deformed shape may be used in the definition of the anatomical feature of interest. Alternatively, or additionally, a different type of dynamic the anatomical feature of interest definition may be used. For example, the anatomical feature of interest may start as the point where an artery is crossing with a vein, and during the operation the vein may be removed from the field of view. In this case, the anatomical feature of interest may be redefined dynamically as the artery part after a specific branching. In some examples, a smart-dynamic the anatomical feature of interest adaptation may be performed by considering the area where the surgeon is working on. For example, as the surgeon is digging in the tissue towards a deeper tumor, the algorithm may follow the “point of action”.
The anatomical or non-anatomical feature of interest being selected, and thus the point being used with respect to the autofocus functionality, may be communicated to the user of the surgical microscope system (e.g., to the surgeon). As outlined above, a digital view of the surgical site may be generated based on the imaging sensor data, showing the surgical site together with additional information, e.g., settings of the surgical microscope system etc. Accordingly, the system may be configured to generate the digital view based on the imaging sensor data. In addition, the selected anatomical or non-anatomical feature of interest may be highlighted, e.g., by showing a color overlay over the anatomical or non-anatomical feature of interest, or by showing a visual indicator representing the focus point. The system may be configured to highlight the area of interest (e.g., as an outline) and/or the feature of interest (e.g., as another outline, color overlay or visual indicator representing the focus point) within the digital view, e.g., by adding an overlay with the outline(s), color overlay or visual indicator. The system may be configured to provide the display signal comprising the digital view to the display device 130 of the surgical microscope system.
In various examples, a visual representation of the plurality of features and/or a digital view of the surgical site are generated and provided to a display 130 of the surgical microscope system as part of a display signal. The visual representation or digital view may be viewed by the user, e.g., the surgeon, of the surgical microscope system. For this purpose, the display signal may be provided to the display, e.g., the auxiliary display 130b or the ocular displays 130a, of the microscope system. Accordingly, the system may be configured to generate the display signal for the display device 130 of the microscope system, the display signal being based on the digital view or visual representation. For example, the display signal may be a signal for driving (e.g., controlling) the display device 130. For example, the display signal may comprise video data and/or control instructions for driving the display. For example, the display signal may be provided via one of the one or more interfaces 112 of the system. Accordingly, the system 110 may comprise a video interface 112 that is suitable for providing the display signal to the display 130 of the microscope system 100.
In the proposed microscope system, at least one optical imaging sensor is used to provide the imaging sensor data. Accordingly, the optical imaging sensor 122 is configured to generate the imaging sensor data. For example, the at least one optical imaging sensor 122 of the microscope 120 may comprise or be an APS (Active Pixel Sensor)—or a CCD (ChargeCoupled-Device)-based imaging sensor 122. For example, in APS-based imaging sensors, light is recorded at each pixel using a photodetector and an active amplifier of the pixel. APS-based imaging sensors are often based on CMOS (Complementary Metal-Oxide-Semiconductor) or S-CMOS (Scientific CMOS) technology. In CCD-based imaging sensors, incoming photons are converted into electron charges at a semiconductor-oxide interface, which are subsequently moved between capacitive bins in the imaging sensors by a circuitry of the imaging sensors to perform the imaging. The processing system 110 may be configured to obtain (i.e., receive or read out) the imaging sensor data from the optical imaging sensor. The imaging sensor data may be obtained by receiving the imaging sensor data from the optical imaging sensor (e.g., via the interface 112), by reading the imaging sensor data out from a memory of the optical imaging sensor (e.g., via the interface 112), or by reading the imaging sensor data from a storage device 116 of the system 110, e.g., after the imaging sensor data has been written to the storage device 116 by the optical imaging sensor or by another system or processor.
The one or more interfaces 112 of the system 110 may correspond to one or more inputs and/or outputs for receiving and/or transmitting information, which may be in digital (bit) values according to a specified code, within a module, between modules or between modules of different entities. For example, the one or more interfaces 112 may comprise interface circuitry configured to receive and/or transmit information. The one or more processors 114 of the system 110 may be implemented using one or more processing units, one or more processing devices, any means for processing, such as a processor, a computer or a programmable hardware component being operable with accordingly adapted software. In other words, the described function of the one or more processors 114 may as well be implemented in software, which is then executed on one or more programmable hardware components. Such hardware components may comprise a general-purpose processor, a Digital Signal Processor (DSP), a micro-controller, etc. The one or more storage devices 116 of the system 110 may comprise at least one element of the group of a computer readable storage medium, such as a magnetic or optical storage medium, e.g., a hard disk drive, a flash memory, Floppy-Disk, Random Access Memory (RAM), Programmable Read Only Memory (PROM), Erasable Programmable Read Only Memory (EPROM), an Electronically Erasable Programmable Read Only Memory (EEPROM), or a network storage.
More details and aspects of the surgical microscope system are mentioned in connection with the proposed concept, or one or more examples described above or below (e.g.,
For example, the method may be implemented by the surgical microscope system introduced in connection with one of the
More details and aspects of the method are mentioned in connection with the proposed concept, or one or more examples described above or below (e.g.,
Various examples of the present disclosure relate to a concept for an adaptive autofocus.
In general, the surgeon needs to manually focus the region of interest every time they move the microscope. The proposed concept provides an auto focus of the region of interest, which is manually defined (only once) by the surgeon. Thus, there is no need to spend time adjusting the focus each time the surgeon moves the microscope. Consequently, the region of interest may always be on focus.
The proposed concept is based on four components: In a first component, the surgeon identifies an area of interest that needs to remain in focus. In a second component, an algorithm converts the local information, i.e., region of interest (ROI), also area of interest, as discussed in connection with
In the following, some implementation examples are given for the four fundamental components listed above. For example, with respect to the first component, the surgeon may use a pointing device, e.g., computer mouse, or touchscreen, to define the ROI. Alternatively, or additionally, the surgeon may use an object within the field of view, e.g., use the tip of a surgical tool, or the fingertip. In some examples, the microscope system may employ an eye-tracking device, so that the surgeon can look at an image area and thus indicate the ROI. For example, the defined ROI may be defined using a standard (i.e., geometric) shape (e.g., circle or square), or may be defined using a free-form line outlining the desired structure.
In some examples, the user may point or click once or multiple times on the desired feature, e.g., clicking along a vessel. Additionally, or alternatively, the surgeon may use a voice command, e.g., “follow the branching of the big artery at the upper-left side”. In some examples, the microscope may segment the image based on different criteria and the surgeon may select one of the segments, e.g., segmentation of vessels, and vessel features (e.g., branching, deformations, color), surgical cavities, colors/discolorations, and elevated areas. For example, the segmentation may (also) use information beyond the information shown in the color image, such as fluorescence, multispectral/hyperspectral images, and preoperative data (e.g., a tumor is known to be deep in the tissue). For example, the segmentation of the image may be done or assisted by the surgical activity of the surgeon, e.g., the area that the surgeon is cutting, touching, manipulating
After the ROI is selected by the surgeon, the system may identify the feature of interest (FOI), i.e., perform the second component of the proposed concept. For example, the system may identify morphological and color features, such as arteries, veins, and unique shape identifiers on them, e.g., branching. The system may identify low-contrast or non-visible information within the ROI. For example, if the surgeon clicks multiple times on a vein, the system may understand that it is about the vein and not the nearby arteries, or if the surgeon clicks on the fully oxygenated blood on the surface of a cavity (micro-bleeding), the algorithm may consider the blood saturation and the shape of this element (micro-bleeding) and thus identify the surgical cavity. In some examples, the system may also consider the neighboring structures, such as branching of the same vessels, which are not directly within the user defined area, i.e., a certain segment of the may be understood by the algorithm as the vessel part between two characteristic brunches or vessel curvatures. In some examples, the system may consider foreign objects such as clips or stitches.
In absence of characteristic features, it may be possible to define the ROI, and consequently the FOI, as an area spatially oriented relative to other features even outside the ROI. For example, the ROI may be understood by the algorithm as the tissue between two vertically oriented vessels near but outside the user-indicated ROI. A secondary camera with a wider Field of View (FoV) may also be used for this purpose. In some examples, image analysis/processing techniques may be used to extract features which are either of low contrast, or invisible, or require quantification. For example, image processing may measure parameters related to vessels/capillaries such as density, branching, and curvature, and use these parameters to segment and identify the FOL. A tissue area with high density of capillaries may be a FOL. In general, a FOI may be described by any one or more features. The result of the FOI identification process may be visualized to ensure the validity of the recognition. The surgeon may confirm, or the surgeon might only take action if it is wrong. For example, the system may indicate the type of features the algorithm uses for the tracking with specific symbols (e.g., an arrow indicating the size of a discolored patch, an arc indicating the curvature of a vessel, or the presence of a fluorescence signal). Thereby the surgeon may be better equipped to control the tracking validity.
In the third component of the system, the algorithm searches for the FOL. For example, the algorithm may search for the FOI using a dynamic FOI description. For example, the FOI may be searched in each frame, but regularly (e.g., every n-th frame), the description of the FOI may be redefined (i.e., updated). For example, when the FOI is a vessel, and at some point, the vessel is deformed, the new deformed shape may be used in the definition. Alternatively, or additionally, a different type of dynamic FOI definition may be used. For example, in addition to the dynamic FOI definition specified above, the different type of FOI may provide the ability to change the way the FOI is defined. For example, a FOI may start as the point where an artery is crossing with a vein, and during the operation the vein may be removed from the FOV. In this case, the FOI may be redefined dynamically as the artery part after a specific branching. In some examples, a smart-dynamic FOI adaptation may be performed by considering the area where the surgeon is working on. For example, as the surgeon is digging in the tissue towards a deeper tumor, the algorithm may follow the “point of action”. Similar to the definition of the FOI, the result of the FOI-searching process may be visualized (e.g., by outlining the current ROI being used for autofocusing). This can be done continuously, periodically (e.g., blinking for 1 second every 15 seconds), when there is a change in the parameter (e.g., when the length of an anatomical structure changes), or when the recognized feature is changed (e.g., the algorithm stops tracking the crossing point of vessels, and now tracks a branching). In some examples, the proposed system may indicate the certainty of the algorithm in the identification of FOL. This may allow the surgeon to redefine the ROI/FOI before a crucial part of the operation and thus avoid the greater inconvenience of the autofocus failure at a crucial surgical step.
With respect to the fourth component, i.e., the communication of the ROI to the autofocus subsystem, the following implementation approaches may be used. For example, the focus ROI may be communicated with a time delay to avoid errors due to instant misinterpretation of the FOI by the algorithm. For example, if the FOI is covered by another object (e.g., tissue, tool, gauze), this should not cause the algorithm “going crazy”. For example, the communication frequency may be influenced by the certainty/confidence of the FOI-recognition. For example, when the algorithm is certain (i.e., the FOI is found with a high level of certainty), the focus ROI may be updated instantly, but when the certainty is low, it may wait until either the confidence raises or issue a warning. In some examples, the ROI may contain different weights for ROI zones with different confidence. For example, tissue structures that are recognized well may have 90% weight, and at the same time other tissue areas that have a vague recognition (e.g., as the fluorescence signal is weak) may have 30% weight. This means that the autofocus may give priority in focusing the zones with high confidence.
More details and aspects of the concept for adaptive autofocus are mentioned in connection with the proposed concept or one or more examples described above or below (e.g.,
Some embodiments relate to a microscope comprising a system as described in connection with one or more of the
The computer system 420 may be a local computer device (e.g. personal computer, laptop, tablet computer or mobile phone) with one or more processors and one or more storage devices or may be a distributed computer system (e.g. a cloud computing system with one or more processors and one or more storage devices distributed at various locations, for example, at a local client and/or one or more remote server farms and/or data centers). The computer system 420 may comprise any circuit or combination of circuits. In one embodiment, the computer system 420 may include one or more processors which can be of any type. As used herein, processor may mean any type of computational circuit, such as but not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a graphics processor, a digital signal processor (DSP), multiple core processor, a field programmable gate array (FPGA), for example, of a microscope or a microscope component (e.g. camera) or any other type of processor or processing circuit. Other types of circuits that may be included in the computer system 420 may be a custom circuit, an application-specific integrated circuit (ASlC), or the like, such as, for example, one or more circuits (such as a communication circuit) for use in wireless devices like mobile telephones, tablet computers, laptop computers, two-way radios, and similar electronic systems. The computer system 420 may include one or more storage devices, which may include one or more memory elements suitable to the particular application, such as a main memory in the form of random access memory (RAM), one or more hard drives, and/or one or more drives that handle removable media such as compact disks (CD), flash memory cards, digital video disk (DVD), and the like. The computer system 420 may also include a display device, one or more speakers, and a keyboard and/or controller, which can include a mouse, trackball, touch screen, voice-recognition device, or any other device that permits a system user to input information into and receive information from the computer system 420.
Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a processor, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some one or more of the most important method steps may be executed by such an apparatus.
Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a non-transitory storage medium such as a digital storage medium, for example a floppy disc, a DVD, a Blu-Ray, a CD, a ROM, a PROM, and EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may, for example, be stored on a machine readable carrier.
Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
In other words, an embodiment of the present invention is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
A further embodiment of the present invention is, therefore, a storage medium (or a data carrier, or a computer-readable medium) comprising, stored thereon, the computer program for performing one of the methods described herein when it is performed by a processor. The data carrier, the digital storage medium or the recorded medium are typically tangible and/or non-transitionary. A further embodiment of the present invention is an apparatus as described herein comprising a processor and the storage medium.
A further embodiment of the invention is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may, for example, be configured to be transferred via a data communication connection, for example, via the internet.
A further embodiment comprises a processing means, for example, a computer or a programmable logic device, configured to, or adapted to, perform one of the methods described herein.
A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
A further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
In some embodiments, a programmable logic device (for example, a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are preferably performed by any hardware apparatus.
As used herein the term “and/or” includes any and all combinations of one or more of the associated listed items and may be abbreviated as “/”.
Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
Embodiments may be based on using a machine-learning model or machine-learning algorithm. Machine learning may refer to algorithms and statistical models that computer systems may use to perform a specific task without using explicit instructions, instead relying on models and inference. For example, in machine-learning, instead of a rule-based transformation of data, a transformation of data may be used, that is inferred from an analysis of historical and/or training data. For example, the content of images may be analyzed using a machine-learning model or using a machine-learning algorithm. In order for the machine-learning model to analyze the content of an image, the machine-learning model may be trained using training images as input and training content information as output. By training the machine-learning model with a large number of training images and/or training sequences (e.g. words or sentences) and associated training content information (e.g. labels or annotations), the machine-learning model “learns” to recognize the content of the images, so the content of images that are not included in the training data can be recognized using the machine-learning model. The same principle may be used for other kinds of sensor data as well: By training a machine-learning model using training sensor data and a desired output, the machine-learning model “learns” a transformation between the sensor data and the output, which can be used to provide an output based on non-training sensor data provided to the machine-learning model. The provided data (e.g. sensor data, meta data and/or image data) may be preprocessed to obtain a feature vector, which is used as input to the machine-learning model.
Machine-learning models may be trained using training input data. The examples specified above use a training method called “supervised learning”. In supervised learning, the machine-learning model is trained using a plurality of training samples, wherein each sample may comprise a plurality of input data values, and a plurality of desired output values, i.e. each training sample is associated with a desired output value. By specifying both training samples and desired output values, the machine-learning model “learns” which output value to provide based on an input sample that is similar to the samples provided during the training. Apart from supervised learning, semi-supervised learning may be used. In semi-supervised learning, some of the training samples lack a corresponding desired output value. Supervised learning may be based on a supervised learning algorithm (e.g. a classification algorithm, a regression algorithm or a similarity learning algorithm. Classification algorithms may be used when the outputs are restricted to a limited set of values (categorical variables), i.e. the input is classified to one of the limited set of values. Regression algorithms may be used when the outputs may have any numerical value (within a range). Similarity learning algorithms may be similar to both classification and regression algorithms but are based on learning from examples using a similarity function that measures how similar or related two objects are. Apart from supervised or semi-supervised learning, unsupervised learning may be used to train the machine-learning model. In unsupervised learning, (only) input data might be supplied and an unsupervised learning algorithm may be used to find structure in the input data (e.g. by grouping or clustering the input data, finding commonalities in the data). Clustering is the assignment of input data comprising a plurality of input values into subsets (clusters) so that input values within the same cluster are similar according to one or more (pre-defined) similarity criteria, while being dissimilar to input values that are included in other clusters.
Reinforcement learning is a third group of machine-learning algorithms. In other words, reinforcement learning may be used to train the machine-learning model. In reinforcement learning, one or more software actors (called “software agents”) are trained to take actions in an environment. Based on the taken actions, a reward is calculated. Reinforcement learning is based on training the one or more software agents to choose the actions such, that the cumulative reward is increased, leading to software agents that become better at the task they are given (as evidenced by increasing rewards).
Furthermore, some techniques may be applied to some of the machine-learning algorithms. For example, feature learning may be used. In other words, the machine-learning model may at least partially be trained using feature learning, and/or the machine-learning algorithm may comprise a feature learning component. Feature learning algorithms, which may be called representation learning algorithms, may preserve the information in their input but also transform it in a way that makes it useful, often as a pre-processing step before performing classification or predictions. Feature learning may be based on principal components analysis or cluster analysis, for example.
In some examples, anomaly detection (i.e. outlier detection) may be used, which is aimed at providing an identification of input values that raise suspicions by differing significantly from the majority of input or training data. In other words, the machine-learning model may at least partially be trained using anomaly detection, and/or the machine-learning algorithm may comprise an anomaly detection component.
In some examples, the machine-learning algorithm may use a decision tree as a predictive model. In other words, the machine-learning model may be based on a decision tree. In a decision tree, observations about an item (e.g. a set of input values) may be represented by the branches of the decision tree, and an output value corresponding to the item may be represented by the leaves of the decision tree. Decision trees may support both discrete values and continuous values as output values. If discrete values are used, the decision tree may be denoted a classification tree, if continuous values are used, the decision tree may be denoted a regression tree.
Association rules are a further technique that may be used in machine-learning algorithms. In other words, the machine-learning model may be based on one or more association rules. Association rules are created by identifying relationships between variables in large amounts of data. The machine-learning algorithm may identify and/or utilize one or more relational rules that represent the knowledge that is derived from the data. The rules may e.g. be used to store, manipulate or apply the knowledge.
Machine-learning algorithms are usually based on a machine-learning model. In other words, the term “machine-learning algorithm” may denote a set of instructions that may be used to create, train or use a machine-learning model. The term “machine-learning model” may denote a data structure and/or set of rules that represents the learned knowledge (e.g. based on the training performed by the machine-learning algorithm). In embodiments, the usage of a machine-learning algorithm may imply the usage of an underlying machine-learning model (or of a plurality of underlying machine-learning models). The usage of a machine-learning model may imply that the machine-learning model and/or the data structure/set of rules that is the machine-learning model is trained by a machine-learning algorithm.
For example, the machine-learning model may be an artificial neural network (ANN). ANNs are systems that are inspired by biological neural networks, such as can be found in a retina or a brain. ANNs comprise a plurality of interconnected nodes and a plurality of connections, so-called edges, between the nodes. There are usually three types of nodes, input nodes that receiving input values, hidden nodes that are (only) connected to other nodes, and output nodes that provide output values. Each node may represent an artificial neuron. Each edge may transmit information, from one node to another. The output of a node may be defined as a (non-linear) function of its inputs (e.g. of the sum of its inputs). The inputs of a node may be used in the function based on a “weight” of the edge or of the node that provides the input. The weight of nodes and/or of edges may be adjusted in the learning process. In other words, the training of an artificial neural network may comprise adjusting the weights of the nodes and/or edges of the artificial neural network, i.e. to achieve a desired output for a given input.
Alternatively, the machine-learning model may be a support vector machine, a random forest model or a gradient boosting model. Support vector machines (i.e. support vector networks) are supervised learning models with associated learning algorithms that may be used to analyze data (e.g. in classification or regression analysis). Support vector machines may be trained by providing an input with a plurality of training input values that belong to one of two categories. The support vector machine may be trained to assign a new input value to one of the two categories. Alternatively, the machine-learning model may be a Bayesian network, which is a probabilistic directed acyclic graphical model. A Bayesian network may represent a set of random variables and their conditional dependencies using a directed acyclic graph. Alternatively, the machine-learning model may be based on a genetic algorithm, which is a search algorithm and heuristic technique that mimics the process of natural selection.
The list of reference signs are:
Number | Date | Country | Kind |
---|---|---|---|
10 2022 102 898.4 | Feb 2022 | DE | national |