The present disclosure relates to systems for providing improved training and guidance to equipment users, and more particularly systems and methods for providing real-time augmented reality (AR) feedback-based guidance in the use of equipment systems, wherein the feedback-based AR guidance is based at least in part on a condition of the user.
In many medical situations, diagnosis or treatment of medical conditions, which may include life-saving care, must be provided by persons without extensive medical training. This may occur because trained personnel are either not present or are unable to respond. For example, temporary treatment of broken bones occurring in remote wilderness areas must often be provided by a companion of the injured patient, or in some cases as self-treatment by the patient alone. The need for improved medical treatment in remote or extreme situations has led to Wilderness First Aid training courses for hikers and backpackers. Battlefield injuries such as gunshot or blast injuries often require immediate treatment, e.g., within minutes or even seconds, by untrained personnel under extreme conditions to stabilize the patient until transport is available. Injuries to maritime personnel may occur on smaller vessels lacking a full-time physician or nurse, and illness or injuries may require treatment by persons with little or no training. Similarly, injuries or illnesses occurring to persons in space (e.g., the International Space Station) may also require treatment by persons with limited or incomplete medical training. Also, medical devices and equipment may require maintenance, calibration, and/or operation, At least some of those procedures currently require the presence of trained personnel, which may increase costs for bringing trained personnel to the location where the devices and equipment are employed, along with reducing the uptime of the device or equipment while waiting for the trained personnel to arrive.
In many instances, such as maritime vessels and injuries in space, adequate medical equipment may be available, but the efficacy of the use of the equipment may be limited by the training level of the caregiver(s). Improved treatment or diagnostic outcomes may be available if improved training is available to caregivers having limited medical training. As used herein, caregivers having little or no medical training for the use of a particular medical device or medical technology are referred to as “novice users” of the technology. Novice users may include persons having a rudimentary or working knowledge of a medical device or technology, but less than a proficient or credentialed technician for such technology. Although the present disclosure generally refers to “novice users,” any user with any level of expertise may use the methods and systems disclosed herein and garner the benefits of doing so.
Further, a perception of a user's skill level, whether made by the user or by others, may not in fact be true. A user may be ignorant of how much of a procedure he or she does not understand (e.g., the user may be in a state of “unconscious incompetence”). An unskilled user may have been “socially promoted” or “kicked upstairs,” thus leading people unfamiliar with the user's true low level of skill to assume he or she has a higher skill level.
In numerous other scenarios unrelated to medicine, it may be desirable for a user having limited or incomplete training in the use of an equipment system to perform a procedure using that equipment system. Such scenarios may include, but are by no means limited to, operating a land, sea, air, or space vehicle or subsystem thereof; and operating a weapon, weapons system, power tool, construction equipment, manufacturing facility, assembly line, or subsystem thereof; among others.
In addition to a user's training level, and regardless of whether a process makes use of medical equipment or non-medical equipment, the performance of a complex process may be rendered more challenging if the user is in a state of physical, mental, or emotional impairment. For example, a trainee doctor or a trainee soldier may be sleep-deprived when called on to perform a task. For another example, the vast amount and rapid change of stimuli in a modern medical scenario, combat scenario, or other stressful scenario may afflict a user with cognitive overload. The space environment subjects astronauts to radiation exposure. Any person may experience stress for reasons that may be related to the task at hand or may have no such relation, e.g. health, family, marriage, romantic, or financial problems may afflict a user with stress. A user may be intoxicated by alcohol or a drug, with even prescribed or otherwise licit medications taken according to medical instructions capable of impairing a person's ability to drive or operate heavy machinery. Far more other examples of physical, mental, or emotional impairment exist than can be listed here.
Many future manned spaceflight missions (e.g., by NASA, the European Space Agency, or non-governmental entities) will require medical diagnosis and treatment capabilities that address the anticipated health risks and also perform well in austere, remote operational environments, Spaceflight-ready medical equipment or devices will need to be capable of an increased degree of autonomous operation, allowing the acquisition of clinically relevant and. diagnosable data by every astronaut, not just select physician crew members credentialed in spaceflight medicine. Such manned spaceflight missions will also make use of numerous complex equipment systems, such as propulsion systems, navigation systems, communications systems, life support systems, maintenance systems, scientific equipment systems, and the like. If, hypothetically, a manned mission returning from Mars must depart the Martian surface or low Martian orbit at a particular time, else a launch window will close and the crew of the mission would lack the consumables to remain on or near Mars until the next launch window, and if the only rated pilots are incapacitated by kidney stones, radiation poisoning, or other hazards of long-duration spaceflight, then the ability of crew members not rated in piloting to return the spacecraft to Earth may be a matter of life or death.
Though less dramatic, numerous terrestrial scenarios may also benefit by allowing novice or underskilled users, and not, just proficient or credentialed users, to perform a given task. For example, in a combat scenario, it would be desirable for a member of a crew-served weapon team to perform tasks normally performed by a second crew member, if the second crew member is severely wounded or killed in combat. Even one's morning or evening commute could be improved if novice or underskilled drivers of other vehicles, especially of larger vehicles such as buses and trucks, had their training expedited and/or their skills improved in some way.
Augmented reality systems have been developed that provide step-by-step instructions to a user in performing a task. Such prior art systems may provide a virtual manual or virtual checklist for a particular task (e.g., performing a repair or maintenance procedure). in some systems, the checklist may be visible to the user via an augmented reality (AR) user interface such as a headset worn by the user. Providing the user with step-by-step instructions or guidance may reduce the need for training for a wide variety of tasks, for example, by breaking a complex task into a series of simpler steps. In some instances, context-sensitive animations may be provided through an AR user interface in the real-world workspace. Existing systems, however, may be unable to guide users in delicate or highly specific tasks that are technique-sensitive, such as many medical procedures or other equipment requiring a high degree of training for proficiency.
Thus, there is a need for AR systems capable of guiding a novice user of equipment in real time through a wide range of unfamiliar tasks in remote and/or complex environments such as space or remote wilderness (e.g., arctic) conditions, combat conditions, etc. These may include daily checklist items (e.g., habitat systems procedures and general equipment maintenance), assembly, and testing of complex electronics setups, and diagnostic and interventional medical procedures. AR guidance systems desirably would allow novice users to be capable of autonomously using medical and other equipment or devices with a high degree of procedural competence, even where the outcome is technique-sensitive.
The present invention provides systems and methods for guiding medical equipment users, including novice users. In some embodiments, systems of the present disclosure provide real-time guidance to a medical equipment user. In some embodiments, systems disclosed herein provide three-dimensional (3D) augmented reality (AR) guidance to a medical device user. In some embodiments, systems of the present disclosure provide machine learning guidance to a medical device user. Guidance systems disclosed herein may provide improved diagnostic, maintenance, calibration, operation, or treatment results for novice users of medical devices. Use of systems of the present invention may assist novice users to achieve results comparable to those obtained by proficient or credentialed medical caregivers for a particular medical device or technology.
Although systems of the present invention may be described for particular medical devices and medical device systems, persons of skill in the art having the benefit of the present disclosure will appreciate that these systems may be used in connection with other medical devices not specifically noted herein. Further, it will also be appreciated that systems according to the present invention not involving medical applications are also within the scope of the present invention. For example, systems of the present invention may be used in many industrial or commercial settings to train users to operate may different kinds of equipment, including heavy machinery as well as many types of precision instruments, tools, or devices. Accordingly, the particular embodiments disclosed above are illustrative only, as the invention may be modified and practiced in different but equivalent manners apparent to those skilled in the art having the benefit of the teachings herein. Examples, where provided, are all intended to be non-limiting. Furthermore, exemplary details of construction or design herein shown are not intended to limit or preclude other designs achieving the same function. The particular embodiments disclosed above may be altered or modified and all such variations are considered within the scope and spirit of the invention, which are limited only by the scope of the claims.
In one embodiment, the present invention comprises a medical guidance system (100) for providing real-time, three-dimensional (3D) augmented reality (AR) feedback guidance in the use of a medical equipment system (200), the medical guidance system comprising: a medical equipment interface to a medical equipment system (200), wherein said medical equipment interface is capable of receiving data from the medical equipment system during a medical procedure performed by a user; an augmented reality user interface (ARUI) (300) for presenting data pertaining to both real and virtual objects to the user during at least a portion of the performance of the medical procedure; a three-dimensional guidance system (3DGS) (400) that is capable of sensing real-time user positioning data relating to one or more of the movement, position, and orientation of at least a portion of the medical equipment system (200) during said medical procedure performed by the user; a library (500) containing 1) stored reference positioning data relating to one or more of the movement, position, and orientation of at least a portion of the medical equipment system (200) during a reference medical procedure and 2) stored reference outcome data relating to an outcome of said reference medical procedure; and a machine learning module (MLM) (600) for providing at least one of 1) position-based 3D AR feedback to the user based on the sensed user positioning data and the reference positioning data, and 2) outcome-based 3D AR feedback to the user based on data received from the medical equipment system during the medical procedure performed by the user and reference outcome data.
In one embodiment, the present invention comprises a medical guidance system (100) for providing real-time, three-dimensional (3D) augmented reality (AR) feedback guidance in the use of a medical equipment system (200), the medical guidance system comprising: a computer 700 comprising a medical equipment interface to a medical equipment system (200), wherein said medical equipment interface receives data from the medical equipment system during a medical procedure performed by a user to achieve a medical procedure outcome; an AR interface to an AR head mounted display (HAM) for presenting information pertaining to both real and virtual objects to the user during the performance of the medical procedure; a guidance system interface (USI) to a three-dimensional guidance system (3DGS) (400) that senses real-time user positioning data relating to one or more of the movement, position, and orientation of at least a portion of the medical equipment system (200) within a volume of a user's environment during a medical procedure performed by the user; a library (500) containing 1) stored reference positioning data relating to one or more of the movement, position, and orientation of at least a portion of the medical equipment system (200) during a reference medical procedure and 2) stored reference outcome data relating to an outcome of a reference performance of the reference medical procedure; and a machine learning module (MLM) (600) for providing at least one of 1) position-based 3D AR feedback to the user based on the sensed user positioning data and 2) outcome-based 3D AR feedback to the user based on the medical procedure outcome, the MLM (600) comprising a position-based feedback module comprising a first module for receiving and analyzing real-time user positioning data; a second module for comparing the user positioning data to the stored reference positioning data, and a third module for generating real-time position-based 3D AR feedback based on the output of the second module, and providing said real-time position-based 3D AR feedback to the user via the ARUI (300) ; and an outcome-based feedback module comprising a fourth module for receiving real-time data from the medical equipment system (200) via said medical equipment interface as the user performs the medical procedure; a fifth module for comparing the real-time data received from the medical equipment system (200) as the user performs the medical procedure to the stored reference outcome data, and a sixth module for generating real-time outcome-based 3D AR feedback based on the output of the fifth module, and providing said real-time outcome-based 3D AR feedback to the user via the ARUI (300),
In one embodiment, the present invention comprises a method for providing real-time, three-dimensional (3D) augmented reality (AR) feedback guidance to a user of a medical equipment system, the method comprising: receiving data from a medical equipment system during a medical procedure performed by a user of the medical equipment to achieve a medical procedure outcome; sensing real-time user positioning data relating to one or more of the movement, position, and orientation of at least a portion of the medical equipment system within a volume of the user's environment during the medical procedure performed by the user; retrieving from a library at least one of 1) stored reference positioning data relating to one or more of the movement, position, and orientation of at least a portion of the medical equipment system during reference a medical procedure, and 2) stored reference outcome data relating to a reference performance of the medical procedure; comparing at least one of 1) the sensed real-time user positioning data to the retrieved reference positioning data, and 2) the data received from the medical equipment system during a medical procedure performed by the user to the retrieved reference outcome data; generating at least one of 1) real-time position-based 3D AR feedback based on the comparison of the sensed real-time user positioning data to the retrieved reference positioning data, and 2) real-time output-based 3D AR feedback based on the comparison of the data received from the medical equipment system during a medical procedure performed by the user to the retrieved reference outcome data; and providing at least one of the real-time position-based 3D AR feedback and the real-time output-based 3D AR feedback to the user via an augmented reality user interface (ARUI).
In one embodiment, the present invention comprises a method for developing a machine learning model of a neural network for classifying images for a medical procedure using an ultrasound system, the method comprising: A) performing a first medical procedure using an ultrasound system; B) automatically capturing a plurality of ultrasound images during the performance of the first medical procedure, wherein each of the plurality of ultrasound images is captured at a defined sampling rate according to defined image capture criteria; C) providing a plurality of feature modules, wherein each feature module defines a feature which may be present in an image captured during the medical procedure; D) automatically analyzing each image using the plurality of feature modules; E) automatically determining, for each image, whether or not each of the plurality of features is present in the image, based on the analysis of each imagine using the feature modules; F) automatically labeling each image as belonging to one class of a plurality of image classes associated with the medical procedure; G) automatically splitting the plurality of images into a training set of images and a validation set of images; H) providing a deep machine learning (DML) platform having a neural network to be trained loaded thereon, the DML platform having a plurality of adjustable parameters for controlling the outcome of a training process; 1) feeding the training set of images into the DML platform; J) performing the training process for the neural network to generate a machine learning model of the neural network; K) obtaining training process metrics of the ability of the generated machine learning model to classify images during the training process, wherein the training process metrics comprise at least one of a loss metric, an accuracy metric, and an error metric for the training process; L) determining whether each of the at least one training process metrics is within an acceptable threshold for each training process metric; M) if one or more of the training process metrics are not within an acceptable threshold, adjusting one or more of the plurality of adjustable DML parameters and repeating steps J, K, and L; N) if each of the training process metrics is within an acceptable threshold for each metric, performing a validation process using the validation set of images; O) obtaining validation process metrics of the ability of the generated machine learning model to classify images during the validation process, wherein the validation process metrics comprise at least one of a loss metric, an accuracy metric, and an error metric for the validation process; P) determining whether each of the validation process metrics is within an acceptable threshold for each validation process metric; Q) if one or more of the validation process metrics are not within an acceptable threshold, adjusting one or more of the plurality of adjustable DML parameters and repeating steps J-P; and R) if each of the validation process metrics is within an acceptable threshold for each metric, storing the machine learning model for the neural network.
A machine learning module developed by a particular institution and/or for a specific user may be customized for that institution or user, such as to conform to the institution's best practices or the user's individual preferences.
Although “machine learning” is used herein for convenience, more generally, the methods and systems disclosed herein may be implemented using artificial intelligence techniques, including machine learning and deep learning techniques. Generally, “machine learning” utilizes analytical models that use neural networks, math equations (e.g., statistics), science, etc., to find patterns or other information without explicitly being programmed to do so. “Deep learning” utilizes a significant number of neural networks that have various processors arranged in multiple layers to perform various computing tasks, such as speech recognition, image recognition, etc.
Exemplary embodiments are illustrated in referenced figures of the drawings. The embodiments disclosed herein are considered illustrative rather than restrictive. No limitation on the scope of the technology and on the claims that follow is to be imputed to the examples shown in the drawings and discussed here.
As used herein, the term “augmented reality” refers to display systems or devices capable of allowing a user to sense (e.g., visualize) objects in reality (e.g., a patient on an examination table and a portion of a medical device used to examine the patient), as well as objects that are not present in reality but which relate in some way to objects in reality, but which are displayed or otherwise provided in a sensory manner (e.g., visually or via sound) in the AR device. Augmented reality as used herein is a live view of a physical, real-world environment that is augmented to a user by computer-generated perceptual information that may include visual, auditory, haptic (or tactile), somatosensory, or olfactory components. The augmented perceptual information is overlaid onto the physical environment in spatial registration so as to be perceived as immersed in the real world. Thus, for example, augmented visual information is displayed relative to one or more physical objects in the real world, and augmented sounds are perceived as coming from a particular source or area of the real world. This could include, as nonlimiting examples, visual distance markers between particular real objects in the AR display, or grid lines allowing the user to gauge depth and contour in the visual space, and sounds, odors, and tactile inputs highlighting or relating to real objects.
A well-known example of AR devices are heads-up displays on military aircraft and some automobiles, which allow the pilot or driver to perceive elements in reality (the landscape and/or aerial environment) as well as information related to the environment (e.g., virtual horizon and plane attitude/angle, markers for the position of other aircraft or targets, etc.) that is not present in reality but which is overlaid on the real environment. The term “augmented reality” (AR) is intended to distinguish systems herein from “virtual reality” (VR) systems that display only items that are not actually present in the user's field of view. Examples of virtual reality systems include YR goggles for gaming that present information to the viewer while blocking entirely the viewer's perception of the immediate surroundings, as well as the display on a television screen of the well-known “line of scrimmage” and “first down” markers in football games. While the football field actually exists, it is not in front of the viewer; both the field and the markers are only presented to the viewer on the television screen.
In one aspect of the present disclosure, a 3D AR system according to the present disclosure may be provided to a novice medical device user for real-time, three-dimensional guidance in the use of an ultrasound system. Ultrasound is a well-known medical diagnostic and treatment technology currently used on the International Space Station (ISS) and planned for use in future deep-space missions. A variety of ultrasound systems may be used in embodiments herein. In one nonlimiting example, the ultrasound system by be the Flexible Ultrasound System (FUS), an ultrasound platform being developed by NASA and research partners for use in space operations.
In one embodiment, computer 700 interfaces with a medical equipment system 200, which in one embodiment may be an ultrasound system. In other embodiments, different medical equipment, devices, or systems may be used instead of or in addition to ultrasound systems. In the embodiment depicted in
In one embodiment, the 3D AR guidance system 100 also includes an augmented reality user interface (ARUI) 300. The ARUI 300 may comprise a visor having a viewing element (e.g., a viewscreen, viewing shield or viewing glasses) that is partially transparent to allow a medical equipment user to visualize a workspace (e.g., an examination room, table or portion thereof). In one embodiment, the ARUI 300 includes a screen upon which virtual objects or information can be displayed to aid a medical equipment user in real-time (i.e., with minimal delay between the action of a novice user and the AR feedback to the action, preferably less than 2 seconds, more preferably less than 1 second, most preferably 100 milliseconds or less). As used herein, three-dimensional (3D) AR feedback refers to augmented reality sensory information (e.g., visual or auditory information) providing to the user based at least in part on the actions of the user, and which is in spatial registration with real world objects perceptible (e.g., observable) to the user. The ARUI 300 provides the user with the capability of seeing all or portions of both real space and virtual information overlaid on or in registration with real objects visible through the viewing element. The ARUI 300 overlays or displays (and otherwise presents, e.g., as sounds or tactile signals) the virtual information to the medical equipment user in real time. In one embodiment, system also includes an ARUI interface (not shown) to facilitate communication between the headset and the computer 700. The interface may be located in computer 700 or ARUI 300, and may comprise software, firmware, hardware, or combinations thereof.
A number of commercially available AR headsets may be used in embodiments of the present invention. The ARUI 300 may include one of these commercially available headsets. In the embodiment depicted in
The embodiment of
In one embodiment, the 3DGS 400 senses real-time user positioning data while a novice user performs a medical procedure. User positioning data relates to or describes one or more of the movement, position, and orientation of at least a portion of the medical equipment system 200 while the user (e.g., a novice) of performs a medical procedure. User positioning data may, for example, include data defining the movement of an ultrasound probe during an ultrasound procedure performed by the user. User positioning data may be distinguished from user outcome data, which may be generated by medical equipment system 200 while the user performs a medical procedure, and which includes data or information indicating or pertaining to the outcome of a medical procedure performed by the user. User outcome data may include, as a nonlimiting example, a series of ultrasound images captured while the user performs an ultrasound procedure, or an auditory or graphical record of a patient's cardiac activity, respiratory activity, brain activity, etc.
In one embodiment, the 3DGS 400 is a magnetic GPS system such as VolNav, developed by GE, or other magnetic GPS system. Magnetic GPS tracking systems While magnetic GPS provides a robust, commercially available means of obtaining precision positional data in real-time, in some environments (e.g.. the International Space Station) magnetic GPS may be unable to tolerate the small magnetic fields prevalent in such environments. Accordingly, in some embodiments, alternative or additional 3D guidance systems for determining the position of the patient, tracking the user's actions, or tracking one or more portions of the medical equipment system 200 (e.g., an ultrasound probe) may be used instead of a magnetic GPS system. These may include, without limitation, digital (optical) camera systems such as the DMA6SA and Optitrack systems, infrared cameras, and accelerometers and/or gyroscopes.
In the case of RGB (color) optical cameras and IR (infrared) depth camera systems, the position and rotation of the patient, the user's actions, and one or more portions of the medical equipment system may be tracked using non-invasive external passive visual markers or external active markers (i.e., a marker emitting or receiving a sensing signal) coupled to one or more of the patient, the user's hands, or portions of the medical equipment. The position and rotation of passive markers in the real world may be measured by the depth cameras in relation to a volume within the user's environment (e.g., an operating room volume), which may be captured by both the depth cameras and color cameras. In other embodiments, one or more sensors configured to receive electromagnetic wavelength bands other than color and infrared, or larger than and possibly encompassing one or more of color and infrared, may be used.
In the case of accelerometers and gyroscopes, the combination of acceleration and gyroscopes comprises inertial measurement units (IMUs), which can measure the motion of subjects in relation to a determined point of origin or reference plane, thereby allowing the position and rotation of subjects to be derived. In the case of a combination of color cameras, depth cameras, and the aggregation of measured position and rotation data (collectively known as pose data) becomes more accurate.
In an alternative embodiment, the 3DGS 400 is not part of the guidance system 100, and guidance system 100 instead includes a 3DGS interface, which may be provided as software, firmware, hardware or a combination thereof in computer 700. In this alternative embodiment, the 3DGS interface communicates with the 3DGS 400 and one or more other system components (e.g., computer 700), and 3DGS 400 interfaces with the system 100 (e.g., via computer 700) in a “plug-and-play” manner.
In one embodiment of the invention, the 3DGS 400 tracks the user's movement of an ultrasound probe (provided as part of medical equipment system 200) relative to the body of the patient in a defined examination area or room. The path and position or orientation of the probe may be compared to a desired reference path and position/orientation (e.g., that of an proficient user such as a physician or ultrasound technician during the examination of a particular or idealized patient for visualizing a specific body structure). This may include, for example, an examination path of a proficient user for longitudinal or cross-sectional visualization of a carotid artery of a patient using the ultrasound probe.
Differences between the path and/or position/orientation of the probe during an examination performed by a novice user in real-time, and an idealized reference path or position/orientation (e.g., as taken during the same examination performed by an proficient), may be used to provide real-time 3D AR feedback to the novice user via the ARUI 300. This feedback enables the novice user to correct mistakes or incorrect usage of the medical equipment and achieve an outcome similar to that of the proficient user. The real-time 3D AR feedback may include visual information (e.g., a visual display of a desired path for the novice user to take with the probe, a change in the position or orientation of the probe, etc.), tactile information (e.g., vibrations or pulses when the novice user is in the correct or incorrect position), or sound (e.g., beeping when the novice user is in the correct or incorrect position).
Referring again to
A machine learning module (MLM) 600 is provided to generate feedback to a novice user of the system 100, which may be displayed in the ARUI 300. MLM 600 is capable of comparing data of a novice user's performance of a procedure or task to that of a reference performance (e.g., by a proficient user). MLM 600 may receive real-time data relating to one or both of 1) the movement, position or orientation (“positioning data”) of a portion of the medical equipment 200 during the novice user's performance of a desired medical task (e.g., the motion, position and orientation of an ultrasound probe as manipulated by a novice user to examine a patient's carotid artery), and 2) data received from the medical equipment 200 relating to an outcome of the medical procedure (“outcome data”).
As previously noted, the positioning data (e.g., relating to the real-time motion, position or orientation an ultrasound probe during use by a novice user) is obtained by the 3DGS 400, which senses the position and/or orientation of a portion of the medical device at a desired sampling rate (e.g., 100 times per second (Hz) up to 0.1 Hz or once every 10 seconds). The positioning data is then processed by one or more of the 3DGS 400, computer 700, or MLM 600 to determine the motion and position/orientation of a portion of the medical equipment system 200 as manipulated by the novice user during the medical procedure.
The MLM 600 includes a plurality of modules, which may comprise software, firmware or hardware, for generating and providing one or both of position-based and outcome-based feedback to user.
By “position-based feedback” is meant data relating to a location of the user, a portion of the user's body, and/or a tool manipulated by the user. The location may be an absolute location, such as may be determined by UPS or the like, a relative location, e.g., a location relative to one or more reference points in proximity to the user, a location relative to a target of the procedure or a portion thereof, or two or more of the foregoing. This data is then provided to one or more components of the system and, either directly or indirectly, through the augmented reality display to the user. The user may he able to apply the position-based feedback to change the location of himself, the portion of his body, and/or the tool to more efficiently or effectively perform the procedure.
By “outcome-based feedback” is meant data relating to the result of an action on the target of the procedure or a portion thereof by the user, a portion of the user's body, and/or a tool manipulated by the user. For example, in an ultrasound medical procedure, the action may be the passage of an ultrasound wand over a portion of a patient's body, and data relating to the result of the action may he an ultrasound image of the portion of the patient's body. This data is then provided to one or more components of the system and, either directly or indirectly, through the augmented reality display to the user. The user may be able to apply the outcome-based feedback to perform the same or a similar action more efficiently or effectively during his performance of the procedure.
Related to this, “reference outcome data” refers to data relating to the result of an action on the target of the procedure or a portion thereof by the user, a portion of the user's body, and/or a tool manipulated by the user, wherein the user is proficient. For example, in an ultrasound medical procedure, the reference outcome data may be a set of ultrasound images collected by a proficient user of an ultrasound system.
In one embodiment, MEM 600 includes a first module for receiving and processing real-time user positioning data, a second module for comparing the real-time user positioning data (obtained by the 3DGS 400) to corresponding stored reference positioning data in patient library 500 of the motion and position/orientation obtained during a reference performance of the same medical procedure or task. Based on the comparison of the movements of the novice user and the reference performance, the MLM 600 may then determine discrepancies or variances of the performance of the novice user and the reference performance. A third module in the MLM generates real-time position-based 3D AR feedback based on the comparison performed by the second module and provides the real-time position-based 3D AR feedback to the user via the ARUI 300. The real-time, 3D AR position-based feedback may include, for example, virtual prompts to the novice user to correct or improve the novice's user's physical performance (i.e., manipulation of the relevant portion of the medical equipment system 200) of the medical procedure or task. The feedback may include virtual still images, virtual video images, sounds, or tactile information. For example, the MLM 600 may cause the ARUI 300 to display a virtual image or video instructing the novice user to change the orientation of a probe to match a desired reference (e.g., proficient) orientation, or may display a correct motion path to be taken by the novice user in repeating a prior reference motion, with color-coding to indicate portions of the novice user's prior path that were erroneous or sub-optimal, In some embodiments, the MLM 600 may cause the ARUI 300 to display only portions of the novice user's motion that must be corrected.
In one embodiment, the MLM 600 also includes a fourth module that receives real-time data from the medical equipment system 200 itself (e.g., via an interface with computer 700) during a medical procedure performed by the novice user, and a fifth module that compares that data to stored reference outcome data from library 500. For example, the MLM 600 may receive image data from an ultrasound machine during use by a novice user at a specified sampling rate (e.g., from 100 Hz to 0.1 Hz), or specific images captured manually by the novice user, and may compare the novice user image data to stored reference image data in library 500 obtained during a reference performance of the medical procedure (e.g., by an proficient user such as an ultrasound technician).
The MLM 600 further includes a sixth module that generates real-time outcome-based feedback based on the comparison performed in the fifth module, and provides real-time, 3D AR outcome-based feedback to the user via the ARUI 300. The real-time outcome-based feedback may include virtual prompts to the user different from, or in addition to, the virtual prompts provided from the positioning data. Accordingly, the outcome data provided by MLM 600 may enable the novice user to further refine his or her use of the medical device, even when the positioning comparison discussed above indicates that the motion, position and/or orientation of the portion of the medical device manipulated by the novice user is correct. For example, the MLM 600 may use the outcome data from the medical device 200 and library 500 to cause the ARUI 300 to provide a virtual prompt instructing the novice user to press an ultrasound probe deeper or shallower into the tissue to the focus the ultrasound image on a desired target such as a carotid artery. The virtual prompt may comprise, for example, an auditory instruction or a visual prompt indicating the direction in which the novice user should move the ultrasound probe. The MLM 600 may also indicate to the novice user whether an acceptable and/or optimal outcome in the use of the device has been achieved.
It will be appreciated from the foregoing that MLM 600 can generate and cause ARUI 300 to provide virtual guidance based on two different types of feedback, including 1) position-based feedback based on the positioning data from the 3DGS 400 and 2) outcome-based feedback based on outcome data from the medical equipment system 200. In some embodiments, the dual-feedback MLM 600 provides a tiered guidance to a novice user: the position-based feedback is used for high-level prompts to guide the novice user in performing the overall motion for a medical procedure, while the outcome-based feedback from the medical device 200 may provide more specific guidance for fine or small movements in performing the procedure. Thus, MLM 600 may in some instances provide both “coarse” and “fine” feedback to the novice user to help achieve a procedural outcome similar to that of a reference outcome (e.g., obtained from a proficient user). Additional details of the architecture and operation of the MLM is provided in connection with subsequent figures.
Referring again to
Software components 402-410 are the software infrastructure modules used to integrate the FUS Research Application (FUSRA) 430 with the HoloLens Head Mounted Display (HMD) augmented reality (AR) application module 412. Although a wide range of architectures are possible, the integration for the experimental system of
The HoloLens HMD AR application module 412 software components are numbered 412-428. The main user interfaces provided by the HoloLens HMD AR application 412 are a Holograms module 414 and a Procedure Manager module 416. The Holograms module 414 blends ultrasound images, real world objects and 3D models, images and graphical clues for display in the HMD HoloLens ARUI. The Procedure Manager module 416 provides status and state for the electronic medical procedure being performed.
The FUS Research Application (FUSRA) module 430 components are numbered 430-440. The FUSRA module 430 will have capability to control the FUS ultrasound scan settings when messages (commands) are received by the computer from the FUS to change scan settings. Specific probe and specific scan settings are needed for specific ultrasound procedures. One specific example is the gain scan setting for the ultrasound, which is controlled by the Processing Control Dialog module 434 using the Message Queue 408 and C++ SDK Processing Chain 446 to control scan settings using C++ FUS shared memory (
The FUSRA module 430 will have the capability to provide FUS ultrasound images in near-real time (high frame rate per second) so the HoloLens Head Mounted Display (HMD) Augmented Reality (AR) application module 412 can display the image stream. The FUSRA module 430 provides JPEG images as MJPEG through a web server 438 that has been optimized to display an image stream to clients (e.g., HoloLens HMD AR application module 412). The Frame Output File 436 (and SDI, MJPEG Image from FUS GPU,
The FUSRA module 430 is also capable of providing motion tracking 3D coordinates and spatial awareness whenever the 3D Guidance System (3DGS) 400 (
The FUS software development kit (SDK) in the FUSRA module 430 contains rudimentary image processing software to provide JPEG images to the FUSRA. The FUSRA module 430 contains additional image processing for monitoring and improving image quality, which is part of the C++ FUS SDK Framework 450 providing images to the Image Web Server 438 in
The FUSRA module 430 uses the machine learning module (MLM) 600 (
The HoloLens HMD AR application module 412 provides a hands-free head mounted display ARUI platform for receiving and viewing real-time feedback during an ultrasound procedure. It also allows the novice user to focus on the patient without having to focus away from the patient for guidance.
The HoloLens HMD AR application module uses the HoloLens HMD platform from Microsoft and the Unity 3D game engine 442 from Unity. The HoloLens HMD AR application module 412 displays guidance during execution of the ultrasound medical procedure with AR visual clues and guidance, in addition to the ultrasound image that is also visible through the HoloLens HMD display. The HoloLens HMD AR application module 412 also has the capability to control the FUS scan settings as part of the procedure setup.
The architecture is designed to be extended to utilize electronic procedures or eProc. Once an electronic procedure is created (using an electronic procedure authoring tool), the procedure can be executed with the Procedure Manager module 416.
The HoloLens HMD AR application module 412 includes the capability to align 3D models and images in the holographic scene with real world objects like the ultrasound unit, its probe and the patient. This alignment allows virtual models and images to align with real world objects for rendering in the HoloLens head mounted display.
The HoloLens HMD AR application module 412 uses voice-based navigation by the novice user to maintain hands free operation of the ultrasound equipment, except during initialization when standard keyboard or other interfaces may be used for control. Voice command modules in
The HoloLens HMD AR application module 412 also is capable of controlling the FUS settings as part of the procedure setup. This function is controlled by the 3DG 400 (
The HoloLens HMD AR application module 412 provides an Image Stream module 404 for display of ultrasound images that can be overlaid with guidance clues prompting the user to correctly the position the ultrasound probe. The HoloLens HMD AR application 412 is also capable of displaying 3D models and images in the HoloLens HMD along with real world objects like the ultrasound, its probe and the patient. The HoloLens HUD display allows virtual models and images to render over real world objects within the novice user's view. This is provided the Image Streamer 404 supplying images to the Holograms module 414 through the User Interface Layers module 422, User Interface Models module 426, and Scene Manager Module 424. This image stream is the same kind of image as a regular display device but tailored for HMD.
An embodiment of a particular system for real-time, 3D AR feedback guidance for novice users of an ultrasound system, showing communication between the system modules, is provided in
The ultrasound system 210 may be used by novice user 50 to perform a variety of diagnostic procedures for detecting one or more medical conditions, which may include without limitation carotid assessments, deep vein thrombosis, cardiogenic shock, sudden cardiac arrest, and venous or arterial cannulation. In addition to the foregoing cardiovascular uses, the ultrasound system 210 may be used to perform procedures in many other body systems, including body systems that may undergo changes during zero gravity space operations. Procedures that may be performed include ocular examinations, musculoskeletal examinations, renal evaluation, and cardiac (i.e., heart) examinations.
In some embodiments, imaging data from the ultrasound system 210 is displayed on an augmented reality user interface (ARUI) 300. A wide variety of available ARUI units 300, many comprising a Head-Mounted Display (HMD), may be used in systems of the present invention. These may include the Microsoft HoloLens, the Vuzix Wrap 920AR and Star 1200. Sony HMZ-T1, Google Glass, Oculus Rift DK1 and DK2, Samsung GearVR, and many others. In some embodiments, the system can support multiple ARUIs 300, enabling multiple or simultaneous users for some procedures or tasks, and in other embodiments allowing third parties to view the actions of the user in real time (e.g., suitable for allowing an proficient user to train multiple novice users).
Information on a variety of procedures that may be performed by novice user 50 may be provided by Library 500, which in some embodiments may be stored on a cloud-based server as shown in
As shown in
In some embodiments, the 3DGS 400, either alone or in combination with library 500 and/or machine learning module (MIN) 600, may cause ARUI 300 to display static markers or arrows to complement the instructions provided by the electronic medical procedure 530. The 3DGS 400 can communicate data relating to the movements of probe 215, while a user is performing a medical procedure, to the NEM 600.
The machine learning module (MLM) 600 compares the performance of the novice user 50 to that of a reference performance (e.g., by a proficient user) of the same procedure as the novice user. As discussed regarding
The MLM 600 generates position-based feedback by comparing the actual movements of a novice user 50 (e.g., using positioning data received from the 3DGS 400 tracking the movement of the ultrasound probe 215) to reference data for the same task. In one embodiment, the reference data is data obtained by a proficient user performing the same task as the novice user. The reference data may be either stored in MLM 600 or retrieved from library 500 via a computer (not shown). Data for a particular patient's anatomy may also be stored in library 500 and used by the MLM 600.
Based on the comparison of the novice user's movements to those of the proficient user, the MLN 600 may determine in real time whether the novice user 50 is acceptably performing the task or procedure (i.e., within a desired margin of error to that of an proficient user). The MLM 600 may communicate with ARUI 300 to display real time position-based feedback guidance in the form of data and/or instructions to confirm or correct the user's performance of the task based on the novice user movement data from the 3DGS 400 and the reference data. By generating feedback in real-time as the novice user performs the medical procedure, MLM 600 thereby enabling the novice user to correct errors or repeat movements as necessary to achieve an outcome for the medical procedure that is within a desired margin to that of reference performance.
In addition to the position-based feedback generated from position data received from 3DGS 400, MLM 600 in the embodiment of
Although
In one embodiment, one or both of real-time motion-based feedback and outcome-based feedback may be used to generate a visual simulation (e.g., as a narrated or unnarrated video displayed virtually to the novice user in the ARUI 300 (e.g., a HoloLens headset). In this way, the novice user may quickly (i.e., within seconds of performing a medical procedure) receive feedback indicating deficiencies in technique or results, enabling the user to improve quickly and achieve outcomes similar to those of a reference performance (e.g., an proficient performance) of the medical or other equipment.
In one embodiment, the novice user's performance may be tracked over time to determine areas in which the novice user repeatedly fails to implement previously provided feedback. In such cases, training exercises may be generated for the novice user focusing on the specific motions or portions of the medical procedure that the novice user has failed to correct, to assist the novice user to achieve improved results. For example, if the novice user fails to properly adjust the angle of an ultrasound proper at a specific point in a medical procedure, the MLM 600 and/or computer 700 may generate a video for display to the user that this limited to the portion of the procedure that the user is performing incorrectly. This allows less time to be wasted having the user repeat portions of the procedure that the user is correctly performing and enables the user to train specifically on areas of incorrect technique.
In another embodiment, the outcome-based feedback may be used to detect product malfunctions. For example, if the images being generated by a novice user at one or more points during a procedure fail to correspond to those of a reference (e.g., an proficient), or in some embodiments by the novice user during prior procedures, the absence of any other basis for the incorrect outcome may indicate that the ultrasound machine is malfunctioning in some way.
In one embodiment, the MLM 600 may provide further or additional instructions to the user in real-time by comparing the user's response to a previous real-time feedback guidance instruction to refine or further correct the novice user's performance of the procedure. By providing repeated guidance instruction as the novice user refines his/her technique, MLM 600 may further augment previously-provided instructions as the user repeats a medical procedure or portion thereof and improves in performance. Where successful results for the use of a medical device are highly technique sensitive, the ability to “fine tune” the user's response to prior instructions may help maintain the user on the path to a successful outcome. For example, where a user “overcorrects” in response to a prior instruction, the MLM 600, in conjunction with the 3DGS 400, assists the user to further refine the movement to achieve a successful result.
To provide usable real time 3D AR feedback-based guidance to a medical device user, the MLM 600 may include a standardized nomenclature module (not shown) to provide consistent real-time feedback instructions to the user. In an alternative embodiment, multiple nomenclature options may be provided to users, and different users may receive instructions that vary based on the level of skill and background of the user. For example, users with an engineering background may elect to receive real time feedback guidance from the machine learning module 600 and ARUI 300 in in terminology more familiar to engineers, even where the user is performing a medical task. Users with a scientific background may elect to receive real time feedback guidance in terminology more familiar for their specific backgrounds. In some embodiments, or for some types of equipment, however, a single, standardized nomenclature module may be provided, and the machine learning module 600 may provide real time feedback guidance using a single, consistent terminology.
The MLM 600 may also provide landmarks and virtual markings that are informative to enable the user to complete the task, and the landmarks provided in some embodiments may be standardized for all users, while in other embodiments different markers may be used depending upon the background of the user.
In the embodiment of
A variety of neural networks may be used in MLM 600 to provide outcome-based-feedback in a medical device system according to
In one embodiment of
As an initial matter, ultrasound images from ultrasound system 210 must be converted to a standard format usable by the neural network (e.g., ResNet). For example, ultrasound images captured by one type of ultrasound machine (FUS) are in the RGB24 image format and may generate images ranging from 512×512 pixels to 1024×768 pixels, depending on how the ultrasound machine is configured for an ultrasound scan. During any particular scan, the size of all captured images will remain constant, but image sizes may vary for different types of scans. Neural networks, however, generally require that the images must be in a standardized format (e.g., CHW format used by ResNet) and a single, constant size determined by the ML model. Thus, ultrasound images may need to be converted into the standardized format. For example, images may be converted for use in ResNet by extracting the CHW components from the original RGB24 format to produce a bitmap in the CHW layout, as detailed at https://docs.microsoft.com/en-us/cognitive-toolkit/archive/cntk-evaluate-image-transforms. It will be appreciated that different format conversion processes may be performed by persons of skill in the art to produce images usable by a particular neural network in a particular implementation.
Ultrasound medical procedures require the ultrasound user to capture specific views of various desired anatomical structures from specific perspectives. These view/perspective combinations may be represented as classes in a neural network. For example, in a carotid artery assessment procedure, the ultrasound user may be required to first capture the radial cross section of the carotid artery, and then capture the lateral cross section of the carotid artery. These two different views can be represented as two classes in the neural network. To add additional depth, a third class can be used to represent any view that does not belong to those two classes.
Classification is a common machine learning problem, and a variety of approaches have been developed. Applicants have discovered that a number of specific steps are advisable to enable MLM 600 to have good performance in classifying ultrasound images to generate 3D AR feedback guidance that is useful for guiding novice users. These include care in selecting both the training set and the validation data set for the neural network, and specific techniques for optimizing the neural network's learning parameters.
As noted, ResNet is an example of a neural network that may be used in MLM 600 to classify ultrasound images. Additional information on ResNet may be found at https://arxiv.org/abs/1512.03385. Neural networks such as ResNet are typically implemented in a program language such as NDL, Python, or BrainScript, and then trained using a deep machine learning (DML) platform or program such as CNTK, Caffe, or Tensorflow, among other alternatives. The platform operates by performing a “training process” using a “training set” of image data, followed by a “validation process” using a “validation set” of image data. Image analysis in general (e.g., whether part of the training and validation processes, or to analyze images of a novice user) is referred to as “evaluation” or “inferencing.”
In the training process, the DML platform generates a machine learning (MT) model using the training set of image data. The ML model generated in the training process is then evaluated in the validation process by using it to classify images from the validation set of image data that were not part of the training set. Regardless of which MIL platform (e.g., CNTK, Caffe, Tensorflow, or other system) is used, the training and validation performance of ResNet should be is similar for a given type of equipment (medical or non-medical). In particular, for the Flexible Ultrasound System (FUS) previously described, the image analysis performance of ResNet is largely independent of the DML platform.
In one embodiment, for small patient populations (e.g., astronauts, polar explorers, small maritime vessels), for each ultrasound procedure, a patient-specific machine learning model may be generated during training using a training data set of images that are acquired during a reference examination (e.g., by an proficient) for each individual patient. Accordingly, during subsequent use by a novice user, for each particular ultrasound procedure the images of a specific patient will be classified using a patient-specific machine learning module for that specific patient. In other embodiments, a single “master” machine learning model is used to classify all patient ultrasound images. In patient-specific approaches, less data is required to train the neural network to accurately classify patient-specific ultrasound images, and it is easier to maintain and evolve such patient-specific machine learning models.
Regardless of which DML platform is used, the machine learning (ML) model developed by the platform has several common features. First, the ML model specifies classes of images that input images (i.e., by a novice user) will be classified against. Second, the ML model specifies the input dimensions that determines the required size of input images. Third, the ML model specifies the weights and biases that determine the accuracy of how input images will the classified.
The ML model developed by the DLM platform is the structure of the actual neural network that will be used in evaluating images captured by a novice user 50. The optimized weights and biases of the ML model are iteratively computed and adjusted during the training process. In the training process, the weights and biases of the neural network are determined through iterative processes known as Feed-Forward (FF) and Back-Propagation (BP) that involve the input of training data into an input layer of the neural network and comparing the corresponding output at the network's output layer with the input data labels until the accuracy of the neural network in classifying images is at an acceptable threshold accuracy level.
The quality of the training and validation data sets determines the accuracy of the ML model, which in turn determines the accuracy of the neural network (e.g., ResNet) during image classification by a novice user. A high-quality data set is one that enables the neural network to be trained within a reasonable time frame to accurately classify a massive variety of new images (i.e., those that do not appear in the training or validation data sets). Measures of accuracy and error for neural networks are usually expressed as classification error (additional details available at https://www.gepsoft.com/gepsoft/APS3KB/Chapter09/Section2/SS01.htm), cross entropy error (https://en.wikipedia.org/wiki/Cross_entropy), and mean average precision (https://docs.microsoft.com/en-us/cognitive-toolkit/object-detection-using-fast-r-cnn-brainscript#map-mean-average-precision).
In one embodiment, the output of the neural network is the probability, for each image class, that an image belongs to the class. From this output, the MLM 600 may provide output-based feedback to the novice user of one or both of 1) the best predicted class for the image (i.e., the image class that the neural network determines has the highest probability that the image belongs to the class), and 2) the numerical probability (e.g., 0% to 100%) of the input image belonging to the best predicted class. The best predicted class may be provided to the novice user in a variety of ways, e.g., as a virtual text label, while the numerical probability may also be displayed in various ways, e.g., as a number, a number on a color bar scale, as a grayscale color varying between white and black, etc.
To train a neural network such as ResNet to classify ultrasound images for specific ultrasound procedures performed with ultrasound system 210, many high-quality images are required. In many prior art neural network approaches to image classification, these data sets are manually developed in a highly labor-intensive process. In one aspect, the present disclosure provides systems and methods for automating one or more portions of the generation of training and validation data sets.
Using software to automate the process of preparing accurately labeled image data sets not only produces data sets having minimal or no duplicate images, but also enables the neural network to be continuously trained to accurately classify large varieties of new images. In particular, automation using software allows the continual generation or evolution of existing image data sets, thereby allowing the continual training of ResNet as the size of the image data set grows over time. In general, the more high-quality data there is to train a neural network, the higher the accuracy of the neural network's ability to classify images will be. This approach contrasts sharply with the manual approaches to building and preparing image data sets for artificial intelligence.
As one nonlimiting example, an ultrasound carotid artery assessment procedure requires at least 10,000 images per patient for training a patient-specific neural network used to provide outcome-based feedback to a novice user in a 3D AR medical guidance system of the present disclosure. Different numbers of images may be used for different imaging procedures, with the number of images will depending upon the needs of the particular procedure.
The overall data set is usually split into two subsets, with 70-90%, more preferably 80-85%, of the images being included as part of a training set and 10-30%, more preferably 15-20%, of the images included in the validation data set, with each image being used in only one of the two subsets (i.e., for any image in the training set, no duplicate of it should exist in the validation set. In addition, any excessive number of redundant images in the training set should be removed to prevent the neural network from being overfitted to a majority of identical images. Removal of such redundant images will improve the ability of the neural network to accurately classify images in the validation set. In one embodiment, an image evaluation module evaluates each image in the training set to determine if it is a duplicate or near-duplicate of any other image in the database. The image evaluation module computes each image's structural similarity index (SSI) against all other images in the set. If the SSI between two images is greater than a similarity threshold, which in one nonlimiting example may be about 60%, then the two images are regarded as near duplicates and the image evaluation module removes all one of the duplicate or near duplicate images. Further, images that are down to exist both in the training set and the validation set are likewise removed (i.e., the image evaluation module computes SSI values for each image in the training set against each image in the validation set and removes duplicate or near-duplicate images from one of the training and validation sets). The reduction of duplicate images allows the neural network to more accurately classify images in the validation set, since the chance of overfitting the neural network during training to a majority of identical images is reduced or eliminated.
Next the reference user manually labels (615) each image as one of the available classes. For the carotid artery assessment, the images are labeled as radial, lateral or unknown.no image overlap in the training and validation data sets). For each labeled image, the reference user may in some embodiments (optional), manually identify (620) the exact area within the image where the target anatomical structure is located, typically with a box bounding the image. Two examples of this the use of bounding boxes to isolate particular structures are provided in
Once the entire data set is properly labeled, it is manually split (625) into the training data set and the validation data sets, which may then be used to train the neural network (e.g., ResNet). Neural networks comprise a series of coupled nodes organized into at least an input and an output layer. Many neural networks have one or more additional layers (commonly referred to as “hidden layers”) that may include one or more convolutional layers as previously discussed regarding MLM 600.
The method 600 also comprises loading (630) the neural network definition (such as a definition of ResNet), usually expressed as a program in a domain-specific computer language such as NDL, Python or BrainScript, into a DML platform or program such as CNTK, Caffe or Tensorflow. The DML platforms offer tunable or adjustable parameters that are used to control the outcome of the training process. Some of the parameters are common to all DML platforms, such as types of loss or error, accuracy metrics, types of optimization or back-propagation (e.g., Stochastic Gradient Descent and Particle Swarm Optimization). Some adjustable parameters are specific to one or more of the foregoing, such as parameters specific to Stochastic Gradient Descent such as the number of epochs to train, training size (e.g., minibatch), learning rate constraints, and others known to persons of skill in the art, In one example involving CNTK as the DML platform, the adjustable parameters include learning rate constraints, number of epochs to train, epoch size, minibatch size, and momentum constraints.
The neural network definition (i.e., a BrainScript program of ResNet) itself also has parameters that may be adjusted independently of any parameter adjustments or optimization of parameters in the DML platform, These parameters are defined in the neural network definition such as the connections between deep layers, the types of layers (e.g., convolutional, max pooling, ReLU), and their structure/organization (e.g., dimensions and strides). If there is minimal error or high accuracy during training and/or validating, then adjustment of these parameters may have a lesser effect on the overall image analysis performance compared to adjusting parameters not specific to the neural network definition (e.g., DML platform parameters), or simply having a high quality training data set. In the case of a system developed for carotid artery assessment, no adjustments to the neural network parameters were needed to achieve less than 10%-15% error, in the presence of a high quality training data set.
Referring again to
The method then includes feeding the validation data set to the Mt model (665), and the validation process is performed (670) using the validation data set. After the completion of the validation process, validation process metrics for loss, accuracy and/or error are obtained (675) for the validation process. A determination is made (680) whether the validation metrics are within an acceptable threshold for each metric, which may be the same as or different from those used for the training process. If the validation process metrics are outside of the acceptable thresholds, the adjustable parameters are adjusted to different values (655) and the training process is restarted (640). 1f the metrics are acceptable, then the ML model may be used to classify new data (685).
The process may be allowed to continue through one or more additional cycles. If validation process metrics are still unacceptable, then the data set is insufficient to properly train the neural network, and the data set needs to be regenerated.
Referring again to
In one aspect, the present invention involves using computer software to automate or significantly speed up one or more of the foregoing steps. Although capturing ultrasound images during use of the ultrasound system by a reference or proficient user (610) necessarily requires the involvement of a proficient user, in one embodiment the present disclosure includes systems and methods for automating all or portions of steps 610-625 of
In one embodiment, MLDM 705 is incorporated into computer system 700 (
Image capture module 710 may also comprise an interface such as a graphical user interface (GUI) 712 for display on a screen of computer 700 or ultrasound system 210. The GUI 712 may permit an operator (e.g., the reference user or a system developer) to automatically capture images while the reference user performs the medical procedure specific to MLDM 705 (e.g., a carotid artery assessment). More specifically, the GUI 712 enables a user to program the image capture module 710 to capture images automatically (e.g., at a specified time interval such as 10 Hz, or when 3DGS 400 detects that probe 210 is at a particular anatomical position) or on command (e.g., by a capture signal activated by the operator using a sequence of keystrokes on computer 700 or a button on ultrasound probe 215). The GUI 712 allows the user to define the condition(s) under which images are captured by image capture module 710 while the reference user performs the procedure of MLDM 705.
Once images have been captured (e.g., automatically or on command) by image capture module 710, MLDM 705 includes one or more feature modules (715, 720, 725, 745, etc.) to identify features associated with the various classes of images that are available for the procedure of MLDM 705. The features may be aspects of particular structures that define which class a given image should belong to. Each feature module defines the image criteria to determine whether a feature is present in the image. Depending on the number of features and the number of classes (which may each contain multiple features, MLDMs for different imaging procedures may have widely different numbers of feature modules. Referring again to
For example, in a carotid artery assessment procedure, the available classes may include a class of “radial cross section of the carotid artery,” a class of “lateral cross section of the carotid artery,” and a class of “unknown” (or “neither radial cross section nor lateral cross section”). For an image to be classified as belonging to the “radial cross section of the carotid artery” class, various features associated with the presence of the radial cross section of a carotid artery must be present in the image. The feature modules, e.g., 715, 720, etc., are used by the MLDM 705 to analyze captured images to determine whether a given image should be placed in the class of “radial cross section of the carotid artery” or in another class. Because the feature modules are each objectively defined, the analysis is less likely to be mislabeled because of the reference user's subjective bias.
Finally, each MLDM 705 may include a classification module 750 to classify each of the captured images with a class among those available for MLDM 705. Classification module 750 determines the class for each image based on which features are present and not present in the image, and labels each image as belonging to the determined class. Because the feature modules are each objectively defined, the classification module 750 is less likely to mislabel images than manual labeling based on the subjective judgment exercised by the reference user.
Computer 700 (
The automated capture and labeling of reference data by MLDM 705 may be better understood by an example of a carotid artery assessment using an ultrasound system. The radial and lateral cross-sections of the carotid artery have distinct visual features that can be used to identify their presence in ultrasound images at specific ultrasound depths. These visual features or criteria may be defined and stored as feature modules 715, 720, 725, etc. in MLDM 705 (or a central feature library in alternative embodiments) for a carotid artery assessment procedure. Captured images are then analyzed using the feature modules determine whether or not each of the carotid artery assessment features are present. The presence or absence of the features are then used to classify each image into one of the available classes for the carotid artery, assessment procedure.
The feature modules 715, 720, 725, etc. provide consistent analysis of image patterns of the target anatomical structures in the images captured during a reference carotid artery assessment procedure (e.g., by a proficient user). Feature modules for each image class may be defined by a reference user, a system developer, or jointly by both, for any number of ultrasound procedures such as the carotid artery assessment procedure.
Once the features for each carotid artery assessment procedure image class have been defined and stored as feature modules 715, 720, 725, etc., standard image processing algorithms (e.g., color analysis algorithms, thresholding algorithms, convolution with kernels, contour detection and segmentation, clustering, and distance measurements) are used in conjunction with the defined features to identify and measure whether the features are present in the captured reference images. In this way, the feature modules allow the MLM 705 to automate (fully or partially) the labeling of large data sets in a consistent and quantifiable manner.
The visual feature image processing algorithms, in one embodiment, are performed on all of the images that are captured during the reference performance of the particular medical procedure associated with the feature module, using software, firmware and/or hardware. The ability of the labeling module to label images may be verified by review of the automated labeling of candidate images by a reference user (e,g., a proficient sonographer, technician, or physician). The foregoing processes and modules allow developers and technicians to quickly and accurately label and isolate target structures in large image data sets of 10,000 or more images.
MILDMs as shown in
Although the functions and operation of MLDM 705 have been illustrated for a carotid artery assessment ultrasound procedure, it will be appreciated that additional modules (not shown) may be provided for different ultrasound procedures (e.g., a cardiac assessment procedure of the heart), and that such modules would include additional class and features modules therein. In addition, for non-imaging types of medical equipment, e.g., an EKG machine, labeling modules may also be provided to classify the output of the EKG machine into one or more classes (e.g., heart rate anomalies, QT interval anomalies, R-wave anomalies, etc.) having different structures and analytical processes but a similar purpose of classifying the equipment output into one or more classes.
Applicants have discovered that the automated capture and labeling of reference image data sets may be improved by automatically adjusting certain parameters within the feature modules 715, 720, 725, etc. As previously noted, the features modules use standard image processing algorithms to determine whether the defined features are present in each image. These image processing algorithms in the feature modules (e.g., color analysis algorithms, thresholding algorithms, convolution with kernels, contour detection and segmentation, clustering and distance measurements) include a number of parameters that are usually maintained as constants, but which may be adjusted. Applicants have discovered that by automatically optimizing these adjustable parameters within the image processing algorithms using Particle Swarm Optimization, it is possible to minimize the number of mislabeled images by the image processing algorithms in the feature modules. Automatic adjustment of the feature modules analysis image processing algorithms is discussed more fully in connection with
The method includes automatically capturing a plurality of ultrasound images (805) during a reference ultrasound procedure (e.g., performed by a proficient user), wherein each of the plurality of images is captured according to defined image capture criteria. In one embodiment, capture may be performed by an image capture module implemented in a computer (e.g., computer 700,
Referring again to
The method further comprises automatically classifying and labeling (815) each image as belonging to one of a plurality of available classes for the ultrasound medical procedure. As noted above, each image may be assigned to a class based on the features present or absent from the image. After an image is classified, the method further comprises labeling the image with its class. Labeling may be performed by storing in memory the image's class, or otherwise associating the result of the classification process with the image in a computer memory. In one embodiment, image classification may be performed by a classification module such as classification module 750 of
In some embodiments, the method may also involve automatically isolating (e.g., using boxes, circles, highlighting or other designation) within each image where each feature (i.e., those determined to be present in the feature analysis step) is located within the image (820). This step is optional and may not be performed in some embodiments. in one embodiment, automatic feature isolation (or bounding) may be performed by an isolation module that determines the boundary of each feature based on the characteristics that define the feature. The isolation module may apply appropriate boundary indicators (e.g., boxes, circles, ellipses, etc.) as defined in the isolation module, which in some embodiments may allow a user to select the type of boundary indicator to be applied.
After the images have been classified and labeled, the method includes automatically, splitting the set of labeled images into a training set and a validation set (825). The training set preferably is larger than the validation set (i.e., comprises more than 50% of the total images in the data set), and may range from 70-90%, more preferably 80-85%, of the total images. Conversely, the validation set may comprise from 10-30, more preferably from 15-20%, of the total images.
The remaining steps in the method 802 (e.g., steps 830-885) are automated steps that are similar to corresponding steps 630-685 and which, for brevity, are described in abbreviated form. The method further comprises providing a Deep Machine Learning (DML) platform (e.g., CNTK, Caffe, or Tensorflow) having a neural network to be trained loaded onto it (830). More specifically, a neural network (e.g., ResNet) is provided as a program in a computer language such as NDL or Python in the DML platform.
The training set is fed into the DML platform (835) and the training process is performed (840). The training process comprises iteratively computing weights and biases for the nodes of the neural network using feed-forward and back-propagation, as previously described, until the accuracy of the network in classifying images reaches an acceptable threshold level of accuracy.
The training process metrics of loss, accuracy, and/or error are Obtained (845) at the conclusion of the training process, and a determination is made (850) whether the training process metrics are within an acceptable threshold for each metric. If the training process metrics are unacceptable, the adjustable parameters of the DML platform (and optionally those of the neural network) are adjusted to different values (855) and the training process is restarted (840). In one example involving CNTK as the DML platform, the tunable or adjustable parameters include learning rate constraints, number of epochs to train, epoch size, minihatch size, and momentum constraints.
The training process may be repeated one or more times if error metrics are not acceptable, with new adjustable parameters being provided each time the training process is performed. In one embodiment, if the error metrics obtained for the training process are unacceptable, adjustments to the adjustable parameters (855) of the DML platform are made automatically, using an optimization technique such as Particle Swarm Optimization. Additional details on particle swarm theory are provided by Eberhart, R. C. & Kennedy, J., “A New Optimizer Using Particle Swarm Theory,” Proceedings of the Sixth International Symposium on Micro Machine and Human Science, 39-43 (1995). In another embodiment, adjustments to the adjustable parameters (855) in the event of unacceptable error metrics are made manually by a designer.
In one embodiment, each time automatic adjustments are made (855) to the adjustable parameters of the DML platform, automatic adjustments are also made to the adjustable parameters of the image processing algorithms used in the feature modules. As discussed in connection with
If the training process 840 fails to yield acceptable metrics (650) after a specific number of iterations (which may be manually determined, or automatically determined by, e.g., Particle Swarm Optimization), then the data set is insufficient to properly train the neural network and the data set is regenerated. If the metrics are within an acceptable threshold for each metric, then a DML model has been successfully generated (860). In one embodiment, acceptable error metrics may range from less than 5% to less than 10% average cross-entropy error for all epochs, and from less than 50% to less than 10% average classification error for all epochs. It will be recognized that different development projects may involve different acceptable thresholds, and that different DML platforms may use different types of error metrics.
If a successful DML model is generated (860), the method then includes feeding the validation data set, to the DML model (865), and the validation process is performed (870) using the validation data set. After the completion of the validation process, validation process metrics for loss, accuracy and/or error are obtained (875) for the validation process.
A determination is made (880) whether the validation metrics are within an acceptable threshold for each metric, which may be the same as or different from those used for the training process. If the validation process metrics are outside of the acceptable threshold, the adjustable parameters are adjusted to different values (855) and the training process is restarted (840). if the metrics are acceptable, then the DML model may be used to classify new data (885). In one embodiment, the step of adjusting the adjustable parameters of the DML platform after the validation process comprises automatically adjusting at least one of the adjustable parameters of the DML platform and automatically adjusting at least one of the adjustable parameters of the image processing algorithms, for example by an algorithm using Particle Swarm Optimization.
The process may be allowed to continue through one or more additional cycles. If evaluation process metrics are still unacceptable, then the data set is insufficient to properly train the neural network, and the data set needs to be regenerated.
As discussed in connection with
Examples of isolating boxes are shown in
Now before I get into exactly what I want to demonstrate today let me tell you about kind of why we're here. What we're wanting to introduce is a piece of software we call procedural guidance or, for short, pro G. Now it's what we believe cutting-edge and the first of its kind as we're introducing procedural guidance in an augmented reality form.
What we do and what we try to perfect is protecting the human condition. We've done that multiple times with NASA via grants awarded over the last 20 years. We actually have products working on the International Space Station and products now also going on to the lunar rover and onto the lunar module. From there we've deduced that technology and we've distilled that value learned over time to what now I'm going to show you.
Here is the augmented reality display of pro G. (
Ok guys now we have the Trimble unit up and running, that is, the XR 10, which houses our Pro G software. What I first want to show you is demos that will highlight the usability, the functionality, and the value distilled from the actual hardware integrated with our software.
First, I'm going to start by pressing a virtual button. (
Back to the home screen, slate examples. This is what's nice. If you're out there on the rig site or if you're in a facility, you've got procedures that you're looking through the augmented reality HoloLens 2 that we've got built into the intrinsically safe Trimble unit. What you normally do is you have to scroll through different documents. (
Over on the next one it's vertical up and down, Over on the horizontal one I can move it wherever I want. Also what's very cool is I can also move these things just by grabbing them and putting the procedures wherever I want (
Back to my home screen. We go to eye tracking target selection. This is one of my favorites. This essentially is tracking your retina and as you can see, I'm looking at the blue ball. It's spinning around. I'm now looking at the green ball spinning around. I'm looking at the red ball spinning around. I'm moving out, the orange ball spinning around now, (
There's two different ways that I can manipulate these. What I want to do is make them explode. I can do that either verbally or I can do that with my hands. The point that I'm trying to put here is that you can verbally manipulate things while doing other tasks, making it hands-free. From an HSE perspective, you're protecting that person. If he's in a confined space where he needs to be able to hold on and make sure that he's secure, he can do that while continuing to push his procedure forward.
Now what I3 m going to do is I'm going to explode all the blue balls just by saying the word “explode.” (
For oil and gas folks, we've gone ahead and built in a construction procedure. Essentially what we're going to do is put together a 13 and 5/8 10k stack via the pro G software system. This is very high-level, but we just want to show you exactly what capabilities we have.
I'm going to go ahead and start the procedure. It's essentially taking the novice to competent, and it'll take the competent to expert in real time by following these procedures. We'll give it a second for the software to boot up. Now we see the first step of the procedure up there. (
Here I can see all of the components that make up the stack. Now I can actually grab that stack and I can move it closer to myself and put it in a more manipulatable way where I can start now mocking it up. I can grab it with two hands and rotate it and go that way. I'm going to make it just a little bit smaller, and then we're going to start from here. (
Step one; identify the well head casing on the ground. What you're going to notice is that the green and yellow flashing actual piece of equipment that is going to be the step that it's associated to. (
Let's go to step number two, select a gasket to attach to the casing. (
We'll go to step number three. (
Let's go on to the next procedure. (
I've done step four, now I'm going to go ahead and move forward. (
Normally what happens here is you're going to go ahead and get started building everything together. You're going to get the guys coming out with a torque wrench. They're going to be torqueing everything up to a certain pressure. We actually have the capability to show exactly where that pressure torque is when it's in the perfect range and display it to the user who's doing that in an AR environment, We'll be able to report all of those numbers in that data to ensure the integrity of the bolt, that it's not over torqued and it's not under torqued.
Now that I've got the drilling spool attached, I'm going to go ahead and move to the next procedure. That is number six. (
Moving on to the next procedure, number seven, we want to go ahead and do the double ram. (
We'll close out that procedure and move forward to select the gasket to attach to the double ram. There's the gasket selected. It is now moved.
Now we're going to go ahead and close that out. We're going ahead to put the annular on top of that. (
What we're going to demonstrate now is actual Pro G in use. This is what we call the UIA. It's an actual piece of kit that's used on the International Space Station. It's what the astronauts use to hook up to get water supply, air supply, and also a waste line.
What you can see is I have this floating essential procedural guidance piece of kit. (
Now I'm going to link procedure procedural guidance or pro G with the actual UIA. I'm going to choose the procedure that I want to do, the UIA panel procedure object recognition.
As you can see, it vectors a user and has a reticle floating around exactly what switch I need to flip now. (
As can be seen in
Another thing that the program and software can support is actually taking pictures and videos (and/or other recorded data) and getting different files and uploading that into the actual procedure itself. In other words, recorded data may be provided to a developer or maintainer of the base procedure instruction data for updating new versions, providing modified versions or forks for different use-cases, or the like.
The actual user sees a floating green flashing green circle that vectors the user as to which switch to flip and where to go. I'm going to go ahead and flip on the #1. Now I'm going to go over and look. Pro G automatically takes me to the next step. Again, it vectors me to another floating reticle to turn on the emu #2 power switch. (
Again, Pro G takes me automatically to the next step. It's forward leading the user in the procedure. Make sure EV power LEDs are green. (
We go on to the next step. This is very important: we want to make sure that we turn on oxygen and get oxygen to the astronaut's supply suit. It tells me to go ahead and open that up. (
It's telling me I'm on step 1.8; open the EV-2 water supply valve. Again, it's highlighted. (
Now I'm on step 1.10. This is to the telemetry variables, so I need to open up the EV-2 water waste that is open. (
Now that that's complete, I go to the next step. I've completed the procedure and now we're done.
That's it. I wanted to give you a high-level highlight of exactly the value that we can deliver. Again, remember, we have an intrinsically safer housing of the HoloLens. Augmented reality procedural guidance. We're protecting the human condition. We are reducing human error in all procedures.
Well beyond the state-of-the-art, among Tienovix' patent pending technology is augmented reality systems that are capable of guiding a remote user in performing various tasks. These tasks may include industrial procedures, medical procedures, manufacturing procedures, quality control procedures, etc. The user may be prompted by texts, graphics, and/or virtual objects within the user's field of vision while the user is performing a procedure.
For example, an augmented reality (A/R) display device may be used to guide a user to perform an industrial procedure, such as assembling a wellhead stack in an oil field. Displayed in the A/R display device may be one or more objects, such as text windows, graphics, virtual objects, etc., which all appear in the user's field of view. These objects guide the user in performing one or more tasks, such as selecting one of several parts in strategic order for assembly. The user may be able to select particular graphics, text windows, and/or virtual objects in an interactive fashion to navigate through various tasks. In some cases, particular manufacturing parts within the user's field of vision may be identified and virtually highlighted for ease of recognition. The user may then engage the identified manufacturing part for performing a step in the assembly task.
This process may be performed in a remote location, such as an off-site oil well platform or a medical center, while being monitored and supervised by a remote skilled practitioner, artificial intelligence (A.I.) system, or an expert system. Retina sensing features associated with the A/R display device may track the user's eye movements in order to determine exactly what the user is viewing. Further, the user's hand movements may be also be tracked. Based on multiple inputs, such as the user's vision, the user's hand movements, machine-recognition of contents in the user's field of view, etc., an expert system may provide instructions to the user to perform, move, collect and assemble various parts. In this matter, accuracy of the user's performance, as well as safety features, are enhanced. That is, using Tienovix' systems, a novice user may instantly become competent with respect to a particular task, and an experienced user may instantly become an expert. Moreover, Tienovix' system may be deployed to perform a quality control check for any process, enhancing the accuracy of critical tasks.
Tienovix' innovative mixed-reality technology may be used for various commercial and medical tasks, such as industrial, manufacturing, testing, training, medical procedure, data acquisition, and many other applications. This cutting-edge technology can be found in the Tienovix Technology Series™. The Tienovix Technology Series™ includes Tienovix' Pro-G™ System and its Vulcan™ System, which includes the Vulcan-Training System™, the Vulcan-Diagnostic System™ and the Vulcan Automated-Diagnostic System™.
As a simplified illustration of a part of one of Tienovix' systems, as shown in
Further, as shown in
In one embodiment, the controller 1110 may comprise A) a library containing 1) stored reference positioning data relating to one or more of a movement, position, and orientation of at least a portion of an equipment system 1120 (e.g. a tool 1122) during a reference procedure and 2) stored reference outcome data relating to an outcome of a reference procedure; and B) a machine learning module (MLM) for providing at least one of 1) position-based 3D AR feedback to a user 1130 based on sensed user data and the reference positioning data, and 2) outcome-based 3D AR feedback to the user 1130 based on data received from the equipment system 1120 (e.g., from the tool 1122) during the procedure performed by the user and reference outcome data; wherein at least one presentation element of the position-based 3D AR feedback and the outcome-based 3D AR feedback is based at least in part on the user condition data.
In another embodiment, as shown in
The controller 1110 may also comprise a machine learning module (MLM) 1230. In one embodiment, the MLM 1230 may be as described herein.
The controller 1110 may further comprise a library 1240. In one embodiment, the library 1240 may be as described herein.
The controller 1110 may additionally comprise a simulation module 1250. The simulation module 1250 may be configured to generate data based on one or models each of one or more elements of the system 1100 depicted in
The controller 1110 may comprise an artificial intelligence (AI) module 1260. The AI module 1260 may process data received from one or more of the input processing module 1220, the MLM module 1230, the library 1240, and the simulation module 1250, in view of the nature of the equipment system 1120, the tool 1122, the target 1170, and the user 1130 (each of which is described in more detail below), to generate data relating to a procedure being performed by the user 1130 using the equipment system 1120 to affect a change or perform another action on the target 1170. The term “artificial intelligence” is not limiting to any particular embodiment of software, hardware, or firmware, and instead encompasses neural networks, expert systems, and other data structures and algorithms known to the person of ordinary skill in the art having the benefit of the present disclosure.
The controller 1110 may also comprise a procedure instruction data generation module 1270. The procedure instruction data generation module 1270 may process data received from the AI module 1260 in order to generate procedure instruction data. Such data may not yet be in condition for presentation to the user 1130 of the system 1100. Accordingly, the procedure instruction data generation module 1270 may output its results to one or more of a graphics module 1272, an audio module 1274, and/or other presentation (e.g., tactile, haptic, olfactory, gustatory, etc.) module 1276. The modules 1272-1276 may process the procedure instruction data in order to generate one or more human-apprehensible elements suitable for presentation to the user 1130 during the performance of a procedure using the equipment system 1120. For example, the graphics module 1272 may generate one or more text, icon, interactive or visual cue elements; the audio module 1274 may generate one or more voice narration or auditory cue elements; and the other presentation module 1276 may generate one or more tactile, haptic, olfactory, gustatory, or other elements.
The output processing module 1280 of the controller 1110 then receives the generated elements of the procedure instruction data and transfers them to an augmented reality user interface (ARUI), such as the augmented reality display 1140 depicted in
Returning to
A “procedure,” as used herein, refers to any process in which, by use of an optional equipment system 1120 or by body members of the user 1130, an action may be performed on a target 1170.
In embodiments, the procedure may be a training or operations procedure, in which embodiments the equipment system 1120 may be a car, truck, construction vehicle, combat vehicle, boat, ship, aircraft, spacecraft, space extravehicular activity (EVA) suit, weapon, power tool, manufacturing facility, assembly line, or component of any of the foregoing, and the target may be a roadway, a vehicle track, a construction site, a combat training ground, a waterway, an airspace, a volume of outer space, a vehicle, a structure, a firearms target, an ordnance target, a workpiece, or the like. Exemplary procedures include, but are not limited to, training or operations in vehicle transportation; construction; manufacturing; maintenance; quality control; combat actions on land, at sea, or in air; combat support actions on land, at sea, or in air, e.g. air-to-air refueling, takeoff and landing of aircraft from aircraft carriers; space operations, such as EVAs (colloquially, “spacewalks”), docking, etc.; and more that will be readily occur to the person of ordinary skill in the art having the benefit of the present disclosure.
“Procedure instruction data,” as used herein, refers to any combination of elements that may be presented by an augmented reality display 1140 to the user 1130, wherein the elements provide instructions for one or more actions to be performed as part of the procedure performed by the user 1130 on the target 1170, such as through action of his or her body members and/or his or her manipulations of the tool 1122 of the equipment system 1120. In one embodiment, the procedure instruction data comprises at least one of text, an icon, an image, an interactive element (e.g., text or an icon that may receive augmented reality input (e.g. a pinch, squeeze, flick, and/or other motion of one or both hands and/or one or more fingers; a turn or other gesture of the head; a voice command, etc.) from the user 1130), a visual cue, a number of instructions displayed simultaneously, an auditory cue (e.g., a pleasant sound when the user 1130 brings the tool 1122 to a desired position and/or orientation; a unpleasant sound when the user 1130 attempts to perform an action with the tool 1122 when the tool 1122 is in an undesired position and/or orientation), or a narration.
In one embodiment, the system 1100 further comprises a user input module 1154 configured to receive a user input from the user 1130 regarding a user's belief that he or she has completed an instruction presented to him or her through augmented reality display 1140. The user input module 1154 may comprise a physical button, switch, or slider; a touchscreen; a microphone; among others; or two or more thereof. In this embodiment, the controller 1110 may provide the procedure instruction data based at least in part on the user input.
The system 1100 also comprises an augmented reality user display 1140. The augmented reality display 1140 presents the procedure instruction data, generated by the controller 1110, to the user 1130 during at least a portion of the procedure. The augmented reality display 1140 may be any known augmented reality hardware, such as a HoloLens 2 (Microsoft Corporation, Redmond, Wash.); among other augmented reality hardware currently known or yet to be developed or commercialized. Although the augmented reality display 1140 is conceptually depicted in proximity to the eyes of the user 1130, and the exemplary augmented reality hardware discussed above presents graphical data to the eyes of the user 1130 and may also present auditory data to the ears of the user 1130, the augmented reality display 1140 may provide any of graphical data, auditory data, olfactory data, tactile data, haptic data, gustatory data, among others, or two or more thereof.
The system 1100 may also comprise a memory 1180. The memory 1180 may comprise one or more database(s) 1182, e.g., as shown in the depicted embodiment, first database 1182a through Nth database 1182n. The database(s) 1182 may store data relating to one or more of the equipment system 1120, the target 1170, the augmented reality display 1140, procedure instruction data generated by or to be generated by the controller 1110, etc. The database(s) 1182 may be selected from relational databases, lookup tables, or other database structures known to the person of ordinary skill in the art.
The memory 1180 may additionally comprise a memory interface 1184. The memory interface 1184 may be configured to read data from the database(s) 1182 and/or write data to the database(s) 1182, and/or provide data to or receive data from the controller 1110, the equipment system 1120, and/or other components of the system 1100.
The system 1100 further comprises a recording device 1150. The recording device 1150 may be configured to provide a recording of at least one of the user 1130 or the target 1170 on which the procedure is performed. The recording device 1150 may be configured to capture signals from more than just the user 1130 and/or the target 1170. In one embodiment, the recording device 1150 may be configured to capture signals from a field 1152 comprising the user 1130 and/or the target 1170, along with other items in the vicinity of the user 1130 and the target 1170.
The recording device 1150 may be configured to capture at least one of visible light, ultraviolet (UV) light, infrared (IR) light, or audio. In other words, the recording device 1150 may comprise one or more of a camera, a video camera, a UV camera, an IR camera, a microphone, or the like.
In one embodiment, not shown, the recording device 1150 may be positioned in proximity to the user and configured to capture signals from essentially the user's field of view or field of hearing. In one embodiment, the recording device 1150 may be a camera mounted on or near the augmented reality display 1140, such as on a hardhat worn by the user 1130 during the procedure and to which the augmented reality display 1140 may also be mounted.
The system 1100 may further comprise a communication interface 1190. The communication interface 1190 may be configured to transmit data generated by the system 1100 to a remote location and/or receive data generated at a remote location for use by the system 1100. For example, the communication interface 1190 may be configured to receive a recording provided by the recording device 1150. The communication interface 1190 may also be configured to transmit the recording to the remote location for expert review or analysis, for review or analysis by an expert system and/or an artificial intelligence (AI) system, for training purposes of the user 1130 and/or trainees at the remote location, for updating or modifying base procedure instruction data to be used by a controller 1110 of the present or a different system 1100, etc. Any review or analysis of the procedure or a step thereof may be performed in real-time or after the procedure or the step thereof is completed.
The communication interface 1190 may be one or more of a Wi-Fi interface, a Bluetooth interface, a radio communication interface, or a telephone communication interface, among others that will be apparent to the person of ordinary skill in the art.
In one embodiment of the method 1300, the augmented reality display may comprise a Microsoft® HoloLens 2® and a casing to which the HoloLens 2 is mounted, wherein the casing is reversibly affixed to the hardhat.
In one embodiment of the method 1300, the instructions further comprise one or more of a reticle overlaid on an object upon which the procedure is performed or a digital twin of the object upon which the procedure is performed.
In one embodiment, the method 1300 additionally comprises receiving (at 1355), by a communication interface, the recording from the recording device. In a further embodiment, the recording comprises capturing at least one of visible light, ultraviolet light, infrared light, or audio. The communication interface may then transmit the recording to a remote location and, optionally, receive information from the remote location based on an analysis or review of the recording made at the remote location.
After initiating (at 4635) the procedure, or after beginning (at 4665) a step of the procedure, the method 4600 may comprise recording, by a recording device, the user and/or the target of the procedure.
The method 4600 may further comprise receiving (at 4640) feedback from the user. For example, the feedback may be a verbal statement or a physical or augmented-reality action (e.g., pressing a real or a virtual button). If the current step of the procedure is complete, as determined at 4645, then the method 4600 may comprise advancing (at 4655) to the next step. If the current step is not complete (as determined at 4645), the method 4600 may comprise providing (at 4650) feedback for the user to complete the step.
After the method 4600 advances (at 4655) to the next step, the method 4600 may comprise determining (at 4660) if the procedure is complete. If the procedure is not complete, the next step advanced to at 4655 is begun (at 4665) and becomes the current step. On the other hand, if the procedure is determined (at 4660) to be complete, the method 4600 may include performing (at 4670) one or more post-procedure actions, such as logging a result of the procedure, sending a message to a third party that that user completed the procedure, etc.
The technology described above may be implemented into one or more systems of the Tienovix Technology Series™.
Throughout, we have referred to “3D” augmented reality elements. The person of ordinary skill in the art having the benefit of the present disclosure will be able, as a matter of routine experimentation, to adapt all such concepts to 2D augmented reality elements, 2D AR feedback, and other two-dimensional AR concepts. For example, the person of ordinary skill in the art would readily be able to adapt 3D AR elements displayable to a user equipped with an extended reality (XR) interface, such as a HoloLens or comparable system, into 2D AR elements displayable to a user with a smartphone or tablet equipped with a rear camera. Other apparatus for displaying 2D AR elements to a user may be used as a matter of routine experimentation, provided, of course, that the person of ordinary skill in the art has the benefit of the present disclosure. Without that benefit, implementing any of the concepts disclosed herein would require undue experimentation.
All of the systems and methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the systems and methods of this invention have been described in terms of particular embodiments, it will be apparent to those of skill in the art that variations may be applied to the systems and methods and in the steps or in the sequence of steps of the methods described herein without departing from the concept, spirit, and scope of the invention. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope, and concept of the invention as defined by the appended claims.
In various embodiments, the present invention relates to the subject matter of the following numbered paragraphs.
101. A method for providing real-time, three-dimensional (3D) augmented reality (AR) feedback guidance to a user of a medical equipment system, the method comprising:
102. The method of claim 101, wherein the medical procedure performed by a user of the medical equipment comprises a first medical procedure, and the stored reference positioning data and stored reference outcome data relate to a reference performance of the first medical procedure prior to the user's performance of the first medical procedure.
103. The method of claim 101, wherein the medical procedure performed by a user of the medical equipment comprises a first ultrasound procedure, and the stored reference positioning data and stored reference outcome data comprise ultrasound images obtained during a reference performance of the first ultrasound procedure prior to the user's performance of the first ultrasound procedure.
104. The method of claim 103, wherein sensing real-time user positioning data comprises sensing real-time movement by the user of an ultrasound probe relative to the body of a patient.
105. The method of claim 101, wherein generating real-time outcome-based 3D AR feedback is based on a. comparison, using a neural network, of real-time images generated by the user in an ultrasound procedure to retrieved images generated during a reference performance of the same ultrasound procedure prior to the user.
106. The method of claim 105, wherein the comparison is performed by a convolutional neural network.
107. The method of claim 101, wherein sensing real-time user positioning data comprises sensing one or more of the movement, position, and orientation of at least a portion of the medical equipment system by the user with a sensor comprising at least one of a magnetic GPS system, a digital camera tracking system, an infrared camera system, an accelerometer, and a gyroscope.
108. The method of claim 101, wherein sensing real-time user positioning data comprises sensing at least one of:
109. The method of claim 101, wherein providing at least one of the real-time position-based 3D AR feedback and the real-time output-based 3D AR feedback to the user comprises providing a feedback selected from:
110. The method of claim 101, wherein providing at least one of the real-time position-based 3D AR feedback and the real-time output-based 3D AR feedback comprises providing both of the real-time position-based 3D AR feedback and the real-time output-based 3D AR feedback to the user.
111. The method of claim 101, wherein providing at least one of the real-time position-based 3D AR feedback and the real-time output-based 3D AR feedback comprises providing said at least one feedback to a head mounted display (HMD) worn by the user.
201. A method for developing a machine learning model of a neural network for classifying images for a medical procedure using an ultrasound system, the method comprising:
A) performing a first medical procedure using an ultrasound system;
B) automatically capturing a plurality of ultrasound images during the performance of the first medical procedure, wherein each of the plurality of ultrasound images is captured at a defined sampling rate according to defined image capture criteria;
C) providing a plurality of feature modules, wherein each feature module defines a feature which may be present in an image captured during the medical procedure;
D) automatically analyzing each image using the plurality of feature modules;
E) automatically determining, for each image, whether or not each of the plurality of features is present in the image, based on the analysis of each imagine using the feature modules;
F) automatically labeling each image as belonging to one class of a plurality of image classes associated with the medical procedure;
G) automatically splitting the plurality of images into a training set of images and a validation set of images;
H) providing a deep machine learning (DML) platform having a neural network to be trained loaded thereon, the DML platform having a plurality of adjustable parameters for controlling the outcome of a training process;
I) feeding the training set of images into the DML platform;
J) performing the training process for the neural network to generate a machine learning model of the neural network;
K) obtaining training process metrics of the ability of the generated machine learning model to classify images during the training process, wherein the training process metrics comprise at least one of a loss metric, an accuracy metric; and an error metric for the training process;
L) determining whether each of the at least one training process metrics is within an acceptable threshold for each training process metric;
M) if one or more of the training process metrics are not within an acceptable threshold, adjusting one or more of the plurality of adjustable DML parameters and repeating steps J, K, and L;
N) if each of the training process metrics is within an acceptable threshold for each metric, performing a validation process using the validation set of images;
O) obtaining validation process metrics of the ability of the generated machine learning model to classify images during the validation process, wherein the validation process metrics comprise at least one of a loss metric, an accuracy metric, and an error metric for the validation process;
P) determining whether each of the validation process metrics is within an acceptable threshold for each validation process metric;
Q) if one or more of the validation process metrics are not within an acceptable threshold, adjusting one or more of the plurality of adjustable DML parameters and repeating steps J-P; and
R) if each of the validation process metrics is within an acceptable threshold for each metric, storing the machine learning model for the neural network.
202. The method of claim 201, further comprising:
S) receiving, after storing the machine learning model for the neural network, a plurality of images from a user performing the first medical procedure using an ultrasound system;
T) using the stored machine learning model to classify each of the plurality of images received from the ultrasound system during the second medical procedure.
203. The method of claim 201, further comprising:
S) using the stored machine learning model for the neural network to classify a plurality of ultrasound images for a user performing the first medical procedure.
204. The method of claim 201, wherein performing the training process comprises iteratively computing weights and biases for each of the nodes of the neural network using feed-forward and back-propagation until the accuracy of the network in classifying images reaches an acceptable threshold level of accuracy.
205. The method of claim 201, wherein performing the validation process comprises using the machine learning model generated by the training process to classify the images of the validation set of image data.
206. The method of claim 201, further comprising stopping the method if steps J, K, and L have been repeated more than a threshold number of repetitions.
207. The method of claim 206, further comprises stopping the method if steps N-Q have been repeated more than a threshold number of repetitions.
208. The method of claim 201, wherein providing a deep machine learning (DML) platform comprises providing a DML platform having at least one adjustable parameter selected from learning rate constraints, number of epochs to train, epoch size, minibatch size, and momentum constraints,
209. The method of claim 208, wherein adjusting one or more of the plurality of adjustable DML parameters comprises automatically adjusting said one or more parameters using a particle swarm optimization algorithm.
210. The method of claim 201, wherein automatically splitting the plurality of images comprises automatically splitting the plurality of images into a training set comprising from 70% to 90% of the plurality of images, and a validation set comprising from 10% to 30% of the plurality of images.
211. The method of claim 201, wherein automatically labeling each image further comprises isolating one or more of the features present in the image using a boundary indicator selected from a bounding box, a bounding circle, a bounding ellipse, and an irregular bounding region.
212. The method of claim 201, wherein obtaining training process metrics comprises obtaining at least one of average cross-entropy error for all epochs and average classification error for all epochs.
213. The method of claim 201, wherein determining whether each of the training process metrics are within an acceptable threshold comprises determining whether average cross-entropy error for all epochs is less than a threshold selected from 5% to 10%, and average classification error for all epochs is less than a threshold selected from 15% to 10%.
214. The method of claim 201, wherein step A) is performed by an proficient.
The particular embodiments disclosed above are illustrative only, as the invention may be modified and practiced in different but equivalent manners apparent to those skilled in the art having the benefit of the teachings herein. Examples are all intended to be non-limiting. Furthermore, exemplary details of construction or design herein shown are not intended to limit or preclude other designs achieving the same function. It is therefore evident that the particular embodiments disclosed above may be altered or modified and all such variations are considered within the scope and spirit of the invention, which are limited only by the scope of the claims.
Embodiments of the present invention disclosed and claimed herein may be made and executed without undue experimentation with the benefit of the present disclosure. While the invention has been described in terms of particular embodiments, it will be apparent to those of skill in the art that variations may be applied to systems and apparatus described herein without departing from the concept, spirit and scope of the invention.
Number | Date | Country | |
---|---|---|---|
62967178 | Jan 2020 | US | |
62971075 | Feb 2020 | US | |
62450051 | Jan 2017 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16732353 | Jan 2020 | US |
Child | 17063629 | US | |
Parent | 15878314 | Jan 2018 | US |
Child | 16732353 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 17063629 | Oct 2020 | US |
Child | 17162622 | US |