The invention relates to augmented reality systems, and is particularly applicable to use in medical procedures.
Augmented reality is a technique that superimposes a computer image over a viewer's direct view of the real world. The position of the viewer's head, objects in the real world environment, and components of the display system are tracked, and their positions are used to transform the image so that it appears to be an integral part of the real world environment. The technique has important applications in the medical field. For example, a three-dimensional image of a bone reconstructed from CT data, can be displayed to a surgeon superimposed on the patient at the exact location of the real bone, regardless of the position of either the surgeon or the patient.
Augmented reality is typically implemented in one of two ways, via video overlay or optical overlay. In video overlay, video images of the real world are enhanced with properly aligned virtual images generated by a computer. In optical overlay, images are optically combined with the real scene using a beamsplitter, or half-silvered mirror. Virtual images displayed on a computer monitor are reflected to the viewer with the proper perspective in order to align the virtual world with the real world. Tracking systems are used to achieve proper alignment, by providing information to the system on the location of objects such as surgical tools, ultrasound probes and a patient's anatomy with respect to the user's eyes. Tracking systems typically include a controller, sensors and emitters or reflectors.
In optical overlay the partially reflective mirror is fixed relative to the display. A calibration process defines the location of the projected display area relative to a tracker mounted on the display. The system uses the tracked position of the viewpoint, positions of the tools, and position of the display to calculate how the display must draw the images so that their reflections line up properly with the user's view of the tools.
It is possible to make a head mounted display (HMD) that uses optical overlay, by miniaturizing the mirror and computer display. The necessity to track the user's viewpoint in this case is unnecessary because the device is mounted to the head, and the device's calibration process takes this into account. The mirrors are attached to the display device and their spatial relationship is defined in calibration. The tools and display device are tracked by a tracking system. Due to the closeness of the display to the eye, very small errors/motions in the position (or calculated position) of the display on the head translate to large errors in the user workspace, and difficulty in calibration. High display resolutions are also much more difficult to realize for an HMD. HMDs are also cumbersome to the user. These are significant disincentives to using HMDs.
Video overlay HMDs have two video cameras, one mounted near each of the user's eyes. The user views small displays that show the images captured by the video cameras combined with any virtual images. The cameras can also serve as a tracking system sensor, so the relative position of the viewpoint and the projected display area are known from calibration So only tool tracking is necessary. Calibration problems and a cumbersome nature also plague HMD video overlay systems.
A device commonly referred to as a “sonic flashlight” (SF) is an augmented reality device that merges a captured image with a direct view of an object independent of the viewer location. The SF does not use tracking, and it does not rely on knowing the user viewpoint. It accomplishes this by physically aligning the image projection with the data it should be collecting. This accomplishment actually limits the practical use of the system, in that the user has to peer through the mirror to the area where the image would be projected. Mounting the mirror to allow this may result in a package that is not ergonomically feasible for the procedure for which it is being used. Also, in order to display 3D images, SF would need to use a 3D display, which results in much higher technologic requirements, which are not currently practical. Furthermore, if an SF were to be used to display anything other than the real time tomographic image (e.g. unimaged tool trajectories), then tracking would have to be used to monitor the tool and display positions.
Also known in the art is an integrated videography (IV) having an autostereoscopic display that can be viewed from any angle. Images can be displayed in 3D, eliminating the need for viewpoint tracking because the data is not shown as a 2D perspective view. The device has been incorporated into the augmented reality concept for a surgical guidance system. A tracking system is used to monitor the tools, which is physically separated from the display. Calibration and accuracy can be problematic in such configurations. This technique involves the use of highly customized and expensive hardware, and is also very computationally expensive.
The design of augmented reality systems used for surgical procedures requires sensitive calibration and tracking accuracy. Devices tend to be very cumbersome for medical use and expensive, limiting there usefulness or affordability Accordingly, there is a need for an augmented reality system that can be easily calibrated, is accurate enough for surgical procedures and is easily used in a surgical setting.
The present invention provides an augmented reality device to combine a real world view with information, such as images, of one or more objects. For example, a real world view of a patient's anatomy may be combined with an image of a bone within that area of the anatomy. The object information, which is created for example by ultrasound or a CAT scan, is presented on a display. An optical combiner combines the object information with a real world view of the object and conveys the combined image to a user. A tracking system tracks the location of one or more objects, such as surgical tools, ultrasound probe or body part to assure proper alignment of the real world view with object information. At least a part of the tracking system is at a fixed location with respect to the display. A non-head mounted eyepiece is provided at which the user can view the combined object and real world views. The eyepiece fixes the user location with respect to the display location and the optical combiner location so that the user's position need not be tracked directly.
The invention is best understood from the following detailed description when read with the accompanying drawings.
FIGS. 3A-B depict augmented reality devices using an infrared camera according to an illustrative embodiment of the invention.
FIGS. 5A-C depict a stereoscopic image overlay device according to illustrative embodiments of the invention.
FIGS. 7A-C depict use of mechanical arms according to illustrative embodiments of the invention.
Advantageously, embodiments of the invention may provide an augmented reality device that is less sensitive to calibration and tracking accuracy errors, less cumbersome for medical use, less expensive and easier to incorporate tracking into the display package than conventional image overlay devices. An eyepiece is fixed to the device relative to the display so that the location of the projected display and the user's viewpoint are known to the system after calibration, and only the tools, such as surgical instruments, need to be tracked. The tool (and other object) positions are known through use of a tracking system. Unlike video-based augmented reality systems, which are commonly implemented in HMD systems, the actual view of the patient, rather than an augmented video view, is provided.
The present invention, unlike the SF has substantially unrestricted viewing positions relative to tools (provided the tracking system used does not require line-of-sight to the tools), 3D visualization, and superior ergonomics.
The disclosed augmented reality device in its basic form includes a display to present information that describes one or more objects in an environment simultaneously. The objects may be, for example, a part of a patient's anatomy, a medical tool such as an ultrasound probe, or a surgical tool. The information describing the objects can be images, graphical representations or other forms of information that will be described in more detail below. Graphical representations can, for example, be of the shape, position and/or the trajectory of one or more objects.
An optical combiner combines the displayed information with a real world view of the objects, and conveys this augmented image to a user. A tracking system is used to align the information with the real world view. At least a portion of the tracking system is at a fixed location with respect to the display.
If the camera (sensor) portion of the tracking system is attached to a box housing the display, i.e. if they are in a single unit or display unit, it would not require the box to be tracked, and would create a more ergonomically desirable device. Preferably the main reference portion of the tracking system (herein referred to as the “base reference object”) is attached to the single unit. The base reference object may be described further as follows: tracking systems typically report the positions of one or more objects, or markers relative to a base reference coordinate system. This base coordinate system is defined relative to a base reference object. The base reference object in an optical tracking system, for example, is one camera or a collection of cameras; (the markers are visualized by the camera(s), and the tracking system computes the location of the markers relative to the camera(s). The base reference object in an electromagnetic tracking system can be a magnetic field generator that invokes specific currents in each of the markers, allowing for position determination.
It can be advantageous to fix the distance between the tracking system's base reference object and the display, for example by providing them in a single display unit. This configuration is advantageous for two reasons. First, it is ergonomically advantageous because the system can be configured to place the tracking system's effective range directly in the range of the display. There are no necessary considerations by the user for external placement of the reference base. For example, if using optical tracking, and the cameras are not mounted to the display unit, then the user must determine the camera system placement so that both the display and the tools to be tracked can all be seen with the camera system. If the camera system is mounted to the display device, and aimed at the workspace, then the only the tools must be visible, because the physical connection dictates a set location of the reference base to the display unit.
Second, there is an accuracy advantage in physically attaching the base reference to the display unit. Any error in tracking that would exist in external tracking of the display unit is eliminated. The location of the display is fixed, and determined through calibration, rather than determined by the tracking system, which has inherent errors. It is noted that reference to “attaching” or “fixing” includes adjustably attaching or fixing.
Finally, the basic augmented reality device includes a non-head mounted eyepiece at which the user can view the augmented image and which fixes the user location with respect to the display location and the optical combiner location.
Although the embodiments described above include infrared images, other nonvisible images, or images from subsets of the visible spectrum can be used and converted to visible light in the same manner as described above.
The term “eyepiece” is used herein in a broad sense and includes a device that would fix a user's viewpoint with respect to the display and optical combiner. An eyepiece may contain vision aiding tools and positioning devices. A vision aiding tool may provide magnification or vision correction, for example. A positioning device may merely be a component against which a user would position their forehead or chin to fix their distance from the display. Such a design may be advantageous because it could accommodate users wearing eyeglasses. Although the singular “eyepiece” is used here, an eyepiece may contain more than one viewing component.
The eye piece may be rigidly fixed with respect to the display location, or it may be adjustably fixed. If adjustably fixed, it can allow for manual adjustments or electronic adjustments. In a particular embodiment of the invention, a sensor, such as a linear encoder, is used to provide information to the system regarding the adjusted eye piece position, so the displayed information can be adjusted to compensate for the adjusted eyepiece location. The eye piece may include a first eye piece viewing component and a second eye piece viewing component associated with each of a user's eye. The system can be configured so that each eye piece viewing component locates a different view point or prospective with respect to the display location and the optical combiner location. This can be used to achieve an affect of depth perception.
Preferably the display, the optical combiner, at least a portion of the tracking system and the eyepiece are housed in a single unit (referred to sometimes herein as a “box”, although each component need not be within an enclosed space). This provides fixed distances and positioning of the user with respect to the display and optical combiner, thereby eliminating a need to track the user's position and orientation. This can also simplify calibration and provide a less cumbersome device.
Numerous types of information describing the objects may be displayed. For example, a rendering of a 3D surface of an object may be superimposed on the object. Further examples include surgical plans, object trajectories, such as that of a medical tool.
Real-time input to the device may be represented in various ways. For example, if the device is following a surgical tool with a targeted location, the color of the tool or its trajectory can be shown to change, thereby indicating the distance to the targeted location. Displayed information may also be a graphical representation of real-time data. The displayed information may either be real-time information, such as may be obtained by an ultrasound probe, or stored information such as from an x-ray or CAT scan.
In an exemplary embodiment of the invention, the optical combiner is a partially reflective mirror. A partially reflective mirror is any surface that is partially transmissive and partially reflective. The transmission rates are dependent, at least in part on lighting conditions. Readily available 40/60 glass can be used, for example, meaning the glass provides 40% transmission and 60% reflectivity. An operating room environment typically has very bright lights, in which case a higher portion of reflectivity is desirable, such as 10/90. The optical combiner need not be glass, but can be a synthetic material, provided it can transmit and reflect the desired amount of light. The optical combiner may include treatment to absorb, transmit and/or reflect different wavelengths of light differently.
The information presented by the display may be an image created, for example, by an ultrasound, CAT scan, MRI, PET, cine-CT or x-ray device. The imaging device may be included as an element of the invention. Other types of information include, but are not limited to, surgical plans, information on the proximity of a medical tool to a targeted point, and various other information. The information may be stored and used at a later time, or may be a real-time image. In an exemplary embodiment of the invention, the image is a 3D model rendering created from a series of 2D images. Information obtained from tracking the real-world object is used to align the 3D image with the real world view.
The device may be hand held or mounted on a stationary or moveable support. In a preferred embodiment of the invention, the device is mounted on a support, such as a mechanical or electromechanical or arm that is adjustable in at least one linear direction, i.e., the X, Y or Z direction. More preferably, the support provides both linear and angular adjustability. In an exemplary embodiment of the invention, the support mechanism is a boom-type structure. The support may be attached to any stationary object. This may include for example, a wall, floor, ceiling or operating table. A movable support can have sensors for tracking. Illustrative support systems are shown in FIGS. 7A-C.
The key in the embodiments depicted in
Numerous types of tracking systems may be used. Any system that can effectively locate a tracked item and is compatible with the system or procedure for which it is used, can serve as a tracking device. Examples of tracking devices include optical, mechanical, magnetic, electromagnetic, acoustic or a combination thereof. Systems may be active, passive and inertial, or a combination thereof. For example, a tracking system may include a marker that either reflects or emits signals.
Numerous display types are within the scope of the invention. In an exemplary embodiment an autostereoscopic liquid crystal display is used, such as a Sharp LL-151D or DTL 2018XLC. To properly orient images and views on a display it may be necessary to reverse, flip, rotate, translate and/or scale the images and views. This can be accomplished through optics and/or software manipulation.
A preferred embodiment of the invention utilizes an autostereoscopic display, and uses the eyepieces to locate the user at the required user viewer position. FIGS. 5A-C depict stereoscopic systems according to illustrative embodiments of the invention.
Tracking is performed in a manner similar to that of a mono-image display system. Ultrasound probe 522 has a tracking marker 508 on it. Arrow 520 represents tracking information going from tracking marker 508 to tracking sensors and tracking base reference object 524. Arrow 526 represents the information being gathered from the sensors and base reference 524 being sent to a processor 530. Arrow 540 represents the information from the ultrasound unit 522 being sent to processor 530. Processor 530 combines information from marker 508 and ultrasound probe 522. Arrow 534 represents the properly aligned data being sent from processor 530 to display portions 504A, 504B.
As shown in FIGS. 5A-C, stereoscopic systems can have many different configurations. A single display can be partitioned to accommodate two different images. Two displays can be used, each having a different image. A single display can also have interlaced images, such as alternating columns of pixels wherein odd columns would correspond to a first image that would be conveyed to a user's first eye, and even columns would correspond to a second image that would be conveyed to the user's second eye. Such a configuration would require special polarization or optics to ensure that the proper images reach each eye.
In a further embodiment of the invention, an augmented image can be created using a first and second set of displayed information and a real world view. The first set of displayed information is seen through a first eye piece viewing component on a first display. The second set of displayed information is seen on a second display through the second eye piece viewing component. The two sets of information are displayed in succession.
For some applications it is preferable to have the display in wireless communication with respect to the processing unit. It may also be desirable to have the tracking system wirelessly in communication with respect to the processing unit, or both.
In a further illustrative embodiment of the invention, you can have the image overlay highlight or outline objects in a field. This can be accomplished with appropriate mirrors and filters. For example, certain wavelengths of invisible light could be transmitted/reflected (such as “near-infrared”, which is about 800 nm) and certain wavelengths could be restricted (such as ultraviolet and far-infrared). In embodiments similar to the infrared examples, you can position a camera to have the same view as the eyepiece, then take the image from that camera, process the image, then show that processed image on the display. In the infrared example, a filter is used to image only the infrared light in the scene, then the infrared image is processed, changed to a visible light image via the display, thereby augmenting the true scene with additional infrared information.
In yet another embodiment of the invention a plurality of cameras is used to process the visible/invisible light images, and is also used as part of the tracking system. The cameras can sense a tracking signal such as an infrared LED emitting from the trackers. Therefore, the cameras are simultaneously used for stereo visualization of a vascular infrared image and for tracking of infrared LEDs. A video based tracking system could be implemented in this manner if the system is using visible light.
Information from U.S. Pat. No. 6,753,828 is incorporated by reference as the disclosed information relates to use in the present invention.
The invention, as described above may be embodied in a variety of ways, for example, a system, method, device, etc.
While the invention has been described by illustrative embodiments, additional advantages and modifications will occur to those skilled in the art. Therefore, the invention in its broader aspects is not limited to specific details shown and described herein. Modifications, for example, to the type of tracking system, method or device used to create object images and precise layout of device components may be made without departing from the spirit and scope of the invention. Accordingly, it is intended that the invention not be limited to the specific illustrative embodiments, but be interpreted within the full spirit and scope of the detailed description and the appended claims and their equivalents.
This application is based on, and claims priority to, provisional application having Ser. No. 60/651,020, and a filing date of Feb. 8, 2005, entitled Image Overlay Device and Method
Number | Date | Country | |
---|---|---|---|
60651020 | Feb 2005 | US |