The present technology is generally related to mediated-reality surgical visualization and associated systems and methods. In particular, several embodiments are directed to head-mounted displays configured to provide mediated-reality output to a wearer for use in surgical applications.
The history of surgical loupes dates back to 1876. Surgical loupes are commonly used in neurosurgery, plastic surgery, cardiac surgery, orthopedic surgery, and microvascular surgery. Despite revolutionary change in virtually every other point of interaction between surgeon and patient, the state of the art of surgical visual aids has remained largely unchanged since their inception. Traditional surgical loupes, for example, are mounted in the lenses of glasses and are custom made for the individual surgeon, taking into account the surgeon's corrected vision, interpupillary distance, and a desired focal distance. The most important function of traditional surgical loupes is their ability to magnify the operative field and empower the surgeon to perform maneuvers at a higher level of precision than would otherwise be possible.
Traditional surgical loupes suffer from a number of drawbacks. They are customized for each individual surgeon, based on the surgeon's corrective vision requirements and interpupillary distance, and so cannot be shared among surgeons. Traditional surgical loupes are also restricted to a single level of magnification, forcing the surgeon to adapt all of her actions to that level of magnification, or to frequently look “outside” the loupes at odd angles to perform actions where magnification is unhelpful or even detrimental. Traditional loupes provide a sharp image only within a very shallow depth of field, while also offering a relatively narrow field of view. Blind spots are another problem, due to the bulky construction of traditional surgical loupes.
The present technology is directed to systems and methods for providing mediated-reality surgical visualization. In one embodiment, for example, a head-mounted display assembly can include a stereoscopic display device configured to display a three-dimensional image to a user wearing the assembly. An imaging device can be coupled to the head-mounted display assembly and configured to capture images to be displayed to the user. Additional image data from other imagers can be incorporated or synthesized into the display. As used herein, the term “mediated-reality” refers to the ability to add to, subtract from, or otherwise manipulate the perception of reality through the use of a wearable display. “Mediated reality” display includes at least “virtual reality” as well as “augmented reality” type displays.
Specific details of several embodiments of the present technology are described below with reference to
For ease of reference, throughout this disclosure identical reference numbers are used to identify similar or analogous components or features, but the use of the same reference number does not imply that the parts should be construed to be identical. Indeed, in many examples described herein, the identically numbered parts are distinct in structure and/or function.
In the illustrated embodiment, the frame 103 is formed generally similar to standard eyewear, with orbitals joined by a bridge and temple arms extending rearwardly to engage a wearer's ears. In other embodiments, the frame 103 can assume other forms; for example, a strap can replace the temple arms or, in some embodiments, a partial helmet can be used to mount the assembly 100 to a wearer's head. The frame 103 includes a right-eye portion 104a and a left-eye portion 104b. When worn by a user, the right-eye portion 104a is configured to generally be positioned over a user's right eye, while the left-eye portion 104b is configured to generally be positioned over a user's left eye. The assembly 100 can generally be opaque, such that a user wearing the assembly 100 will be unable to see through the frame 103. In other embodiments, however, the assembly 100 can be transparent or semitransparent, so that a user can see through the frame 103 while wearing the assembly 100. The assembly 100 can be configured to be worn over a user's standard eyeglasses. The assembly 100 can include tempered glass or other sufficiently sturdy material to meet OSHA regulations for eye protection in the surgical operating room.
The imaging device 101 includes a first imager 113a and a second imager 113b. The first and second imagers 113a-b can be, for example, digital video cameras such as CCD or CMOS image sensor and associated optics. In some embodiments, each of the imagers 113a-b can include an array of cameras having different optics (e.g., differing magnification factors). The particular camera of the array can be selected for active viewing based on the user's desired viewing parameters. In some embodiments, intermediate zoom levels between those provided by the separate cameras themselves can be computed. For example, if a zoom level of 4.0 is desired, an image captured from a 4.6 magnification camera can be down-sampled to provide a new, smaller image with this level of magnification. However, now this image may not fill the entire field of view of the camera. An image from a lower magnification camera (e.g., a 3.3 magnification image) has a wider field of view, and may be up-sampled to fill in the outer portions of the desired 4.0 magnification image. In another embodiment, features from a first camera (such as a 3.3 magnification camera) may be matched with features from the second camera (e.g., a 4.6 magnification camera). To perform the matching, features such as SIFT or SURF may be used. With features from different images matched, the different images captured with different levels of magnification can be combined more effectively and in a fashion that introduces less distortion and error. In another embodiment, each camera may be equipped with a lenslet array between the image sensor and the main lens. This lenslet array allows capture of “light fields,” from which images with different focus planes and different viewpoints (parallax) can be computed. Using light field parallax adjustment techniques, differences in image point of view between the various cameras can be compensated away, so that as the zoom level changes, the point of view does not. In another embodiment, so-called “origami lenses,” or annular folded optics, can be used to provide high magnification with low weight and volume.
In some embodiments, the first and second imagers 113a-b can include one or more plenoptic cameras (also referred to as light field cameras). For example, instead of multiple lenses with different degrees of magnification, a plenoptic camera alone may be used for each imager. The first and second imagers 113a-b can each include a single plenoptic camera: a lens, a lenslet array, and an image sensor. By sampling the light field appropriately, images with varying degrees of magnification can be extracted. In some embodiments, a single plenoptic camera can be utilized to simulate two separate imagers from within the plenoptic camera. The use of plenoptic cameras is described in more detail below with respect to
The first imager 113a is disposed over the right-eye portion 104a of the frame 103, while the second imager 113b is disposed over the left-eye portion 104b of the frame 103. The first and second imagers 113a-b are oriented forwardly such that when the assembly 100 is worn by a user, the first and second imagers 113a-b can capture video in the natural field of view of the user. For example, given a user's head position when wearing the assembly 100, she would naturally have a certain field of view when her eyes are looking straight ahead. The first and second imagers 113a-b can be oriented so as to capture this field of view or a similar field of view when the user dons the assembly 100. In other embodiments, the first and second imagers 113a-b can be oriented to capture a modified field of view. For example, when a user wearing the assembly 100 rests in a neutral position, the imagers 113a-b may be configured to capture a downwardly oriented field of view.
The first and second imagers 113a-b can be electrically coupled to first and second control electronics 115a-b, respectively. The control electronics 115a-b can include, for example, a microprocessor chip or other suitable electronics for receiving data output from and providing control input to the first and second imagers 113a-b. The control electronics 115a-b can also be configured to provide wired or wireless communication over a network with other components, as described in more detail below with respect to
A fiducial marker 117 can be disposed over the forward surface 105 of the frame 103. The fiducial marker 117 can be used for motion tracking of the assembly 100. In some embodiments, for example, the fiducial marker 117 can be one or more infrared light sources that are detected by an infrared-light camera system. In other embodiments, the fiducial marker 117 can be a magnetic or electromagnetic probe, a reflective element, or any other component that can be used to track the position of the assembly 100 in space. The fiducial marker 117 can include or be coupled to an internal compass and/or accelerometer for tracking movement and orientation of the assembly 100.
On the rearward surface 107 of the frame 103, a display device 109 is disposed and faces rearwardly. As best seen in
The first and second displays 119a-b can be electrically coupled to the first and second control electronics 115a-b, respectively. The control electronics 115a-b can be configured to provide input to and to control operation of the displays 119a-b. The control electronics 115a-b can be configured to provide a display input to the displays 119a-b, for example, processed image data that has been obtained from the imagers 113a-b. For example, in in one embodiment image data from the first imager 113a is communicated to the first display 119a via the first control electronics 115a, and similarly, image data from the second imager 113b is communicated to the second display 119b via the second control electronics 115b. Depending on the position and configuration of the imagers 113a-b and the displays 119a-b, the user can be presented with a stereoscopic image that mimics what the user would see without wearing the assembly 100. In some embodiments, the image data obtained from the imagers 113a-b can be processed, for example, digitally zoomed, so that the user is presented with a zoomed view via the displays 119a-b.
First and second eye trackers 121a-b are disposed over the rearward surface 107 of the frame 103, adjacent to the first and second displays 119a-b. The first eye tracker 121a can be positioned within the right-eye portion 104a of the frame 103, and can be oriented and configured to track the movement of a user's right eye while a user wears the assembly 100. Similarly, the second eye tracker 121b can be positioned within the left-eye portion 104b of the frame 103, and can be oriented and configured to track the movement of a user's left eye while a user wears the assembly 100. The first and second eye trackers 121a-b can be configured to determine movement of a user's eyes and can communicate electronically with the control electronics 115a-b. In some embodiments, the user's eye movement can be used to provide input control to the control electronics 115a-b. For example, a visual menu can be overlaid over a portion of the image displayed to the user via the displays 119a-b. A user can indicate selection of an item from the menu by focusing her eyes on that item. Eye trackers 121a-b can determine the item that the user is focusing on, and can provide this indication of item selection to the control electronics 115a-b. For example, this feature allows a user to control the level of zoom applied to particular images. In some embodiments, a microphone or physical button(s) can be present on the assembly 100, and can receive user input either via spoken commands or physical contact with buttons. In other embodiments other forms of input can be used, such as gesture recognition via the imagers 113a-b, assistant control, etc.
The technology described herein may be applied to endoscope systems. For example, rather than mounting the multiple cameras (with different field or view/magnification combinations) on the user's forehead, the multiple cameras may be mounted on the tip of the endoscopic instrument. Alternatively, a single main lens plus a lenslet array may be mounted on the tip of the endoscopic instrument. Then light field rendering techniques such as refocusing, rendering stereo images from two different perspectives, or zooming may be applied. In such cases, the collected images may be displayed through the wearable head-mounted display assembly 100.
A computing component 207 includes a plurality of modules for interacting with the other components via communication link 201. The computing component 207 includes, for example, a display module 209, a motion tracking module 211, a registration module 213, and an image capture module 215. In some embodiments, the computing component 207 can include a processor such as a CPU which can perform operations in accordance with computer-executable instructions stored on a computer-readable medium. In some embodiments, the display module, motion tracking module, registration module, and image capture module may each be implemented in separate computing devices each having a processor configured to perform operations. In some embodiments, two or more of these modules can be contained in a single computing device. The computing component 207 is also in communication with a database 217.
The display module 209 can be configured to provide display output information to the assembly 100 for presentation to the user via the display device 109. As noted above, this can include stereoscopic display, in which different images are provided to each eye via first and second display devices 119a-b (
The motion tracking module 211 can be configured to determine the position and orientation of the assembly 100 as well as any additional imagers 205, with respect to the surgical site. As noted above, the tracker 203 can track the position of the assembly 100 and additional imagers 205 optically or via other techniques. This position and orientation data can be used to provide appropriate display output via display module 209.
The registration module 213 can be configured to register all image data in the surgical frame. For example, position and orientation data for the assembly 100 and additional imagers 205 can be received from the motion tracking module 211. Additional image data, for example, pre-operative images, can be received from the database 217 or from another source. The additional image data (e.g., X-ray, MRI, CT, fluoroscopy, anatomical diagrams, etc.) will typically not have been recorded from the perspective of either the assembly 100 or of any of the additional imagers 205. As a result, the supplemental image data must be processed and manipulated to be presented to the user via display device 109 of the assembly 100 with the appropriate perspective. The registration module 213 can register the supplemental image data in the surgical frame of reference by comparing anatomical or artificial fiducial markers as detected in the pre-operative images and those same anatomical or artificial fiducial markers as detected by the surgical navigation system, the assembly 100, or other additional imagers 205.
The image capture module 215 can be configured to capture image data from the imaging device 101 of the assembly 100 and also from any additional imagers 205. The images captured can include continuous streaming video and/or still images. In some embodiments, the imaging device 101 and/or one or more of the additional imagers 205 can be plenoptic cameras, in which case the image capture module 215 can be configured to receive the light field data and to process the data to render particular images. Such image processing for plenoptic cameras is described in more detail below with respect to
While the surgeon 301 is operating, images captured via the imaging device 101 of the assembly 100 are processed and displayed stereoscopically to the surgeon via an integrated display device 109 (
The assembly 100 may respond to voice commands or even track the surgeon's eyes—thus enabling the surgeon 301 to switch between feeds and tweak the level of magnification being employed. A heads-up display with the patient's vital signs (EKG, EEG, SSEPs, MEPs), imaging (CT, MRI, etc.), and any other information the surgeon desires may scroll at the surgeon's request, eliminating the need to interrupt the flow of the operation to assess external monitors or query the anesthesia team. Wireless networking may infuse the assembly 100 with the ability to communicate with processors (e.g., the computing component 207) that can augment the visual work environment for the surgeon with everything from simple tools like autofocus to fluorescence video angiography and tumor “paint.” The assembly 100 can replace the need for expensive surgical microscopes and even the remote robotic workstations of the near future—presenting an economical alternative to the current system of “bespoke” glass loupes used in conjunction with microscopes and endoscopes.
The head-mounted display assembly 100 can aggregate multiple streams of visual information and send it not just to the surgeon for visualization, but to remote processing power (e.g., the computing component 207 (
In some embodiments, the data recorded from the imaging device 101 and other imagers can be used to later generate different viewpoints and visualizations of the surgical site. For example, for later playback of the recorded data, an image having a different magnification, different integration of additional image data, and/or a different point of view can be generated. This can be particularly useful for review of the procedure or for training purposes.
Referring first to
The use of plenoptic cameras can also allow the system to reduce perceived latency as the assembly moves and captures a new field of view. Plenoptic cameras can capture and transmit information to form a spatial buffer around each virtual camera. During movement, the local virtual cameras can be moved into the spatial buffer regions without waiting for remote sensing to receive commands, physically move to the desired location, and send new image data. As a result, the physical scene objects captured by the moved virtual cameras will have some latency, but the viewpoint latency can be significantly reduced.
This same enlargement technique is illustrated in
Enlarged volumes can be fixed to the position in space, rather than a particular angular area of a view. For example, a tumor or other portion of the surgical site can be enlarged, and as the user moves her head while wearing the head-mounted display assembly 100, the image can be manipulated such that the area of enlargement remains fixed to correspond to the physical location of the tumor. In some embodiments, the regions “behind” the enlarged area can be rendered transparently so that the user can still perceive that area that is being obscured by the enlargement of the area of interest.
In some embodiments, the enlarged volume does not need to be rendered at its physical location, but rather can be positioned independently from the captured volume. For example, the enlarged view can be rendered closer to the surgeon and at a different angle. In some embodiments, the position of external tools can be tracked for input. For example, the tip of a scalpel or other surgical tool can be tracked (e.g., using the tracker 203), and the enlarged volume can be located at the tip of the scalpel or other surgical tool. In some embodiments, the surgical tool can include haptic feedback or physical controls for the system or other surgical systems. In situations in which surgical tools are controlled electronically or electromechanically (e.g., during telesurgery where the tools are controlled with a surgical robot), the controls for those tools can be modified depending on the visualization mode. For example, when the tool is disposed inside the physical volume to be visually transformed (e.g., enlarged), the controls for the tool can be modified to compensate for the visual scaling, rotation, etc. This allows for the controls to remain the same inside the visually transformed view and the surrounding view. This modification of the tool control can aid surgeons during remote operation to better control the tools even as visualization of the tools and the surgical site are modified.
Information from additional cameras in the environment located close to points of interest can be fused with images from the imagers coupled to the head-mounted display, thereby improving the ability to enlarge regions of interest. Depth information can be generated or gained from a depth sensor and used to bring the entirety of the scene into focus by co-locating the focal plane with the physical geometry of the scene. As with other mediated reality, data can be rendered and visualized in the environment. The use of light fields can allow for viewing around occlusions and can remove specular reflections. In some embodiments, processing of light fields can also be used to increase the contrast between tissue types.
Although several embodiments described herein are directed to mediated-reality visualization systems for surgical applications, other uses of such systems are possible. For example, a mediated-reality visualization system including a head-mounted display assembly with an integrated display device and an integrated image capture device can be used in construction, manufacturing, the service industry, gaming, entertainment, and a variety of other contexts.
1. A mediated-reality surgical visualization system, comprising: an opaque, head-mounted display assembly comprising:
a computing device in communication with the stereoscopic display device and the image capture device, the computing device configured to:
2. The mediated-reality surgical visualization system of example 1 wherein the head-mounted display assembly comprises a frame having a right-eye portion and a left-eye portion, and wherein the first display is disposed within the right-eye portion, and wherein the second display is disposed within the left-eye portion.
3. The mediated-reality surgical visualization system of any one of examples 1-2 wherein the head-mounted display assembly comprises a frame having a right-eye portion and a left-eye portion, and wherein the first imager is disposed over the right-eye portion, and wherein the second imager is disposed over the left-eye portion.
4. The mediated-reality surgical visualization system of example any one of examples 1-3 wherein the first and second imagers comprise plenoptic cameras.
5. The mediated-reality surgical visualization system of any one of examples 1-4 wherein the first and second imagers comprise separate regions of a single plenoptic camera.
6. The mediated-reality surgical visualization system of any one of examples 1-5, further comprising a third imager.
7. The mediated-reality surgical visualization system of example 6 wherein the third imager comprises a camera separate from the head-mounted display and configured to be disposed about the surgical field.
8. The mediated-reality surgical visualization system of any one of examples 1-7, further comprising a motion-tracking component.
9. The mediated-reality surgical visualization system of example 8, wherein the motion-tracking component comprises a fiducial marker coupled to the head-mounted display and a motion tracker configured to monitor and record movement of the fiducial marker.
10. The mediated-reality surgical visualization system of any one of examples 1-9 wherein the computing device is further configured to:
receive third image data;
process the third image data; and
present a processed third image from the third image data at the first display and/or the second display.
11. The mediated-reality surgical visualization system of example 10 wherein the third image data comprises at least one of: fluorescence image data, magnetic resonance imaging data, computed tomography image data, X-ray image data, anatomical diagram data, and vital-signs data.
12. The mediated-reality surgical visualization system of any one of examples 10-11 wherein the processed third image is integrated with the stereoscopic image.
13. The mediated-reality surgical visualization system of any one of examples 10-12 wherein the processed third image is presented as a picture-in-picture over a portion of the stereoscopic image.
14. The mediated-reality surgical visualization system of any one of examples 1-13 wherein the computing device is further configured to: present the stereoscopic image to a second head-mounted display assembly.
15. A mediated-reality visualization system, comprising:
a head-mounted display assembly comprising:
a computing device in communication with the display device and the image capture device, the computing device configured to:
present an image from the image data via the display device
16. The mediated-reality visualization system of example 15 wherein the image capture device comprises an image capture device having a first imager and a second imager.
17. The mediated-reality visualization system of any one of examples 15-16 wherein the display device comprises a stereoscopic display device having a first display and a second display.
18. The mediated-reality visualization system of any one of examples 15-17 wherein the computing device is configured to present the image in real time.
19. The mediated-reality visualization system of any one of examples 15-18 wherein the frame is worn on the user's head and the image capture device faces away from the user.
20. The mediated-reality visualization system of any one of examples 15-19 wherein the image capture device comprises at least one plenoptic camera.
21. The mediated-reality visualization system of example 20 wherein the computing device is further configured to:
process image data received from the plenoptic camera;
render at least one virtual camera from the image data; and
present an image corresponding to the virtual camera via the display device.
22. The mediated-reality visualization system of example 21 wherein the computing device is configured to render the at least one virtual camera at a location corresponding to a position of a user's eye when the frame is worn by the user.
23. The mediated-reality visualization system of any one of examples 21-22 wherein rendering the at least one virtual camera comprises rendering an enlarged view of a portion of a captured light field.
24. The mediated-reality visualization system of any one of examples 21-23 wherein the display device comprises first and second displays.
25. The mediated-reality visualization system of any one of examples 15-25 wherein the display device comprises a stereoscopic display device having a first display and a second display,
wherein the image capture device comprises at least one plenoptic camera, and
wherein the computing device is further configured to:
26. The mediated-reality visualization system of any one of examples 15-25 wherein the head-mounted display assembly is opaque.
27. The mediated-reality visualization system of any one of examples 15-25 wherein the head-mounted display assembly is transparent or semi-transparent.
28. A method for providing mediated-reality surgical visualization, the method comprising:
providing a head-mounted display comprising a frame configured to be mounted to a user's head, first and second imagers coupled to the frame, and first and second displays coupled to the frame;
receiving first image data from the first imager;
receiving second image data from the second imager;
processing the first image data and the second image data;
displaying the first processed image data at the first display; and
displaying the second processed image data at the second display.
29. The method of example 28 wherein the first and second processed image data are displayed at the first and second displays in real time.
30. The method of any one of examples 28-29, further comprising:
receiving third image data;
processing the third image data; and
displaying the processed third image data at the first display and/or second display.
31. The method of example 30 wherein the third image data comprises at least one of: fluorescence image data, magnetic resonance imaging data; computed tomography image data, X-ray image data, anatomical diagram data, and vital-signs data.
32. The method of any one of examples 28-31 wherein the third image data is received from a third imager spaced apart from the head-mounted display.
33. The method of any one of examples 28-32, further comprising tracking movement of the head-mounted display.
34. The method of example 33 wherein tracking movement of the head-mounted display comprises tracking movement of a fiducial marker coupled to the head-mounted display.
35. The method of any one of examples 28-34, further comprising:
providing a second display device remote from the head-mounted display, the second display device comprising third and further displays;
displaying the first processed image data at the third display; and
displaying the second processed image data at the fourth display.
36. The method of any one of examples 28-35 wherein first and second imagers comprise at least one plenoptic camera.
37. The method of any one of examples 28-36, further comprising:
processing image data received from the plenoptic camera;
rendering at least one virtual camera from the image data; and
presenting an image corresponding to the virtual camera via the first display.
38. The method of example 37 wherein rendering the at least one virtual camera comprises rendering the at least one virtual camera at a location corresponding to a position of the user's eye when the display is mounted to a user's head.
39. The method of any one of examples 37-38 wherein rendering the at least one virtual camera comprises rendering an enlarged view of a portion of a captured light field.
The above detailed descriptions of embodiments of the technology are not intended to be exhaustive or to limit the technology to the precise form disclosed above. Although specific embodiments of, and examples for, the technology are described above for illustrative purposes, various equivalent modifications are possible within the scope of the technology, as those skilled in the relevant art will recognize. For example, while steps are presented in a given order, alternative embodiments may perform steps in a different order. The various embodiments described herein may also be combined to provide further embodiments.
From the foregoing, it will be appreciated that specific embodiments of the invention have been described herein for purposes of illustration, but well-known structures and functions have not been shown or described in detail to avoid unnecessarily obscuring the description of the embodiments of the technology. Where the context permits, singular or plural terms may also include the plural or singular term, respectively.
Moreover, unless the word “or” is expressly limited to mean only a single item exclusive from the other items in reference to a list of two or more items, then the use of “or” in such a list is to be interpreted as including (a) any single item in the list, (b) all of the items in the list, or (c) any combination of the items in the list. Additionally, the term “comprising” is used throughout to mean including at least the recited feature(s) such that any greater number of the same feature and/or additional types of other features are not precluded. It will also be appreciated that specific embodiments have been described herein for purposes of illustration, but that various modifications may be made without deviating from the technology. Further, while advantages associated with certain embodiments of the technology have been described in the context of those embodiments, other embodiments may also exhibit such advantages, and not all embodiments need necessarily exhibit such advantages to fall within the scope of the technology. Accordingly, the disclosure and associated technology can encompass other embodiments not expressly shown or described herein.
This application claims the benefit of U.S. Provisional Patent Application No. 62/000,900, filed May 20, 2014, which is incorporated herein by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US15/31637 | 5/19/2015 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62000900 | May 2014 | US |