The present technology generally relates to a camera array, and more specifically, to a camera array for generating a virtual perspective of a scene for a mediated-reality viewer.
In a mediated reality system, an image processing system adds, subtracts, and/or modifies visual information representing an environment. For surgical applications, a mediated reality system may enable a surgeon to view a surgical site from a desired perspective together with contextual information that assists the surgeon in more efficiently and precisely performing surgical tasks. Such mediated reality systems rely on multiple camera angles to reconstruct an image of the environment. However, even small relative movements and/or misalignments between the multiple cameras can cause unwanted distortions in the reconstructed image.
Many aspects of the present disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale. Instead, emphasis is placed on clearly illustrating the principles of the present disclosure.
Aspects of the present disclosure are directed generally to mediated-reality imaging systems, such as for use in surgical procedures. In several of the embodiments described below, for example, an imaging system includes a camera array having a plurality of hexagonal cells arranged in a honeycomb pattern in which (i) a pair of inner cells include respective edges adjacent to one another and (ii) a pair of outer cells are separated from each other by the inner cells. A plurality of cameras are mounted within the hexagonal cells. Each of the cells can include at least one camera of a first type and at least one camera of a second type different than the first type. For example, the camera of the first type may have a longer focal length than the camera of the second type. In some embodiments, the cameras within each of the hexagonal cells are arranged in a triangular grid and approximately equidistant from neighboring ones of cameras. In some embodiments, the camera of the second type is positioned farther from or equidistant from a center point of the camera array relative to the cameras of the first type in the same cell.
Specific details of several embodiments of the present technology are described herein with reference to
The accompanying Figures depict embodiments of the present technology and are not intended to be limiting of its scope. The sizes of various depicted elements are not necessarily drawn to scale, and these various elements may be arbitrarily enlarged to improve legibility. Component details may be abstracted in the Figures to exclude details such as position of components and certain precise connections between such components when such details are unnecessary for a complete understanding of how to make and use the present technology. Many of the details, dimensions, angles, and other features shown in the Figures are merely illustrative of particular embodiments of the disclosure. Accordingly, other embodiments can have other details, dimensions, angles, and features without departing from the spirit or scope of the present technology.
The camera array 120 includes a plurality of cameras 122 (e.g., a first camera 122-1, a second camera 122-2, . . . , an nth camera 122-N) that are each configured to capture respective images of a scene 130. The cameras 122 may be physically arranged in a particular configuration as described in further detail below such that their physical locations and orientations relative to each other are fixed. For example, the cameras 122 may be structurally secured by a mounting structure to mount the cameras 122 at predefined fixed locations and orientations. The cameras 122 of the camera array 120 may be positioned such that neighboring cameras may share overlapping views of the scene 130. In some embodiments, the cameras 122 in the camera array 120 are synchronized to capture images of the scene 130 substantially simultaneously (e.g., within a threshold temporal error). In some embodiments, all or a subset of the cameras 122 can be light-field (e.g., plenoptic) cameras that are configured to capture information about the light field emanating from the scene 130 (e.g., information about the intensity of light rays in the scene 130 and also information about a direction the light rays are traveling through space). In some embodiments, the camera array 120 can further comprise (i) one or more projectors 124 configured to project a structured light pattern onto/into the scene 130, and/or (ii) one or more depth sensors 126 configured to perform depth estimation of a surface of the scene 130.
The image processing device 110 is configured to receive images (e.g., light-field images, light field image data, etc.) captured by the camera array 120 and to process the images to synthesize an output image corresponding to a virtual camera perspective. In the illustrated embodiment, the output image corresponds to an approximation of an image of the scene 130 that would be captured by a camera placed at an arbitrary position and orientation corresponding to the virtual camera perspective. The image processing device 110 can synthesize the output image from a subset (e.g., two or more) of the cameras 122 in the camera array 120, but does not necessarily utilize images from all of the cameras 122. For example, for a given virtual camera perspective, the image processing device 110 may select a stereoscopic pair of images from two of the cameras 122 that are positioned and oriented to most closely match the virtual camera perspective.
In some embodiments, the image processing device 110 is further configured to perform a depth estimation for each surface point of the scene 130. For example, in some embodiments the image processing device 110 can detect the structured light projected onto the scene 130 by the projector 124 to estimate depth information of the scene 130. Alternatively or additionally, the image processing device 110 can perform the depth estimation based on depth information received from the dedicated depth sensors 126. In yet other embodiments, the image processing device 110 may estimate depth only from multi-view image data from the camera array 120 without necessarily utilizing information collected by any of the projectors 124 or the depth sensors 126. The depth information may be combined with the images from the cameras 122 to synthesize the output image as a three-dimensional rendering of the scene as viewed from the virtual camera perspective.
In some embodiments, functions attributed to the image processing device 110 may be practically implemented by two or more physical devices. For example, in some embodiments a synchronization controller controls images displayed by the projector 124 and sends synchronization signals to the cameras 122 to ensure synchronization between the cameras 122 and the projector 124 to enable fast, multi-frame, multi-camera structured light scans. Additionally, such a synchronization controller may operate as a parameter server that stores hardware specific configurations such as parameters of the structured light scan, camera settings, and camera calibration data specific to the camera configuration of the camera array 120. The synchronization controller may be implemented in a separate physical device from a display controller that controls the display device 140, or the devices may be integrated together.
The virtual camera perspective may be controlled by an input controller 150 that provides a control input corresponding to the location and orientation of the virtual imager perspective. The output image corresponding to the virtual camera perspective is outputted to the display device 140 and displayed by the display device 140. The image processing device 110 may beneficially process received inputs from the input controller 150 and process the captured images from the camera array 120 to generate output images corresponding to the virtual perspective in substantially real-time as perceived by a viewer of the display device 140 (e.g., at least as fast as the frame rate of the camera array 120).
The image processing device 110 may comprise a processor and a non-transitory computer-readable storage medium that stores instructions that when executed by the processor, carry out the functions attributed to the image processing device 110 as described herein. Although not required, aspects and embodiments of the present technology can be described in the general context of computer-executable instructions, such as routines executed by a general-purpose computer, e.g., a server or personal computer. Those skilled in the relevant art will appreciate that the present technology can be practiced with other computer system configurations, including Internet appliances, hand-held devices, wearable computers, cellular or mobile phones, multi-processor systems, microprocessor-based or programmable consumer electronics, set-top boxes, network PCs, mini-computers, mainframe computers and the like. The present technology can be embodied in a special purpose computer or data processor that is specifically programmed, configured or constructed to perform one or more of the computer-executable instructions explained in detail below. Indeed, the term “computer” (and like terms), as used generally herein, refers to any of the above devices, as well as any data processor or any device capable of communicating with a network, including consumer electronic goods such as game devices, cameras, or other electronic devices having a processor and other components, e.g., network communication circuitry.
The invention can also be practiced in distributed computing environments, where tasks or modules are performed by remote processing devices, which are linked through a communications network, such as a Local Area Network (“LAN”), Wide Area Network (“WAN”), or the Internet. In a distributed computing environment, program modules or sub-routines may be located in both local and remote memory storage devices. Aspects of the invention described below may be stored or distributed on computer-readable media, including magnetic and optically readable and removable computer discs, stored as in chips (e.g., EEPROM or flash memory chips). Alternatively, aspects of the invention may be distributed electronically over the Internet or over other networks (including wireless networks). Those skilled in the relevant art will recognize that portions of the present technology may reside on a server computer, while corresponding portions reside on a client computer. Data structures and transmission of data particular to aspects of the present technology are also encompassed within the scope of the invention.
The display device 140 may comprise, for example, a head-mounted display device or other display device for displaying the output images received from the image processing device 110. In some embodiments, the input controller 150 and the display device 140 are integrated into a head-mounted display device and the input controller 150 comprises a motion sensor that detects position and orientation of the head-mounted display device. The virtual perspective can then be derived to correspond to the position and orientation of the head-mounted display device 140 such that the virtual perspective corresponds to a perspective that would be seen by a viewer wearing the head-mounted display device 140. Thus, in such embodiments, the head-mounted display device 140 can provide a real-time rendering of the scene as it would be seen by an observer without the head-mounted display 140. Alternatively, the input controller 150 may comprise a user-controlled control device (e.g., a mouse, pointing device, handheld controller, gesture recognition controller, etc.) that enables a viewer to manually control the virtual perspective displayed by the display device 140.
In the illustrated embodiment, the display device 140 is embodied as a virtual reality headset. The workstation 204 may include a computer to control various functions of the camera array 120 and the display device 140. In some embodiments, the workstation 204 includes a secondary display 206 that can display a user interface for performing various configuration functions, a mirrored image of the display on the display device 140, and/or other useful visual images/indications. The image processing device 120 and the input controller 150 may each be integrated in the workstation 204, the display device 140, or a combination thereof.
In one aspect of the present technology, the hexagonal shape of the cells 360 enables the camera array 120 to be expanded to include additional cells 360 in a modular fashion. For example, while the camera array 120 includes four of the cells 360 in the illustrated embodiment, in other embodiments the camera array 120 can include more than four of the cells 360, eight or more of the cells 360, etc., by positioning additional cells 360 adjacent to the outer edges of the cells 360 in a honeycomb pattern. The cells 360 can be coupled together via adhesives, suitable connectors or fasteners (e.g., screws, bolts, etc.), magnets, etc. In some embodiments, the cells 360 can be mounted to a common frame (e.g., a metal frame) or other suitably rigid structure. By utilizing a repeatable pattern, the camera array 120 can be manufactured to have an arbitrary/desired size and number of the cameras 122. In another aspect of the present technology, the repeatable pattern can ensure that the spacing of the cameras 122 is predictable, which can enable the image processor 110 (
In some embodiments, the sidewalls of the cells 360 are constructed of a rigid material such as metal or a hard plastic. The cell structure provides strong structural support for holding the cameras 122 in their respective positions without significant movement due to flexing or vibrations of the structure of the camera array 120. In other embodiments, the cells 360 need not include any sidewalls. For example, in some embodiments the cells 360 can each comprise a printed circuit board (PCB). The PCBs can have the illustrated hexagonal shape or other suitable shapes (e.g., rectilinear, polygonal, circular, etc.), and can be coupled together directly or mounted to a common frame or other rigid structure.
In the illustrated embodiment, each of the cells 360 includes/holds a set of three of the cameras 122 (identified individually as first cameras 122-A and a second camera 122-B) arranged in a triangular pattern/grid with each of the cameras 122 oriented to focus on a single point. More specifically, the cameras 122 can each be mounted at angle relative to the generally planar bottom of the cells 360 in the camera array (e.g., as shown in
In some embodiments, each of the cameras 122 in a set within one of the cells 360 is approximately equidistant from each of the neighboring cameras 122 in the same set and approximately equidistant from the neighboring cameras 122 in adjacent ones of the cells 360. This camera spacing results in a triangular grid, where each set of three neighboring cameras 122 (regardless of placement within the cells 360) are arranged in triangle of approximately equal dimensions. In one aspect of the present technology, this spacing can simplify the processing performed by the image processing device 110 (
In some embodiments, the cameras 122 in each of the cells 360 include cameras of at least two different types. For example, in the illustrated embodiment each of cells 360 includes (i) two of the first cameras 122-A which are of a first type (e.g., type A) and (ii) one of the second cameras 122-B which is of a second type (e.g., type B). In some embodiments, the type A cameras 122-A and the type B cameras 122-B have different focal lengths. For example, the type B cameras 122-B may have a shorter focal length than the type A cameras 122-A. In a particular embodiment, the type A cameras 122-A can have 50mm lenses while the type B cameras 122-B can have 35mm lenses. In some embodiments, some or all of the cameras 122 can be light-field cameras configured to capture light-field images. In some embodiments, some or all of the cameras 122 can have different focal depths in addition to or alternatively to different focal lengths. For example, the type B cameras 122-B can have a greater focal depth than the type A cameras 122-A. In some embodiments, some or all of the cameras can have different color characteristics. For example, a first portion of the type-A cameras 122-A could be color cameras while a second portion of the type-A cameras 122-A could be monochrome cameras. In some embodiments, image data from (i) color ones of the cameras 122 can be used by the image processing device 110 (
In the illustrated embodiment, the type B cameras 122-B are generally positioned in their respective cells 360 in a camera position farthest from a center point of the array 120 (e.g., between the adjacent edges of the first and second cells 360-A, 360-B). Accordingly, the type B cameras 122-B can have a larger field-of-view and can provide more overlap of the scene 130 (
In another aspect of the present technology, positioning the type B cameras 122-B along the exterior of the array 120 achieves a wide baseline between the type B cameras 122-B, which can enable accurate stereoscopic geometry reconstruction. For example, in the illustrated embodiment the type-B cameras 122-B in the first, third, and fourth cells 360-A, 360-C, 360-D are positioned farther from the center point of the array 120 than the type A cameras 122-A. In the second cell 360-B, the type B camera 122-B is positioned equidistant from the center point of the array 120 as one of the type-A cameras, and the camera position for the type B camera 122-B (either in a left-most or right-most camera position in
In the illustrated embodiment, the camera array 120 includes the projector 124 that can project structured light onto the scene 130 (
The fans 402 and/or other cooling features can operate to maintain the camera array 120 at a generally uniform temperature during operation. In one aspect of the present technology, this can inhibit thermal expansion/contraction of the various components of the camera array 120 which could otherwise lead to undesirable flexing/bending of the camera array 120. In some embodiments, the camera array 120 can include other thermal management features for stabilizing the temperature of the camera array 120 during operation such as, for example, heat sinks, heat pipes, etc.
In some embodiments, the camera array 120 can include a transparent face plate (not shown in
The following examples are illustrative of several embodiments of the present technology:
1. A camera array, comprising:
a support structure having a perimeter; and
a plurality of cameras mounted to the support structure in an array,
2. The camera array of example 1 wherein individual ones of the cameras are approximately equidistantly spaced from neighboring one of the cameras in the array.
3. The camera array of example 2 wherein the array is a triangular grid.
4. The camera array of any one of examples 1-3 wherein each of the cameras is oriented to focus on the same point.
5. The camera array of any one of examples 1-4 wherein the cameras are light-field cameras.
6. The camera array of any one of examples 1-5 wherein the support structure includes a plurality of cells, and wherein at least two of the first cameras and at least one of the second cameras are mounted within each of the cells.
7. The camera array of example 6 wherein the cells are hexagonally shaped and arranged in a honeycomb pattern.
8. The camera array of any one of examples 1-7 wherein the first cameras have a narrower field of view than second cameras.
9. The camera array of example 8 wherein first cameras have 50 mm lenses and wherein the second cameras have 35 mm lenses.
10. The camera array of any one of examples 1-9, further comprising:
11. The camera array of any one of examples 1-10, further comprising a swing arm coupled to the support structure and configured to position the plurality of cameras in a desired position and orientation.
12. The camera array of any one of examples 1-11 wherein the support structure is generally planar, and wherein the cameras are mounted to the support structure such that a focal axis of each of the cameras is angled relative to the support structure so that each of the cameras is oriented to focus on the same point.
13. The camera array of any one of examples 1-12 wherein the support structure is non-planar, and wherein the cameras are mounted to the support structure such that a focal axis of each of the cameras is perpendicular to the support structure so that each of the cameras is oriented to focus on the same point.
14. A mediated-reality system, comprising:
15. The mediated-reality system of example 14 wherein individual ones of the second cameras are arranged closer to or equidistant to a perimeter of the support structure relative to individual ones of the first cameras.
16. The mediated-reality system of example 14 or example 15 wherein the cameras are arrayed in a triangular grid across the support structure such that individual ones of the cameras are approximately equidistantly spaced from adjacent ones of the cameras.
17. The mediated-reality system of any one of examples 14-16 wherein the display device is configured to be mounted on the head of a user.
18. The mediated-reality system of any one of examples 14-17 wherein the image processing device is configured to synthesize the virtual image based on fewer than all of the plurality of images of the scene.
19. The mediated-reality system of any one of examples 14-18 wherein the image processing device is configured to synthesize the virtual image based on a stereoscopic pair of the plurality of images.
20. An imaging device, comprising:
21. A camera array comprising:
22. The camera array of example 21 wherein the at least one camera of the second type within each of the plurality of hexagonal cells is at a position further from or equidistant from a center point of the camera array relative to cameras of the first type.
23. The camera array of example 21 or example 22 wherein the at least one camera of the first type has a longer focal length than the at least one camera of the second type.
24. The camera array of any one of examples 21-23 wherein the outer cells each include four edges along an exterior perimeter of the camera array and wherein the inner cells each include three edges along the exterior perimeter of the camera array.
25. The camera array of any one of examples 21-24 wherein the camera of the first type has a narrower field of view than the camera of the second type.
26. The camera array of any one of examples 21-25 wherein the camera of the first type comprises a 50 mm camera and wherein the camera of the second type comprises a 35 mm camera.
27. The camera array of any one of examples 21-26, further comprising a projector to project structured light onto a scene that within a field of view of the plurality of cameras.
28. The camera array of any one of examples 21-27, further comprising a depth sensor for sensing depth of a surface of a scene within a field of view of the plurality of cameras.
29. The camera array of any one of examples 21-28, further comprising a cooling system to provide cooling to the plurality of cameras.
30. The camera array of any one of examples 21-29, further comprising a swing arm to position the plurality of cameras in a desired position and orientation.
31. A mediated-reality system, comprising:
32. The mediated-reality system of example 31 wherein the at least one camera of the second type within each of the plurality of hexagonal cells is at a position further from or equidistant from a center point of the camera array relative to cameras of the first type.
33. The mediated-reality system of example 31 or example 32 wherein the at least one camera of the first type has a longer focal length than the at least one camera of the second type.
34. The mediated-reality system of any one of examples 31-33 wherein the outer cells each include four edges along an exterior perimeter of the camera array and wherein the inner cells each include three edges along the exterior perimeter of the camera array.
35. The mediated-reality system of any one of examples 31-34, wherein the camera of the first type has a narrower field of view than the camera of the second type.5
36. The mediated-reality system of any one of examples 31-35 wherein the camera of the first type comprises a 50 mm camera and wherein the camera of the second type comprises a 35 mm camera.
37. The mediated-reality system of any one of examples 31-36, further comprising a projector to project structured light onto a scene that within a field of view of the plurality of cameras.
38. A camera array comprising:
39. The camera array of example 38, wherein the at least one camera of the second type within each of the plurality of hexagonal cells is at a position further from or equidistant from a center point of the camera array relative to cameras of the first type.
40. The camera array of example 38 or example 39, wherein the at least one camera of the first type has a longer focal length than the at least one camera of the second type.
41. A camera array comprising:
42. The camera array of example 41 wherein the at least one camera of the second type is positioned farther from or equidistant from a center point of the camera array relative to the at least one camera of the first type.
43. The camera array of example 41 or example 42 wherein the at least one camera of the first type has a longer focal length than the at least one camera of the second type.
44. The camera array any one of examples 41-43 wherein the plurality of hexagonal cells includes (a) a pair of inner cells each having at least one edge adjacent to one another and (b) a pair of outer cells separated from one another by the inner cells, wherein the outer cells each include four edges along an exterior perimeter of the camera array, and wherein the inner cells each include three edges along the exterior perimeter of the camera array.
45. The camera array of any one of examples 41-44 wherein the at least one camera of the first type has a narrower field of view than the at least one camera of the second type.
46. The camera array of any one of examples 41-45 wherein the at least one camera of the first type comprises a 50 mm camera and wherein the at least one camera of the second type comprises a 35 mm camera.
47. The camera array of any one of examples 41-46, further comprising a projector configured to project structured light onto a scene within a field of view of the cameras.
48. The camera array of any one of examples 41-47, further comprising a depth sensor configured to sense a depth of a surface of a scene within a field of view of the cameras.
49. The camera array of any one of examples 41-48, further comprising a cooling system configured to cool the cameras.
50. The camera array of any one of examples 41-49, further comprising a swing arm coupled to at least one of the plurality of hexagonal cells and configured to position the plurality of cameras in a desired position and orientation.
51. A mediated-reality system, comprising:
52. The mediated-reality system of example 51 wherein the at least one camera of the second type includes one camera, wherein the at least one camera of the first type includes two cameras, and wherein the cameras in the set are arranged in a triangle approximately equidistant from one another.
53. The mediated-reality system of example 51 or example 52 wherein the display device is configured to be mounted on the head of a user.
54. The mediated-reality system of any one of examples 51-53 wherein the image processing device is configured to synthesize the virtual image based on fewer than all of the plurality of images of the scene.
55. The mediated-reality system of any one of examples 51-54 wherein the image processing device is configured to synthesize the virtual image based on a stereoscopic pair of the plurality of images.
56. The mediated-reality system of any one of examples 51-55 wherein the plurality of hexagonal cells includes four cells.
57. The mediated-reality system of any one of examples 51-56, further comprising a projector configured to project structured light onto the scene, wherein the image processing device is further configured to determine depth information for the virtual image based on image data of the structured light captured in the plurality of images of the scene.
58. An imaging device, comprising:
59. The imaging device of example 58 wherein each of the cameras neighbors at least two other cameras, and wherein each of the cameras is approximately equidistantly spaced apart from the at least two other cameras.
60. The imaging device of example 58 or example 59 wherein the camera of the first type has a longer focal length than the camera of the second type, and wherein the camera of the second type is positioned farther from a center of the array than the camera of the first type.
61. An imaging device, comprising:
62. The imaging device of example 61 wherein the support structure has a perimeter, and wherein each of the second type of cameras is arranged closer to or equidistant to the perimeter as each of the cameras of the first type.
63. The imaging device of example 61 or example 62 wherein each of the cameras is oriented to focus on the same point.
The above detailed description of embodiments of the technology are not intended to be exhaustive or to limit the technology to the precise form disclosed above. Although specific embodiments of, and examples for, the technology are described above for illustrative purposes, various equivalent modifications are possible within the scope of the technology as those skilled in the relevant art will recognize. For example, although steps are presented in a given order, alternative embodiments may perform steps in a different order. The various embodiments described herein may also be combined to provide further embodiments.
From the foregoing, it will be appreciated that specific embodiments of the technology have been described herein for purposes of illustration, but well-known structures and functions have not been shown or described in detail to avoid unnecessarily obscuring the description of the embodiments of the technology. Where the context permits, singular or plural terms may also include the plural or singular term, respectively.
Moreover, unless the word “or” is expressly limited to mean only a single item exclusive from the other items in reference to a list of two or more items, then the use of “or” in such a list is to be interpreted as including (a) any single item in the list, (b) all of the items in the list, or (c) any combination of the items in the list. Additionally, the term “comprising” is used throughout to mean including at least the recited feature(s) such that any greater number of the same feature and/or additional types of other features are not precluded. It will also be appreciated that specific embodiments have been described herein for purposes of illustration, but that various modifications may be made without deviating from the technology. Further, while advantages associated with some embodiments of the technology have been described in the context of those embodiments, other embodiments may also exhibit such advantages, and not all embodiments need necessarily exhibit such advantages to fall within the scope of the technology. Accordingly, the disclosure and associated technology can encompass other embodiments not expressly shown or described herein.
The present application claims the benefit of U.S. Provisional Patent Application No. 62/737,791, filed Sep. 27, 2018, and titled “CAMERA ARRAY FOR A MEDIATED-REALITY SYSTEM,” the disclosure of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62737791 | Sep 2018 | US |