The present technology generally relates to computational imaging systems including multiple cameras and, more specifically, to methods and systems for determining calibration quality metrics for multicamera imaging systems.
Multicamera imaging systems are becoming increasingly used to digitize our understanding of the world, such as for measurement, tracking, and/or three-dimensional (3D) reconstruction of a scene. These camera systems must be carefully calibrated using precision targets to achieve high accuracy and repeatability. Typically, such targets consist of an array of feature points with known locations in the scene the that can be precisely identified and consistently enumerated across different camera frames and views. Measuring these known 3D world points and their corresponding two-dimensional (2D) projections in images captured by the cameras allows for intrinsic parameters (e.g., focal length) and extrinsic parameters (e.g., position and orientation in 3D world space) of the cameras to be computed.
The calibration of multicamera imaging systems will typically degrade over time due to environmental factors. The gradual degradation of system performance is often hard to detect during normal operation. As a result, it is typically left to the discretion of the user to periodically check the calibration quality of the system using the calibration target and/or to simply recalibrate the system.
Known calibration techniques can generally be classified into two categories: (i) calibration based on known targets in the scene and (ii) calibration based on correlating feature points across different cameras views. When calibrating based on known targets in the scene, the target provides known feature points with 3D world positions. The corresponding 2D projected points in the camera images are compared to the calculated 2D locations based on the calibration. A reprojection error is calculated as the difference between these measurements in pixels. Therefore, the calibration quality can be measured with a calibration target and quantified with reprojection error. However, such techniques require that known targets be positioned and visible within the scene.
When correlating feature points across different cameras views, the correlated features can be, for example, reflective marker centroids from binary images (e.g., in the case of an optical tracking system), or scale-invariant feature transforms (SIFT) from grayscale or color images (e.g., for general camera systems). With these correlated features, the system calibration can be improved using bundle adjustment—an optimization of the calibration parameters to minimize reprojection error. However, unlike calibration with a known target, bundle adjustment typically includes scale ambiguity. Even with gauge fixing constraints applied, due to the complex multivariate nature of bundle adjustment there are many local minima in the correlation. Accordingly, solutions can be determined that minimize reprojection error—but that do not improve system accuracy. That is, agreement between cameras is improved, but the intrinsic and/or extrinsic parameters of the cameras can diverge from their true values such that the measurement accuracy of the system is reduced compared to known target calibration techniques. Furthermore, the process of calculating image features, correctly matching them across camera views, and performing bundle adjustment is computationally expensive and can have errors due to noise intrinsic in the physical process of capturing images. For high-resolution, multicamera imaging systems such as those used for light field capture, the computational complexity increases substantially along with the presence of non-physical local minima solutions to bundle adjustment.
Many aspects of the present disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale. Instead, emphasis is placed on clearly illustrating the principles of the present disclosure.
Aspects of the present disclosure are directed generally to methods of assessing the calibration quality of a computational imaging system including multiple cameras. In several of the embodiments described below, for example, a method can include quantifying calibration error by directly comparing computed virtual camera images and raw camera images from the same camera pose. More specifically, the method can include capturing raw images of a scene and then selecting one or more of the cameras in the system for validation/verification. The method can further include computing, for each of the cameras selected for validation, a virtual image of the scene corresponding to the pose (e.g., position and orientation) of the camera. Then, the raw image captured with each of the cameras selected for validation is compared with the computed virtual image to calibrate and/or classify error in the imaging system.
When there is no calibration error, sensor noise, or computational error, the computed and raw images will be identical and a chosen image comparison function will compute an error of zero. However, if there are calibration errors, sensor noise, computational errors, or the like, the comparison function will compute a non-zero error. In some embodiments, the computed error can be classified based on the image comparison as being attributable to one or more underlying causes. In one aspect of the present technology, this classification methodology can be especially useful in attributing error to different subsystems (e.g., different camera types) when the computational imaging system includes multiple heterogenous subsystems that generate different kinds of data.
Specific details of several embodiments of the present technology are described herein with reference to
The accompanying figures depict embodiments of the present technology and are not intended to be limiting of its scope. The sizes of various depicted elements are not necessarily drawn to scale, and these various elements can be arbitrarily enlarged to improve legibility. Component details can be abstracted in the figures to exclude details such as position of components and certain precise connections between such components when such details are unnecessary for a complete understanding of how to make and use the present technology. Many of the details, dimensions, angles, and other features shown in the Figures are merely illustrative of particular embodiments of the disclosure. Accordingly, other embodiments can have other details, dimensions, angles, and features without departing from the spirit or scope of the present technology.
The headings provided herein are for convenience only and should not be construed as limiting the subject matter disclosed.
In the illustrated embodiment, the camera array 110 includes a plurality of cameras 112 (identified individually as cameras 112a-112n) that are each configured to capture images of a scene 108 from a different perspective. In some embodiments, the cameras 112 are positioned at fixed locations and orientations (e.g., poses) relative to one another. For example, the cameras 112 can be structurally secured by/to a mounting structure (e.g., a frame) at predefined fixed locations and orientations. In some embodiments, the cameras 112 can be positioned such that neighboring cameras share overlapping views of the scene 108. Therefore, all or a subset of the cameras 112 can have different extrinsic parameters, such as position and orientation. In some embodiments, the cameras 112 in the camera array 110 are synchronized to capture images of the scene 108 substantially simultaneously (e.g., within a threshold temporal error). In some embodiments, all or a subset of the cameras 112 can be light-field/plenoptic/RGB cameras that are configured to capture information about the light field emanating from the scene 108 (e.g., information about the intensity of light rays in the scene 108 and also information about a direction the light rays are traveling through space). Therefore, in some embodiments the images captured by the cameras 112 can encode depth information representing a surface geometry of the scene 108.
In some embodiments, the cameras 112 can include multiple cameras of different types. For example, different subsets of the cameras 112 can have different intrinsic parameters such as focal length, sensor type, optical components, and the like. In some embodiments, a subset of the cameras 112 can be configured to track an object through/in the scene 108. The cameras 112 can have charge-coupled device (CCD) and/or complementary metal-oxide semiconductor (CMOS) image sensors and associated optics. Such optics can include a variety of configurations including lensed or bare individual image sensors in combination with larger macro lenses, micro-lens arrays, prisms, and/or negative lenses.
In the illustrated embodiment, the camera array 110 further comprises (i) one or more projectors 114 configured to project a structured light pattern onto/into the scene 108, and (ii) one or more depth sensors 116 configured to estimate a depth of a surface in the scene 108. In some embodiments, the depth sensor 116 can estimate depth based on the structured light pattern emitted from the projector 114. In other embodiments, the camera array 110 can omit the projector 114 and/or the depth sensor 116.
In the illustrated embodiment, the processing device 102 includes an image processing device 103 (e.g., an image processor, an image processing module, an image processing unit) and a validation processing device 105 (e.g., a validation processor, a validation processing module, a validation processing unit). The image processing device 103 is configured to (i) receive images (e.g., light-field images, light field image data) captured by the camera array 110 and (ii) process the images to synthesize an output image corresponding to a selected virtual camera perspective. In the illustrated embodiment, the output image corresponds to an approximation of an image of the scene 108 that would be captured by a camera placed at an arbitrary position and orientation corresponding to the virtual camera perspective. In some embodiments, the image processing device 103 is further configured to receive depth information from the depth sensor 116 and/or calibration data from the validation processing device 105 (and/or another component of the system 100) and to synthesize the output image based on the images, the depth information, and the calibration data. More specifically, the depth information and calibration data can be used/combined with the images from the cameras 112 to synthesize the output image as a 3D (or stereoscopic 2D) rendering of the scene 108 as viewed from the virtual camera perspective. In some embodiments, the image processing device 103 can synthesize the output image using any of the methods disclosed in U.S. patent application Ser. No. 16/457,780, titled “SYNTHESIZING AN IMAGE FROM A VIRTUAL PERSPECTIVE USING PIXELS FROM A PHYSICAL IMAGER ARRAY WEIGHTED BASED ON DEPTH ERROR SENSITIVITY,” filed Jun. 28, 2019, now U.S. Pat. No. 10,650,573, which is incorporated herein by reference in its entirety.
The image processing device 103 can synthesize the output image from images captured by a subset (e.g., two or more) of the cameras 112 in the camera array 110, and does not necessarily utilize images from all of the cameras 112. For example, for a given virtual camera perspective, the processing device 102 can select a stereoscopic pair of images from two of the cameras 112 that are positioned and oriented to most closely match the virtual camera perspective. In some embodiments, the image processing device 103 (and/or the depth sensor 116) is configured to estimate a depth for each surface point of the scene 108 relative to a common origin and to generate a point cloud and/or 3D mesh that represents the surface geometry of the scene 108. For example, in some embodiments the depth sensor 116 can detect the structured light projected onto the scene 108 by the projector 114 to estimate depth information of the scene 108. Alternatively or additionally, the image processing device 103 can perform the depth estimation based on depth information received from the depth sensor 116. In some embodiments, the image processing device 103 can estimate depth from multiview image data from the cameras 112 using techniques such as light field correspondence, stereo block matching, photometric symmetry, correspondence, defocus, block matching, texture-assisted block matching, structured light, and the like., with or without utilizing information collected by the projector 114 or the depth sensor 116. In other embodiments, depth may be acquired by a specialized set of the cameras 112 performing the aforementioned methods in another wavelength, or by tracking objects of known geometry through triangulation or perspective-n-point algorithms. In yet other embodiments, the image processing device 103 can receive the depth information from dedicated depth detection hardware, such as one or more depth cameras and/or a LiDAR detector, to estimate the surface geometry of the scene 108.
In some embodiments, the processing device 102 (e.g., the validation processing device 105) performs a calibration process to detect the positions and orientation of each of the cameras 112 in 3D space with respect to a shared origin and/or an amount of overlap in their respective fields of view. For example, in some embodiments the processing device 102 can calibrate/initiate the system 100 by (i) processing captured images from each of the cameras 112 including a fiducial marker placed in the scene 108 and (ii) performing an optimization over the camera parameters and distortion coefficients to minimize reprojection error for key points (e.g., points corresponding to the fiducial markers). In some embodiments, the processing device 102 can perform a calibration process by correlating feature points across different cameras views and performing a bundle analysis. The correlated features can be, for example, reflective marker centroids from binary images, scale-invariant feature transforms (SIFT) features from grayscale or color images, and so on. In some embodiments, the processing device 102 can extract feature points from a ChArUco target and process the feature points with the OpenCV camera calibration routine. In other embodiments, such a calibration can be performed with a Halcon circle target or other custom target with well-defined feature points with known locations. Where the camera array 110 is heterogenous—including different types of the cameras 112—the target may have features visible only to distinct subsets of the cameras 112, which may be grouped by their function and spectral sensitivity. In such embodiments, the calibration of extrinsic parameters between the different subsets of the cameras 112 can be determined by the known locations of the feature points on the target.
As described in detail below with reference to
In some embodiments, the processing device 102 (e.g., the image processing device 103) can process images captured by the cameras 112 to perform object tracking of an object within the vicinity of the scene 108. Object tracking can be performed using image processing techniques or may utilize signals from dedicated tracking hardware that may be incorporated into the camera array 110 and/or the object being tracked. In a surgical application, for example, a tracked object may comprise a surgical instrument or a hand or arm of a physician or assistant. In some embodiments, the processing device 102 may recognize the tracked object as being separate from the surgical site of the scene 108 and can apply a visual effect to distinguish the tracked object such as, for example, highlighting the object, labeling the object, or applying a transparency to the object.
In some embodiments, functions attributed to the processing device 102, the image processing device 103, and/or the validation processing device 105 can be practically implemented by two or more physical devices. For example, in some embodiments a synchronization controller (not shown) controls images displayed by the projector 114 and sends synchronization signals to the cameras 112 to ensure synchronization between the cameras 112 and the projector 114 to enable fast, multi-frame, multi-camera structured light scans. Additionally, such a synchronization controller can operate as a parameter server that stores hardware specific configurations such as parameters of the structured light scan, camera settings, and camera calibration data specific to the camera configuration of the camera array 110. The synchronization controller can be implemented in a separate physical device from a display controller that controls the display device 104, or the devices can be integrated together.
The processing device 102 can comprise a processor and a non-transitory computer-readable storage medium that stores instructions that when executed by the processor, carry out the functions attributed to the processing device 102 as described herein. Although not required, aspects and embodiments of the present technology can be described in the general context of computer-executable instructions, such as routines executed by a general-purpose computer, e.g., a server or personal computer. Those skilled in the relevant art will appreciate that the present technology can be practiced with other computer system configurations, including Internet appliances, hand-held devices, wearable computers, cellular or mobile phones, multi-processor systems, microprocessor-based or programmable consumer electronics, set-top boxes, network PCs, mini-computers, mainframe computers and the like. The present technology can be embodied in a special purpose computer or data processor that is specifically programmed, configured or constructed to perform one or more of the computer-executable instructions explained in detail below. Indeed, the term “computer” (and like terms), as used generally herein, refers to any of the above devices, as well as any data processor or any device capable of communicating with a network, including consumer electronic goods such as game devices, cameras, or other electronic devices having a processor and other components, e.g., network communication circuitry.
The invention can also be practiced in distributed computing environments, where tasks or modules are performed by remote processing devices, which are linked through a communications network, such as a Local Area Network (“LAN”), Wide Area Network (“WAN”), or the Internet. In a distributed computing environment, program modules or sub-routines can be located in both local and remote memory storage devices. Aspects of the invention described below can be stored or distributed on computer-readable media, including magnetic and optically readable and removable computer discs, stored as in chips (e.g., EEPROM or flash memory chips). Alternatively, aspects of the invention can be distributed electronically over the Internet or over other networks (including wireless networks). Those skilled in the relevant art will recognize that portions of the present technology can reside on a server computer, while corresponding portions reside on a client computer. Data structures and transmission of data particular to aspects of the present technology are also encompassed within the scope of the invention.
The virtual camera perspective can be controlled by an input controller 106 that provides a control input corresponding to the location and orientation of the virtual camera perspective. The output images corresponding to the virtual camera perspective are outputted to the display device 104. The display device 104 is configured to receive the output images (e.g., the synthesized three-dimensional rendering of the scene 108) and to display the output images for viewing by one or more viewers. The processing device 102 can beneficially process received inputs from the input controller 106 and process the captured images from the camera array 110 to generate output images corresponding to the virtual perspective in substantially real-time as perceived by a viewer of the display device 104 (e.g., at least as fast as the frame rate of the camera array 110).
The display device 104 can comprise, for example, a head-mounted display device, a monitor, a computer display, and/or another display device. In some embodiments, the input controller 106 and the display device 104 are integrated into a head-mounted display device and the input controller 106 comprises a motion sensor that detects position and orientation of the head-mounted display device. The virtual camera perspective can then be derived to correspond to the position and orientation of the head-mounted display device 104 such that the virtual perspective corresponds to a perspective that would be seen by a viewer wearing the head-mounted display device 104. Thus, in such embodiments the head-mounted display device 104 can provide a real-time rendering of the scene 108 as it would be seen by an observer without the head-mounted display device 104. Alternatively, the input controller 106 can comprise a user-controlled control device (e.g., a mouse, pointing device, handheld controller, gesture recognition controller) that enables a viewer to manually control the virtual perspective displayed by the display device 104.
Referring to
In some embodiments, the validation processing device 105 can classify the computed error based on the image comparison as being attributable to one or more underlying causes. In one aspect of the present technology, this classification methodology can be especially useful in attributing error to different ones of the cameras 112 when the camera array 110 includes different types of cameras 112 or subsets of the cameras 112 that generate different kinds of data. Accordingly, when the system 100 is heterogenous, the present technology provides a metric for quantifying full system calibration, or the entire tolerance stack across several integrated technologies, which directly impacts the effectiveness of a user operating the system 100. Additionally, the disclosed methods of calibration assessment can be used to assess the registration accuracy of imaging or volumetric data collected from other modalities—not just the cameras 112—that are integrated into the system 100.
In contrast to the present technology, conventional methods for determining calibration error include, for example, (i) processing source images to determine feature points in a scene, (ii) filtering and consistently correlating the feature points across different camera views, and (iii) comparing the correlated feature points. However, such methods are computationally expensive and can have scale ambiguities that decrease system accuracy. Moreover, existing methods based on feature point comparison may not be applicable to heterogeneous systems if cameras do not have overlapping spectral sensitivities.
In the illustrated embodiment, to generate the synthesized/computed output image, for a given virtual pixel Pv of the output image (e.g., where Pv can refer to a location (e.g., an x-y location) of the pixel within the 2D output image), a corresponding world point W is calculated using the pose of the virtual camera 112v and the geometry of the scene 108, such as the measured depth of the scene 108. Therefore, the world point W represents a point in the scene 108 corresponding to the virtual pixel Pv based on the predicted pose of the virtual camera 112v and the predicted geometry of the scene 108. More specifically, to determine the world point W, a ray Rv is defined from an origin of the virtual camera 112v (e.g., an origin of the virtual camera 112v as modeled by a pinhole model) through the virtual pixel Pv such that it intersects the world point Win the scene 108.
To determine a value for the virtual pixel Pv, rays R1 and R3 are defined from the same world point W to the first and third cameras 1121 and 1123, respectively. The rays R1 and R3 identify corresponding candidate pixels P1 and P3 of the first and third cameras 1121 and 1123, respectively, having values that can be interpolated or otherwise computed to calculate a value of the virtual pixel Pv. For example, in some embodiments the value of the virtual pixel Pv can be calculated as an average of the candidate pixels P1 and P3:
The computed value of the virtual pixel Pv can be compared to a value of a corresponding pixel P2 of the second camera 1122 that is directly measured from image data of the scene 108 captured by the second camera 1122. In some embodiments, the comparison generates an error value/metric representative of the calibration of the system 100 (e.g., of the second camera 1122). For example, as the system 100 approaches perfect calibration, the comparison will generate an error value approaching zero as the computed value of the virtual pixel Pv approaches the measured value of the actual pixel P2.
As one example,
Typically, however, the system 100 will include sources of error that can cause the raw image to diverge from the computed virtual image outside of an acceptable tolerance. For example, the raw image captured with the second camera 1122 will typically include noise arising from the physical capture process. In some embodiments, the raw image can be filtered (e.g., compared to a simple threshold) to remove the noise. In other embodiments, the noise characteristics of the individual cameras 112 can be measured and applied to the rendered virtual image for a more accurate comparison.
The original calibration of the cameras 112 and the depth measurement of the scene 108 can also introduce error into the system 100. For example,
In the illustrated embodiment, due to the depth error δdepth in the measured depth of the scene 108, rather than the world point W, the world point measured by the first camera 1121 is W1δdepth the world point measured by the second camera 1122 is W2δdepth and the world point measured by the third camera 1123 is W3δdepth. Moreover, due to calibration error δcalib in the calibration of the cameras 112, the calculated poses of the cameras 112 measuring these world points differ from the actual poses such that the first camera 1121 measures the world point W167 depth at corresponding pixel P1δcalib rather than at the (correct) pixel P1, and the third camera 1123 measures the world point W167 depth at corresponding pixel P3δcalib rather than at the (correct) pixel P3. Accordingly, the value of the virtual pixel Pv can be calculated as an average of the pixels P1δcalib and P3δcalib.
The computed value of the virtual pixel Pv can be compared to a value of a corresponding pixel P2 of the second camera 1122 that is directly measured from image data of the scene 108 captured by the second camera 1122. In some embodiments, the comparison generates an error value/metric representative of the calibration of the system 100.
As one example,
While
As one example,
In some embodiments, the tracking cameras 812 can determine a depth and pose of the object 840 within scene, which can then be combined with/correlated to the image data from the cameras 112 to generate an output image including a rendering of the object 840. That is, the system 100 can render the object 840 into the output image of the scene 108 that is ultimately presented to a viewer. More specifically, the tracking cameras 812 can track one or more feature points on the object 840. When the system 100 includes different types of cameras as shown in
In the illustrated embodiment, the fourth and fifth cameras 1124 and 1125 are chosen for verification (e.g., for calibration assessment), and image data from the first through third cameras 1121-1123 is used to render synthesized output images from the perspectives of a virtual camera 112v4 and a virtual camera 112v5 having the same extrinsic and intrinsic parameters (e.g., pose, orientation, focal length) as the fourth and fifth cameras 1124 and 1125, respectively. In some embodiments, the cameras 112 chosen for verification can be positioned near one another. For example, the fourth and fifth cameras 1124 and 1125 can be mounted physically close together on a portion of the camera array 110. In some embodiments, such a validation selection scheme based on physical locations of the cameras 112 can identify if a structure (e.g., frame) of the camera array 110 has warped or deflected.
Due to the transform error δtransform, the tracking cameras 812 each measure/detect a feature point WFδtransform of the object 840 having a position in the scene 108 that is different than the position of an actual feature point WF of the object 840 in the scene 108 and/or as measured by the cameras 112. That is, the transform error -transform shifts the locations of the measured feature points on the tracked object 840 relative to their real-world positions. This shift away from the real feature point WF results in a shift in data returned by the system 100 when rendering the output image including the object 840. For example, in the illustrated embodiment a world point W on the surface of the object 840 is chosen for verification, as described in detail above with reference to
In the illustrated embodiment, due to the transform error δtransform, the actual world points measured by the first through fifth cameras 1121-1125—instead of the erroneous world point W—are world points W1-W5, respectively, which correspond to pixels P1-P5, respectively. Therefore, the transform error δtransform causes a shift or a difference in a localized region of the output image corresponding to the object 840.
As one example,
Referring to
With no loss of generality, such registration of volumetric data to a real-time rendered output image of a scene can be equated to the calibration of heterogenous camera types as described in detail with reference to
Referring to
At block 1051, the method 1050 includes calibrating the system 100 including the cameras 112. For example, the calibration process can determine a pose (e.g., a position and orientation) for each of the cameras 112 in 3D space with respect to a shared origin. As described in detail with reference to
At block 1052, the method 1050 can optionally include registering or inputting additional data into the system 100, such as volumetric data collected from modalities other than the cameras 112 (e.g., CT data, Mill data). Such volumetric data can ultimately be aligned with/overlaid over the output image rendered from images captured by the cameras 112.
At block 1053, the method includes selecting a subset (e.g., one or more) of the cameras 112 for verification/validation. As shown in
At block 1054, the method 1050 includes capturing raw images from the cameras 112—including from the subset of the cameras 112 selected for validation.
At block 1055, the method 1050 includes computing a virtual image from the perspective (e.g., as determined by the calibration process) of each of the cameras 112 in the subset selected for validation. As described in detail with reference to
At block 1066, the method 1050 includes comparing the raw image captured by each of the cameras 112 in the subset for validation with the virtual image computed for the camera. The raw and virtual images can be compared using image similarity metrics such as Euclidian distance, optical flow, cross correlation, histogram comparison, and/or with other suitable methods.
At block 1057, the method 1050 can include computing a quantitative calibration quality metric based on the comparison. The calibration quality metric can be a specific error attributed to each of the cameras 112 in the subset selected for validation. In other embodiments, the computed calibration quality metric represents a measurement of the full error of the system 100.
Alternatively or additionally, at block 1058, the method 1050 can include classifying the result of the image comparison using cross-correlation and/or another suitable technique. At block 1059, the method 1050 can further include estimating a source of error in the system 100 based on the classification. That is, the system 100 can attribute the error to an underlying cause based at least in part on the image comparison. For example, as shown in
At block 1060, the method 1050 can optionally include generating a suggestion to a user of the system 100 for improving or correcting system calibration. For example, if one of the cameras 112 is determined to be out of alignment relative to the calibrated state, the system 100 can generate a notification/indication to the user (e.g., via the display device 104) indicating that the particular camera should be realigned/recalibrated.
The method 1050 can then return to block 1051 and proceed again after a new recalibration of the system 100. Alternatively or additionally, the method 1050 can return to block 1053 and iteratively process different subsets of the cameras 112 until all the cameras 112 are validated.
The following examples are illustrative of several embodiments of the present technology:
1. A method of validating a computational imaging system including a plurality of cameras, the method comprising:
2. The method of example 1 wherein the method further comprises computing a quantitative calibration quality metric based on the comparison of the second image to the virtual image.
3. The method of example 1 or example 2 wherein the method further comprises classifying the comparison of the second image to the virtual image to estimate a source of error in the imaging system.
4. The method of example 3 wherein classifying the comparison includes applying an edge filter to the second image and the virtual image.
5. The method of any one of examples 1-4 wherein capturing the first images of the scene includes capturing light field images.
6. The method of any one of examples 1-5 wherein the cameras include at least two different types of cameras.
7. The method of any one of examples 1-6 wherein the method further comprises analyzing a frequency content of the virtual image and the second image to classify an error in the virtual image.
8. The method of any one of examples 1-7 wherein comparing the second image with the virtual image includes detecting a relative shift between the second image and the virtual image.
9. The method of any one of examples 1-8 wherein generating the virtual image includes, for each of a plurality of pixels of the virtual image—
10. The method of any one of examples 1-9 wherein the method further comprises:
11. A system for imaging a scene, comprising:
12. The system of example 11 wherein the cameras are light field cameras.
13. The system of example 11 or example 12 wherein the computer-executable instructions further include instructions for computing a quantitative calibration quality metric based on the comparison of the second image to the virtual image.
14. The system of any one of examples 11-13 wherein the computer-executable instructions further include instructions for classifying the comparison of the second image to the virtual to estimate a source of error in the system.
15. The system of any one of examples 11-14 wherein the cameras are rigidly mounted to a common frame.
16. A method of verifying a calibration of a first camera in a computational imaging system, the method comprising:
17. The method of example 16 wherein verifying the calibration includes determining a difference between the first image and the virtual second image.
18. The method of example 16 or example 17 wherein the first camera has a position and an orientation, and wherein generating the virtual second image includes generating the virtual second image for a virtual camera having the position and the orientation of the first camera.
19. The method of any one of examples 16-18 wherein the first camera and the second cameras are mounted to a common frame.
20. The method of any one of examples 16-19 wherein the method further comprises determining a source of a calibration error based on the comparison of the first image to the virtual second image.
The above detailed description of embodiments of the technology are not intended to be exhaustive or to limit the technology to the precise form disclosed above. Although specific embodiments of, and examples for, the technology are described above for illustrative purposes, various equivalent modifications are possible within the scope of the technology as those skilled in the relevant art will recognize. For example, although steps are presented in a given order, alternative embodiments can perform steps in a different order. The various embodiments described herein can also be combined to provide further embodiments.
From the foregoing, it will be appreciated that specific embodiments of the technology have been described herein for purposes of illustration, but well-known structures and functions have not been shown or described in detail to avoid unnecessarily obscuring the description of the embodiments of the technology. Where the context permits, singular or plural terms can also include the plural or singular term, respectively.
Moreover, unless the word “or” is expressly limited to mean only a single item exclusive from the other items in reference to a list of two or more items, then the use of “or” in such a list is to be interpreted as including (a) any single item in the list, (b) all of the items in the list, or (c) any combination of the items in the list. Additionally, the term “comprising” is used throughout to mean including at least the recited feature(s) such that any greater number of the same feature and/or additional types of other features are not precluded. It will also be appreciated that specific embodiments have been described herein for purposes of illustration, but that various modifications can be made without deviating from the technology. Further, while advantages associated with some embodiments of the technology have been described in the context of those embodiments, other embodiments can also exhibit such advantages, and not all embodiments need necessarily exhibit such advantages to fall within the scope of the technology. Accordingly, the disclosure and associated technology can encompass other embodiments not expressly shown or described herein.
This application claims the benefit of and priority to U.S. Provisional Patent Application No. 62/976,248, filed Feb. 13, 2020, titled “METHODS AND SYSTEMS FOR DETERMINING CALIBRATION QUALITY METRICS FOR A MULTICAMERA IMAGING SYSTEM,” which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62976248 | Feb 2020 | US |