This disclosure relates generally to immersive realities, and, more particularly, to methods and apparatus to perform multiple-camera calibration.
Immersive realities: virtual reality (VR), augmented reality (AR), and mixed reality (MR) are shaping multiple industries and everything from marketing to retail and training to education, is being fundamentally changed by technology. To create these immersive realities, camera parameter estimations, high quality image feeds, and high quality video feeds are needed, which is currently solved through a sensor-based approach or through an additive bundle adjustment approach.
A sensor-based approach to multiple-camera calibration, such as inertial navigation system (INS), is one manner for performing automated calibration of multiple cameras. Sensor-based systems are equipped with sensors such as inertial measurement unit (IMU) or a global positioning system (GPS), raising the cost of each device tremendously. Additionally, the INS system and the GPS system typically do not provide a target level of accuracy for performing multiple-camera calibration.
Another approach to camera calibration is an additive bundle adjustment. Additive bundle adjustment calculates a bundle adjustment between two cameras, and then adds camera after camera to the adjusted bundle.
The figures are not to scale. In general, the same reference numbers will be used throughout the drawing(s) and accompanying written description to refer to the same or like parts. Connection references (e.g., attached, coupled, connected, and joined) are to be construed broadly and may include intermediate members between a collection of elements unless otherwise indicated. As such, connection references do not necessarily infer that two elements are directly connected and in fixed relation to each other. Stating that any part is in “contact” with another part means that there is no intermediate part between the two parts. Although the figures show layers and regions with clean lines and boundaries, some or all of these lines and/or boundaries may be idealized. In reality, the boundaries and/or lines may be unobservable, blended, and/or irregular.
Descriptors “first,” “second,” “third,” etc., are used herein when identifying multiple elements or components which may be referred to separately. Unless otherwise specified or understood based on their context of use, such descriptors are not intended to impute any meaning of priority, physical order or arrangement in a list, or ordering in time but are merely used as labels for referring to multiple elements or components separately for ease of understanding the disclosed examples. In some examples, the descriptor “first” may be used to refer to an element in the detailed description, while the same element may be referred to in a claim with a different descriptor such as “second” or “third.” In such instances, it should be understood that such descriptors are used merely for ease of referencing multiple elements or components.
Immersive realities give users the perception of being physically present in a non-physical world. The perception is created by surrounding the user of the immersive reality system in images, sound or other stimuli that provide a total environment. Creation of these immersive realities requires high quality image and video feeds, which require methods such as calibration of cameras that are observing an event and creating the video feed. Calibrating multiple-cameras allows for high quality immersive realities to be created for multiple industries.
The methods and apparatus to perform multiple-camera calibration without an initial guess, as described herein, may be used in many different ways in many different use cases. For example, the teachings of this disclosure may be used in planar environments and cold start environments. A planar environment is a situation in which most of a scene being observed (e.g., a playing field) by multiple cameras is composed from at least one plane. In cases of planar environments, some multiple view algorithms are not feasibly used in such vision systems.
A cold start environment is a situation in which sensors are not stationary or do not include auxiliary devices, such as INS, for localization. As such, cold start systems do not have an initial guess that can be used for multiple-camera calibration.
In contrast to camera calibration by bundle adjustment, which tends to be unstable and inaccurate especially in wide base-line case, the systems and methods described herein are stable and accurate. In bundle adjustment, the solving stage is being called many times, thus, making the entire process very time wasteful, and dependent on both the order of the cameras and an accurate initial guess. Overall, bundle adjustments are non-robust and unstable. The systems and methods disclosed herein are robust, stable, and efficient.
The example, scene 101 of
The example cameras 102A, 102B, 102C, 102D, 102E, 102F may be implemented using any camera, such as an internet protocol (IP) camera. The example cameras 102A, 102B, 102C, 102D, 102E, 102F are located around the example scene 101 and has a different viewing angle from which the example camera 102A, 102B, 102C, 102D, 102E, 102F views the scene 101. A view from each example camera 102A, 102B, 102C, 102D, 102E, 102F overlaps with a view from at least one other example camera 102A, 102B, 102C, 102D, 102E, 102F to facilitate creation of an immersive reality.
As described herein, an example of multiple-camera calibration without an initial guess involves first grabbing a synchronized shot from all the example cameras 102A, 102B, 102C, 102D, 102E, 102F in the example scene 101, where the focal lengths of the example cameras 102A, 102B, 102C, 102D, 102E, 102F are approximately known. This environment contains example multiple cameras 102A, 102B, 102C, 102D, 102E, 102F having overlapping views of the example scene 101.
The example scene 101 also includes a reference point 103, which can be, for example, a stationary object or a specific marked corner of a sports field or a marked line in a field. In the example, of
The example network 104 may be implemented utilizing any public and/or private network that facilitates communication between the example cameras 102A, 102B, 102C, 102D, 102E, 102F and the image processor 106. For example, all or part of the network 104 may be implemented utilizing the Internet. The example network 104 also facilitates indicating to the example image processor 106 the example reference point 103.
The example image processor 106 may be implemented utilizing any suitable processing hardware. For example, the image processor 106 may be implemented using a server, a computer, or any suitable combination thereof. In the example of
The camera orientation calibrator 108, which is shown as implemented as part of the image processor 106 in the example of
In one example, the example camera orientation calibrator 108 solves for camera locations to first find the global rotations in a coordinate system. For example, the camera orientation calibrator 108 identifies the location (e.g., orientation) of a first example camera such that the location is (0,0,0) and identifies the location (e.g., orientation) of a second example camera such that the location is 1 in arbitrary units. The example camera orientation calibrator 108 is able calculate the distance between the example cameras.
After the global rotations are known, the relative translations are determined by decoupling the global rotations from the relative translation and get: tij˜(ci−cj). It is necessary to find the camera centers ci. Therefore, the system defines and solves an optimization problem to, for example, determine the following energy: ΣijϵEρ(×(ci−cj)|2) where
is the relative translation with unit length, and ρ is a loss function such as soft L1 loss.
In the example of
The virtual image generator 112, which may be implemented as part of the image processor 106, receives input from the position and orientation calculator 110 and generates an image for the immersive reality using the data received from the network 104, the camera orientation calibrator 108, and from the position and orientation calculator 110. For example, the virtual image generator 112 may combine feeds from multiple cameras using the generated information. This generates an immersive reality that is sent to be displayed on the AR/VR/MR set 114.
The AR/VR/MR set 114, which may be a headset, or one or more screens, uses the data from the virtual image generator and displays the scene 101 to a user.
The example camera orientation calibrator 108 receives information from the example network 104. The network 104 passes data from the example cameras 102A, 102B, 102C, 102D, 102E, 102F to the example camera orientation calibrator 108. Using the data from the example cameras 102A, 102B, 102C, 102D, 102E, 102F, the example camera data receiver 202 determines the focal length of each example camera 102A, 102B, 102C, 102D, 102E, 102F using the example focal length receiver 210.
The example plane surface calculator uses data from the cameras 102A, 102B, 102C, 102D, 102E, 102F to automatically calculate the planar surface input. The example camera data receiver 202 receives the plane surface from the example plane surface calculator 212.
Using the data from the cameras 102A, 102B, 102C, 102D, 102E, 102F, the example feature matcher receiver 214 of the example camera data receiver 202 matches overlapping features of each example camera 102A, 102B, 102C, 102D, 102E, 102F.
Using the data from the example network 104, the example camera data receiver 202 obtains the manual label point, such as the example reference point 103 (e.g., stationary object 103) using the example manual label point receiver 216. The data from the example camera data receiver 202 is sent to the example calibration data sender 206, which consolidates the data and sends it to the example position and orientation calculator 110.
The example camera data receiver 202 receives an input from the example network 104. The example calibration data receiver 202 receives multiple inputs. Examples of inputs include the focal length in the example focal length receiver 210, the manual label point in the example manual label point receiver 216, the plane surface in the example plane surface calculator 212, and the feature matcher in the feature matcher receiver 214. The example camera orientation calibrator 108 then sends the output of the example camera data receiver 202 to the example calibration data sender 206. The example camera orientation calibrator sends its output to the example position and orientation calculator 110.
While an example manner of implementing the camera orientation calibrator 108 of
As shown in the example of
The example relative rotation, relative translation, surface normal decomposer 304 calculates the relative rotation, relative translation, surface normal by constructing and decomposing the values. The example relative rotation, relative translation, surface normal decomposer 304 sends the output to the example outlier remover 306, which removes outliers from the data. Then, the example outlier remover 306 sends the remaining output to the example minimization solver 308, which calculates a minimization problem. Then the example position and orientation calculator 110 sends the output of the example minimization solver 308 to the example virtual image generator 112, which generates a virtual image of the entire example scene 101. The output of the virtual image generator 112 is then sent and displayed on the example AR/VR/MR set 114.
The example position and orientation calculator 110 receives input data from the example camera orientation calibrator 108. The example position and orientation calculator 110 first uses the data to solve for the homography between camera pairs using a homography calculator 302. The example homography calculator 302 receives data from the example camera orientation calibrator 108 using the example data receiver 310. The homography calculator 302 sends the data to the example feature obtainer 312, which gets data about all the features in the scene. Then, the example feature obtainer 312 sends the features to the example feature comparer 314, which analyzes the features in the scene and identifies common features. The example feature comparer 314 sends the common features to the example edge creator 316.
The edge creator 316 determines edges between pairs of cameras to determine overlapping image feeds using the reference points to determine proper overlap points.
A graph can be defined where the vertices are the cameras as shown in
For each edge, (e.g., pair of cameras with multiple (e.g., 4, 6, 8, etc.) matches) the homography is calculated between matches of the points on the image's plane. Due to the fact that all the cameras are looking at the same plane and their transformation is between image plane coordinates, the homography can be written as: H=γ(R+t·nT), where γ is an arbitrary non-zero number, R is the relative rotation from first camera to second one, t is the relative camera displacement from the first camera to the second camera in the rotated space, scaled proportionally, and n is the plane normal from the frame system of the first camera.
The example homography calculator 302 uses the created edges and feeds them to the relative rotation, relative translation, surface normal decomposer 304. Knowing the focal lengths of the cameras, the homographies can be constructed and can be decomposed into relative rotations and translations. This can be done as described in Faugeras' singular value decomposition (SVD)-based method and implemented in, for example, OpenCV. This approach results in a number (e.g., 4) of physically possible solutions for the relative rotation, translation and surface normal per homography. By defining the cameras to be above the plane (e.g., surface normals point toward camera centers), the number of possible solutions is reduced (e.g., to two).
The example relative rotation, relative translation, surface normal decomposer 304 is used to find the relative rotation, relative translation, and surface normal of each camera. Using the edges calculated in the example edge creator 316, the example relative rotation, relative translation, surface normal decomposer 304 solves for the relative rotation, relative translation, and surface normal of each example camera. Using the solutions, the example relative rotation, relative translation, surface normal decomposer 304 decomposes the values and sends the data to the example outlier remover 306.
The example outlier remover 306 receives the solutions that were decomposed and compares the received solutions. By selecting triplets of connected vertices, the example relative rotation, relative translation, surface normal decomposer 304 requires that pairs agree on the same surface normal. In some cases, the normal from the second camera is measured so that the corresponding relative rotation is applied on the normal. The solutions (e.g., four solutions) given by the homography decomposition are reduced to a single solution of a relative rotation and translation between two agreeing cameras.
If by the end of the process of eliminating solutions, there exist relative geometries with more than one solution, the relative geometries did not agree with any other relative geometry on the plane normal. Such results are considered an outlier and are removed by the example outlier remover 306, which sends the remaining solutions to the example minimization solver 308.
The example minimization solver 308 solves for the camera orientation. To solve camera orientations, the example minimization solver 308 solves the absolute rotations in a coordinate system aligned with the first camera (or any other camera). Quaternions are used to represent rotations due to increased numerical stability, compared to other rotation representations such as SO(3). However, quaternion representation has a global sign ambiguity. To overcome the sign ambiguity, a rough initial guess is used for the rotation that is obtained by a Hamiltonian path. A Hamiltonian path is a path that visits every vertex once with no repeats. However, the Hamiltonian path does not need to start and end at the same vertex. Starting from the first camera (with rotation set to identity (e.g., (0,0,0)), each of the edges on the graph are visited until the path visits all vertices of the graph. During the process, the global rotation of each example camera visited is set by applying the relevant relative rotation. The result is a rough estimation for the global rotation of each camera, represented as a quaternion . Since most of the data is not included in the Hamiltonian path, this initial guess is considered rough and noisy.
Phase one to solve camera orientations involves converting the relative rotations to a quaternion representation. The sign ambiguity is set using the initial guess. The relative rotation from camera j to camera i as qij is marked and the relative rotation is estimated from the initial guess =
·
. The sign that is closer in the sense of quaternion norm is chosen as arg min(|
−qij|,|
+qij|), resulting in a consistent relative rotation system in a quaternion representation qij.
Phase two to solve camera orientations involves solving a minimization problem to find the three degrees of freedom for all the camera rotations represented by quaternions for a more accurate and less noisy initial guess using the consistent relative quaternions from the quaternion norm min ΣijϵE|−qij·
|2. In general, quaternions that represent rotations are of unit length.
To solve the minimization problem, the quaternion norm equation (e.g., min ΣijϵE|−qij·
|2) is solved such that |
|=1 for all i. The quaternion norm equation is difficult to solve without an initial guess. An initial guess is generated for a simplified quaternion norm equation (e.g., min ΣijϵE|
−qij·
|2) is solved such that |
|=constant value through a SVD decomposition technique
Phase three to solve camera orientations is a variant used to solve the minimization problem to find the three degrees of freedom for all the camera rotations represented by quaternions min ΣijϵEρ(|−qij·
|2). Again, adding the norm constraint min ΣijϵEρ(|
−qij·
|2) such that |
|=1 for all i. This non-convex optimization uses quaternion parametrization and loss function ρ such as soft L1 loss. The loss function is crucial in decreasing the phenomenon of outliers. The solution is defined up to a global rotation, so the first (or any other) rotation is fixed to be identity to achieve a unique global minimum. Resulting with global rotation per camera qi, measured from the first camera.
To solve for camera locations, the absolute translations are found in a coordinate system where the location of the first (or any other) camera is (0,0,0) and the distance to the second camera (or any other) is 1 in arbitrary units.
After the global rotations are known, the relative translations are attained by decoupling the global rotations from the relative translation: tij˜(ci−cj). It is necessary to find the camera centers ci. Therefore, an optimization problem is solved, for example, the following energy is devised: ΣijϵEρ(×(ci−cj)|2) where
is the relative translation with unit length, and ρ is a loss function such as soft L1 loss.
While an example manner of implementing the position and orientation calculator 110 of
Flowcharts representative of example hardware logic, machine readable instructions, hardware implemented state machines, and/or any combination thereof for implementing the image processor 806 is shown in
The machine readable instructions described herein may be stored in one or more of a compressed format, an encrypted format, a fragmented format, a compiled format, an executable format, a packaged format, etc. Machine readable instructions as described herein may be stored as data or a data structure (e.g., portions of instructions, code, representations of code, etc.) that may be utilized to create, manufacture, and/or produce machine executable instructions. For example, the machine readable instructions may be fragmented and stored on one or more storage devices and/or computing devices (e.g., servers) located at the same or different locations of a network or collection of networks (e.g., in the cloud, in edge devices, etc.). The machine readable instructions may require one or more of installation, modification, adaptation, updating, combining, supplementing, configuring, decryption, decompression, unpacking, distribution, reassignment, compilation, etc. in order to make them directly readable, interpretable, and/or executable by a computing device and/or other machine. For example, the machine readable instructions may be stored in multiple parts, which are individually compressed, encrypted, and stored on separate computing devices, wherein the parts when decrypted, decompressed, and combined form a set of executable instructions that implement one or more functions that may together form a program such as that described herein.
In another example, the machine readable instructions may be stored in a state in which they may be read by processor circuitry, but require addition of a library (e.g., a dynamic link library (DLL)), a software development kit (SDK), an application programming interface (API), etc. in order to execute the instructions on a particular computing device or other device. In another example, the machine readable instructions may need to be configured (e.g., settings stored, data input, network addresses recorded, etc.) before the machine readable instructions and/or the corresponding program(s) can be executed in whole or in part. Thus, machine readable media, as used herein, may include machine readable instructions and/or program(s) regardless of the particular format or state of the machine readable instructions and/or program(s) when stored or otherwise at rest or in transit.
The machine readable instructions described herein can be represented by any past, present, or future instruction language, scripting language, programming language, etc. For example, the machine readable instructions may be represented using any of the following languages: C, C++, Java, C#, Perl, Python, JavaScript, HyperText Markup Language (HTML), Structured Query Language (SQL), Swift, etc.
As mentioned above, the example processes of
“Including” and “comprising” (and all forms and tenses thereof) are used herein to be open ended terms. Thus, whenever a claim employs any form of “include” or “comprise” (e.g., comprises, includes, comprising, including, having, etc.) as a preamble or within a claim recitation of any kind, it is to be understood that additional elements, terms, etc. may be present without falling outside the scope of the corresponding claim or recitation. As used herein, when the phrase “at least” is used as the transition term in, for example, a preamble of a claim, it is open-ended in the same manner as the term “comprising” and “including” are open ended. The term “and/or” when used, for example, in a form such as A, B, and/or C refers to any combination or subset of A, B, C such as (1) A alone, (2) B alone, (3) C alone, (4) A with B, (5) A with C, (6) B with C, and (7) A with B and with C. As used herein in the context of describing structures, components, items, objects and/or things, the phrase “at least one of A and B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, and (3) at least one A and at least one B. Similarly, as used herein in the context of describing structures, components, items, objects and/or things, the phrase “at least one of A or B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, and (3) at least one A and at least one B. As used herein in the context of describing the performance or execution of processes, instructions, actions, activities and/or steps, the phrase “at least one of A and B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, and (3) at least one A and at least one B. Similarly, as used herein in the context of describing the performance or execution of processes, instructions, actions, activities and/or steps, the phrase “at least one of A or B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, and (3) at least one A and at least one B.
As used herein, singular references (e.g., “a”, “an”, “first”, “second”, etc.) do not exclude a plurality. The term “a” or “an” entity, as used herein, refers to one or more of that entity. The terms “a” (or “an”), “one or more”, and “at least one” can be used interchangeably herein. Furthermore, although individually listed, a plurality of means, elements or method actions may be implemented by, e.g., a single unit or processor. Additionally, although individual features may be included in different examples or claims, these may possibly be combined, and the inclusion in different examples or claims does not imply that a combination of features is not feasible and/or advantageous.
There are many applications that require precise calibration of multiple cameras, for example autonomous driving, robot navigation/interaction with the surroundings, and a full 3D reconstruction for creating free dimensional videos. One of the main tasks required for rendering a scene from a virtual camera, is to obtain the exact extrinsic calibration parameters of each camera. Once the cameras focal lengths are known, this allows it to automatically obtain the positions and orientations, hence, saving manpower and obtaining much better accuracy. Furthermore, this method can be applied while in use, for example it can be applied while a game being broadcasted, instead of previous methods which would compute calibration between games. This further gives the ability to broadcast multiple events simultaneously.
At block 402, the camera orientation calibrator 108, inputs multiple cameras with known focal lengths (in pixels), pointed to a plane surface. An example of multiple-camera calibration without an initial guess involves first grabbing a synchronized shot from all the cameras in the scene, where the focal lengths are approximately known.
Theoretically, each camera image must obtain at least 4 features (to construct homography with another cameras), but for numerical stability, it demands at least 8 features per camera image. Features from different cameras agree (matching features) if they are looking at the same three-dimensional point. In some examples, a pair of cameras agree if there are at least 8 features that agree. In other examples, a pair of cameras agree by proxy if there exists a path of agreeing cameras between the two cameras. The matches must satisfy the connectivity condition: all (or a majority of) pairs of cameras must agree by proxy. The features extraction and their matching can be obtained by manual stabbing, an automated method of feature detection and matching, or any other method.
At block 404, the camera orientation calibrator 108 checks for features matches between the camera images.
At block 406, the position and orientation calculator 110 calculates homographies between camera pairs.
At block 408, the position and orientation calculator 110 decomposes each homography into a relative rotation, relative translation, and surface normal.
At block 410, the position and orientation calculator 110 removes outliers by requiring the same surface normal.
At block 412, the position and orientation calculator 110 solves a minimization problem to find absolute rotations knowing relative ones.
At block 414, the position and orientation calculator 110 solves a minimization problem to find absolute translations knowing relative ones. The output of position and orientation calculator 110 will go to virtual image generator 112, which will generate a virtual image that can be sent to AR/VR/MR set 114.
At block 504, the example homography calculator 302, obtains relative rotation for each camera from the example camera orientation calibrator 108.
At block 506, the example position and orientation calculator 110 converts relative rotation to quaternion for each camera and removes outliers with the example outlier remover 306.
At block 508, the example minimization solver 308, solves the minimization problem for each camera.
At block 510, the example position and orientation calculator 110, outputs global rotation for each camera.
At block 512, the position and orientation calculator 110, outputs 3 degrees of freedom for each camera rotation represented by quaternions for each camera. Camera orientations are used to map cameras and calibrate multiple cameras without an initial guess. The example machine-readable instructions 500 end.
At block 604, the relative rotation, relative translation, surface normal decomposer 304, finds absolute translation in coordinate system from an example first camera to a an example second camera in arbitrary units.
At block 606, the relative rotation, relative translation, surface normal decomposer 304, uses global rotation and absolute translation to attain the relative translation as well as use the outlier remover 306 to remove outliers.
At block 608, the example position and orientation calculator 110, solves an optimization problem to find camera centers. Finding camera centers is used for camera locations, which are mapped to calibrate multiple cameras without an initial guess. The example instructions end.
For each edge (e.g., pair of cameras with at least 8 matches) the homography between matches of the points is solved for on the images plane. Due to the facts that all the cameras are looking at the same plane and their transformation is between images plane coordinates, the homography can be written as: H=γ(R+t·nT), where γ is an arbitrary non-zero number, R is the relative rotation from first camera to second one, t is the relative camera displacement from the first camera to the second camera in the rotated space, scaled proportionally, and n is the plane normal from the frame system of the first camera.
The processor platform 800 of the illustrated example includes a processor 812. The processor 812 of the illustrated example is hardware. For example, the processor 812 can be implemented by one or more integrated circuits, logic circuits, microprocessors, GPUs, DSPs, or controllers from any desired family or manufacturer. The hardware processor may be a semiconductor based (e.g., silicon based) device. In this example, the processor 812 implements the example camera orientation calibrator 108, the example position and orientation calculator 110, and the example virtual image generator 112.
The processor 812 of the illustrated example includes a local memory 813 (e.g., a cache). The processor 812 of the illustrated example is in communication with a main memory including a volatile memory 814 and a non-volatile memory 816 via a bus 818. The volatile memory 814 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS® Dynamic Random Access Memory (RDRAM®) and/or any other type of random access memory device. The non-volatile memory 816 may be implemented by flash memory and/or any other desired type of memory device. Access to the main memory 814, 816 is controlled by a memory controller.
The processor platform 800 of the illustrated example also includes an interface circuit 820. The interface circuit 820 may be implemented by any type of interface standard, such as an Ethernet interface, a universal serial bus (USB), a Bluetooth® interface, a near field communication (NFC) interface, and/or a PCI express interface.
In the illustrated example, one or more input devices 822 are connected to the interface circuit 820. The input device(s) 822 permit(s) a user to enter data and/or commands into the processor 812. The input device(s) 822 can be implemented by, for example, an audio sensor, a microphone, a camera (still or video), a keyboard, a button, a mouse, a touchscreen, a track-pad, a trackball, an isopoint device, and/or a voice recognition system.
One or more output devices 824 are also connected to the interface circuit 820 of the illustrated example. The output devices 824 can be implemented, for example, by display devices (e.g., a light emitting diode (LED), an organic light emitting diode (OLED), a liquid crystal display (LCD), a cathode ray tube display (CRT), an in-place switching (IPS) display, a touchscreen, etc.), a tactile output device, a printer and/or speaker. The interface circuit 820 of the illustrated example, thus, typically includes a graphics driver card, a graphics driver chip and/or a graphics driver processor.
The interface circuit 820 of the illustrated example also includes a communication device such as a transmitter, a receiver, a transceiver, a modem, a residential gateway, a wireless access point, and/or a network interface to facilitate exchange of data with external machines (e.g., computing devices of any kind) via a network 826. The communication can be via, for example, an Ethernet connection, a digital subscriber line (DSL) connection, a telephone line connection, a coaxial cable system, a satellite system, a line-of-site wireless system, a cellular telephone system, etc.
The processor platform 800 of the illustrated example also includes one or more mass storage devices 828 for storing software and/or data. Examples of such mass storage devices 828 include floppy disk drives, hard drive disks, compact disk drives, Blu-ray disk drives, redundant array of independent disks (RAID) systems, and digital versatile disk (DVD) drives.
The machine executable instructions 832 of
A block diagram illustrating an example software distribution platform 905 to distribute software such as the example computer readable instructions 832 of
From the foregoing, it will be appreciated that example methods, apparatus and articles of manufacture have been disclosed that calibrate multiple-cameras without an initial guess. The disclosed methods, apparatus and articles of manufacture improve the efficiency of using a computing device by providing a solution to using a planar environment and/or a cold start.
The methods and apparatus to solve multiple-camera calibration without an initial guess, as described herein, may be used in many different ways and may be incorporated into many different use cases. For example, the teachings of this disclosure have many applications in planar environments and cold start environments. A planar environment is a situation in which most of the scene being observed by multiple cameras is composed from at least one plane. In cases of planar environments, some multiple view algorithms are not feasibly used in such vision systems. A cold start environment is a situation in which sensors are not stationary or do not include auxiliary devices for localization such as inertial navigation system (INS). As such, the system does not have an initial guess for multiple-camera calibration.
The methods and apparatus to solve multiple-camera calibration without an initial guess, as described herein, cost less and do not have drift caused by poor indoors GPS reception as what can occur with a sensor-based approach to multiple-camera calibration, such as INS.
The methods and apparatus to solve multiple-camera calibration without an initial guess, as described herein, allows for the addition of cameras to the system and is more robust and stable than an additive bundle adjustment method.
This disclosed methods, apparatus and articles of manufacture automatically obtain the positions and orientations, hence, saving manpower and obtaining much better accuracy. Furthermore, this method can be applied while in use, for example it can be applied while a game being broadcasted, instead of previous methods which would compute calibration between games. This further gives the ability to broadcast multiple events simultaneously and without an initial guess. The disclosed methods, apparatus and articles of manufacture are accordingly directed to one or more improvement(s) in the functioning of a computer.
Example methods, apparatus, systems, and articles of manufacture to perform multiple camera calibration are disclosed herein. Further examples and combinations thereof include the following:
Example 1 includes a method perform camera calibration, the method comprising calculating a homography for a camera pair, the homography created using common image plane coordinate matches of identified features and identified focal lengths of cameras of the camera pair, decomposing the homography to identify relative translations, relative rotations, and surface normals, solving a minimization equation for camera centers of the cameras, and calibrating the cameras using cameras positions and orientations.
Example 2 includes the method of example 1, wherein the method further includes removing other cameras that do not have the same surface normal as the other cameras.
Example 3 includes the method of example 1, wherein the method further includes setting a first camera as point (0,0,0) in a coordinate plane.
Example 4 includes the method of example 1, wherein the method further includes generating an image combining feeds of the cameras to create an immersive reality.
Example 5 includes the method of example 1, wherein the method further includes utilizing a quaternion to represent the relative rotation and an absolute rotation in a coordinate system.
Example 6 includes the method of example 5, wherein the method further includes utilizing an initial guess for the rotation obtained by a Hamiltonian path to overcome a sign ambiguity of the quaternion.
Example 7 includes the method of example 1, wherein the method further includes solving an optimization problem to find the camera centers using relative translations, the relative translations calculated by decoupling global rotations.
Example 8 includes an apparatus comprising a homography calculator to calculate a homography for a camera pair, the homography created using common image plane coordinate matches of identified features and identified focal lengths of cameras of the camera pair, a relative rotation, relative translation, surface normal decomposer to decompose the homography to identify relative translations, relative rotations, and surface normals, a minimization solver to solve a minimization equation for camera centers of the cameras, and a position and orientation calculator to calibrate the cameras using cameras positions and orientations.
Example 9 includes the apparatus of example 8, wherein the position and orientation calculator is to remove other cameras that do not have the same surface normal as the other cameras.
Example 10 includes the apparatus of example 8, further including a camera orientation calibrator to set a first camera as point (0,0,0) in a coordinate plane.
Example 11 includes the apparatus of example 8, further including a virtual image generator to generate an image combining feeds of the cameras to create an immersive reality.
Example 12 includes the apparatus of example 8, wherein the minimization solver is to utilize a quaternion to represent the relative rotation and an absolute rotation in a coordinate system.
Example 13 includes the apparatus of example 12, wherein the minimization solver is to utilize an initial guess for the rotation obtained by a Hamiltonian path to overcome a sign ambiguity of the quaternion.
Example 14 includes the apparatus of example 8, wherein the minimization solver is to solve an optimization problem to find the camera centers using relative translations, the relative translations calculated by decoupling global rotations.
Example 15 includes a non-transitory computer readable medium comprising instructions that, when executed cause a machine to at least calculate a homography for a camera pair, the homography created using common image plane coordinate matches of identified features and identified focal lengths of cameras of the camera pair, decompose the homography to identify relative translations, relative rotations, and surface normals, solve a minimization equation for camera centers of the cameras, and calibrate the cameras using cameras positions and orientations.
Example 16 includes the non-transitory computer readable medium of example 15, wherein the instructions, when executed, cause the machine to remove other cameras that do not have the same surface normal as the other cameras.
Example 17 includes the non-transitory computer readable medium of example 15, wherein the instructions, when executed, cause the machine to set a first camera as point (0,0,0) in a coordinate plane.
Example 18 includes the non-transitory computer readable medium of example 15, wherein the instructions, when executed, cause the machine to generate an image combining feeds of the cameras to create an immersive reality.
Example 19 includes the non-transitory computer readable medium of example 15, wherein the instructions, when executed, cause the machine to utilize a quaternion to represent the relative rotation and an absolute rotation in a coordinate system.
Example 20 includes the non-transitory computer readable medium of example 19, wherein the instructions, when executed, cause the machine to utilize an initial guess for the rotation obtained by a Hamiltonian path to overcome a sign ambiguity of the quaternion.
Example 21 includes the non-transitory computer readable medium of example 15, wherein the instructions, when executed, cause the machine to solve an optimization problem to find the camera centers using relative translations, the relative translations calculated by decoupling global rotations.
Although certain example methods, apparatus and articles of manufacture have been disclosed herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all methods, apparatus and articles of manufacture fairly falling within the scope of the claims of this patent.
The following claims are hereby incorporated into this Detailed Description by this reference, with each claim standing on its own as a separate embodiment of the present disclosure.