The present application relates to the technical field of image processing, and in particular to a method, apparatus, device, storage medium and product for generating a binocular stereoscopic panoramic image.
With the development of technologies such as Virtual Reality (VR), users have higher and higher requirements for images. In electronic devices such as VR glasses, binocular stereoscopic panoramic images or videos can be shown to the user, displaying images of left and right eyes on the left and right eye screens respectively. The user can get the information with differences between the displayed images to create a sense of three-dimensionality in his mind.
In a traditional method, the electronic device can simultaneously photograph the same object through multiple lenses, stitch the images captured by the multiple lenses into a left-eye panoramic image and a right-eye panoramic image, and combine them to obtain a binocular stereoscopic panoramic image. However, using the above method generally requires professional-grade multi-lenses panoramic photographing equipment, which is complicated to operate and expensive.
At present, ordinary non-stereoscopic panoramic image/video photographing equipment has been very common, and the operation is simple and low-cost, there is an urgent need for a simple and fast method to generate stereoscopic panoramic image/video directly through the ordinary non-stereoscopic panoramic image/video.
Some embodiments of the present disclosure provide a generation method, apparatus, device, storage medium and product for generating a stereoscopic panoramic image/video directly from an ordinary non-stereoscopic panoramic image/video, in response to the above technical problem or others.
In one aspect, one embodiment of the present disclosure provides a method for generating a binocular stereoscopic panoramic image. The method may comprise inputting a panoramic image into a predetermined depth estimation model to obtain a depth image corresponding to the panoramic image, the depth image including depth information corresponding to each pixel point in the panoramic image; mapping the panoramic image into a left-eye panoramic image and a right-eye panoramic image based on a preset pupil distance and the depth image; and generating a binocular stereoscopic panoramic image based on the left-eye panoramic image and the right-eye panoramic image.
In another aspect, another embodiment of the present disclosure provides a device for generating a binocular stereoscopic panoramic image. The device may include obtaining circuitry to input a panoramic image into a predetermined depth estimation model to obtain a depth image corresponding to the panoramic image; the depth image includes depth information corresponding to each pixel point in the panoramic image; mapping circuitry to map the panoramic image into a left-eye panoramic image and a right-eye panoramic image based on a preset pupil distance and the depth image; and generating circuitry to generate a binocular stereo panoramic image based on the left-eye panoramic image and the right-eye panoramic image.
In another aspect, another embodiment of the present disclosure provides an electronic device comprising a memory and a processor, the memory storing a computer program, wherein the processor, when executing the computer program, implements steps of: inputting a panoramic image into a predetermined depth estimation model to obtain a depth image corresponding to the panoramic image, the depth image including depth information corresponding to each pixel point in the panoramic image; mapping the panoramic image into a left-eye panoramic image and a right-eye panoramic image based on a preset pupil distance and the depth image; and generating a binocular stereoscopic panoramic image based on the left-eye panoramic image and the right-eye panoramic image.
In another aspect, another embodiment of the present disclosure provides a computer-readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements steps of: inputting a panoramic image into a predetermined depth estimation model to obtain a depth image corresponding to the panoramic image, the depth image including depth information corresponding to each pixel point in the panoramic image; mapping the panoramic image into a left-eye panoramic image and a right-eye panoramic image based on a preset pupil distance and the depth image; and generating a binocular stereoscopic panoramic image based on the left-eye panoramic image and the right-eye panoramic image.
In some embodiments, since the electronic device can acquire the depth image of the panoramic image, the panoramic image can be mapped into the left-eye panoramic image and the right-eye panoramic image according to the depth image and the preset pupil distance to obtain the binocular stereoscopic panoramic image, which makes it possible for the electronic device to complete the mapping conversion between the panoramic image and the binocular stereoscopic panoramic image. Accordingly, there is no need to acquire the binocular panoramic stereoscopic image with the specialized multi-lenses panoramic photographing equipment, thereby reducing the cost of the electronic device and having simple operation.
It should be understood that the above general description and the detailed description that follows are exemplary and explanatory only and do not limit the present application.
In order to explain the technical features of embodiments of the present disclosure more clearly, the drawings used in the present disclosure are briefly introduced as follow. Obviously, the drawings in the following description are some exemplary embodiments of the present disclosure. Ordinary person skilled in the art may obtain other drawings and features based on these disclosed drawings without inventive efforts.
In order to make the purpose, technical solutions and advantages of the present application more clearly understood, the following is a further detailed description of the present application in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only for the purpose of explaining the present application and are not intended to limit the present application.
A method for generating a binocular stereoscopic panoramic image provided in one embodiment of the present application can be applied to an electronic device, and the electronic device can process a panoramic image to obtain a binocular stereoscopic panoramic image corresponding to the panoramic image. The aforementioned electronic device may be, but is not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices. The above electronic device may also be an imaging device such as a camera, a camcorder, and the like; the above camera may be, but is not limited to, an ordinary camera, a pocket camera, a shake-proof camera, a Virtual Reality (VR) panoramic camera, a sports camera, or a consumer-grade or professional-grade panoramic camera, and the like.
In one embodiment, as shown in
S101: inputting a panoramic image into a predetermined depth estimation model to obtain a depth image corresponding to the panoramic image; the depth image includes depth information corresponding to each pixel point in the panoramic image.
The panoramic image may be an image acquired by the electronic device via photographing, or may be an image stored in the electronic device, without limitation herein. The panoramic image acquired via photographing by the electronic device may be an image captured by the electronic device by means of a panoramic camera, or may be an image frame in a video captured by the electronic device, without limitation herein. When the above panoramic image is an image stored in the electronic device, it may be stored in the electronic device in an image format, or it may be a video frame in a stored video. For example, the above electronic device may be a VR glass, and the above panoramic image may be a panoramic image input to the VR glass to be played.
The camera of the aforementioned electronic device may be a dual fisheye panoramic camera, wherein the electronic device may capture the panoramic image in such a way that any one angle may be covered by a field of view of one of the lenses in the dual fisheye panoramic camera. The electronic device may stitch the images captured by the different lenses to obtain a panoramic image.
The above depth estimation model may be a neural network model, and the above depth estimation model may be used to extract depth information of each pixel point in the panoramic image, and generate a depth image corresponding to the panoramic image based on the depth information corresponding to each pixel point. Wherein the above depth information is a distance between an object represented by a pixel point in the image and a center of the camera when the panoramic image is taken.
The electronic device may input the panoramic image into the depth estimation model described above, or may pre-process the panoramic image before inputting it into the depth estimation model, without limitation herein. For example, the pre-processing operation of the electronic device on the panoramic image may include down-sampling the panoramic image, changing a projection method of the panoramic image, changing brightness or contrast of the panoramic image, and/or converting the panoramic image into a single-channel grayscale map. The above depth estimation model may output a depth image corresponding to the panoramic image, and a dimension of the above depth image may be equal to that of the panoramic image or may be smaller than that of the panoramic image, without limitation herein.
The above depth image and the above panoramic image may use the same panoramic projection, and the above panoramic projection may be a spherical projection or an isometric cylindrical projection, without limitation herein.
S102, mapping the panoramic image into a left-eye panoramic image and a right-eye panoramic image based on a preset pupil distance as well as the depth image.
On the basis of obtaining the depth image, the electronic device may map the panoramic image into a left-eye panoramic image and a right-eye panoramic image, such that the parallax generated by the left-eye panoramic image and the right-eye panoramic image corresponds to the depth image.
When the user views the left panoramic image with the left eye and the right panoramic image with the right eye at the same time, there is a difference in the position of the same object in the left eye and the right eye, which is also known as parallax. The larger the parallax, the closer the user perceives the object; the smaller the parallax, the farther the user perceives the object.
After the electronic device maps the panoramic image into a corresponding left-eye panoramic image and a right-eye panoramic image, such that the distance perceived by the user through the parallax generated by the above-described binocular stereoscopic panoramic image corresponds to the above-described depth image. For example, the above panoramic image includes an object A, and the depth information corresponding to the object A in the depth image obtained by the depth estimation model is H; after the electronic device maps the panoramic image into a left-eye panoramic image and a right-eye panoramic image based on the above depth image, the user can perceive the distance of the object A from the user through the left-eye panoramic image and the right-eye panoramic image, and the distance corresponds to the depth information H.
Specifically, the electronic device may use an omni-directional stereo (ODS) projection method to map the panoramic image into a left-eye panoramic image and a right-eye panoramic image.
S103, generating a binocular stereo panoramic image based on the left-eye panoramic image and the right-eye panoramic image.
The electronic device can form a binocular stereoscopic panoramic image from the above-described left-eye panoramic image and the right-eye panoramic image on the basis of obtaining the above-described left-eye panoramic image and the right-eye panoramic image.
In the above-described method for generating a binocular stereoscopic panoramic image, the electronic device inputs the panoramic image into a predetermined depth estimation model to obtain a depth image corresponding to the panoramic image; then, according to the depth image and a preset pupil distance, the panoramic image is mapped into a left-eye panoramic image and a right-eye panoramic image; and a binocular stereoscopic panoramic image is generated based on the left-eye panoramic image and the right-eye panoramic image; wherein, the above-described depth image includes depth information corresponding to each pixel point in the panoramic image. Since the electronic device can acquire the depth image of the panoramic image, the panoramic image can be mapped into the left-eye panoramic image and the right-eye panoramic image according to the depth image and the preset pupil distance to obtain the binocular stereoscopic panoramic image, which makes it possible for the electronic device to complete the mapping conversion between the panoramic image and the binocular stereoscopic panoramic image. Accordingly, there is no need to complete acquisition of the binocular panoramic stereoscopic image with the specialized multi-lenses panoramic photographing equipment, thereby reducing the cost of electronic equipment, and the operation is simple.
S201, according to the preset pupil distance as well as the depth image, obtaining a left-eye mapping relationship and a right-eye mapping relationship; the left-eye mapping relationship includes a correspondence between a first coordinate of a pixel point in the panoramic image and a second coordinate of the pixel point in the panoramic screen of the left eye; and the right-eye mapping relationship includes a correspondence between the first coordinate and a third coordinate of the pixel point in the panoramic screen of the right eye.
Wherein, the above pupil distance may be used to characterize the distance between the pupil of the user's left eye and the pupil of the user's right eye. In one embodiment, a preset value of the pupil distance may be stored in the electronic device, and the preset value is employed to map the above panoramic image.
In another embodiment, the electronic device may adopt different pupil distances for different users; the electronic device may preset the correspondence between different user accounts and pupil distances, and the pupil distance in the above correspondence may be input by the user, or selected by the user in a plurality of preset values, or obtained by the electronic device based on the extraction of the user's image, and no limitation will be made herein with respect to the acquisition method of the above pupil distance. For example, when the user is using the electronic device, the image acquisition can be performed through the electronic device or a terminal such as a cellular phone connected to the electronic device, and the above image acquisition process can be in the process of user registration or in the process of logging in, and no limitation is made herein.
In another embodiment, different types of electronic devices may correspond to different pupil distances. For example, the aforementioned electronic device may be a VR glass, a smart helmet, etc., and different pupil distances may be used for different electronic devices to meet the mapping needs of the binocular stereoscopic panoramic image of the electronic device.
On the basis of obtaining the pupil distance as well as the depth image, the electronic device can obtain the left-eye mapping relation and the right-eye mapping relation corresponding to this panoramic image based on an ODS mapping method.
The above mapping relationships are coordinate correspondences. For the pixel point in the panoramic image, it may be mapped to the left-eye panoramic screen and the right-eye panoramic screen respectively. The coordinate of the above pixel point in the panoramic image may be a first coordinate, the coordinate in the left-eye panoramic screen may be a second coordinate, and the coordinate in the right-eye panoramic screen may be a third coordinate, as shown in
S202, according to the left-eye mapping relationship and the right-eye mapping relationship, mapping and projecting the panoramic image separately to generate a left-eye panoramic image and a right-eye panoramic image.
After obtaining the above-described left-eye mapping relationship and the right-eye mapping relationship, the electronic device can determine to which position the pixel point in the panoramic image is mapped, and thus determine the coordinates of the respective pixel point in the left-eye panoramic image and the right-eye panoramic image. After matching each of the above-described second coordinates to a corresponding pixel value, the left-eye panoramic image is obtained. After matching each of the above-described third coordinates to a corresponding pixel value, the right-eye panoramic image is obtained.
In the above-described method for generating a binocular stereoscopic panoramic image, the electronic device obtains a left-eye mapping relationship and a right-eye mapping relationship on a basis of a pupil distance and a depth image and can accurately map the panoramic image into a binocular stereoscopic panoramic image, so that the binocular stereoscopic panoramic image can present a stereoscopic effect that corresponds to the depth information of that panoramic image.
S301, obtaining the second coordinate and the third coordinate based on the depth information, the preset pupil distance, and the first coordinate.
The above first coordinate, the second coordinate, and the third coordinate may be spherical coordinates or three-dimensional planar coordinates, and are not limited herein. The electronic device may perform coordinate mapping according to a predetermined formula to calculate the second and third coordinates corresponding to each first coordinate.
In one embodiment, the individual pixel points in the above panoramic image and the above depth image may be represented using spherical coordinates; that is, the coordinate of each pixel point may comprise a longitude coordinate and a latitude coordinate.
The above predetermined formula may include a longitude coordinate calculation formula and a latitude coordinate calculation formula. Wherein, the longitude coordinates in the above second coordinate and the third coordinate may be related to the depth information, the preset pupil distance, and the longitude coordinate in the first coordinate. For the same first coordinate, the longitude coordinate of the corresponding second coordinate is different from the longitude coordinate of the corresponding third coordinate. The difference between the longitude coordinate of the above second coordinate and the longitude coordinate of the third coordinate may be obtained from the ratio of the pupil distance to the depth information corresponding to that coordinate. Since the parallax generated by the panoramic image of the left eye and the panoramic image of the right eye for generating the distance information is mainly related to the longitude coordinates, the electronic device can directly determine the latitude coordinate in the first coordinate as the latitude coordinate in the second coordinate and the latitude coordinate in the third coordinate, respectively.
For the second coordinate, the electronic device may calculate the longitude coordinate in the second coordinate according to the formula
and determining the latitude coordinate in the first coordinate as the latitude coordinate in the second coordinate. ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate, and D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the Lϕ(ϕ, θ) is the longitude coordinate in the second coordinate corresponding to the first coordinate, and p is the preset pupil distance.
For the third coordinate, the electronic device may calculate the longitude coordinate in the third coordinate according to the formula
and determining the latitude coordinate in the first coordinate as the latitude coordinate in the third coordinate. ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate. D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the Rϕ(ϕ, θ) is the longitude coordinate in the third coordinate corresponding to the first coordinate, and p is the preset pupil distance.
That is, the above predetermined formula for calculating the latitude coordinates may be:
S302, determining the correspondence between the first coordinate and the second coordinate as a left-eye mapping relationship and the correspondence between the first coordinate and the third coordinate as a right-eye mapping relationship.
On the basis of obtaining the second coordinate and the third coordinate, the electronic device may determine the correspondence between the first coordinate and the second coordinate of each pixel point as a left-eye mapping relationship, and determine the correspondence between the first coordinate and the third coordinate as a right-eye mapping relationship.
The above-described method for generating a binocular stereoscopic panoramic image, wherein the electronic device uses spherical coordinates to accomplish coordinate mapping, can be applied to a panoramic image in any projection mode, thereby improving the applicability of mapping a panoramic image to a binocular stereoscopic panoramic image.
S401, obtaining a training sample; the training sample includes a panoramic sample image, and a sample depth image corresponding to the panoramic sample image.
The electronic device may acquire a binocular stereoscopic panoramic sample image, and then extract depth information from the binocular stereoscopic panoramic sample image to obtain a sample depth image corresponding to the binocular stereoscopic panoramic sample image; furthermore, the electronic device may carry out monocularization of the abovementioned binocular stereoscopic panoramic sample image to obtain a panoramic sample image corresponding to the binocular stereoscopic panoramic sample image. The above panoramic sample image and its corresponding sample depth image constitute a training sample.
In another embodiment, a binocular stereoscopic panoramic camera and a monocular panoramic camera may be used to simultaneously photograph for the same scene, obtain a binocular stereoscopic panoramic sample image and a panoramic sample image, respectively, and then obtain the aforementioned training sample after generating a sample depth image based on the binocular stereoscopic panoramic sample image.
S402: using the panoramic sample image as a reference input of the initial depth estimation model and the sample depth image as a reference output of the initial depth estimation model, training the initial depth estimation model according to a predetermined loss function to obtain the depth estimation model.
On the basis of obtaining the training samples, the electronic device may take the panoramic sample image as a reference input of the initial depth estimation model, take the sample depth image as a reference output of the initial depth estimation model, and train the initial depth estimation model according to a predetermined loss function to obtain the depth estimation model.
The above-described method for generating a binocular stereoscopic panoramic image can obtain a depth estimation model through sample training, so that a depth image of the panoramic image can be obtained according to the depth estimation model, which provides a data base for mapping from the panoramic image to the binocular stereoscopic panoramic image.
In one embodiment, a method of generating a binocular stereoscopic panoramic image is provided, as shown in
S501, inputting the panoramic image into the predetermined depth estimation model to obtain a depth image corresponding to the panoramic image;
S502, according to the formula
calculating the longitude coordinate of the pixel point in the panoramic image in the second coordinate in the left-eye panoramic screen;
S503, according to the formula
calculating the longitude coordinate of the pixel point in the panoramic image in the third coordinate in the right-eye panoramic screen;
S504, determining the latitude coordinate of the pixel point in the first coordinate in the monocular panoramic image as the latitude coordinate in the second coordinate and the latitude coordinate in the third coordinate, respectively;
S505, determining the correspondence between the first coordinate and the second coordinate as a left-eye mapping relationship; and the correspondence between the first coordinate and the third coordinate as a right-eye mapping relationship;
S506, according to the left-eye mapping relationship and the right-eye mapping relationship, mapping and projecting the panoramic image separately to generate a left-eye panoramic image and a right-eye panoramic image; and
S507, generating a binocular stereoscopic panoramic image based on the left-eye panoramic image and the right-eye panoramic image.
The above method of generating a binocular stereoscopic panoramic image, the technical principles and embodiment effects of which can be seen in the above embodiments, will not be repeated herein.
It should be understood that although the individual steps in the flowcharts involved in the embodiments as described above are shown sequentially as indicated by the arrows, these steps are not necessarily executed sequentially in the order indicated by the arrows. Unless expressly stated herein, there is no strict order limitation on the execution of these steps, and these steps may be executed in other orders. Moreover, at least a portion of the steps in the flowchart involved in the embodiments as described above may include multiple steps or multiple phases, which are not necessarily executed to completion at the same moment but may be executed at different moments, and the order in which these steps or phases are executed is not necessarily sequential but may be performed in conjunction with at least a portion of the other steps or of at least a portion of the steps or phases in the other steps. steps or stages are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least parts of other steps.
In one embodiment, a method for generating a binocular stereoscopic panoramic video is provided, and the electronic device may adopt the above-described method for generating binocular stereoscopic panoramic images to generate binocular stereoscopic panoramic images based on each of the individual panoramic images in the panoramic video, respectively; and then, based on each of the individual binocular stereoscopic panoramic images, a binocular stereoscopic panoramic video is generated.
The above method of generating a binocular stereoscopic panoramic video, the embodiment principle and technical effect of which are described in the above embodiment of the method of generating a binocular stereoscopic panoramic image, will not be repeated herein.
Based on the same inventive concept, some embodiments of the present application also provide a device for generating a binocular stereoscopic panoramic image for realizing the method for generating binocular stereoscopic panoramic images as described above. The embodiment scheme for solving the problem provided by the device is similar to the embodiment scheme documented in the above-described method, so the specific limitations in the one or more embodiments of the device for generating binocular stereoscopic panoramic images provided below can be found in the limitations for the method for generating binocular stereoscopic panoramic images described above, and will not be repeated herein.
In one embodiment, as shown in
In one embodiment, based on the above embodiment, as shown in
In one embodiment, based on the above embodiment, as shown in
In one embodiment, on the basis of the above embodiment, the above obtaining subunit 2011 is specifically for: according to the formula
calculating the longitude coordinate in the second coordinate; determining the latitude coordinate in the first coordinate as the latitude coordinate in the second coordinate; wherein ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate, and D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the Lϕ(ϕ, θ) is the longitude coordinate in the second coordinate corresponding to the first coordinate, and p is a preset pupil distance.
In one embodiment, on the basis of the above embodiment, the above obtaining subunit 2011 is specifically for: according to the formula
calculating the longitude coordinate in the third coordinate; determining the latitude coordinate in the first coordinate as the latitude coordinate in the third coordinate; wherein ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate, and D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the Rϕ(ϕ, θ) is the longitude coordinate in the third coordinate corresponding to the first coordinate, and p is a preset pupil distance.
In one embodiment, on the basis of the above embodiment, as shown in
The above-described device for generating a binocular stereoscopic panoramic image, the technical principles and embodiment effects of which can be found in the above-described method embodiments, will not be repeated herein.
The various modules in the above-described binocular stereoscopic panoramic image generating device may be realized in whole or in part by means of software, hardware and combinations thereof. Each of the above-described modules may be embedded in or independent of a processor in a computer device in the form of hardware, or may be stored in a memory in the computer device in the form of software, so as to facilitate the processor to invoke the execution of the operations corresponding to each of the above-described modules.
In one embodiment, an electronic device is provided, the internal structure diagram of which may be shown in
It will be understood by those skilled in the art that the structure illustrated in
In one embodiment, there is provided an electronic device comprising a memory and a processor, the memory storing a computer program, the processor executing the computer program realizing the following steps:
In one embodiment, the processor also realizes the following steps when executing the computer program: obtaining a left-eye mapping relationship and a right-eye mapping relationship based on the preset pupil distance and the depth image; the left-eye mapping relationship includes a correspondence between a first coordinate of a pixel point in the panoramic image and a second coordinate of the pixel point in the panoramic screen of the left eye; the right-eye mapping relationship includes a correspondence between the first coordinate and a third coordinate of the pixel point in the panoramic screen of the right eye; and generating the left-eye panoramic image and the right-eye panoramic image by mapping and projecting the panoramic image based on the left-eye mapping relationship and the right-eye mapping relationship respectively. The right-eye mapping relationship includes a correspondence between the first coordinate and a third coordinate of the pixel point in the right-eye panoramic image; according to the left-eye mapping relationship and the right-eye mapping relationship, the panoramic image is mapped and projected respectively to generate a left-eye panoramic image and a right-eye panoramic image.
In one embodiment, the processor further realizes the following steps when executing the computer program: obtaining a second coordinate and a third coordinate based on the depth information, the preset pupil distance, and the first coordinate; determining a correspondence between the first coordinate and the second coordinate as a left-eye mapping relationship; and, determining a correspondence between the first coordinate and the third coordinate as a right-eye mapping relationship.
In one embodiment, the processor executes the computer program further realizing the steps of: according to the formula
calculating the longitude coordinate in the second coordinates; determining the latitude coordinate in the first coordinates as the latitude coordinate in the second coordinate; wherein ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate, and D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the Lϕ(ϕ, θ) is the longitude coordinate in the second coordinate corresponding to the first coordinate, and p is a preset pupil distance.
In one embodiment, the processor executes the computer program further realizing the steps of: according to the formula
calculating the longitude coordinate in the third coordinate; determining the latitude coordinate in the first coordinate as the latitude coordinate in the third coordinate; wherein ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate, and D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the R (ϕ, θ) is the longitude coordinate in the third coordinate corresponding to the first coordinate, and p is a preset pupil distance.
In one embodiment, the processor also realizes the following steps when executing the computer program: obtaining a training sample; the training sample includes a panoramic sample image, and a sample depth image corresponding to the panoramic sample image; using the panoramic sample image as a reference input of the initial depth estimation model, using the sample depth image as a reference output of the initial depth estimation model, and training the initial depth estimation model according to a preset loss function to obtain the depth estimation model.
In one embodiment, the processor also realizes the following steps when executing the computer program: executing the step of the method for generating binocular stereoscopic panoramic images, generating binocular stereoscopic panoramic images based on the respective panoramic images in a panoramic video, respectively; and then generating a binocular stereoscopic panoramic video based on the respective binocular stereoscopic panoramic images.
This embodiment provides an electronic device, the embodiment principle and technical effect of which are similar to the above method embodiments and will not be repeated herein.
In one embodiment, a computer-readable storage medium is provided on which a computer program is stored, the computer program being executed by a processor realizing the following steps:
In one embodiment, the computer program, when executed by the processor, also implements the following steps: obtaining a left-eye mapping relationship and a right-eye mapping relationship based on the preset pupil distance and the depth image; the left-eye mapping relationship includes a correspondence between a first coordinate of a pixel point in the panoramic image and a second coordinate of the pixel point in the panoramic screen of the left-eye; the right-eye mapping relationship includes a correspondence between the first coordinate and a third coordinate of the pixel point in the panoramic screen of the right-eye; and generating the left-eye mapping relationship and the right-eye mapping relationship based on the left-eye mapping relationship and the right-eye mapping relationship, respectively. The right-eye mapping relationship includes a correspondence between the first coordinate and a third coordinate of the pixel point in the right eye panoramic image; according to the left-eye mapping relationship and the right-eye mapping relationship, the panoramic image is mapped and projected respectively to generate a left-eye panoramic image and a right-eye panoramic image.
In one embodiment, the computer program is executed by the processor further realizing the following steps: obtaining a second coordinate and a third coordinate based on the depth information, the preset pupil distance, and the first coordinate; determining a correspondence between the first coordinate and the second coordinate as a left-eye mapping relationship; and, determining a correspondence between the first coordinate and the third coordinate as a right-eye mapping relationship.
In one embodiment, the computer program is further implemented when executed by the processor in the steps of: calculating longitude coordinate in the second coordinate according to the formula
determining the latitude coordinate in the first coordinates as the latitude coordinate in the second coordinate; wherein ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate, and D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the Lϕ(ϕ, θ) is the longitude coordinate in the second coordinate corresponding to the first coordinate, and p is a preset pupil distance.
In one embodiment, the computer program is executed by the processor further realizing the steps of: calculating the longitude coordinate in the third coordinate according to the formula
determining the latitude coordinate in the first coordinates as the latitude coordinate in the third coordinate; wherein ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate, and D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the Rϕ(ϕ, θ) is the longitude coordinate in the third coordinate corresponding to the first coordinate, and p is a preset pupil distance.
In one embodiment, the computer program is executed by the processor with the following steps: obtaining a training sample; the training sample includes a panoramic sample image, and a sample depth image corresponding to the panoramic sample image; using the panoramic sample image as a reference input of the initial depth estimation model, using the sample depth image as a reference output of the initial depth estimation model, and training the initial depth estimation model according to a predetermined loss function to obtain the depth estimation model.
In one embodiment, the computer program, when executed by the processor, also realizes the following steps: performing the step of the method for generating binocular stereoscopic panoramic images, generating binocular stereoscopic panoramic images based on the respective panoramic images in a panoramic video, respectively; and then generating a binocular stereoscopic panoramic video based on the respective binocular stereoscopic panoramic images.
This embodiment provides a computer-readable storage medium, the embodiment principle and technical effect of which are similar to the above method embodiments and will not be repeated herein.
In one embodiment, there is provided a computer program product comprising a computer program that implements the following steps when executed by a processor:
In one embodiment, the computer program, when executed by the processor, also implements the following steps: obtaining a left-eye mapping relationship and a right-eye mapping relationship based on the preset pupil distance and the depth image; the left-eye mapping relationship includes a correspondence between a first coordinate of a pixel point in the panoramic image and a second coordinate of the pixel point in the panoramic screen of the left-eye; the right-eye mapping relationship includes a correspondence between the first coordinate and a third coordinate of the pixel point in the panoramic screen of the right-eye; and generating the left-eye mapping relationship and the right-eye mapping relationship based on the left-eye mapping relationship and the right-eye mapping relationship, respectively. The right-eye mapping relationship includes a correspondence between the first coordinate and a third coordinate of the pixel point in the right-eye panoramic image; according to the left-eye mapping relationship and the right-eye mapping relationship, the panoramic image is mapped and projected respectively to generate a left-eye panoramic image and a right-eye panoramic image.
In one embodiment, the computer program is executed by the processor further realizing the following steps: the root obtains a second coordinate and a third coordinate based on the depth information, the predetermined pupil distance, and the first coordinate; determines a correspondence between the first coordinate and the second coordinate as a left-eye mapping relationship; and, determines a correspondence between the first coordinate and the third coordinate as a right-eye mapping relationship.
In one embodiment, the computer program is further implemented when executed by the processor in the steps of: calculating a longitude coordinate in the second coordinate according to the formula
determining the latitude coordinate in the first coordinate as the latitude coordinate in the second coordinate; wherein ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate, and D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the Lϕ(ϕ, θ) is the longitude coordinate in the second coordinate corresponding to the first coordinate, and p is a preset pupil distance.
In one embodiment, the computer program is further implemented when executed by the processor in the steps of: calculating the longitude coordinate in the third coordinates according to the formula
determining the latitude coordinate in the first coordinate as the latitude coordinate in the third coordinate; wherein ϕ is the longitude coordinate in the first coordinate; θ is the latitude coordinate in the first coordinate, and D(ϕ, θ) is the depth information corresponding to the first coordinate in the depth image, the R (ϕ, θ) is the longitude coordinate in the third coordinate corresponding to the first coordinate, and p is a preset pupil distance.
In one embodiment, the computer program is executed by the processor with the following steps: obtaining a training sample; the training sample includes a panoramic sample image, and a sample depth image corresponding to the panoramic sample image; using the panoramic sample image as a reference input of the initial depth estimation model, using the sample depth image as a reference output of the initial depth estimation model, and training the initial depth estimation model according to the predetermined loss function to obtain the depth estimation model.
In one embodiment, the computer program, when executed by the processor, also realizes the following steps: performing the step of the method for generating binocular stereoscopic panoramic images, generating binocular stereoscopic panoramic images based on the respective panoramic images in the panoramic video, respectively; and then generating a binocular stereoscopic panoramic video based on the respective binocular stereoscopic panoramic images.
This embodiment provides a computer program product, the embodiment principle and technical effect of which are similar to the above method embodiments and will not be repeated herein.
A person of ordinary skill in the art may understand that realizing all or part of the processes in the methods of the above embodiments is possible by means of a computer program to instruct the relevant hardware to accomplish the same, and the computer program may be stored in a non-volatile computer-readable storage medium, which computer program, when executed, may comprise processes such as the processes of the embodiments of the respective methods described above. Among other things, any reference to a memory, database, or other medium used in the embodiments provided in this application may include at least one of non-volatile and volatile memory. Non-volatile memories may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memories, Resistance-Resistive Memory (ReRAM), Magnetoresistive Random Access Memory (MRAM), Ferroelectric Memory (ReRAM, Magnetoresistive Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), Graphene Memory and so on. The volatile memory may include a Random Access Memory (RAM) or an external cache memory, and the like. As an illustration and not as a limitation, the RAM may be in various forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), and the like. The databases involved in the embodiments provided in the present application may include at least one of a relational database and a non-relational database. The non-relational database may include a blockchain-based distributed database and the like, without limitation. The processor involved in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logician, a data processing logician based on quantum computing, circuitry, and the like, without limitation.
The various technical features of the above embodiments can be combined in any way, and all possible combinations of the various technical features of the above embodiments have not been described for the sake of conciseness of description; however, as long as there is no contradiction in the combinations of these technical features, they should all be considered to be within the scope of the specification as recorded herein.
The above-described embodiments express only several embodiments of the present application, which are described in a more specific and detailed manner, but are not to be construed as a limitation of the scope of the patent of the present application. It should be pointed out that for a person of ordinary skill in the art, several deformations and improvements can be made without departing from the conception of the present application, which all fall within the scope of protection of the present application. Therefore, the scope of protection of this application shall be subject to the attached claims.
Number | Date | Country | Kind |
---|---|---|---|
202210242661.X | Mar 2022 | CN | national |
The present application is a continuation of International Application No. PCT/CN2023/079064, filed on Mar. 1, 2023, which claims the priority of Chinese Patent Application No. CN 202210242661.X, filed on Mar. 11, 2022, the content of each of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2023/079064 | Mar 2023 | WO |
Child | 18830614 | US |