The present disclosure relates in general to an electronic system, and it relates in particular to an electronic system and a method for generating panoramic light fields.
Three-dimensional (3D) visualization devices, such as Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR) devices, generate 3D sensation based on the stereoscopic vision principle, and render a panoramic scene (i.e., a scene that incorporate a wide viewing angle) at a single depth. Since the distance between the display panel and the eye of the viewer is fixed, the accommodation of the eyes does not change with the vergence. This leads to the effect of vergence accommodation conflict (VAC), which can cause discomfort such as visual fatigue and eye strain to viewers, or even daze some viewers who are not used to the 3D visualization effect.
The light field display device is a display device using light field technology, allowing the observer to see the light field with depth perception. The light field with depth perception can avoid the impact of vergence accommodation conflict. Therefore, there is a need for an electronic system and a method capable of generating panoramic light field for the light field display device to display a more comfortable scene to the observer.
An electronic system for generating panoramic light fields is provided by an embodiment of the present disclosure. The electronic system includes a camera, a depth estimation circuit, and a light field generation circuit. The camera is configured to capture a panoramic image of a scene. The depth estimation circuit is configured to estimate panoramic depth information of the scene based on the panoramic image captured by the camera. The light field generation circuit is configured to generate a panoramic light field based on the estimated panoramic depth information.
In some embodiments, the camera is mounted on a rotational mechanism used for revolving the camera. The camera is configured to capture a sequence of image sets of the scene from a plurality of points on a first trajectory of the camera. The camera is further configured to stitch images in each image set to obtain a stitched image. The camera is further configured to transform the stitched image to the rectangular panoramic image using an equirectangular projection method.
In some embodiments, the rotational mechanism is further augmented to lift or lower the camera. The camera is further configured to capture another sequence of image sets of the scene from a plurality of points on a second trajectory of the camera.
In further embodiments, the first trajectory and the second trajectory are circles. The first trajectory and the second trajectory may have different radiuses.
In some embodiments, the depth estimation circuit is further configured to estimate the depth information using a convolutional neural network-based model. In further embodiments, the convolutional neural network uses a distance-based kernel.
In some embodiments, the processor is further configured to generate the panoramic light field using a convolutional neural network-based model.
In an embodiment, each light ray in the panoramic light field is addressed by two sets of cylindrical coordinates. In another embodiment, each light ray in the panoramic light field is addressed by two sets of spherical coordinates.
An electronic system for generating panoramic light fields is provided by another embodiment of the present disclosure. The electronic system includes a camera and a processor. The camera is configured to capture a panoramic image of a scene. The processor is configured to estimate panoramic depth information of the scene based on the panoramic image captured by the camera, and to generate a panoramic light field based on the estimated panoramic depth information.
A method for generating panoramic light fields is provided by an embodiment of the present disclosure. The method is for use in an electronic system that includes a camera. The method includes the step of capturing a panoramic image of a scene by the camera. The method further includes the step of estimating panoramic depth information of the scene based on the panoramic image. The method further includes the step of generating a panoramic light field based on the panoramic depth information.
The embodiments of the present disclosure provides panoramic light field for the light field display device to directly projecting light rays from various directions into the viewer's eye, which is more in line with the way humans observe the world, so that the effect of VAC can be mitigated.
The present disclosure can be better understood by reading the subsequent detailed description and examples with references made to the accompanying drawings. Additionally, it should be appreciated that in the flow diagram of the present disclosure, the order of execution for each blocks can be changed, and/or some of the blocks can be changed, eliminated, or combined.
The following description provides embodiments of the invention, which are intended to describe the basic spirit of the invention, but is not intended to limit the invention. For the actual inventive content, reference must be made to the scope of the claims.
In each of the following embodiments, the same reference numbers represent identical or similar elements or components.
It must be understood that the terms “including” and “comprising” are used in the specification to indicate the existence of specific technical features, numerical values, method steps, process operations, elements and/or components, but do not exclude additional technical features, numerical values, method steps, process operations, elements, components, or any combination of the above.
Ordinal terms used in the claims, such as “first,” “second,” “third,” etc., are only for convenience of explanation, and do not imply any precedence relation between one another.
The term “panoramic” used herein is intended to cover any wide-angle view, such as 360-degree view, 300-degree view, 280-degree view or the like, the present disclosure is not limited thereto.
The camera 11 may include a plurality of lenses, each of which is used for capturing images with a specific angle of view. In the example of
The camera 11 may further include an image processing unit 113. The image processing unit 113 can be a specialized microprocessor dedicated to perform specific image processing tasks, such as stitching a set of images (also referred to as “image set” herein) captured by the lens 111 and the lens 112.
The depth estimation circuit 14A is a specifically designed hardware that can be implemented by one or more electronic circuits, such as an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or the like, but the present disclosure is not limited thereto. The depth estimation circuit 14A is electrically connected to the camera 11, so as to obtain the panoramic image captured by the camera 11. The functionalities of the depth estimation circuit 14A will be described later.
The light field generation circuit 15A is a specifically designed hardware that can be implemented by one or more electronic circuits, such as an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or the like, but the present disclosure is not limited thereto. The light field generation circuit 15A is electrically connected to the depth estimation circuit 14A, so as to obtain the depth information of the scene estimated by the depth estimation circuit 14A. The functionalities of the light field generation circuit 15A will be described later.
In an embodiment, the depth estimation circuit 14A and the light field generation circuit 15A can be integrated into electronic circuitry 145, that can be implemented by an integrated circuit, an FPGA, or a system-on-chip (SoC), but the present disclosure is not limited thereto.
The camera 11 in the electronic system 10B is substantially the same as the camera 11 in the electronic system 10A, which has been described previously, and thus the description is not repeated here.
The processor 12 can be a central processing unit (CPU) or a graphic processing unit (GPU) capable of executing instructions and performing high-speed computation. The storage device 13 may include a non-volatile memory device, such as a hard disk drive, a solid-state disk, a flash memory, or a read-only memory, but the present disclosure is not limited thereto. The storage device 13 stores the depth estimation module 14B and the light field generation module 15B, each of which is a software module that includes a set of instructions executable by the processor 12. The processor 12 may be connected to the storage device 13 through system bus or network, such as local area network (LAN), wide area network (WAN), virtual private network (VPN), internet, intranet, extranet, public switched telephone network, infrared network, wireless network, or any combination thereof. The processor 12 is configured to load the depth estimation module 14B and the light field generation module 15B from the storage device 13 to execute corresponding steps or operations of the disclosed method for generating panoramic light field which will be introduced later.
In some embodiments, the processor 12 and storage device 13 may be equipped in a computer device 123. The computer device 123 can be a personal computer (e.g., laptop computer or notebook computer) or a server computer running an operating system (e.g., Windows, Mac OS, Linux, UNIX, etc.). The computer device 123 may communicate with the camera 11 through wired transmission interfaces and/or wireless transmission interfaces (not shown in
The method 200 starts from step 201. In step 201, a panoramic image of a scene (or a scenario) is captured by the camera 11. Then, the method 200 proceeds to step 202.
In step 202, panoramic depth information of the scene is estimated based on panoramic image captured by the camera 11. Then, the method 200 proceeds to step 203.
In step 203, a panoramic light field is generated based on the panoramic depth information estimated in step 202.
In some embodiments, the depth information can be a depth map containing information relating to the distance of the surfaces of objects in the scene from a viewpoint, namely the location of the camera 11.
One panoramic image alone is not sufficient to provide the information about the light field of the scene, because the construction of the light field requires sampling the radiance of light rays emitted from the scene along different directions. In fact, sampling large amount of panoramic images from different angles of view is required for gathering the information about the light field. The techniques for sampling the panoramic images of the scene are introduced herein.
In the example shown in
It should be appreciated that the implementations provided in
In step 501, the camera 11 is revolved (e.g., manually controlled or electrically controlled), and a sequence of image sets of the scene from a plurality of points (as shown in
In step 502, the images in each image set are stitched (i.e., combined into an image) to obtain a stitched image (e.g., by the image processing unit 113 in
In step 503, the stitched image is transformed to the panoramic image using the equirectangular projection method (e.g., by the image processing unit 113 in
In some embodiments, the rotation part 32 of the rotational mechanism 30 in
In some embodiments, the estimation of the panoramic depth information uses a convolutional neural network (CNN)-based model.
The feature extraction layers 71 may include a plurality of convolution layers for extracting the feature representation (or feature maps) 702 of each of the input panoramic images 701. The number of convolution layers is not limited by the present disclosure. In an embodiment, each of the panoramic images 701 is divided into two polar regions (i.e., the north pole region and the south pole region) and a central region before being input to the feature extraction layers 71. The distortion effect of equirectangular projection varies from region to region. Specifically, the distortion level of a polar region is probably higher than the distortion level of the central region. In order to compensate for such distortion effect of equirectangular projection, each of the convolution layers of the feature extraction layers 71 use a distance-based kernel, which means the size of the convolution kernel for each of the regions is determined by the distortion level of the region. The size of the kernel used by the feature extraction layers 71 for a polar region is configured bigger than the size of the kernel used by the feature extraction layers 71 for the central region, so as to enhance the training for the polar regions of the panoramic image.
The cost aggregation layers 72 use the extracted feature maps 702 to compute the cost volume for calculating the depth information 703. The cost volume denotes the data matching costs for associating a pixel in a panoramic image with its corresponding pixels at another panoramic image. In an embodiment, the cost aggregation layers 72 applies a semi-global aggregation (SGA) layer and local guided aggregation (LGA) layer to refine the edge of objects and compensate for the accuracy degradation caused by the down-sampling of the cost volume. The depth information 703 is obtained by the minimization of the cost volume.
In some embodiments, the generation of light field in step 203 may use a plurality of 3D convolutional neural layers (e.g., using another CNN model) for resampling. First, a coarse light field is warped from the panoramic images to generate a warped new light field based on the panoramic depth information (e.g., panoramic depth map). Then, the warped new light field is refined by the light field resampling layers to generate the light field output by the light field generating circuit 15A or the light field generating module 15B.
In some embodiments, the weights used in the CNN models for depth estimation and light field generation can be trained simultaneously by imposing spatiotemporal consistency, which computes the error between the predicted panoramic images at certain angular coordinates and the ground truth. The lost value of the spatiotemporal consistency can be minimized using a gradient descent method such as Stochastic Gradient Descent (SGD) method or an adaptive movement estimation (Adam) algorithm, but the present disclosure is not limited thereto.
It should be appreciated that the implementations provided in
The embodiments of the present disclosure provides panoramic light field for the light field display device to directly projecting light rays from various directions into the viewer's eye, which is more in line with the way humans observe the world, so that the effect of VAC can be mitigated.
The above paragraphs are described with multiple aspects. Obviously, the teachings of the specification may be performed in multiple ways. Any specific structure or function disclosed in examples is only a representative situation. According to the teachings of the specification, it should be noted by those skilled in the art that any aspect disclosed may be performed individually, or that more than two aspects could be combined and performed.
While the invention has been described by way of example and in terms of the preferred embodiments, it should be understood that the invention is not limited to the disclosed embodiments. On the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art). Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.
This application claims the benefit of U.S. Provisional Application No. 63/286,036, filed Dec. 4, 2021, the entirety of which is incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
63286036 | Dec 2021 | US |