This application is a non-provisional of U.S. Provisional Patent Application No. 62/347,077 filed Jun. 7, 2016, the contents of which are hereby incorporated by reference.
Streaming 360-degree video content may provide immersive environments for virtual reality (VR) and augmented reality (AR) applications.
In an aspect, an imaging system is provided. The imaging system includes a plurality of cameras configured to capture video image data based on respective fields of view of an environment. Each camera of the plurality of cameras is communicatively coupled to neighbor cameras of the plurality of cameras via a communication interface. Each camera of the plurality of cameras includes at least one processor and a memory. The at least one processor executes instructions stored in memory so as to carry out operations. The operations include capturing video image data of the respective field of view and determining an overlay region. The overlay region includes an overlapping portion of video image data captured by the respective camera and at least one of the neighbor cameras. The operations also include cropping and warping the captured video image data of the respective field of view based on the overlay region to form respective processed video image data.
In an aspect, a method is provided. The method includes receiving processed video image data associated with respective cameras of a plurality of cameras of an imaging system. Each camera of the plurality of cameras is configured to capture video images of respective fields of view of an environment. The processed video image data includes cropped and warped video image data based on an overlay region. The overlay region includes an overlapping portion of video image data captured by at least two neighbor cameras of the plurality of cameras. The method also includes providing streamed video to a client device, via a plurality of communication links. The streamed video is based on the processed video image data.
In an aspect, a system is provided. The system includes various means for carrying out the operations of the other respective aspects described herein.
These as well as other embodiments, aspects, advantages, and alternatives will become apparent to those of ordinary skill in the art by reading the following detailed description, with reference where appropriate to the accompanying drawings. Further, it should be understood that this summary and other descriptions and figures provided herein are intended to illustrate embodiments by way of example only and, as such, that numerous variations are possible. For instance, structural elements and process steps can be rearranged, combined, distributed, eliminated, or otherwise changed, while remaining within the scope of the embodiments as claimed.
Example methods, devices, and systems are described herein. It should be understood that the words “example” and “exemplary” are used herein to mean “serving as an example, instance, or illustration.” Any embodiment or feature described herein as being an “example” or “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments or features. Other embodiments can be utilized, and other changes can be made, without departing from the scope of the subject matter presented herein.
Thus, the example embodiments described herein are not meant to be limiting. Aspects of the present disclosure, as generally described herein, and illustrated in the figures, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations, all of which are contemplated herein.
Further, unless context suggests otherwise, the features illustrated in each of the figures may be used in combination with one another. Thus, the figures should be generally viewed as component aspects of one or more overall embodiments, with the understanding that not all illustrated features are necessary for each embodiment.
I. Overview
Virtual reality (VR) 360° video has a long pipeline from production to consumption.
The pipeline may break into a few components in real implementation. For example,
Another a conventional virtual reality 360° video streaming pipeline 400 is illustrated in
We propose a distributed infrastructure for VR 360° video capturing and live streaming.
Since videos from each integrated camera module have been carefully warped and cropped, no more alignment is needed to stitch them together. One can choose to stitch all videos together in the cloud with relatively low computation cost, or choose to stitch them during rendering on the final display device (e.g., VR headset, smartphone, etc).
II. Example Systems
A. Integrated Camera Unit
In one embodiment, the integrated unit of camera, processor, storage, and transmission modules can be made into one piece of module, as shown in
In another embodiment, the camera may be physically separated from other components, as shown in
Each camera unit may also include a microphone, so that all camera units together are able to record and stream sound in different directions.
B. Geometry of VR 360° Camera System
A VR 360° camera system may consist of various numbers of cameras. These cameras can be geometrically arranged in various ways to cover the desired FOV.
Fewer cameras may be implemented if a FOV smaller than sphere is desired, or if each individual camera has a larger FOV. More cameras may be implemented if more overlap between cameras is desired (e.g., for easier stitching, more redundancy), or if each individual camera has a smaller FOV.
In an example embodiment, a pair of cameras may be arranged along each plane (e.g., at each viewpoint) to provide a stereoscopic view for every view.
Also, although this disclosure provides examples involving 360° video, the same method and systems may be applied to provide videos with fields of view that are less than 360°.
C. Interconnection Between Camera Units
In the proposed VR 360° camera system, each camera unit processes video frames in a way that output frames from different camera units that may be directly stitched to form a spherical view, or may be stitched to form a spherical view with a small amount of further processing. In such a scenario, each camera needs to receive information from its neighboring camera units (e.g., neighbor cameras) to perform the image processing (e.g., warping and/or cropping).
In one embodiment, these camera units are directly connected to one another via a wired or wireless communication interface (e.g., BLUETOOTH, BLUETOOTH LOW ENERGY, WiFi, or another type of communication protocol), as shown in
In one embodiment, a geometric position of every camera may be pre-calibrated, which may avoid communication between cameras during runtime. This simplifies the system at the cost of stitching quality. For example, high quality stitching is a result of both the geometric arrangement of cameras as well as the spatial position of imaged objects.
Batteries may or may not be included in the integrated units. For example, the imaging system may be externally powered. Additionally or alternatively, batteries may provide some or all power for the imaging system.
In an example embodiment, network devices may be incorporated in the communication link and hardware architecture between the VR camera system and the cloud server so as to speed up or help facilitate the uploading process.
III. Example Methods
A. Camera Synchronization
Each camera unit in the proposed VR 360° camera system captures a portion of a spherical view. To stitch images from each camera together into a 360° image frame (and a 360° video), image capture in each camera unit needs to be synchronized. Camera system clocks may be synchronized based on communication between the cameras. Additionally or alternatively, a synchronizing flash may be fired while the cameras are capturing video. In such a scenario, the cameras may be synchronized by finding the frames from the respective video data that capture the synchronizing flash. Additionally or alternatively, the cameras may be synchronized by analyzing the final video clips (e.g., by stopping the video capture of the cameras at the same time).
If all cameras are synchronized, then for any given time, t, in the target 360° video, one may locate one frame on each video clip that is closest to t, and stitch them together. Linear interpolation may be used here in the temporal dimension for better smoothness.
B. Warping and Cropping for Video Stitching
1. Background
Video stitching difficulty may stem from 1) lens distortion; and 2) field of view disparity between cameras. Lens distortion may be largely corrected by camera calibration which may be done before runtime or at runtime. Camera disparity is scene dependent, and may be addressed using camera overlaps when video is being captured.
Frames from one camera unit overlap with those from its neighbor camera units.
Information of overlapping regions may be transmitted in various ways. 1) One may transmit to its neighbor camera unit the max possibly overlay regions (as shown in red rectangles in
2. Video Warping and Cropping for Each Camera Unit
As shown in
1. Between F1 and F2, compute the disparities for the rightmost pixel of F1, P1, and then infer the depth of this pixel.
2. Between F1 and F3, compute the disparities for the leftmost pixel of F1, P2, and then infer the depth of this pixel.
3. Estimate depth of other pixels in F1, use a linear interpolation of the depths of P1 and P2.
4. With depth of every pixel being estimated, we can remap F1 to the targeted 72° for the point of view O.
One need not compute depth explicitly to produce targeted images. Another exemplar solution is as shown in
1. For each frame from Point A (2), take the possibly overlapping regions. For example, Region W and U from Viewpoint C and A, and Region P and Q from Viewpoint A and B.
2. For each pair of overlapping region, find the best cut that minimizes discontinuities between W and U, and P and Q, as shown in (4) and (5). A number of graph cut algorithms have been proposed in literature.
3. As shown in (6), from Frame 1, crop off Region W and Region Q, and then horizontally warp the rest region into a rect of image.
4. In reality, each frame may have four overlapping regions with its neighbor cameras. As a result, Frame 1 may be cropped in four directions, as shown in (7), W, V, Q, Y. The rest of the regions are then warped horizontally and vertically into the final rectangle image.
5. Cropped and warped frames from all cameras will cover the full spherical view with a reduced amount of stitching artifacts.
In one embodiment, for temporal smoothness, extra smoothness constraints (e.g., performing edge alignment, movement compensation, color/shape matching, tone mapping, exposure correction, etc.) may be posed when finding the best cut between overlapping regions.
This processing may be done for every frame or for a periodic or aperiodic interval of frames. The graph cut (e.g., cropping and warping) may be interpolated based on an estimated movement rate and/or other change in the respective images between image processing periods.
C. Data Storage
Video data from each camera unit may be saved immediately to local storage, or may also be cropped and warped first (as described in the previous session) before saving to a local storage.
Video data may also be encoded (e.g., H.264, VP8, VP9, etc) before saving to one file in a local storage. Video data may also be encoded into a series of small trunk files in a local storage for later streaming (e.g., HLS, DASH).
D. Video Data Upload and Live Stream
Regardless of saving to a local storage or not, the processed video data may be uploaded to a cloud in real time. For an example as shown in
In this case, the cloud may be able to broadcast to a number of users via various streaming protocols (e.g., HLS, DASH, etc).
Note that the VR 360° camera system consists of a number of camera units. Each unit uploads one stream of data to the cloud as shown in
1. Stream and Stitch on Client Application
An application on a client device first grabs the metadata from stream server, and then connects to all the required lists of video trunks. The application is designed to stream and synchronize the required video trunks as needed, stitch them together, render to screen, and so it provides a VR 360° video for end-users. One possible solution of application is described in U.S. Provisional Patent Application No. 62/320,451, filed Apr. 8, 2016.
2. Stitch on Cloud and then Stream to Client Application
One may also stitch video data from all camera units on the cloud. Since all data have already been aligned before uploading, computation for stitching is relatively low and may be done in real-time. After stitching, the stitched video data appears to be a regular video stream, which may be streamed to client devices via regular streaming protocols (e.g., HLS, DASH, etc).
3. Combine Cloud Stitching and Client Stitching
One may also stitch a low-resolution 360° video on the cloud, and stitch high-resolution 360° video in the client application. To save computation on the cloud, each camera unit in the VR 360° camera system may upload two series of trunks to the cloud, one high-res trunks and one low-res trunks, as shown in
The particular arrangements shown in the Figures should not be viewed as limiting. It should be understood that other embodiments may include more or less of each element shown in a given Figure. Further, some of the illustrated elements may be combined or omitted. Yet further, an illustrative embodiment may include elements that are not illustrated in the Figures.
A step or block that represents a processing of information can correspond to circuitry that can be configured to perform the specific logical functions of a herein-described method or technique. Alternatively or additionally, a step or block that represents a processing of information can correspond to a module, a segment, or a portion of program code (including related data). The program code can include one or more instructions executable by a processor for implementing specific logical functions or actions in the method or technique. The program code and/or related data can be stored on any type of computer readable medium such as a storage device including a disk, hard drive, or other storage medium.
The computer readable medium can also include non-transitory computer readable media such as computer-readable media that store data for short periods of time like register memory, processor cache, and random access memory (RAM). The computer readable media can also include non-transitory computer readable media that store program code and/or data for longer periods of time. Thus, the computer readable media may include secondary or persistent long term storage, like read only memory (ROM), optical or magnetic disks, compact-disc read only memory (CD-ROM), for example. The computer readable media can also be any other volatile or non-volatile storage systems. A computer readable medium can be considered a computer readable storage medium, for example, or a tangible storage device.
While various examples and embodiments have been disclosed, other examples and embodiments will be apparent to those skilled in the art. The various disclosed examples and embodiments are for purposes of illustration and are not intended to be limiting, with the true scope being indicated by the following claims.
Number | Name | Date | Kind |
---|---|---|---|
9204041 | Campbell | Dec 2015 | B1 |
20040100443 | Mandelbaum | May 2004 | A1 |
20040263636 | Cutler | Dec 2004 | A1 |
20080002023 | Cutler | Jan 2008 | A1 |
20100245532 | Kurtz | Sep 2010 | A1 |
20100271394 | Howard | Oct 2010 | A1 |
20110199372 | Mark et al. | Aug 2011 | A1 |
20110214072 | Lindemann | Sep 2011 | A1 |
20110249095 | Kim | Oct 2011 | A1 |
20120242787 | Oh | Sep 2012 | A1 |
20120287222 | Liu | Nov 2012 | A1 |
20130121261 | Yao | May 2013 | A1 |
20130124471 | Chen | May 2013 | A1 |
20130141526 | Banta | Jun 2013 | A1 |
20130250040 | Vitsnudel | Sep 2013 | A1 |
20140270684 | Jayaram | Sep 2014 | A1 |
20140320697 | Lammers | Oct 2014 | A1 |
20140375759 | Mikes | Dec 2014 | A1 |
20150084619 | Stark | Mar 2015 | A1 |
20150095964 | Teixeira | Apr 2015 | A1 |
20150138311 | Towndrow | May 2015 | A1 |
20150316835 | Scott | Nov 2015 | A1 |
20150346812 | Cole | Dec 2015 | A1 |
20160088280 | Sadi et al. | Mar 2016 | A1 |
20160094810 | Mirza | Mar 2016 | A1 |
Number | Date | Country |
---|---|---|
2016024892 | Feb 2016 | WO |
Entry |
---|
International Search Report, International Application No. PCT/US2017/036385, dated Oct. 30, 2017. |
European Patent Office, Extended European Search Report dated Dec. 6, 2019, issued in connection with European Patent Application No. 17810949.2, 15 pages. |
Shrestha et al., “Synchronization of Multiple Video Recordings based on Still Camera Flashes,” AMC Multimedia 2006 & Co-Located Workshops, Oct. 23, 2006, pp. 137-140. |
Number | Date | Country | |
---|---|---|---|
20170352191 A1 | Dec 2017 | US |
Number | Date | Country | |
---|---|---|---|
62347077 | Jun 2016 | US |