The present disclosure relates generally to the streaming of spherical videos (so called Virtual Reality (VR) 360° videos) to an end device through a delivery network.
This section is intended to introduce the reader to various aspects of art, which may be related to various aspects of the present disclosure that are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present disclosure. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art. VR 360° videos offer an immersive experience wherein a user can look around using a VR head-mounted display (HMD) or can navigate freely within a scene on a flat display by controlling the viewport with a controlling apparatus (such as a mouse or a remote control).
Such a freedom in spatial navigation requires that the whole 360° scene is delivered to a player (embedded within the HMD or TV set) configured to extract the video portion to be visualized depending on the position of the viewport within the scene. Therefore, a high throughput is necessary to deliver the video. Indeed, it is commonly admitted that a physical space field of vision surrounded by the 360° horizontal direction and 180° vertical direction can be entirely covered by a user within a minimum set of twelve viewports. To offer an unrestricted VR 360° video service in 4K resolution, a video stream equivalent to twelve 4K videos has to be provided.
Therefore, one main issue relies on the efficient transmission of VR 360° videos over bandwidth constrained network with an acceptable quality of immersive experience (i.e. avoiding freeze screen, blockiness, black screen, etc.). Currently, for delivering a VR 360° video service in streaming, a compromise has to be reached between immersive experience, resolution of video and available throughput of the content delivery network.
The majority of known solutions streaming VR 360° videos provides the full 360° scene to the end device, but only less than 10% of the whole scene is presented to the user. Since delivery networks have limited throughput, the video resolution is decreased to meet bandwidth constraints.
Other known solutions mitigate the degradation of the video quality by reducing the resolution of the portion of the 360° scene arranged outside of the current viewport of the end device. Nevertheless, when the viewport of the end device is moved upon user's action to a lower resolution area, the displayed video suffers from a sudden degradation.
The present disclosure has been devised with the foregoing in mind.
The disclosure concerns a method for tiling with a set of tiles a sphere representing a spherical multimedia content, said method comprising:
In an embodiment, each transformation associated with a corresponding tile of the set of tiles can be defined by a rotation matrix.
In an embodiment, said rotation matrix can be a matrix product of two rotation matrices defined by the following equation:
Rotij=Rot(y, φij)*Rot(x, θj)
wherein:
In an embodiment, for said corresponding tile,
In an embodiment, the tile horizontal angular amplitude and the tile vertical angular amplitude can depend on service parameters.
In an embodiment, the number of parallel lines can depend on the tile vertical angular amplitude and a vertical overlapping ratio.
In an embodiment, the number of tiles on a parallel line can depend on the tile horizontal angular amplitude and a horizontal overlapping ratio.
In an embodiment, the angular amplitude between two parallel lines can be constant.
In an embodiment, the tiles of said set of tiles can have the same shape.
The present disclosure also concerns a network equipment configured for tiling with a set of tiles a sphere representing a spherical multimedia content, said network equipment comprising at least one memory and at least one processing circuitry configured to perform:
In an embodiment, each transformation associated with a corresponding tile of the set of tiles can be defined by a rotation matrix.
In an embodiment, said rotation matrix can be a matrix product of two rotation matrices defined by the following equation:
Rotij=Rot(y, φij)*Rot(x, ηj)
wherein:
In an embodiment, for said corresponding tile,
In an embodiment, the tile horizontal angular amplitude and the tile vertical angular amplitude can depend on service parameters.
In an embodiment, the number of parallel lines can depend on the tile vertical angular amplitude and a vertical overlapping ratio.
The present disclosure also concerns a method to be implemented at a terminal configured to be in communication with a network equipment to receive a spherical multimedia content represented by a sphere, wherein the method comprises receiving:
The present disclosure further concerns a terminal configured to be in communication with a network equipment to receive a spherical multimedia content represented by a sphere, wherein said terminal comprises at least one memory and at least one processing circuitry configured to receive:
Besides, the present disclosure is further directed to a non-transitory program storage device, readable by a computer, tangibly embodying a program of instructions executable by the computer to perform a method for tiling with a set of tiles a sphere representing a spherical multimedia content, which comprises:
The present disclosure also concerns a computer program product which is stored on a non-transitory computer readable medium and comprises program code instructions executable by a processor for implementing a method for tiling with a set of tiles a sphere representing a spherical multimedia content, which comprises:
The method according to the disclosure may be implemented in software on a programmable apparatus. It may be implemented solely in hardware or in software, or in a combination thereof.
Some processes implemented by elements of the present disclosure may be computer implemented. Accordingly, such elements may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as “circuit”, “module” or “system”. Furthermore, such elements may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.
Since elements of the present disclosure can be implemented in software, the present disclosure can be embodied as computer readable code for provision to a programmable apparatus on any suitable carrier medium. A tangible carrier medium may comprise a storage medium such as a floppy disk, a CD-ROM, a hard disk drive, a magnetic tape device or a solid state memory device and the like.
The disclosure thus provides a computer-readable program comprising computer-executable instructions to enable a computer to perform the method for tiling with a set of tiles a sphere representing a spherical multimedia content according to the disclosure.
Certain aspects commensurate in scope with the disclosed embodiments are set forth below. It should be understood that these aspects are presented merely to provide the reader with a brief summary of certain forms the disclosure might take and that these aspects are not intended to limit the scope of the disclosure. Indeed, the disclosure may encompass a variety of aspects that may not be set forth below.
The disclosure will be better understood and illustrated by means of the following embodiment and execution examples, in no way limitative, with reference to the appended figures on which:
Wherever possible, the same reference numerals will be used throughout the figures to refer to the same or like parts.
The following description illustrates the principles of the present disclosure. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the disclosure and are included within its scope.
All examples and conditional language recited herein are intended for educational purposes to aid the reader in understanding the principles of the disclosure and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the disclosure, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the principles of the disclosure. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, read only memory (ROM) for storing software, random access memory (RAM), and nonvolatile storage.
In the claims hereof, any element expressed as a means and/or module for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
In addition, it is to be understood that the figures and descriptions of the present disclosure have been simplified to illustrate elements that are relevant for a clear understanding of the present disclosure, while eliminating, for purposes of clarity, many other elements found in typical digital multimedia content delivery methods, devices and systems. However, because such elements are well known in the art, a detailed discussion of such elements is not provided herein. The disclosure herein is directed to all such variations and modifications known to those skilled in the art.
The present disclosure is depicted with regard to a streaming environment to deliver a spherical multimedia content (such as a VR 360° video) to a client terminal through a delivery network.
As shown in
The client terminal 100—connected to the gateway 200 through a first network N1 (such as a home network or an enterprise network)—may wish to request a VR 360° video stored on a remote network equipment 300 through a second network N2 (such as the Internet network). The first network N1 is connected to the second network N2 thanks to the gateway 200.
The network equipment 300 is configured to stream segments to the client terminal 100, upon the client request, using a streaming protocol (such as the HTTP adaptive streaming protocol, so called HAS).
As shown in the example of
As an example, the client terminal 100 is a portable media device, a mobile phone, a tablet or a laptop, a head mounted device, a set-top box or the like. Naturally, the client terminal 100 might not comprise a complete video player, but only some sub-elements such as the ones for demultiplexing and decoding the media content and might rely upon an external means to display the decoded content to the end user.
As shown in the example of
According to the present principles, the network equipment 300 (e.g. via its processor(s) 304 and/or content generator 306) is configured to implement a method 400 (shown in
As shown in the example of
The tile horizontal angular amplitude θtile and the tile vertical angular amplitude θtile can be determined by taking into account one or several of the following service parameters of the targeted VR 360° video service:
The reference tile 600R depicted in
The central point (so called centroid) Cij of a tile 600 of the set of tiles can be defined with the spherical coordinates (1, θj, φij), in the system R(O,x,y,z).
To determine the centroids of the tiles 600, the network equipment 300 can, in a step 402, obtain an altitude θj for each parallel line Lj of the sphere 500 which comprises one or several centroids Cij of tiles 600 of the set of tiles. The number of parallel lines Lj depends on the tile vertical angular amplitude stile and the vertical overlapping ratio Rvert. The angle between two consecutive parallel lines Lj can be defined by the following equation:
Δθ=θtile×(1−Rvert)
It can be derived a list of possible θj values for the centroids Cij of the tiles 600, given by:
θj=Δθ×j=θtile×(1−Rvert)×j
wherein j belongs to [0, . . . , Nparallels per hemisphere−1] with the maximum number of parallel lines Lj per hemisphere given by Nparallels per hemisphere=90°/Δθ.
It is worth noting that the maximum number of parallel lines Lj per hemisphere can be lowered. The most the viewport is close to the poles, the most the navigation becomes a simple rotation around a single point. Consequently, a band of tiles can be less efficient at the poles and can be replaced a by star-shaped layout as described hereinafter. In an illustrative but non-limitative example, the pole case can reduce by one the number of parallel lines per hemisphere, so that the number of parallel lines Lj becomes:
N
parallels per hemisphere=(90°/Δθ)−1.
Once the parallel lines L, are defined, the network equipment 300 can, in a step 403, further determine the horizontal angular position of the centroids Cij on the corresponding parallel lines Lj, such as, for instance, they spatially meet the horizontal overlapping ratio Rhor. The number of tiles 600 arranged on a parallel line L, decreases when moving through the poles P, as it is proportional to the circumference of the parallel line Lj. By considering a circumference CE at the equator E, the circumference Cj for a parallel line Lj at an altitude θj is given by the following formulae:
C
j
=C
E×cos θj
The number of tiles 600 per parallel line Lj can depend on the tile horizontal angular amplitude φtile the and the horizontal overlapping ratio Rhor. In particular, for the parallel line L0 arranged at θ0=0°, the angular deviation between two consecutive centroids Ci0 belonging to L0 is given by:
Δφ0φtile×(1−Rhor)
so that the number of tiles on the parallel line L0 can be derived as follows:
N
tiles on parallel L0=360°/Δφ0
This leads to a list of possible φi0 values for the tile centroids Ci0 on the parallel line L0, given by:
φi0=Δφ0×i
with i belonging to [0, . . . , Ntiles on parallel L0−1].
The number of tiles on a parallel line Lj can then be obtained from the following formulae:
N
tiles on parallel Lj
=N
tiles on parallel L0×cos(θj)=360°×cos (θj)/Δφ0
In addition, the angular deviation between two consecutive centroids arranged on a parallel line Li is derived from the following equation:
Δφj=360°/Ntiles on parallel Lj=Δφ0/cos(θj)
For a centroid Cij arranged on a parallel line Lj, the angular position of the centroid Cij in the system R(O,x,y,z) can be obtained as follows:
φij=Δφj×i=θφ0×i/cos(θj)
with i belonging to [0, . . . , Ntiles on parallel Lj−1].
φij represents a rotation angle around axis y with respect to the segment OC and θj a rotation angle around axis x with respect to OC. The segment OCij can be obtained by a rotation matrix applied to OC defined (step 404) as follows:
OC
ij=Rotij(OC)
with Rotij the rotation matrix.
In an embodiment of the present principles, the rotation matrix Rotij can be a matrix product of two rotation matrices defined by the following equation:
Rotij=Rot(y, φij)×Rot(x, θj)
wherein:
In an embodiment of the present principles, since every tile of the set of tiles has the same shape, to obtain the tile mesh associated with the tile of centroid Cij (the mesh center of a tile is arranged at the center of said tile), the rotation matrix Rotij can be applied, in a step 405, to a reference tile mesh associated with the reference tile 600R of centroid C. The reference tile 600R can serve as a model for all the tiles. The rotation matrix Rotij is then applied to all vertices of the reference mesh to obtain the vertices of the tile mesh associated with the tile centered on Cij.
In a step 406, the network equipment 300 can determine the pixel content of the tile, e.g. by using a known ray-tracing technique computing ray intersection between the rotated tile shape and a 360° video frame of the VR 360° video projected on the sphere 500.
It should be noted that few tiles can be distributed in a star-shaped way at each pole P to complete the tiling of the sphere 500. For instance, in an illustrative but non-limitative example, the distribution in a star-shaped way can comprise six tiles (covering a tile horizontal angular amplitude φtile and a tile vertical angular amplitude θtile) on each pole P, regularly arranged (e.g. the angular deviation between the centers Cij of two consecutives tiles is equal to 60°). The axial tilt between the normal axis of a tile at center Cij and the y axis of the orthogonal system R(O,x,y,z) can be equal to 5°.
As shown in
Besides, according to the present principles, the streaming controller 103 of the client terminal 100—receiving the VR 360° video from the network equipment 300—can be further configured to continually select the segments associated with the tiles covering, for instance, the current viewport associated with the terminal 100. In the example of adaptive streaming, the switch from a current tile to a next tile—both comprising the current viewport—may occur only at the end of a video segment and at the beginning of the next one.
To this end, the client terminal 100 can receive, from the network equipment 300, the values of the tile horizontal and vertical angular amplitudes ((Nile, ewe), in order to be able to regenerate the tile reference mesh. The network equipment 300 can also send all the vertices of the reference tile 600R to terminal 100 and the list of rotation matrices Rot,i to be applied to the tile reference mesh to obtain the tiles covering the sphere 500. In a variant, the network equipment can only share with the terminal 100 the polar coordinates of the centroid Cij, when the terminal 100 is configured to dynamically re-compute the rotation matrices by using appropriate mathematic libraries.
In an illustrative but non-limitative example of the present principles shown in
In the example, the 4K video tiles are delivered to the terminal 100 with a horizontal FOV equal to 120° and a vertical FOV equal to about 72° (to respect the 16:9 ratio of the VR 360° video). By considering a horizontal overlapping ratio Rhor equal to ¾ along the equator E between two consecutive tiles (leading to a horizontal angular overlap of 90°, the shift between two consecutive tiles is equal to 30°, so that twelve tiles are defined on the equator E (parallel line L0). The same operation can be applied vertically when moving from south to north pole P. The angular vertical overlap is equal to 51° when considering a vertical overlapping ratio Rvert equal to ¾, meaning that the vertical shift from a crown of tiles to the upper one is equal to 17°, so that eleven tiles can be arranged on a given meridian of the sphere. Besides, few tiles are organized in a star-shaped way at each pole P to complete the tiling of the sphere representing the VR 360° video. In the end, about seventy tiles are required to cover the whole sphere.
Thanks to the above described method, by delivering only a portion of the scene, the ratio of video quality over data bitrate can be controlled and a high-quality video on client side can be obtained, even with network bandwidth constraints. In addition, by generating a tile larger than the viewport and adapted to the display ratio, a minimal user navigation in the video without disruption can be provided. Furthermore, by building the same shape of tiles for all viewports, it can prevent from reducing the quality on the poles.
References disclosed in the description, the claims and the drawings may be provided independently or in any appropriate combination. Features may, where appropriate, be implemented in hardware, software, or a combination of the two.
Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one implementation of the method and device described. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments necessarily mutually exclusive of other embodiments.
Reference numerals appearing in the claims are by way of illustration only and shall have no limiting effect on the scope of the claims.
Although certain embodiments only of the disclosure have been described herein, it will be understood by any person skilled in the art that other modifications, variations, and possibilities of the disclosure are possible. Such modifications, variations and possibilities are therefore to be considered as falling within the spirit and scope of the disclosure and hence forming part of the disclosure as herein described and/or exemplified.
The flowchart and/or block diagrams in the Figures illustrate the configuration, operation and functionality of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, or blocks may be executed in an alternative order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of the blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. While not explicitly described, the present embodiments may be employed in any combination or sub-combination.
Number | Date | Country | Kind |
---|---|---|---|
17306264.7 | Sep 2017 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2018/074961 | 9/14/2018 | WO | 00 |