The present disclosure relates generally to the streaming of spherical videos (so called 360° videos) to an end device through a delivery network.
This section is intended to introduce the reader to various aspects of art, which may be related to various aspects of the present disclosure that are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present disclosure. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
Spherical video content renders a scene with a 360° angle horizontally (and 180° vertically) allowing the user to navigate (i.e. pan) within the spherical scene for which the capture point is moving along the camera motion decided by an operator/scenarist. A spherical content is obtained through a multi-head camera, the scene being composed through stitching the camera's views, projecting them onto a sphere, mapping the sphere content onto a plan (for instance through an equirectangular projection) and compressing it through conventional video encoders.
Spherical videos offer an immersive experience wherein a user can look around using an adapted end-device (such as a head-mounted display (HMD)) or can navigate freely within a scene on a flat display by controlling the viewport with a controlling apparatus (such as a mouse, a remote control or a touch screen).
Such a freedom in spatial navigation requires that the whole spherical scene is delivered to a player (embedded within the HMD or TV set) configured to extract the video portion to be visualized depending on the position of the viewport within the scene. Therefore, a high bandwidth is necessary to deliver the whole spherical video (to offer an unrestricted spherical video service in 4K resolution, a video stream equivalent to twelve 4K videos has to be provided).
The majority of known solutions streaming spherical videos provides the full spherical scene to the end device, but only less than 10% of the whole scene is presented to the user. Since delivery networks have limited bandwidth, the video quality is decreased to meet bandwidth constraints.
Other known solutions mitigate the degradation of the video quality by reducing the resolution of the portion of the 360° scene arranged outside of the current viewport of the end device (i.e. the complete spherical scene is sent from a server with a non-uniform coding). In particular, 30 different viewports can be required to cover the whole spherical scene, so that 30 different versions of the same immersive video are generated and stored at the server side. Nevertheless, when the viewport of the end device is moved upon user's action outside of the highest resolution areas, the displayed video suffers from a sudden degradation.
The present disclosure has been devised with the foregoing in mind.
The disclosure concerns a method for tiling with a set of tiles a sphere representing a scene of a spherical immersive content, said method comprising:
In an embodiment, the tiles of the set of tiles can be distributed amongst three different areas of the sphere.
In an embodiment, the three areas comprise an equator area surrounding the equator of the sphere and two pole areas arranged at the poles of the sphere.
In an embodiment, the method can comprise:
In an embodiment, each of said first rotation matrices can be a first matrix product of two rotation matrices defined by the following equation:
Rotij=Rot(y,φij)*Rot(x,θj)
wherein:
In an embodiment, the equator area can comprise a number of parallel lines depending on the vertical angular amplitude of the tiles of the first type.
In an embodiment, the number L of parallel lines of the equator area can be given by:
L=round(90°/θtile)+1
wherein θtile is the tile vertical angular amplitude of tiles of first type.
In an embodiment, the method can comprise:
In an embodiment, each of said second rotation matrices can be a second matrix product of three rotation matrices defined by the following equation:
Rot′ij=Rot(x,ψi)×Rot(y,φij)×Rot(x,θj)
wherein:
In an embodiment, a pole area of the pole areas can comprise a number of parallel lines depending on the vertical angular amplitude of the tiles of the second type.
In an embodiment, the number L of parallel lines can be given by:
L=round(P°/Ωtile)+1
wherein P° is a horizontal angular amplitude delimiting a pole area and Ωtile is the vertical amplitude of tiles of second type.
In an embodiment, the tiles of the first type can have a rectangular shape and the tiles of the second type can have a square shape.
The present disclosure also concerns a network equipment configured for tiling with a set of tiles a sphere representing a scene of a spherical immersive content, said network equipment comprising at least one memory and at least one processing circuitry configured to spatially split the scene of the spherical multimedia content with at least a first type of tiles and a second type of tiles.
In an embodiment, the tiles of the set of tiles can be distributed amongst three areas on the scene.
In an embodiment, the three areas can comprise an equator area surrounding the equator of the sphere and two pole areas arranged at the poles of the sphere.
The present disclosure is further directed to a method to be implemented at a terminal configured to be in communication with a network equipment to receive a spherical immersive content with a scene represented by a sphere,
wherein the method comprises receiving information on a tiling of the scene with a set of tiles from the network equipment, the tiling spatially splitting the scene of the spherical multimedia content with at least a first type of tiles and a second type of tiles.
In addition, the present disclosure also concerns a terminal configured to be in communication with a network equipment to receive a spherical immersive content with a scene represented by a sphere,
wherein said terminal comprises at least one memory and at least one processing circuitry configured for receiving information on a tiling of the scene with a set of tiles from the network equipment, the tiling spatially splitting the scene of the spherical multimedia content with at least a first type of tiles and a second type of tiles.
Besides, the present disclosure is further directed to a non-transitory program storage device, readable by a computer, tangibly embodying a program of instructions executable by the computer to perform a method for tiling with a set of tiles a sphere representing a scene of a spherical immersive content,
said method comprising:
The present disclosure also concerns a computer program product which is stored on a non-transitory computer readable medium and comprises program code instructions executable by a processor for implementing a method for tiling with a set of tiles a sphere representing a scene of a spherical immersive content, said method comprising:
The method according to the disclosure may be implemented in software on a programmable apparatus. It may be implemented solely in hardware or in software, or in a combination thereof.
Some processes implemented by elements of the present disclosure may be computer implemented. Accordingly, such elements may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as “circuit”, “module” or “system”. Furthermore, such elements may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.
Since elements of the present disclosure can be implemented in software, the present disclosure can be embodied as computer readable code for provision to a programmable apparatus on any suitable carrier medium. A tangible carrier medium may comprise a storage medium such as a floppy disk, a CD-ROM, a hard disk drive, a magnetic tape device or a solid state memory device and the like.
The disclosure thus provides a computer-readable program comprising computer-executable instructions to enable a computer to perform the method for tiling with a set of tiles a sphere representing a spherical multimedia content according to the disclosure.
Certain aspects commensurate in scope with the disclosed embodiments are set forth below. It should be understood that these aspects are presented merely to provide the reader with a brief summary of certain forms the disclosure might take and that these aspects are not intended to limit the scope of the disclosure. Indeed, the disclosure may encompass a variety of aspects that may not be set forth below.
The disclosure will be better understood and illustrated by means of the following embodiment and execution examples, in no way limitative, with reference to the appended figures on which:
Wherever possible, the same reference numerals will be used throughout the figures to refer to the same or like parts.
The following description illustrates the principles of the present disclosure. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the disclosure and are included within its scope.
All examples and conditional language recited herein are intended for educational purposes to aid the reader in understanding the principles of the disclosure and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the disclosure, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the principles of the disclosure. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, read only memory (ROM) for storing software, random access memory (RAM), and nonvolatile storage.
In the claims hereof, any element expressed as a means and/or module for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
In addition, it is to be understood that the figures and descriptions of the present disclosure have been simplified to illustrate elements that are relevant for a clear understanding of the present disclosure, while eliminating, for purposes of clarity, many other elements found in typical digital multimedia content delivery methods, devices and systems. However, because such elements are well known in the art, a detailed discussion of such elements is not provided herein. The disclosure herein is directed to all such variations and modifications known to those skilled in the art.
The present disclosure is depicted with regard to a streaming environment to deliver a spherical multimedia content (such as a spherical video) to a client terminal through a delivery network.
As shown in the illustrative but non-limiting example of
The client terminal 100—connected to the gateway 200 through a first network N1 (such as a home network or an enterprise network)—may wish to request a spherical video stored on a remote network equipment 300 through a second network N2 (such as the Internet network). The first network N1 is connected to the second network N2 thanks to the gateway 200.
The network equipment 300 is configured to stream segments to the client terminal 100, according to the client request, using a streaming protocol (such as the HTTP adaptive streaming protocol, so called HAS).
As shown in the example of
As an example, the client terminal 100 is a portable media device, a mobile phone, a tablet or a laptop, a head mounted device, a set-top box or the like. Naturally, the client terminal 100 might not comprise a complete video player, but only some sub-elements such as the ones for demultiplexing and decoding the media content and might rely upon an external means to display the decoded content to the end user.
As shown in the example of
According to the present principles, the network equipment 300 (e.g. via its processor(s) 304 and/or content generator 306) can be configured to implement a method 400 (shown in
In particular, in an embodiment of the present principles, the scene 500 of the spherical video is spatially split with a first type of tiles (e.g. rectangular shape) on an equator area surrounding the equator L0 and with a second type of tiles (e.g. square shape) on two pole areas arranged at the poles of the sphere 500. the rectangular tiles are distributed over the equator area and the square tiles are arranged in the two distinct pole areas. The first type and the second type of tiles are different from each other. Naturally, other shapes of tiles might be considered, without departing from the scope of the present principles.
As shown in the example of
While it might be different, in the considered embodiment, the horizontal angular amplitude φtile of a rectangular tile 600 is distinct from the horizontal angular amplitude Ωtile of a square tile 700. In particular, in an illustrative and non-limiting example of the present principles, the horizontal and vertical angular amplitudes Ωtile of a square tile 700 can be defined by:
Ωtile=√(φtile×θtile)
which might be set to round (√(φtile×θtile))+1 degree (wherein round is a function configured for returning the nearest integer).
The tile horizontal angular amplitude φtile and the tile vertical angular amplitude θtile can be determined by taking into account one or several of service parameters of the targeted spherical video service (such as, a network available bandwidth for delivery along a transmission path between the client terminal 100 and the network equipment 300, a quality of the requested spherical video, a user field of view associated with the viewport of the client terminal 100, etc.).
Tiles Determination for the Equator Area
A reference rectangular tile 600R depicted in
In addition, the central point (so called centroid) Cij of a rectangular tile 600 belonging to the equator area 800 (shown in
To determine the centroids Cij (shown in
The number L of parallel lines Lj of the equator area 800 depends on the tile vertical angular amplitude θtile and can be given by:
L=round(90°/θtile)+1.
It should be noted that, a large part of the navigation within a scene being done around the equator area (paved with rectangular tiles to support, for instance, a better 16/9 viewport matching), the vertical equator area amplitude E° can be maximized (e.g. in an illustrative but non-limiting example larger than 90°). In particular, the vertical angular amplitude E° of the equator area 800 (e.g. having an annular shape as shown in
E°=(round(90°/θtile)+1)×θtile
The following list of altitude θj for the parallel lines Lj, i.e. a list of possible altitude values θj for the centroids Cij of the rectangular tiles 600 can then be obtained:
θj=θtile×j
wherein j belongs to [−L/2, . . . , 0, . . . , L/2] with j=0 at the equator L0,
θj=k×(θtile/2+θtile×j)
wherein k belongs to [1, −1] and j belongs to [0, . . . , (L/2−1)]
The number of rectangular tiles per parallel line Lj depends on the circumference of the considered parallel line Lj and on the horizontal angular amplitude of the tile φtile.
Once the parallel lines Lj are defined, the network equipment 300 can, in a step 403, determine the horizontal angular position of the centroids Cij on the corresponding parallel lines Lj of the equator area 800. The number of rectangular tiles 600 arranged on a parallel line Lj decreases when moving through the poles P, as it is proportional to the circumference of the parallel line Lj. By considering a circumference C0 at the equator L0, the circumference Cj at the bottom (i.e. the closest to the equator L0) of the rectangular tiles 600 for parallel lines Lj in the north hemisphere of the spherical scene is given by the following formulae:
C
j
=C
0×cos(θj−θtile/2)
The circumference Cj at the top (i.e. the closest to the equator L0) of the tiles for parallel lines in the south hemisphere of the spherical scene is given by:
C
j
=C
0×cos(θj+θtile/2)
It is worth noting that, in the north hemisphere, the circumference at the bottom of a tile is longer than circumference at the center of the tile and that, in the south hemisphere, the circumference at the top of the tiles is longer than circumference at the center of the tile.
The number Tj of rectangular tiles (presenting, for instance, a minimum overlapping) for a parallel line Lj is then defined as follows:
Thus, for a parallel line Lj, the rectangular tiles 600 have their centroids Cij arranged at the following longitudes φij:
φij represents a rotation angle around axis y with respect to the segment OC and θj a rotation angle around axis x with respect to OC. The segment OCij (i.e. the centroid Cij) shown in
OC
ij=Rotij(OC)
with Rotij the rotation matrix.
In an embodiment of the present principles, the rotation matrix Rotij can be a matrix product of two rotation matrices defined by the following equation:
Rotij=Rot(y,φij)×Rot(x,θj)
wherein:
In an embodiment of the present principles, to obtain the tile mesh associated with the rectangular tile of centroid Cij (the mesh center of a rectangular tile is arranged at the center of said tile), the rotation matrix Rotij can be applied, in a step 405, to a reference rectangular tile mesh associated with the reference rectangular tile 600R of centroid C. The reference rectangular tile 600R can serve as a model for all the rectangular tiles 600 of the equator area 800. The rotation matrix Rotij is then applied to all vertices of the reference mesh to obtain the vertices of the tile mesh associated with the rectangular tile centered on Cij.
Tiles Determination for the Two Pole Areas
In addition, for each pole area 900 depicted in
The reference square tile 700R shown in
As for the rectangular tiles 600, the centroid Cij of a square tile 700 can be first defined with the spherical coordinates (1, θj, φij) in the system R(O,x,y,z).
Besides, the horizontal angular amplitude P° delimiting a pole area 900 (the horizontal angular amplitude being equal to the vertical angular amplitude) can be defined by the difference between an angle corresponding to half of the sphere (i.e the scene vertical angular amplitude) and the vertical angular amplitude E° of the equator area 800:
P°=180°−((round(90°/θtile)+1)×θtile)
To determine the centroids Cij of the square tiles 700, the network equipment 300 can, in a step 406, obtain an altitude θ, for each parallel line Lj of the sphere 500 which comprises one or several centroids Cij of square tiles 700. The angle between two consecutive parallel lines Lj corresponds to Ωtile.
The number of parallel lines Lj of a pole area 900 depends on the tile vertical angular amplitude Ωtile and can be given by:
L=round(P°/Ωtile)+1
This leads to the following list of altitude θj for the parallel lines Lj, i.e. a list of possible altitude values θj for the centroids Cij of the square tiles 700:
θj=Ωtile×j
wherein j belongs to [−L/2, 0, . . . , 0, . . . , L/2] with j=0 at the equator,
θj=k×(Ωtile/2+Ωtile×j)
wherein k belongs to [1, −1] and j belongs to [0, . . . , (L/2−1)]
At the pole areas 900, the number T of square tiles per line is equal to the number of lines, so that the number of tiles per line (presenting a minimum overlapping), for a parallel line Lj is given by:
T=L=round(P°/Ωtile)+1
It should be noted the pole areas are identical and are paved with a tiled square area.
This leads (step 407) to a list of longitude φij for the square tiles 700 in the system R(O,x,y,z):
φij=Ωtile×i
wherein i belongs to [−T/2, 0, . . . , 0, . . . , T/2] with i=0 at the equator,
φij=k×(Ωtile/2+Ωtile×i)
wherein k belongs to [1, −1] and i belongs to [0, . . . , (T/2−1)].
φij represents a rotation angle around axis y with respect to the segment OC and θj a rotation angle around axis x with respect to OC.
According to the principles, the square tiles 700 as defined (shown in
Thus, the segment OCij can be obtained by a rotation matrix applied to OC defined (step 408) as follows:
OC
ij=Rot′ij(OC)=Rot(x,ψi)×Rot(y,φij)×Rot(x,θj)
wherein:
In an embodiment of the present principles, to obtain the tile mesh associated with the square tile of centroid Cij (the mesh center of a square tile is arranged at the center of said tile), the rotation matrix Rot′ij can be applied, in a step 409, to a reference square tile mesh associated with the reference square tile 700R of centroid C. The reference square tile 700R of
In a step 410, the network equipment 300 can determine the pixel content of the tiles, e.g. by using a known ray-tracing technique computing ray intersection between the rotated tile shape and a 360° video frame of the spherical video projected on the sphere 500.
As shown in
Besides, according to the present principles, the streaming controller 103 of the client terminal 100—receiving the spherical video from the network equipment 300—can be further configured to continually select the segments associated with the tiles covering, for instance, the current viewport associated with the terminal 100. In the example of adaptive streaming, the switch from a current tile to a next tile—both comprising the current viewport—may occur only at the end of a video segment and at the beginning of the next one.
To this end, the client terminal 100 can receive, from the network equipment 300, the values of the horizontal and vertical angular amplitudes (φtile, θtile, Ωtile) of the square and rectangular tiles, in order to be able to regenerate the correspondings tile reference meshes. The network equipment 300 can also send all the vertices of the reference square and rectangular tiles 600R to terminal 100 and the list of rotation matrices Rotij to be applied to the tile reference meshes to obtain the tiles covering the sphere 500. In a variant, the network equipment can only share with the terminal 100 the spherical coordinates of the centroid Cij, when the terminal 100 is configured to dynamically re-compute the rotation matrices by using appropriate mathematic libraries.
In an illustrative but non-limitative example of the present principles, to take into account the inevitable latency due to the recovery of the video from the server, a larger scene than a viewport VP can be delivered to the video player of the client terminal. For instance, to ensure the availability of the viewport in HD format (1920×1080 pixels), sixteen 1K video tiles (i.e 16×(960×540)=3840×2160 pixels) are delivered to the client terminal 100 allowing overprovisioning, as shown in
It should be noted that the tiling pattern impacts the coding efficiency. That is, larger tiles provide a better coding efficiency but less flexibility for viewport selection and smaller tiles provide a better match to a given viewport but consequently reduce coding efficiency.
At the beginning of a navigation, the center of the scene of the spherical video is visualized through the viewport. 16 tiles need to be delivered to the client terminal. At this moment, the user can freely change his point of view up/down or left/right within the portion of scene covered by the 16 tiles with no video disruption.
When the user is moving continuously his point of view to the right (left respectively), the 4 left tiles (right tiles respectively) will have to be replaced by 4 right tiles (left tiles respectively) to properly overprovision the future viewport. Same rules apply vertically.
In a further aspect of the present principles, to bring a good user experience, the Field Of View of the viewport needs to be wide enough not to give the feeling of seeing only a narrow part of a scene and to provide an acceptable level of immersion to the end user. By contrast, the FOV should not be too large to preserve an acceptable resolution (the larger the FOV, the less the number of pixel per degree is). In an illustrative but non-limiting example, the horizontal FOV for the viewport in HD format can be equal to 60° with a vertical FOV of 36° (to respect, for instance, a 16:9 ratio of the spherical video), so that the horizontal overprovisioning FOV (associated with a 161K tiles pattern) is about 120° in UHD format with a vertical FOV corresponding to 72°.
Thanks to the above described method, by delivering only a portion of the scene, the ratio of video quality over data bitrate can be controlled and a high-quality video on client side can be obtained, even with network bandwidth constraints. In addition, by tiling the spherical scene of an immersive video with two different types of tiles distributed among an equator area and two pole areas, the freedom given to the user for moving in any directions is improved. Tiles having a rectangular shape (i.e. with same aspect ratio as the viewport) are well adapted to an equator area where the navigation is similar to a horizontal movement of the viewport on a plane (more precisely on a cylinder). By contrast, tiles having a square shape are more suited to pole areas where a horizontal panning of the viewport becomes a rotation around the pole (no priority given to any axe).
References disclosed in the description, the claims and the drawings may be provided independently or in any appropriate combination. Features may, where appropriate, be implemented in hardware, software, or a combination of the two.
Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one implementation of the method and device described. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments necessarily mutually exclusive of other embodiments.
Reference numerals appearing in the claims are by way of illustration only and shall have no limiting effect on the scope of the claims.
Although certain embodiments only of the disclosure have been described herein, it will be understood by any person skilled in the art that other modifications, variations, and possibilities of the disclosure are possible. Such modifications, variations and possibilities are therefore to be considered as falling within the spirit and scope of the disclosure and hence forming part of the disclosure as herein described and/or exemplified.
The flowchart and/or block diagrams in the Figures illustrate the configuration, operation and functionality of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, or blocks may be executed in an alternative order, depending upon the functionality involved. In particular, in
Number | Date | Country | Kind |
---|---|---|---|
18305077.2 | Jan 2018 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2019/051502 | 1/22/2019 | WO | 00 |