The present disclosure relates to methods and systems for decoding and displaying a media content item. Particularly, but not exclusively, the present disclosure relates to providing a single inventory to allow playback in both landscape and portrait modes on a user device.
It is now common for a user to reorientate a user device when viewing media content. For example, when using a mobile device, such as a smartphone, to watch a movie, viewing the movie in landscape mode better presents the movie in a format closer to its original formant. On the other hand, portrait mode may be better suited to viewing certain parts of the movie in better detail. Moreover, portrait mode can provide a flexible and feature-rich interface for content discovery, e.g., when navigating a media guidance application or a social media application. In some cases, video content has been created and optimized differently for different use cases. Many social media apps have focused on sharing video targeted for portrait mode playback, while there is still a great amount of video created and desired for landscape viewing.
Typically, video encoding takes place after the editing for portrait or landscape viewing modes. Thus, a separate inventory is needed for each of the portrait and landscape viewing modes. Furthermore, the number of inventories can grow as different aspect ratios are required, e.g., for differently configured mobile device screens.
Systems and methods are provided herein for encoding and displaying a media content item. Such systems and methods may provide reduced operational demands of a service provider, e.g., by providing a simpler inventory when streaming content to a user device, and may provide a better viewing experience to a user, since an operational demand placed on a user device may be reduced, based on more straightforward strategies for decoding the media content item based on one or more operational parameters of the user device, such as its orientation.
According to some examples, methods and systems are provided for generating an encoded media content item having a partitioning structure comprising: multiple partitioned areas configured to, when decoded, generate display of the media content item in a first format, and a partition boundary defining one of the partitioned areas configured to, when decoded, generate display of the media content item in a second format, e.g., without decoding any other of the multiple partitioned areas.
In some examples, the partition boundary defines a region of interest of a frame of the media content item. In some examples, the position of the partition boundary is time-varying, e.g., relative to the other of the multiple partitioned areas. In some examples, the aspect ratio of the partition boundary is time-varying, e.g., relative to the other of the multiple partitioned areas. For example, the position and/or aspect ratio of the partition boundary may change from frame to frame based on the composition of successive frames of the media content item.
In some examples, the first format is a format better suited for horizontal viewing, e.g., a landscape format, and the second format is a format better suited for vertical viewing, e.g., a portrait format. For example, a landscape format is a format wherein a width of a frame is greater than a height of the frame, and a portrait format is a format wherein a width of a frame is less than a height of the frame.
In some examples, the partitioning structure is non-regular. In some examples, the partitioning structure comprises a tile structure. In some examples, the partitioning structure comprises a slice structure. In some examples, the partition boundary is cropped based on a parameter of a user device, e.g., after the media content item is decoded at a user device.
In some examples, the method further comprises: rotating the media content item; generating the partitioning structure; encoding the media content item; decoding the area of the media content item defined by the partition boundary; and rotating the decoded area of the media content item defined by the partition boundary.
In some examples, the method further comprises: determining whether a user device requires display in the first format or the second format; and in response to determining that the user device requires display in the first format, decoding and displaying the multiple partitioned areas; or in response to determining that the user device requires display in the second format, decoding and displaying the partitioned area defined by the partition boundary.
In some examples, determining whether the user device requires display in the first format or the second format comprises: determining an orientation of the user device; and determining a period for which the orientation is above a threshold orientation.
The above and other objects and advantages of the disclosure will be apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:
One way to avoid creating an additional encode is to add cropping information into a bitstream. However, this requires the player to first decode the entire video. Thus, it is highly desirable for a bitstream to be encoded so that it is possible to decode only the required bits to enable playback of the media content item in portrait mode or landscape mode, e.g., as necessitated by the orientation of the user device 102. This will eliminate the need for either creating a separate encode for each of the portrait mode and the landscape mode, or decoding the entire video encoded for landscape and then cropping to portrait mode.
The systems and methods disclosed herein enable a single inventory to allow playback in a first format and a second format. For example, the first format may be a landscape (e.g., horizontal) format and the second format may be a portrait format (e.g., a vertical format). In some examples, each of the first and second formats may support display of a media content item in a different aspect ratio, such as 16:9 and 9:16, or 1:1 and 2:3, e.g., to suit the configuration of the user device 102. The benefit comes in twofold. The disclosed systems and methods can remove the need of dual inventory of encoded streams, which therefore reduces the storage requirements. For example, a partitioning structure, built into the encoding of the media content item can signal, e.g., automatically, the intended presentation of the streamed media content item. For example, in portrait viewing, the video decoding is applied, e.g., only, to the portions of the media content item required for portrait presentation, which significantly reduces the computational demands.
In the example shown in
In some examples, system 100 may comprise an application that provides guidance through an interface, e.g., a graphical user interface, that allows users to efficiently navigate media content selections, navigate an interactive media content item, and easily identify media content that they may desire, such as content provided on a database of one or more live streams. Such guidance is referred to herein as an interactive content guidance application or, sometimes, a content guidance application, a media guidance application, or a guidance application. In some examples, the application may be configured to provide a recommendation for a content item, e.g., based on a user profile and/or an endorsement profile of the content item. For example, the application may provide a user with a recommendation based for a content item based on one or more endorsements present, e.g., visibly and/or audibly present, in the content item. In some examples, the application provides users with access to a group watching session and/or group communication functionality. For example, the application may provide a user with an option to join a group watching session and participate in group communication with one or more other users participating in the group watching session.
Interactive media guidance applications may take various forms, depending on the content for which they provide guidance. One typical type of media guidance applications is an interactive television program guide. Interactive television program guides (sometimes referred to as electronic program guides) are well-known guidance applications that, among other things, allow users to navigate among and locate many types of content or media assets. Interactive media guidance applications may generate graphical user interface screens that enable a user to navigate among, locate and select content. As referred to herein, the terms “media content items”, “media asset”, “content items” and “content” should each be understood to mean an electronically consumable user asset, such as television programming, as well as pay-per-view programs, on-demand programs (as in VOD systems), Internet content (e.g., streaming content, downloadable content, Webcasts, etc.), video clips, audio, content information, pictures, rotating images, documents, playlists, websites, articles, books, electronic books, blogs, chat sessions, social media, applications, games, and/or any other media or multimedia and/or combination of the same. Guidance applications also allow users to navigate amid and locate content. As referred to herein, the term “multimedia” should be understood to mean content that utilizes at least two different content forms described above, for example, text, audio, images, video, or interactivity content forms. Content may be recorded, played, displayed or accessed by user equipment devices, but can also be part of a live performance.
The media guidance application and/or any instructions for performing any of the examples discussed herein may be encoded on computer-readable media. Computer-readable media includes any media capable of storing data. The computer-readable media may be transitory, including, but not limited to, propagating electrical or electromagnetic signals, or may be non-transitory, including, but not limited to, volatile and non-volatile computer memory or storage devices such as a hard disk, floppy disk, USB drive, DVD, CD, media card, register memory, processor cache, random access memory (RAM), etc.
With the ever-improving capabilities of the Internet, mobile computing, and high-speed wireless networks, users are accessing media on user equipment devices on which they traditionally did not. As referred to herein, the phrases “user equipment device,” “user equipment,” “user device,” “computing device,” “electronic device,” “electronic equipment,” “media equipment device,” or “media device” should be understood to mean any device for accessing the content described above, such as a television, a Smart TV, a set-top box, an integrated receiver decoder (IRD) for handling satellite television, a digital storage device, a digital media receiver (DMR), a digital media adapter (DMA), a streaming media device, a DVD player, a DVD recorder, a connected DVD, a local media server, a BLU-RAY player, a BLU-RAY recorder, a personal computer (PC), a laptop computer, a tablet computer, a WebTV box, a personal computer television (PC/TV), a PC media server, a PC media center, a hand-held computer, a stationary telephone, a personal digital assistant (PDA), a mobile telephone, a portable video player, a portable music player, a portable gaming machine, a smartphone, or any other television equipment, computing equipment, or wireless device, and/or combination of the same. In some examples, the user equipment device may have a front-facing screen and a rear-facing screen, multiple front screens, or multiple angled screens. In some examples, the user equipment device may have a front-facing camera and/or a rear-facing camera. On these user equipment devices, users may be able to navigate among and locate the same content available through a television. Consequently, media guidance may be available on these devices, as well. The guidance provided may be for content available only through a television, for content available only through one or more of other types of user equipment devices, or for content available through both a television and one or more of the other types of user equipment devices. The media guidance applications may be provided as online applications (i.e., provided on a website), or as stand-alone applications or clients on user equipment devices. Various devices and platforms that may implement media guidance applications are described in more detail below.
One of the functions of the media guidance application is to provide media guidance data to users. As referred to herein, the phrase “media guidance data” or “guidance data” should be understood to mean any data related to content or data used in operating the guidance application. For example, the guidance data may include program information, subtitle data, guidance application settings, user preferences, user profile information, media listings, media-related information (e.g., broadcast times, broadcast channels, titles, descriptions, ratings information (e.g., parental control ratings, critics' ratings, etc.), genre or category information, actor information, logo data for broadcasters' or providers' logos, etc.), media format (e.g., standard definition, high definition, 3D, etc.), on-demand information, blogs, websites, and any other type of guidance data that is helpful for a user to navigate among and locate desired content selections.
Server 204 includes control circuitry 211 and input/output (hereinafter “I/O”) path 212, and control circuitry 211 includes storage 214 and processing circuitry 216. Computing device 202, which may be a personal computer, a laptop computer, a tablet computer, a smartphone, a smart television, a smart speaker, or any other type of computing device, includes control circuitry 218, I/O path 220, speaker 222, display 224, and user input interface 226. Control circuitry 218 includes storage 228 and processing circuitry 220. Control circuitry 211 and/or 218 may be based on any suitable processing circuitry such as processing circuitry 216 and/or 230. As referred to herein, processing circuitry should be understood to mean circuitry based on one or more microprocessors, microcontrollers, digital signal processors, programmable logic devices, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), etc., and may include a multi-core processor (e.g., dual-core, quad-core, hexa-core, or any suitable number of cores). In some examples, processing circuitry may be distributed across multiple separate processors, for example, multiple of the same type of processors (e.g., two Intel Core i9 processors) or multiple different processors (e.g., an Intel Core i7 processor and an Intel Core i9 processor).
Each of storage 214, 228, and/or storages of other components of system 200 (e.g., storages of content database 206, and/or the like) may be an electronic storage device. As referred to herein, the phrase “electronic storage device” or “storage device” should be understood to mean any device for storing electronic data, computer software, or firmware, such as random-access memory, read-only memory, hard drives, optical drives, digital video disc (DVD) recorders, compact disc (CD) recorders, BLU-RAY disc (BD) recorders, BLU-RAY 2D disc recorders, digital video recorders (DVRs, sometimes called personal video recorders, or PVRs), solid state devices, quantum storage devices, gaming consoles, gaming media, or any other suitable fixed or removable storage devices, and/or any combination of the same. Each of storage 214, 228, and/or storages of other components of system 200 may be used to store various types of content, metadata, and or other types of data. Non-volatile memory may also be used (e.g., to launch a boot-up routine and other instructions). Cloud-based storage may be used to supplement storages 214, 228 or instead of storages 214, 228. In some examples, control circuitry 210 and/or 218 executes instructions for an application stored in memory (e.g., storage 214 and/or 228). Specifically, control circuitry 211 and/or 218 may be instructed by the application to perform the functions discussed herein. In some implementations, any action performed by control circuitry 211 and/or 218 may be based on instructions received from the application. For example, the application may be implemented as software or a set of executable instructions that may be stored in storage 214 and/or 228 and executed by control circuitry 211 and/or 218. In some examples, the application may be a client/server application where only a client application resides on computing device 202, and a server application resides on server 204.
The application may be implemented using any suitable architecture. For example, it may be a stand-alone application wholly implemented on computing device 202. In such an approach, instructions for the application are stored locally (e.g., in storage 228), and data for use by the application is downloaded on a periodic basis (e.g., from an out-of-band feed, from an Internet resource, or using another suitable approach). Control circuitry 218 may retrieve instructions for the application from storage 228 and process the instructions to perform the functionality described herein. Based on the processed instructions, control circuitry 218 may determine what action to perform when input is received from user input interface 226.
In client/server-based examples, control circuitry 218 may include communication circuitry suitable for communicating with an application server (e.g., server 204) or other networks or servers. The instructions for carrying out the functionality described herein may be stored on the application server. Communication circuitry may include a cable modem, an Ethernet card, or a wireless modem for communication with other equipment, or any other suitable communication circuitry. Such communication may involve the Internet or any other suitable communication networks or paths (e.g., communication network 208). In another example of a client/server-based application, control circuitry 218 runs a web browser that interprets web pages provided by a remote server (e.g., server 204). For example, the remote server may store the instructions for the application in a storage device. The remote server may process the stored instructions using circuitry (e.g., control circuitry 211) and/or generate displays. Computing device 202 may receive the displays generated by the remote server and may display the content of the displays locally via display 224. This way, the processing of the instructions is performed remotely (e.g., by server 204) while the resulting displays, such as the display windows described elsewhere herein, are provided locally on computing device 202. Computing device 202 may receive inputs from the user via input interface 226 and transmit those inputs to the remote server for processing and generating the corresponding displays.
A user device may send instructions, e.g., to view an interactive media content item and/or select one or more programming options of the interactive media content item, to control circuitry 211 and/or 218 using user input interface 226. User input interface 226 may be any suitable user interface, such as a remote control, trackball, keypad, keyboard, touchscreen, touchpad, stylus input, joystick, voice recognition interface, gaming controller, or other user input interfaces. User input interface 226 may be integrated with or combined with display 224, which may be a monitor, a television, a liquid crystal display (LCD), an electronic ink display, or any other equipment suitable for displaying visual images.
Server 204 and computing device 202 may transmit and receive content and data via I/O path 212 and 220, respectively. For instance, I/O path 212, and/or I/O path 220 may include a communication port(s) configured to transmit and/or receive (for instance to and/or from content database 206), via communication network 208, content item identifiers, content metadata, natural language queries, and/or other data. Control circuitry 211 and/or 218 may be used to send and receive commands, requests, and other suitable data using I/O paths 212 and/or 220.
In particular,
At 302, control circuitry, e.g., control circuitry 210, determines a type of partitioning structure supported by a codec. For example, control circuitry may determine whether a codec supports a tile-based partitioning structure, a slice-based partitioning structure, or any other appropriate type of partitioning structure. In some examples, control circuitry may access a database containing information relating to one or more types of codecs to determine which type of partitioning they support. For example, database may contain information relating to codec usage (e.g., codec popularity) and associated support for various types of partitioning structures, such as H.264/AVC does not support tile-based partitioning and does support slice-based partitioning, and H.265/HEVC does support tile-based partitioning. In this manner control circuitry may determine which type of partition to implement, and may determine to implement one type of partitioning over another, such that the media content item is encoded, in the first instance, for use by one codec over another codec. When a codec supports tile-base partitioning, process 300 moves to 304. When a codec does not support tile-base partitioning, process 300 moves to 318.
At 304, control circuitry, e.g., control circuitry 210, implements a tile-base partitioning structure for partition the media content item. For example, control circuitry may determine that each frame of the media content item is to be divided into multiple tiles.
At 306, control circuitry, e.g., control circuitry 210, determines a ROI of frame 500. For example, control circuitry may access a user profile, at 308, to determine one or more topics of interest of a user. In this case, the user profile may be set to accept a default setting of service provider, who may denote which region of frame 500 comprises the ROI. For example, the ROI may be the main focal point of frame 500, which in this case is a mountain peak, but in other cases, the ROI may be a face of an individual in a frame or an item, e.g., product, in a frame of an advertisement. In some examples, the user profile may comprise a setting to not accept, e.g., as default, a setting of a service provider, and may, instead, choose one or more topics that they wish to be shown in the ROI. In this manner, the systems and methods herein provide for a personated encoding process, based on their user profile, or one or more other trending topics, for example.
At 310, control circuitry, e.g., control circuitry 210, determines whether the ROI moves relative to frame 500 as the media content item is played. For example, the media content item may comprise a scene that pans across the mountain range shown in
Referring to
At 312, control circuitry, e.g., control circuitry 210, assigns a fixed partition boundary to the ROI, e.g., in response to determining that the ROI does not move or change shape as playback advances.
At 314, control circuitry, e.g., control circuitry 210, assigns a moving partition boundary to the ROI, e.g., in response to determining that the ROI does move or change shape as playback advances, in a similar manner to that shown in
At 316, control circuitry, e.g., control circuitry 210, encodes each frame of the media content item with a certain partitioning structure. Referring to
Referring back to 302, when control circuitry determines that a codec does not support a tile-based partitioning structure, process 300 moves to 318. It is to be understood that, where technically possible, the features of the disclosed systems and method relating to tile-based methodology are to be used, mutatis mutandis, in relation to slice-based methodology. However, a complication arises when tiles are not supported. For example, typically, a slice-based partitioning structure consists of a group of continuous content units in raster scan order. However, applying the above-described partition boundary, which defines the ROI, would result in a partitioning that breaks the statistical correlation of spatially neighboring sliced content units. To account for this and enable codecs supporting slice-based partitioning to be used to equal advantage, at 318, control circuitry, e.g., control circuitry 210, rotates the media content item, e.g., rotates a frame of the media content item prior to implementing a partitioning structure and encoding the media content item.
Referring to
At 320, control circuitry, e.g., control circuitry 210, implements a slice-base partitioning structure for partition the media content item. For example, control circuitry may determine that each frame of the media content item is to be divided into multiple slices.
Following the encoding of the media content item, in whichever codec/partitioning structure combination, process 300 may move back to 302, to repeat the process for other codec/partitioning structure combinations, e.g., to ensure that an encoded media content item can be delivered for use with any appropriate user device 102. For example, process 300, as shown in
At 322, control circuitry, e.g., control circuitry 210, receives a request from user device 102 for an encoded media content item. For example, the request may contain information relating to which codec(s) are supported by user device 102.
At 324, control circuitry, e.g., control circuitry 210, selects an encoded media content item corresponding to the request from user device 102. For example, user device 102 may request a media content item in a certain quality for decoding by a certain codec, e.g., H.264/AVC.
At 326, control circuitry, e.g., control circuitry 210, transmits the requested encoded media content item to user device 102. Importantly, the encode media content item is transmitted in a single inventory with SEI pertaining to how to decode the media content item based on its partitioning structure. For example, the SEI may contain instructions on how to decode the tile/slices, e.g., which tile(s)/slice(s) to decode based on whether user device 102 requires display in a first format or a second format. In some examples, the first and second format may be related to an operational parameter of the user device 102, such as its orientation. For example, the first format may be a landscape display format and the second formant may be a portrait display format.
At 328, control circuitry, e.g., control circuitry 218, determines whether user device 102 requires display in the first format or the second format. In the example shown in
At 330, control circuitry, e.g., control circuitry 218, determines whether the orientation of user device 102 is greater than orientation threshold. Referring to
At 334, control circuitry, e.g., control circuitry 218, determines to display the media content item in the first format. In the example shown in
At 336, control circuitry, e.g., control circuitry 218, access the SEI embedded in the bitrate stream. The SEI contains instructions on which tiles/slices to decode for displaying the media content item in the first format, e.g., in a landscape format. When a tile-based partitioning structure has been implemented, decoding of tiles T1, T2 and T3 would result in entire frame 500 being decoded for display. When a slice-based partitioning structure has been implemented, decoding of slices S1, S2 and S3 would result in entire frame 600 being decoded for display.
Returning back to
At 332 control circuitry, e.g., control circuitry 218, determines whether the orientation of user device 102 has been held at an orientation for greater than a threshold period. For example, the threshold period may be set to an amount of time, e.g., 2 seconds, to ensure that the format of the display of the media content item does not switch inadvertently, such as when a user is jolted or slips. Thus, when user device 802 is held at an angle above the threshold angle, but not for longer than the threshold period, process moves to 334 as described above. However, when user device 802 is held at an angle above the threshold angle for longer than the threshold period, process moves to 338.
At 338, control circuitry, e.g., control circuitry 218, determines to display the media content item in the second format. In the example shown in
At 340, control circuitry, e.g., control circuitry 218, access the SEI embedded in the bitrate stream. The SEI contains instructions on which tiles/slices to decode for displaying the media content item in the second format, e.g., in a portrait format. In particular, under these conditions, control circuitry is instructed to decode, e.g., only decode, the tile/slice defined by the partition boundary. When a tile-based partitioning structure has been implemented, decoding of tile T2 would result in the ROI of frame 500 being decoded for display, e.g., as shown in
At 342, control circuitry, e.g., control circuitry 218, determines whether the decoded media content item needs to be rotated, e.g., to account for encoding for slice-based partitioning structures. In some examples, control circuitry may access the SEI to look for a rotation instruction. In other examples, control circuitry may perform one or more appropriate image processing techniques to determine whether the decoded frame is likely to be in a rotated orientation. When the decoded media content item is rotated, process 300 moves to 344. When the decoded media content item is rotated, process 300 moves to 346.
At 344, control circuitry, e.g., control circuitry 218, performs a rotation operation on the decoded media content item. For example, control circuitry may rotate the entire frame of the media content item, or just the ROI defined by the partition boundary, e.g., depending on the outcome of 334 and 338. In the example shown in
At 346, control circuitry, e.g., control circuitry 218, determines whether the decoded media content item requires cropping, e.g., such that the decoded media content item better fits the screen of the user device 102 in either landscape or portrait display mode. For example, cropping may be required when the exact version of the encoded media content item was unavailable at 322. For example, control circuitry of the user device 102 may have requested an encoded version of the media content item having a first format having an aspect ratio of 16:9, and second format having an aspect ratio of 9:16. However, server 104, may have supplied a slightly different version of the media content item having a first format having the aspect ratio of 16:9, but a second format having an aspect ratio of 1:1. As such, when cropping is required, process 300 moves to 348, and when cropping is not required, process 300 moves to 350.
At 348, control circuitry, e.g., control circuitry 218, crops the media content item to the required format. For example,
At 350, control circuitry, e.g., control circuitry 218, causes the media content item to be displayed in the first or second format, e.g., based on the orientation of the user device.
The actions or descriptions of
The processes described above are intended to be illustrative and not limiting. One skilled in the art would appreciate that the steps of the processes discussed herein may be omitted, modified, combined, and/or rearranged, and any additional steps may be performed without departing from the scope of the invention. More generally, the above disclosure is meant to be illustrative and not limiting. Only the claims that follow are meant to set bounds as to what the present invention includes. Furthermore, it should be noted that the features and limitations described in any one example may be applied to any other example herein, and flowcharts or examples relating to one example may be combined with any other example in a suitable manner, done in different orders, or done in parallel. In addition, the systems and methods described herein may be performed in real time. It should also be noted that the systems and/or methods described above may be applied to, or used in accordance with, other systems and/or methods.