Systems and methods for providing audio content during trick-play playback

Description

FIELD OF THE INVENTION

The present invention generally relates to adaptive streaming and more specifically to streaming systems that may provide audio content during a trick-play mode of playback.

BACKGROUND

The term streaming media describes the playback of media on a playback device, where the media is stored on a server and continuously sent to the playback device over a network during playback. Typically, the playback device stores a sufficient quantity of media in a buffer at any given time during playback to prevent disruption of playback due to the playback device completing playback of all the buffered media prior to receipt of the next portion of media. Adaptive bit rate streaming or adaptive streaming involves detecting the present streaming conditions (e.g. the user's network bandwidth and CPU capacity) in real time and adjusting the quality of the streamed media accordingly. Typically, the source media is encoded at multiple bit rates and the playback device or client switches between streaming the different encodings depending on available resources.

Adaptive streaming solutions typically utilize either Hypertext Transfer Protocol (HTTP), published by the Internet Engineering Task Force and the World Wide Web Consortium as RFC 2616, or Real Time Streaming Protocol (RTSP), published by the Internet Engineering Task Force as RFC 2326, to stream media between a server and a playback device. HTTP is a stateless protocol that enables a playback device to request a byte range within a file. HTTP is described as stateless, because the server is not required to record information concerning the state of the playback device requesting information or the byte ranges requested by the playback device in order to respond to requests received from the playback device. RTSP is a network control protocol used to control streaming media servers. Playback devices issue control commands, such as “play” and “pause”, to the server streaming the media to control the playback of media files. When RTSP is utilized, the media server records the state of each client device and determines the media to stream based upon the instructions received from the client devices and the client's state.

In adaptive streaming systems, the source media is typically stored on a media server as a top level index file pointing to a number of alternate streams that contain the actual video and audio data. Each stream is typically stored in one or more container files. Different adaptive streaming solutions typically utilize different index and media containers. The Synchronized Multimedia Integration Language (SMIL) developed by the World Wide Web Consortium is utilized to create indexes in several adaptive streaming solutions including IIS Smooth Streaming developed by Microsoft Corporation of Redmond, Wash., and Flash Dynamic Streaming developed by Adobe Systems Incorporated of San Jose, Calif. HTTP Adaptive Bitrate Streaming developed by Apple Computer Incorporated of Cupertino, Calif. implements index files using an extended M3U playlist file (.M3U8), which is a text file containing a list of URIs that typically identify a media container file. The most commonly used media container formats are the MP4 container format specified in MPEG-4 Part 14 (i.e. ISO/IEC 14496-14) and the MPEG transport stream (TS) container specified in MPEG-2 Part 1 (i.e. ISO/IEC Standard 13818-1). The MP4 container format is utilized in IIS Smooth Streaming and Flash Dynamic Streaming. The TS container is used in HTTP Adaptive Bitrate Streaming.

The Matroska container is a media container developed as an open standard project by the Matroska non-profit organization of Aussonne, France. The Matroska container is based upon Extensible Binary Meta Language (EBML), which is a binary derivative of the Extensible Markup Language (XML). Decoding of the Matroska container is supported by many consumer electronics (CE) devices. The DivX Plus file format developed by DivX, LLC of San Diego, Calif. utilizes an extension of the Matroska container format (i.e. is based upon the Matroska container format, but includes elements that are not specified within the Matroska format).

To provide a consistent means for the delivery of media content over the Internet, the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) have put forth the Dynamic Adaptive Streaming over HTTP (DASH) standard. The DASH standard specifies formats for the media content and the description of the content for delivery of MPEG content using HTTP. In accordance with DASH, each component of media content for a presentation is stored in one or more streams. Each of the streams is divided into segments. A Media Presentation Description (MPD) is a data structure that includes information about the segments in each of the stream and other information needed to present the media content during playback. A playback device uses the MPD to obtain the components of the media content using adaptive bit rate streaming for playback.

As the latency with which content can be adaptively streamed has improved, streaming of live events, such as sporting events and concerts, has become popular. In this type of content, the audio portion of the content may give indications of portions of the presentation that may be of interest to user. As such, users may want to use audio cues when searching the media content to find parts of the content that are of interest. However, most conventional playback systems do not provide playback of audio content during the use of trick-play features such as rewind, fast forward and seek to find content of interest. Systems and methods for providing playback of audio content during playback in a trick-play mode in accordance with some embodiments of the invention are disclosed.

SUMMARY OF THE INVENTION

In accordance with some embodiments of the invention, a playback device is configured to perform a process for providing trick-play playback with audio content in the following manner, the playback device stores of segments of an audio content portion and frames of a video content portion of media content in a buffer in a playback device. Synchronization information is stored in a memory in the playback device. The synchronization information associates a presentation time of each of one or more of the segments of the audio content portion with the presentation time of one or more of the frames of the video content. The playback device receives a command for playback of the media content in a trick-play mode. The next frame to present is determined by the playback device based upon the trick-play mode. Each segment of audio content associated with the next frame is determined from the synchronization information stored in memory. The playback device presents each of the segments audio content associated with the next frame during playback in the trick-play mode.

In accordance with some embodiments the determined next frame is presented on a display of the playback device. In accordance with some of these embodiments, each segment of the audio content associated with the next frame is presented concurrently with the presentation of the next frame. In accordance with some other of these embodiments, the playback device adds each segment of the audio content associated with the next frame to a queue in response to the determination of the associated segments of the audio content and each segment of audio content associated with the next frame is presented based upon the queue and is independent of the presentation of the next frame from the video content on the display of the playback device.

In accordance with some embodiments, the playback device generates a display of a scrubber for the video content indicating a presentation time of the next frame from the video content and overlays the display of the scrubber for the video content over the presentation of the image on the display. In accordance with many embodiments, the playback device generates a display of a scrubber for the audio content indicating a presentation time of each segment of the audio content associated with the next frame and overlays the display of the scrubber for the audio content over the presentation of the image on the display. In accordance with a number of embodiments, the scrubber for the audio content is separate from a scrubber for the video content in the display.

In accordance with many embodiments, data for the segments of the audio content portion, data for the plurality of frames of the video content, and the synchronization information are received in the playback device from a content provider system over a network using adaptive bitrate streaming. In accordance with a number of these embodiments, the receiving of the data is performed in the following manner. The playback device receives a top level index file from the content provider system over a network. The top level index files identifies alternative streams of video content wherein at least a portion of the plurality of alternative streams are encoded at different maximum bitrates, and at least one stream of audio content. The playback device requests portions of the video content from the content provider using the alternative streams based upon the network bandwidth between the playback device and the content provider system and receives the requested portions of the video content in the playback device in response to the requests. The playback device generates the frames of the video content from the portions of video content received stores the plurality of frames in a buffer. The playback device also requests portions of the audio content from the at least one streams of audio content from the content provider system, receives the requested portions of the audio content, generates the segments of the audio content from the portions of audio content received and stores the segments of audio content in a buffer. The playback device obtains the synchronization information from the content provider system based upon information in the top level index file and stores the synchronization information in the memory of the playback device. In accordance with some of these embodiments, the synchronization information is obtained by reading a pointer to a file including the synchronization information from the top level index file, requesting the file from content provider system using the playback device, and receiving the requested file in the playback device.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a network diagram of an adaptive bitrate streaming system in accordance with an embodiment of the invention.

FIG. 2 illustrates a block diagram of components of a processing system such as a processing system in a playback device.

FIG. 3 illustrates a processing system such as a processing system that in an encoding system, and/or content provider system in accordance with an embodiment of the invention.

FIG. 4 illustrates a flow diagram of a process performed by an encoding system to encode streams of media content including at least one stream of audio data divided into segments and at least one stream of video content including frames and to generate synchronization data that associates one or more segments of the audio content with a frame of the video content in accordance with an embodiment of the invention.

FIG. 5 illustrates a flow diagram of a process performed by a playback device to obtain audio and video content of the media content as well as synchronization information for the media content in accordance with an embodiment of the invention.

FIG. 6 illustrates a flow diagram for a process performed by a playback device to obtain audio and video content for the media content as well as synchronization information for the media content using adaptive bitrate streaming in accordance with an embodiment of the invention.

FIG. 7 illustrates a flow diagram for a process performed by a playback device to provide audio content during playback of media content in a trick-play mode in accordance with an embodiment of the invention.

FIG. 8 illustrates a screen shot of a display during playback providing audio content during a trick-play mode in accordance with an embodiment of the invention.

DETAILED DESCRIPTION

Turning now to the drawings, systems and methods for providing playback of audio content during playback of video content in a trick-play mode in accordance with some embodiments of the invention are illustrated. For purposes of this discussion, a trick-play mode is a mode in which video content is presented in a manner other than the sequential presentation of video content in the accordance with an intended presentation time. Examples of trick-play modes include, but are not limited to, rewind, fast forward, and seek. In accordance with some embodiments, a trick-play mode is initiated by the playback device receiving a command from a user indicating a particular trick-play mode. The playback device then begins trick mode play by determining which image of the video content to display next and the frame of video content associated with the image. In several embodiments, the playback device then determines one or more segments of audio content associated with the frame. The image is then presented on the display of the playback device and the segment of audio content is presented via an audio system of the playback device.

In accordance with some embodiments, the segment of the audio content is only presented as long as the associated frame(s) is being presented. In accordance with some other embodiments, the entirety of the segment of audio data is played regardless of the current frame being presented. In accordance with many of these embodiments, a queue of segments of audio content to play is maintained and the segments are played from the queue in a sequential order regardless of the frames being displayed. In accordance with a number of these embodiments, the current frame being presented is determined when a segment of audio content ends and the audio content associated with the current frame is presented, which may cause segments of audio content in the queue that are associated with frames that were presented while the previous segment of audible content was being presented to be skipped. In accordance with some embodiments, the presentation of one or more audio segments associated with a frame(s) is repeated if the frame(s) is still being presented when the presentation of the one or more audible segment associated with the frame ends.

In accordance with some embodiments, synchronization information indicating the association of segments of the audio data with frames of video content is generated by an encoding device. In accordance with some embodiments, the synchronization information may be determined from a subtitle track or closed caption track which associates video presentation times to verbal commentary and/or dialog. In accordance with several of these embodiments, the synchronization information is stored in files storing the audio and/or video content. In accordance with a number of other embodiments, the synchronization information is stored in an index file associated with the media content. The synchronization information can be retrieved at the start of playback and stored for use when a trick-play mode is requested.

In accordance with many embodiments, a playback device that provides audio playback during a trick-play mode uses adaptive bit rate streaming to obtain the media content. In accordance with some of these embodiments, the media content is stored in streams in accordance with the DASH standards. However, one skilled in the art will recognize that the formats such as but not limited to, a Matroska (MKV) container file format may be used to store streams of the media content without departing from this invention.

The performance of an adaptive bitrate streaming system in accordance with some embodiments of the invention can be significantly enhanced by encoding each portion of the source video in each of the alternative streams in such a way that the portion of video is encoded in each stream as a single (or at least one) closed group of pictures (GOP) starting with an Instantaneous Decoder Refresh (IDR) frame, which is an intra-frame. The playback device can switch between the alternative streams used during normal playback at the completion of the playback of a Cluster and, irrespective of the stream from which a Cluster is obtained, the first frame in the Cluster will be an IDR frame that can be decoded without reference to any encoded media other than the encoded media contained within the Cluster element.

In a number of embodiments, the playback device obtains information concerning each of the available streams from a top level index file or manifest file and selects one or more streams to utilize in the playback of the media. The playback device can then obtain header information from the container files containing the one or more bit streams or streams, and file headers and/or the manifest file can provide information concerning the decoding of the streams. The playback device can also request index information that indexes the encoded media stored within the relevant container files. The index information including, but not limited to, the metadata associating segments of the audio content with frames of the video content can be stored within the container files, separately from the container files in the top level index or in separate index files. The index information can enable the playback device to request byte ranges corresponding to GOPs within the container file via HTTP from the server. The playback device may obtain the metadata information in the index information when the top level index file is received, request the files storing the index information after the top level index file is received or request the portions of the container file storing the index information in accordance with various embodiments of the invention. The playback device uses the index information to request portions of the media content from the alternative streams in accordance with some embodiments. Playback is continued with the playback device requesting portions of the encoded content from a stream having media content that is encoded at a bitrate that can be supported by the network conditions.

The encoding of source video for use in adaptive bitrate streaming systems that provide audio playback during trick-play modes and the playback of the media content in a trick-play mode using adaptive bitrate streaming in accordance with some embodiments of the invention is discussed further below.

Adaptive Streaming System Architecture

Turning now to the FIG. 1, an adaptive streaming system including playback devices that provide audio playback during playback in a trick-play mode in accordance with an embodiment of the invention is illustrated. The adaptive streaming system 10 includes a source encoder 12 configured to encode source media as a number of alternative streams. In the illustrated embodiment, the source encoder is a server. In other embodiments, the source encoder can be any processing device including a processor and sufficient resources to perform the transcoding of source media (including but not limited to video, audio, and/or subtitles). Typically, the source encoding server 12 generates a top level index to a plurality of container files containing the streams and/or metadata information, at least a plurality of which are alternative streams. Alternative streams are streams that encode the same media content in different ways. In many instances, alternative streams encode media content (such as, but not limited to, video content and/or audio content) at different maximum bitrates. In a number of embodiments, the alternative streams of video content are encoded with different resolutions and/or at different frame rates. The top level index file and the container files are uploaded to an HTTP server 14. A variety of playback devices can then use HTTP or another appropriate stateless protocol to request portions of the top level index file, other index files, and/or the container files via a network 16 such as the Internet.

In the illustrated embodiment, playback devices include personal computers 18, CE players, and mobile phones 20. In other embodiments, playback devices can include consumer electronics devices such as DVD players, Blu-ray players, televisions, set top boxes, video game consoles, tablets, and other devices that are capable of connecting to a server via HTTP and playing back encoded media. Although a specific architecture is shown in FIG. 1, any of a variety of architectures including systems that perform conventional streaming and not adaptive bitrate streaming can be utilized that enable playback devices to request portions of the top level index file and the container files in accordance with embodiments of the invention.

Playback Device

Some processes for providing methods and configuring systems in accordance with embodiments of this invention are executed by a playback device. The relevant components in a playback device that can perform the processes in accordance with an embodiment of the invention are shown in FIG. 2. One skilled in the art will recognize that playback device may include other components that are omitted for brevity without departing from described embodiments of this invention. The playback device 200 includes a processor 205, a non-volatile memory 210, and a volatile memory 215. The processor 205 is a processor, microprocessor, controller, or a combination of processors, microprocessor, and/or controllers that performs instructions stored in the volatile 215 or non-volatile memory 210 to manipulate data stored in the memory. The non-volatile memory 210 can store the processor instructions utilized to configure the playback device 200 to perform processes including processes in accordance with embodiments of the invention and/or data for the processes being utilized. In accordance with some embodiments, these instructions are included in a playback application that performs the playback of media content on a playback device. In accordance with various embodiments, the playback device software and/or firmware can be stored in any of a variety of non-transitory computer readable media appropriate to a specific application.

Servers

Some processes for providing methods and systems in accordance with embodiments of this invention are executed by the HTTP server; source encoding server; and/or local and network time servers. The relevant components in a server that performs one or more of these processes in accordance with embodiments of the invention are shown in FIG. 3. One skilled in the art will recognize that a server may include other components that are omitted for brevity without departing from the described embodiments of this invention. The server 300 includes a processor 305, a non-volatile memory 310, and a volatile memory 315. The processor 305 is a processor, microprocessor, controller, or a combination of processors, microprocessor, and/or controllers that performs instructions stored in the volatile 315 or non-volatile memory 310 to manipulate data stored in the memory. The non-volatile memory 310 can store the processor instructions utilized to configure the server 300 to perform processes including processes in accordance with embodiments of the invention and/or data for the processes being utilized. In accordance with some embodiments, instructions to perform encoding of media content are part of an encoding application. In accordance with various embodiments, the server software and/or firmware can be stored in any of a variety of non-transitory computer readable media appropriate to a specific application. Although a specific server is illustrated in FIG. 3, any of a variety of server configured to perform any number of processes can be utilized in accordance with embodiments of the invention.

Encoding of Streams of Audible and Video Content and Synchronization Information in Alternative Streams

In accordance with some embodiments, a playback device that provides audible content during playback in a trick-play mode receives the media content including video content and audio content; and the synchronization information associating segments of the audio content to frames in the video content from a content provider system using adaptive bitrate streaming. To provide the required data to provide media content via adaptive bitrate streaming, the audio content and the video content can be encoded in streams and synchronization information for the audio content and the video content can be generated and stored in a manner such that the information may be provided to the playback device. In accordance with some embodiments, the audio content is encoded in one stream at a specific maximum bitrate; and the video content is encoded into multiple streams that are encoded at varying maximum bitrates, resolutions, aspect ratios, and the like for use by different playback devices experiencing differing network traffic conditions. In accordance with some other embodiments, the audio content is encoded in multiple streams at varying maximum bitrate; and the video content is encoded into multiple streams that are encoded at varying maximum bitrates, resolutions, aspect ratios, and the like for use by different playback devices experiencing differing network traffic conditions. In a number of embodiments, the video content may also be encoded into trick-play streams that only include specific portions of the video content for use in providing trick-play modes during playback.

In accordance with some embodiments, the synchronization information is included in index information stored in a top level index file. In accordance with many embodiments, the synchronization information is stored in an index file pointed to by the top level index file. In a number of embodiments, the synchronization information may be stored as metadata in files storing portions of the streams of audio content and/or video content. In accordance with some embodiments, the synchronization information may be determined from a subtitle track or closed caption track which associates video presentation times to verbal commentary and/or dialog. A process performed by an encoder server system for encoding media content including audio content and video content as well as synchronization information in accordance with an embodiment of this invention is shown in FIG. 4.

Process 400 begins by receiving media content to be encoded (405). The media content includes audio content and video content. In accordance with some embodiments of this invention, the audio content is divided into segments and the video content includes frames where each frame provides information for one or more of the images in the video content. The process 400 then encodes the audio content and video content into streams (410). In accordance with many embodiments, the audio content is encoded in one stream at a specific maximum bitrate; and the video content is encoded into multiple streams that are encoded at varying maximum bitrates, resolutions, aspect ratios, and the like for use by different playback devices experiencing differing network traffic conditions. In accordance with some other embodiments, the audio content is encoded in multiple streams at varying maximum bitrate; and the video content is encoded into multiple streams that are encoded at varying maximum bitrates, resolutions, aspect ratios, and the like for use by different playback devices experiencing differing network traffic conditions. In a number of embodiments, the video content may also be encoded into trick-play streams that only include specific portions of the video content for use in providing trick-play modes during playback.

Process 400 obtains synchronization information that associates the presentation time of at least some of the portions of the audible content with the presentation time of specific frames of the video content (415). In accordance with some embodiments, the synchronization information may be received along with the media content. In accordance with many embodiments, the synchronization information may be generated by encoder server system. In accordance with a number of embodiments, the synchronization information may be received as an input of an operator of the encoder synchronization server. The synchronization information can then be encoded as index information for the media content (420).

The process 400 generates the container files and index files for the media content (425). In accordance with some embodiments, each stream (of both audio and video content) is divided into segments and each segment is placed in a separate container file. In accordance with many embodiments, the audio data is segmented such that each identified portion of the audio content is a single segment and stored in a separate container file. In accordance with many embodiments, the index information is placed in the top level index file. In accordance with a number of embodiments, the index information is placed in one or more index files that are pointed to by the top level index file. In still some other embodiments, index information including the synchronization information is stored in container files as metadata. In some particular embodiments, the synchronization information for a particular portion of the audio data is stored as metadata in the container file storing the segment including the particular portion of audio data.

Although one embodiment for encoding alternative streams of media content including audio content, video content, and synchronization information in accordance with one embodiment of the invention are described above. One skilled in the art will recognize that other processes for encoding the streams may be performed in accordance with some embodiments of the invention.

Playback of Media Content Including Providing Audio Data in a Trick-Play Mode

In accordance with some embodiments of the invention, a playback device provides audio content during playback in a trick play mode. To do so, the playback device stores audio content that is divided into portions, video content that includes images that are each associated with a frame, and synchronization data that associates a presentation time of a portion of the audio content with a presentation time of a frame for images in the audio content. A process for obtaining the audio content, video content and synchronization information in accordance with an embodiment of this invention is shown in FIG. 5.

The Process 500 receives the media content including audio content and video content (505). In accordance with some embodiments of the invention, the media content is received via adaptive bit rate streaming. In accordance with some other embodiments, the media content may be read from a memory. In accordance with still other embodiments, the media content is read from a non-transitory media storing the media content. The video content includes images. Each of the images is associated with a frame that provides information for forming the image. The audio content is divided into segments that have a playback duration that is approximately equal to or less than the playback duration of images associated with a frame in the video content. The audio content and the video content are stored in a buffer for playback as the media content is decoded by the playback device (510).

Synchronization information is received (515) in process 500. The synchronization information associated the presentation time(s) of one or more portions of the audio content to the presentation of a frame for images in the video content. In accordance with some embodiments, the synchronization information includes synchronization information for each portion of the audio content. In accordance with some other embodiments, the synchronization information includes synchronization information associated at least one portion of audio data with each frame of the video data. In accordance with still other embodiments, the synchronization information may only associate particular portions of the audio data with particular frames in the video data. The received synchronization information is stored in a memory for use during playback and more particularly for use during playback of audio content in a trick-play mode in accordance with some embodiments of this invention (520).

Although various processes for obtaining media content for use in providing playback of audio content in a trick-play mode are discussed above with reference to FIG. 5, one skilled in the art will recognize that other processes for obtaining media content may be performed in accordance with various embodiments of this invention.

In accordance with some particular embodiments of the invention, a playback device may obtain media content for use in providing playback of audio content in a trick-play mode using adaptive bit rate streaming. A process for obtaining audio content, video content, and synchronization information for use in providing playback of audio content in a trick-play mode in accordance with an embodiment of the invention is shown in FIG. 6.

In process 600, the playback device receives an index file from a content provider system (605). The playback device uses the index file to request portions of the audio content and video content from content provider system (630). In accordance with some embodiments of the invention, the playback device monitors the network bandwidth for communications over the network between the playback device and the content provider system; and selects streams of the audio and video content that are encoded at highest maximum bitrates that can be handled in accordance with the measured bandwidth. Systems and methods for selecting a stream and commencing playback and obtaining media content using adaptive bit rate streaming are further disclosed in more detail by U.S. patent application Ser. No. 13/251,061 entitled “Systems and Methods for Determining Available Bandwidth and Performing Initial Stream Selection When Commencing Streaming Using Hypertext Transfer Protocol” and U.S. patent application Ser. No. 13/339,992 entitled “Systems and Methods for Performing Multiphase Adaptive Bitrate Streaming,” the disclosures of which are hereby incorporated by reference in their entirety. The requested portions of audio and video content are received by the playback device (635). The audio and video contents are then prepared from the received portions (640) by the playback and provided to a buffer in the playback device to store for presentation (645) by a client application. One skilled in the art will note that the requesting (630), receiving (635), generating (640) and providing of the audio and video content may be performed iteratively until all of the audio and video contents of the media content is received by the playback device in accordance with adaptive bitrate streaming processes.

The playback device also obtains the synchronization information that associates the presentation of segments of the audio content to the presentation of frames of the video content (650). In accordance with some embodiments, the synchronization information may be read from the top level index file when the top level index file is received system during an initial start-up of an adaptive bitrate streaming process. In accordance with some other embodiments, the playback device reads a pointer to an index file from the top level index file; and requests and receives the index file from content provider system during an initial start-up of an adaptive bitrate streaming process. In accordance with still other embodiments, the synchronization information is received as metadata during the streaming of the audio and/or video contents. In accordance with some embodiments, the synchronization information may be determined from a subtitle track or closed caption track which associates video presentation times to verbal commentary and/or dialog. The synchronization information is then provided to the client playback application (655) which may store the synchronization as a data structure in memory for use during playback.

Although a process for obtaining media content for use in providing playback of audio content in a trick-play mode using adaptive bitrate streaming is discussed above with reference to FIG. 6, one skilled in the art will recognize that other processes for obtaining media content may be performed in accordance with various embodiments of this invention.

Provision of Audible Playback During a Trick-Play Mode

Sometimes the audible content in a media presentation can indicate a portion of media content that may be of interest to a user. For example, the media content of a sporting event may include audible content of a play-by description of the plays. A user can find portion of content that is of interest, such as a score or important play from this description while using a trick-play to find event of interest in the content. As such, some playback systems may want to provide audible content during playback during a trick-play mode to enhance the user experience. A process performed by a playback device to provide audible playback during a trick-play mode in accordance with an embodiment of this invention is shown in FIG. 7.

In process 700, the playback device receives a request to present the media content in a trick-play mode (705). In accordance with some embodiments, trick-play modes can include, but are not limited to, fast forward, rewind, forward seek and backward seek. In accordance with many embodiments, the trick-play mode command is received as input via an Input/Output (I/O) device. Examples of I/O devices in accordance with various embodiments of this invention include, but are not limited to, a keyboard, a mouse, a touch screen, and the like. The playback device determines the next frame of video content to present based upon the current trick-play mode initiated by the command (710). In accordance with some embodiments of the invention, the next frame to present is based upon the trick-play mode being employed and the selected speed of the trick-play mode. In accordance with some other embodiments, the next frame may be determined based upon a predetermined list of images to present such as, but not limited to, predetermined images presented in a seek a mode.

The playback device determines the segment(s) of audio content associated with the next frame using the synchronization information stored by the playback device (712). The segment(s) of audio content associated with the next frame to present is compared to the segments of audio content associated with the current frame of video content being presented (715). If the segment(s) of audio content associated with the next frame is the same as the segment(s) of audio content associated with the current frame being presented, the current audio segment(s) being presented is used for playback (730). In accordance with some embodiments, the use of the current audio content may include presenting the current segments of audio content a second time. In other embodiments, the use of the current audio segment may include allowing the entire segment and/or segments subsequent to the current segment of audio data to be presented. The next frame is then obtained from the buffer for presentation (735). If the segment(s) of audio content associated with the next frame is different from the segment(s) of audio content associated with the current frame being presented, the playback device then obtains the segment(s) of audio content associated with the next frame from the buffer(s) storing the audio content (725). One skilled in the art will appreciate that duration of the presentation of the audio segment(s) may be different than the presentation time of the frame. Thus, a queue of audio segments to play may be maintained and the obtained audio segments are added to the queue (727).

The next frame is presented using the display device and the audio segments are presented using the audio system (735) In accordance with many embodiments, the audio segments are presented in the order provided by the queue. In accordance with some of these embodiments, the queue may provide indications of when an audio segment is associated with multiple segments requiring either re-presenting the audio segment during playback or providing subsequent segments until the frame associated audio segment changes.

In accordance with some embodiments, a display of “scrubbers” indicating the presentation time of the currently presented audio segment with reference to an overall presentation time and the presentation time of the current image being presented with reference to the overall presentation time are generated to indicate the current audio segment and image being presented (740). An example of a display during playback with audio content having scrubbers indicating the presentation times of the video and audio contents in accordance with an embodiment of the invention is shown in FIG. 8. Display 800 is currently showing a frame of video content having a presentation time indicated by the dot on a video scrubber 805. The presentation time of the audio content being presented via the audio system is shown on audio scrubber 810. As can be seen in FIG. 8, there may not be a one to one correspondence between the presentation time of the images in the trick-play mode and the presentation of the audio segments, the video scrubber 805 and audio scrubber 810 are useful to see the difference in the presentation time between the audio and video contents being presented to allow a user to find the presentation time of desired content.

Returning to FIG. 7, the current image and the audio segment associated with the image are presented (745). In accordance with some embodiments, the audio content associated with a frame of an image is played while the image is being presented and the audio segment being presented changes when the frame associated with the image being presented changes. In accordance with some other embodiments, a queue of the audio content to present is maintained and the audio content is played back according to the queue, independent of the images being presented in the trick-play mode. In accordance with embodiments providing the “scrubbers”, the display of generated “scrubbers” are overlaid onto the image being presented (750).

The playback device then determines whether the trick-play mode is completed (755). In accordance with some embodiments, the indication of the completion of a trick-play mode may be the receiving of an input command indicating the resumption of normal playback. In accordance with other embodiments, the indication of the completion of trick play may be reaching a predetermined portion of the media content. If the trick-play mode is completed, process 700 ends and conventional playback is resumed. If the playback mode is not completed, the process 700 repeats from the determination of the next frame to present and using the previously determined next frame as the current frame.

Although various processes for providing audio playback during a trick-play mode in accordance with an embodiment of the invention are described above with reference to FIG. 7, other processes may be performed by a playback device to provide audio playback during a trick-play mode in accordance with embodiments of the invention.

Although the present invention has been described in certain specific aspects, many additional modifications and variations would be apparent to those skilled in the art. It is therefore to be understood that the present invention may be practiced otherwise than specifically described, including various changes in the implementation such as utilizing encoders and decoders that support features beyond those specified within a particular standard with which they comply, without departing from the scope and spirit of the present invention. Thus, embodiments of the present invention should be considered in all respects as illustrative and not restrictive.

Claims

1. A method for providing playback of audio content in a trick-play mode during playback of media content, the method comprising: storing synchronization information in a memory in a playback device wherein the synchronization information associates a presentation time of each of one or more of a plurality of segments of a portion of audio content with a presentation time of a set of one or more frames, where the set of frames comprises a predetermined subset of frames of a portion of video content to present in the trick-play mode;receiving a command for playback of the media content in the trick-play mode in the playback device;determining a first frame from the predetermined subset of frames of the portion of video content to present in the trick-play mode using the playback device;determining each segment from the plurality of segments of the portion of audio content associated with the first frame from the synchronization information stored in memory;presenting each of the segments of the plurality of segments of audio content associated with the first frame during playback in the trick-play mode using the playback device;determining a next frame from the predetermined subset of frames of the portion of video content to present in the trick-play mode using the playback device;determining each segment from the plurality of segments of the portion of audio content associated with the next frame from the synchronization information stored in memory; andpresenting each of the segments of the plurality of segments of audio content associated with the next frame during playback in the trick-play mode using the playback device.
2. The method of claim 1 further comprising: presenting the next frame on a display of the playback device.
3. The method of claim 2 wherein the presenting of each segment of the plurality of segments of the audio content associated with the next frame is performed concurrently with the presentation of the next frame.
4. The method of claim 2 further comprising: adding each segment of the plurality of segments of the audio content associated with the next frame to a queue in response to a determination of each of the plurality of segments associated with the frame using the playback device; andwherein each segment of the plurality segments associated with the next frame is presented based upon the queue and is independent of the presentation of the next frame from the video content on the display of the playback device.
5. The method of claim 2 further comprising: generating a display of a scrubber for the video content indicating a presentation time of the next frame from the video content using the playback device; andoverlaying the display of the scrubber for the video content over the presentation of an image on the display.
6. The method of claim 2 further comprising: generating a display of a scrubber for the audio content indicating a presentation time of each segment of the audio content associated with the next frame using the playback device; andoverlaying the display of the scrubber for the audio content over the presentation of an image on the display.
7. The method of claim 6 wherein the scrubber for the audio content is separate from a scrubber for the video content in the display.
8. The method of claim 1 further comprising: receiving data for the plurality of segments of the audio content portion, data for the plurality of frames of the video content, and the synchronization information in the playback device from a content provider system over a network using adaptive bitrate streaming.
9. The method of claim 8 wherein the receiving of the data for the plurality of segments of the audio content portion, the data for the plurality of frames of the video content, and the synchronization information in the playback device from a content provider system over a network using adaptive bitrate streaming comprises: receiving a top level index file from the content provider system in the playback device over a network wherein the top level index files identifies a plurality of alternative streams of video content wherein at least a portion of the plurality of alternative streams are encoded at different maximum bitrates, and at least one stream of audio content; andrequesting portions of the video content from the plurality of alternative streams from the content provider system using the playback device based upon network bandwidth between the playback device and the content provider system;receiving the requested portions of the video content in the playback device in response to the requests;generating the plurality of frames of the video content in the playback device from the portions of video content received by the playback device;storing the plurality of frames in a buffer of the playback device;requesting portions of the audio content from the at least one streams of audio content from the content provider system using the playback device;receiving the requested portions of the audio content in the playback device;generating the plurality of segments of the audio content from the portions of audio content received using the playback device;storing the plurality of segments of audio content in a buffer of the playback device;obtaining the synchronization information from the content provider system using the playback device based upon information in the top level index file; andstoring the synchronization information in the memory of the playback device.
10. The method of claim 9, wherein the obtaining of the synchronization information comprises: reading a pointer to a file including the synchronization information from the top level index file using the playback device;requesting the file from content provider system using the playback device; andreceiving the requested file in the playback device.
11. A playback device comprising: a memory;a network interface; anda processor that reads instructions stored in the memory that direct the processor to: store synchronization information in a memory wherein the synchronization information associates a presentation time of each of one or more of a plurality of segments of a portion of audio content with a presentation time of a set of one or more frames, where the set of frames comprises a predetermined subset of frames of a portion of video content to present in a trick-play mode;receive a command for playback of media content in the trick-play mode;determine a first frame from the predetermined subset of frames of the portion of video content to present in the trick-play mode;determine each segment from the plurality of segments of the portion of audio content associated with the first frame from the synchronization information stored in memory;present each of the segments of the plurality of segments of audio content associated with the first frame during playback in the trick-play mode using the playback device;determine a next frame from the predetermined subset of frames of the portion of video content plurality of images to present based upon hi the trick-play mode;determine each segment from the plurality of segments of the portion of audio content associated with the next frame from the synchronization information stored in memory; andpresent each of the segments of the plurality of segments of audio content associated with the next frame during playback in the trick-play mode using the playback device.
12. The playback device of claim 11 wherein the instructions further direct the processor to: present the next frame on a display of the playback device.
13. The playback device of claim 12 wherein the presenting of each of the segments of the audio content associated with the next frame is performed concurrently with the presentation of the next frame.
14. The playback device of claim 12 wherein the instructions further direct the processor to: add each of the segments of the audio content associated with the next frame to a queue in response to a determination of each of the segments is associated with the next frame; andwherein the presenting each of the segments is based upon the queue and is independent of the presenting of the next frame from the video content on the display of the playback device.
15. The playback device of claim 12 wherein the instructions further direct the processor to: generate a display of a scrubber for the video content indicating a presentation time of the next frame from the video content; andoverlay the display of the scrubber for the video content over the presentation of the next frame on the display.
16. The playback device of claim 12 wherein the instructions further direct the processor to: generate a display of a scrubber for the audio content indicating a presentation time of each segment associated with next frame of the audio content as each segment is being presented; andoverlay the display of the scrubber for the audio content over the presentation of an image on the display.
17. The playback device of claim 16 wherein the scrubber for the audio content is separate from a scrubber for the video content in the display.
18. The playback device of claim 11 wherein the instructions further direct the processor to: receive data for the plurality of segments of the audio content portion, data for the plurality of frames of the video content, and the synchronization information from a content provider system over a network using adaptive bitrate streaming.
19. The playback of claim 18 wherein instructions to receive of the data for the plurality of segments of the audio content portion, the data for the plurality of frames of the video content, and the synchronization information from a content provider system over a network using adaptive bitrate streaming further direct the processor to: receive a top level index file from the content provider system from a content provider system over a network wherein the top level index files identifies a plurality of alternative streams of video content wherein at least a portion of the plurality of alternative streams are encoded at different maximum bitrates, and at least one stream of audio content;request portions of the video content from the plurality of alternative stream from the content provider system based upon network bandwidth between the playback device and the content provider system;receive the requested portions of the video content;generate the plurality of frames of the video content from the portions of video content received;store the plurality of frames in a buffer;request portions of the audio content from the at least one streams of audio content from the content provider system;receive the requested portions of the audio content;generate the plurality of segments of the audio content from the portions of audio content received;store the plurality of segments of audio content in a buffer of the playback device;obtain the synchronization information from the content provider system based upon information in the top level index file; andstore the synchronization information in the memory of the playback device.
20. A non-transitory machine readable medium containing processor instructions, where execution of the instructions by a processor causes the processor to perform a process comprising: storing synchronization information in a memory wherein the synchronization information associates a presentation time of each of one or more of a plurality of segments of a portion of audio content with a presentation time of a set of one or more frames, where the set of frames comprises a predetermined subset of frames of a portion of video content to present in a trick-play mode;receiving a command for playback of media content in the trick-play mode;determining a first frame from the predetermined subset of frames of the portion of video content to present in the trick-play mode;determining each segment from the plurality of segments of the portion of audio content associated with the first frame from the synchronization information stored in memory;presenting each of the segments of the plurality of segments of audio content associated with the first frame during playback in the trick-play mode;determining a next frame from the predetermined subset of the portion of video content to present in the trick-play mode;determining each segment from the plurality of segments of the portion of audio content associated with the next frame from the synchronization information stored in memory; andpresenting each of the segments of the plurality of segments of audio content associated with the next frame during playback in the trick-play mode.

CROSS-REFERENCE TO RELATED APPLICATIONS

The current application is a continuation of U.S. patent application Ser. No. 16/879,440, filed May 20, 2020, entitled “Systems and Methods for Providing Audio Content During Trick-Play Playback” to Frantz et al., which is a continuation of U.S. patent application Ser. No. 16/298,747, filed Mar. 11, 2019, entitled “Systems and Methods for Providing Audio Content During Trick-Play Playback to Frantz et al., which is a continuation of U.S. patent application Ser. No. 15/163,370, filed May 24, 2016, entitled “Systems and Methods for Providing Audio Content During Trick-Play Playback” to Frantz et al., the disclosure of which is incorporated herein by reference in its entirety.

US Referenced Citations (225)

Number	Name	Date	Kind
5400401	Wasilewski et al.	Mar 1995	A
5574785	Ueno et al.	Nov 1996	A
5600721	Kitazato	Feb 1997	A
5621794	Matsuda et al.	Apr 1997	A
5642338	Fukushima et al.	Jun 1997	A
5751280	Abbott	May 1998	A
5805700	Nardone et al.	Sep 1998	A
5813010	Kurano et al.	Sep 1998	A
5854873	Mori et al.	Dec 1998	A
5907658	Murase et al.	May 1999	A
5923869	Kashiwagi et al.	Jul 1999	A
5953485	Abecassis et al.	Sep 1999	A
6002834	Hirabayashi et al.	Dec 1999	A
6009237	Hirabayashi et al.	Dec 1999	A
6016381	Taira et al.	Jan 2000	A
6057832	Lev et al.	May 2000	A
6065050	DeMoney	May 2000	A
6181383	Wine et al.	Jan 2001	B1
6266483	Okada et al.	Jul 2001	B1
6282320	Hasegawa et al.	Aug 2001	B1
6320905	Konstantinides	Nov 2001	B1
6351538	Uz	Feb 2002	B1
6373803	Ando et al.	Apr 2002	B2
6404978	Abe	Jun 2002	B1
6415031	Colligan et al.	Jul 2002	B1
6434321	Eshima et al.	Aug 2002	B1
6445877	Okada et al.	Sep 2002	B1
6453115	Boyle	Sep 2002	B1
6453116	Ando et al.	Sep 2002	B1
6504873	Vehvilaeinen	Jan 2003	B1
6512883	Shim et al.	Jan 2003	B2
6594699	Sahai et al.	Jul 2003	B1
6654933	Abbott et al.	Nov 2003	B1
6671408	Kaku	Dec 2003	B1
6690838	Zhou	Feb 2004	B2
6724944	Kalevo et al.	Apr 2004	B1
6728678	Bhadkamkar et al.	Apr 2004	B2
6751623	Basso et al.	Jun 2004	B1
6792047	Bixby et al.	Sep 2004	B1
6801544	Rijckaert et al.	Oct 2004	B1
6813437	Ando et al.	Nov 2004	B2
6829781	Bhagavath	Dec 2004	B1
6871006	Oguz et al.	Mar 2005	B1
6912513	Candelore	Jun 2005	B1
6920181	Porter et al.	Jul 2005	B1
6931531	Takahashi	Aug 2005	B1
6957350	Demos	Oct 2005	B1
6970564	Kubota et al.	Nov 2005	B1
6983079	Kim	Jan 2006	B2
6995311	Stevenson	Feb 2006	B2
7006757	Ando et al.	Feb 2006	B2
7007170	Morten	Feb 2006	B2
7020287	Unger	Mar 2006	B2
7116894	Chatterton	Oct 2006	B1
7151832	Fetkovich et al.	Dec 2006	B1
7188183	Paul et al.	Mar 2007	B1
7212726	Zetts	May 2007	B2
7242772	Tehranchi	Jul 2007	B1
7274861	Yahata et al.	Sep 2007	B2
7295673	Grab et al.	Nov 2007	B2
7349886	Morten et al.	Mar 2008	B2
7352956	Winter et al.	Apr 2008	B1
7382879	Miller	Jun 2008	B1
7397853	Kwon et al.	Jul 2008	B2
7400679	Kwon et al.	Jul 2008	B2
7414550	Sudharsanan et al.	Aug 2008	B1
7418132	Hoshuyama	Aug 2008	B2
7432940	Brook et al.	Oct 2008	B2
7457415	Reitmeier et al.	Nov 2008	B2
7499930	Naka et al.	Mar 2009	B2
7546641	Robert et al.	Jun 2009	B2
7624337	Sull et al.	Nov 2009	B2
7639921	Seo et al.	Dec 2009	B2
7640435	Morten	Dec 2009	B2
7711052	Hannuksela et al.	May 2010	B2
7853980	Pedlow, Jr. et al.	Dec 2010	B2
7864186	Robotham et al.	Jan 2011	B2
7945143	Yahata et al.	May 2011	B2
7991270	Yahata et al.	Aug 2011	B2
8059942	Yahata et al.	Nov 2011	B2
8086087	Kato	Dec 2011	B2
8131875	Chen	Mar 2012	B1
8169916	Pai et al.	May 2012	B1
8219553	Bedingfield, Sr.	Jul 2012	B2
8238722	Bhadkamkar	Aug 2012	B2
8243924	Chen et al.	Aug 2012	B2
8286213	Seo	Oct 2012	B2
8312079	Newsome et al.	Nov 2012	B2
8369421	Kadono et al.	Feb 2013	B2
8548298	Yahata et al.	Oct 2013	B2
8640181	Inzerillo	Jan 2014	B1
8649669	Braness et al.	Feb 2014	B2
8655156	Begen	Feb 2014	B2
8677397	Van De Pol et al.	Mar 2014	B2
8683066	Hurst et al.	Mar 2014	B2
8782268	Pyle et al.	Jul 2014	B2
8799757	Jewsbury	Aug 2014	B2
8819116	Tomay et al.	Aug 2014	B1
8832297	Soroushian et al.	Sep 2014	B2
8849950	Stockhammer et al.	Sep 2014	B2
8856371	Kariti et al.	Oct 2014	B2
9038116	Knox et al.	May 2015	B1
9281011	Erofeev et al.	Mar 2016	B2
10231001	Frantz et al.	Mar 2019	B2
10701417	Frantz et al.	Jun 2020	B2
11044502	Frantz et al.	Jun 2021	B2
20010021276	Zhou	Sep 2001	A1
20010052077	Fung et al.	Dec 2001	A1
20010052127	Seo et al.	Dec 2001	A1
20020003948	Himeno et al.	Jan 2002	A1
20020048450	Zetts	Apr 2002	A1
20020067432	Kondo et al.	Jun 2002	A1
20020135607	Kato et al.	Sep 2002	A1
20020141503	Kobayashi et al.	Oct 2002	A1
20020150126	Kovacevic	Oct 2002	A1
20020154779	Asano et al.	Oct 2002	A1
20020164024	Arakawa et al.	Nov 2002	A1
20020169971	Asano et al.	Nov 2002	A1
20030002577	Pinder	Jan 2003	A1
20030043924	Haddad et al.	Mar 2003	A1
20030044080	Frishman et al.	Mar 2003	A1
20030044166	Haddad	Mar 2003	A1
20030053541	Sun et al.	Mar 2003	A1
20030063675	Kang et al.	Apr 2003	A1
20030077071	Lin et al.	Apr 2003	A1
20030135742	Evans	Jul 2003	A1
20030142594	Tsumagari et al.	Jul 2003	A1
20030206717	Yogeshwar et al.	Nov 2003	A1
20040001594	Krishnaswamy et al.	Jan 2004	A1
20040022391	Obrien	Feb 2004	A1
20040028227	Yu	Feb 2004	A1
20040037421	Truman	Feb 2004	A1
20040047592	Seo et al.	Mar 2004	A1
20040047607	Seo et al.	Mar 2004	A1
20040076237	Kadono et al.	Apr 2004	A1
20040081333	Grab et al.	Apr 2004	A1
20040093494	Nishimoto et al.	May 2004	A1
20040101059	Joch et al.	May 2004	A1
20040107356	Shamoon et al.	Jun 2004	A1
20050013494	Srinivasan et al.	Jan 2005	A1
20050063541	Candelore	Mar 2005	A1
20050076232	Kawaguchi	Apr 2005	A1
20050144468	Northcutt	Jun 2005	A1
20050177741	Chen et al.	Aug 2005	A1
20050243912	Kwon et al.	Nov 2005	A1
20050246738	Lockett	Nov 2005	A1
20050265555	Pippuri	Dec 2005	A1
20060013568	Rodriguez	Jan 2006	A1
20060093320	Hallberg et al.	May 2006	A1
20060165163	Burazerovic et al.	Jul 2006	A1
20060269221	Hashimoto et al.	Nov 2006	A1
20070002946	Bouton et al.	Jan 2007	A1
20070006063	Jewsbury et al.	Jan 2007	A1
20070047645	Takashima	Mar 2007	A1
20070067472	Maertens et al.	Mar 2007	A1
20070083467	Lindahl et al.	Apr 2007	A1
20070180051	Kelly et al.	Aug 2007	A1
20070208571	Lemieux	Sep 2007	A1
20080022350	Hostyn et al.	Jan 2008	A1
20080086570	Dey et al.	Apr 2008	A1
20080101718	Yang et al.	May 2008	A1
20080131075	Pontual	Jun 2008	A1
20080137847	Candelore et al.	Jun 2008	A1
20090010622	Yahata et al.	Jan 2009	A1
20090013195	Ochi et al.	Jan 2009	A1
20090077143	Macy, Jr.	Mar 2009	A1
20090106082	Senti et al.	Apr 2009	A1
20090132599	Soroushian et al.	May 2009	A1
20090178090	Oztaskent	Jul 2009	A1
20090222854	Cansler et al.	Sep 2009	A1
20090249081	Zayas	Oct 2009	A1
20090282162	Mehrotra et al.	Nov 2009	A1
20090310819	Hatano	Dec 2009	A1
20090326930	Kawashima et al.	Dec 2009	A1
20100142915	Mcdermott et al.	Jun 2010	A1
20100172625	Lee	Jul 2010	A1
20110010466	Fan et al.	Jan 2011	A1
20110058675	Brueck et al.	Mar 2011	A1
20110096828	Chen et al.	Apr 2011	A1
20110103374	Lajoie et al.	May 2011	A1
20110131618	Hasek	Jun 2011	A1
20110135090	Chan et al.	Jun 2011	A1
20110145858	Philpott et al.	Jun 2011	A1
20110173345	Knox et al.	Jul 2011	A1
20110179185	Wang et al.	Jul 2011	A1
20110197261	Dong et al.	Aug 2011	A1
20110217025	Begen et al.	Sep 2011	A1
20110246661	Manzari et al.	Oct 2011	A1
20110296048	Knox et al.	Dec 2011	A1
20110314130	Strasman	Dec 2011	A1
20120005312	Mcgowan et al.	Jan 2012	A1
20120042090	Chen et al.	Feb 2012	A1
20120047542	Lewis et al.	Feb 2012	A1
20120076209	Matsunaga et al.	Mar 2012	A1
20120081567	Côté et al.	Apr 2012	A1
20120094171	Guen	Apr 2012	A1
20120110120	Willig et al.	May 2012	A1
20120113270	Spracklen et al.	May 2012	A1
20120140018	Pikin et al.	Jun 2012	A1
20120167132	Mathews et al.	Jun 2012	A1
20120278842	Look et al.	Nov 2012	A1
20120311174	Bichot et al.	Dec 2012	A1
20120331167	Hunt	Dec 2012	A1
20130007200	van der Schaar et al.	Jan 2013	A1
20130007297	Soroushian et al.	Jan 2013	A1
20130013803	Bichot et al.	Jan 2013	A1
20130080267	McGowan	Mar 2013	A1
20130141643	Carson et al.	Jun 2013	A1
20130336379	Erofeev et al.	Dec 2013	A1
20130336412	Erofeev et al.	Dec 2013	A1
20140140253	Lohmar et al.	May 2014	A1
20140143799	Nagorniak et al.	May 2014	A1
20140149557	Lohmar et al.	May 2014	A1
20140282262	Gregotski et al.	Sep 2014	A1
20140359681	Amidei	Dec 2014	A1
20150131966	Zurek et al.	May 2015	A1
20150208020	Castiglione	Jul 2015	A1
20150288530	Oyman	Oct 2015	A1
20160065635	Besehanic	Mar 2016	A1
20160105675	Tourapis	Apr 2016	A1
20160156873	Toye	Jun 2016	A1
20160173931	Eber	Jun 2016	A1
20170347136	Frantz et al.	Nov 2017	A1
20190208239	Frantz et al.	Jul 2019	A1
20200351530	Frantz et al.	Nov 2020	A1

Foreign Referenced Citations (49)

Number	Date	Country
2237293	Jul 1997	CA
1784737	Jun 2006	CN
101106723	Jan 2008	CN
101106724	Jan 2008	CN
101662689	Mar 2010	CN
101674486	Mar 2010	CN
103053170	Apr 2013	CN
109937448	Jun 2019	CN
109937448	Feb 2021	CN
1453319	Sep 2004	EP
1283640	Oct 2006	EP
2180664	Apr 2010	EP
2360923	Aug 2011	EP
2509073	Oct 2012	EP
3465460	Apr 2019	EP
11-355719	Dec 1999	JP
2002278597	Sep 2002	JP
2004112176	Apr 2004	JP
2005086362	Mar 2005	JP
2008546220	Dec 2008	JP
2013126184	Jun 2013	JP
2013141254	Jul 2013	JP
2013201606	Oct 2013	JP
2013539250	Oct 2013	JP
2014506430	Mar 2014	JP
2014529967	Nov 2014	JP
2019-517219	Jun 2019	JP
2021040342	Mar 2021	JP
7078697	May 2022	JP
20040039852	May 2004	KR
20060106250	Oct 2006	KR
20070096406	Oct 2007	KR
10-2054654	Dec 2019	KR
2328040	Jun 2008	RU
2000049762	Aug 2000	WO
2000049763	Aug 2000	WO
2003047262	Jun 2003	WO
2004012378	Feb 2004	WO
2004100158	Nov 2004	WO
2005008385	Jan 2005	WO
2005015935	Feb 2005	WO
2009006302	Jan 2009	WO
2009109976	Sep 2009	WO
2011087449	Jul 2011	WO
2011101371	Aug 2011	WO
2011103364	Aug 2011	WO
2012094171	Jul 2012	WO
2016001051	Jan 2016	WO
2017205028	Nov 2017	WO

Non-Patent Literature Citations (96)

Entry
Extended European Search Report for European Application No. 17803261.1, Search completed Sep. 24, 2019, dated Oct. 2, 2019, 7 Pgs.
Information Technology—MPEG Systems Technologies—Part 7: Common Encryption in ISO Base Media File Format Files (ISO/IEC 23001-7), Apr. 2015, 24 pgs.
International Preliminary Report on Patentability for International Application PCT/US2013/042105, Report dated Dec. 16, 2014, dated Dec. 24, 2014, 8 Pgs.
International Preliminary Report on Patentability for International Application PCT/US2017/031097, Report dated Nov. 27, 2018, dated Dec. 6, 2018, 7 Pgs.
International Search Report and Written Opinion for International Application No. PCT/US13/42105, International Filing Date May 21, 2013, Search Completed Nov. 14, 2013, dated Dec. 3, 2013, 9 pgs.
International Search Report and Written Opinion for International Application No. PCT/US2017/031097, Search completed Jun. 29, 2017, dated Jul. 21, 2017, 8 Pgs.
ISO/IEC 14496-12 Information technology—Coding of audio-visual objects—Part 12: ISO base media file format, Feb. 2004 (“MPEG-4 Part 12 Standard”), 62 pgs.
ISO/IEC 14496-12:2008(E) Informational Technology—Coding of Audio-Visual Objects Part 12: ISO Base Media File Format, Oct. 2008, 120 pgs.
ISO/IEC FCD 23001-6 MPEG systems technologies Part 6: Dynamic adaptive streaming over HTTP (DASH), Jan. 28, 2011, 86 pgs.
Microsoft Corporation, Advanced Systems Format (ASF) Specification, Revision 01.20.03, Dec. 2004, 121 pgs.
MPEG-DASH presentation at Streaming Media West 2011, Nov. 2011, 14 pgs.
Pomelo, LLC Tech Memo, Analysis of Netflix's Security Framework for ‘Watch Instantly’ Service, Mar.-Apr. 2009, 18 pgs.
Server-Side Stream Repackaging (Streaming Video Technologies Panorama, Part 2), Jul. 2011, 15 pgs.
Text of ISO/IEC 23001-6: Dynamic adaptive streaming over HTTP (DASH), Oct. 2010, 71 pgs.
Universal Mobile Telecommunications System (UMTS), ETSI TS 126 233 V9.1.0 (Jun. 2011) 3GPP TS 26.233 version 9.1.0 Release 9, 18 pgs.
Universal Mobile Telecommunications Systems (UMTS); ETSI TS 126 244 V9.4.0 (May 2011) 3GPP TS 26.244 version 9.4.0 Release 9, 58 pgs.
“Apple HTTP Live Streaming specification”, Aug. 2017, 60 pgs.
“Data Encryption Decryption using AES Algorithm, Key and Salt with Java Cryptography Extension”, Available at https://www.digizol.com/2009/10/java-encrypt-decrypt-jce-salt.html, Oct. 200, 6 pgs.
“Delivering Live and On-Demand Smooth Streaming”, Microsoft Silverlight, 2009, 28 pgs.
“HTTP Based Adaptive Streaming over HSPA”, Apr. 2011, 73 pgs.
“HTTP Live Streaming”, Mar. 2011, 24 pgs.
“HTTP Live Streaming”, Sep. 2011, 33 pgs.
“Information Technology—Coding of Audio Visual Objects—Part 2: Visual”, International Standard, ISO/IEC 14496-2, Third Edition, Jun. 1, 2004, pp. 1-724. (presented in three parts).
“Java Cryptography Architecture API Specification & Reference”, Available at https://docs.oracle.com/javase/1.5.0/docs/guide/security/CryptoSpec.html, Jul. 25, 2004, 68 pgs.
“Java Cryptography Extension, javax.crypto.Cipher class”, Available at https://docs.oracle.com/javase/1.5.0/docs/api/javax/crypto/Cipher.html, 2004, 24 pgs.
“JCE Encryption—Data Encryption Standard (DES) Tutorial”, Available at https://mkyong.com/java/jce-encryption-data-encryption-standard-des-tutorial/, Feb. 25, 2009, 2 pgs.
“Live and On-Demand Video with Silverlight and IIS Smooth Streaming”, Microsoft Silverlight, Windows Server Internet Information Services 7.0, Feb. 2010, 15 pgs.
“Microsoft Smooth Streaming specification”, Jul. 22, 2013, 56 pgs.
“MPEG-2, Part 1, ISO/IEC 13818-1”, Information technology—Generic Coding of Moving Pictures and Associated Audio: Systems, 161 pgs., Nov. 13, 1994.
“MPEG-4, Part 14, ISO/IEC 14496-14”, Information technology—Coding of audio-visual objects, 18 pgs., Nov. 15, 20030.
“OpenDML AVI File Format Extensions Version 1.02”, OpenDMLAVI MJPEG File Format Subcommittee. Last revision: Feb. 28, 1996. Reformatting: Sep. 1997, 42 pgs.
“Single-Encode Streaming for Multiple Screen Delivery”, Telestream Wowza Media Systems, 2009, 6 pgs.
“The MPEG-DASH Standard for Multimedia Streaming Over the Internet”, IEEE MultiMedia, vol. 18, No. 4, 2011, 7 pgs.
“Windows Media Player 9”, Microsoft, Mar. 23, 2017, 3 pgs.
Abomhara et al., “Enhancing Selective Encryption for H.264/AVC Using Advanced Encryption Standard”, International Journal of computer Theory and Engineering, Apr. 2010, vol. 2, No. 2, pp. 223-229.
Alattar et al., A.M. “Improved selective encryption techniques for secure transmission of MPEG video bit-streams”, In Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348), vol. 4, IEEE, 1999, pp. 256-260.
Antoniou et al., “Adaptive Methods for the Transmission of Video Streams in Wireless Networks”, 2015, 50 pgs.
Apostolopoulos et al., “Secure Media Streaming and Secure Transcoding”, Multimedia Security Technologies for Digital Rights Management, 2006, 33 pgs.
Asai et al., “Essential Factors for Full-Interactive VOD Server: Video File System, Disk Scheduling, Network”, Proceedings of Globecom '95, Nov. 14-16, 1995, 6 pgs.
Beker et al., “Cipher Systems, The Protection of Communications”, 1982, 40 pgs.
Bocharov et al., “Portable Encoding of Audio-Video Objects, The Protected Interoperable File Format (PIFF)”, Microsoft Corporation, First Edition Sep. 8, 2009, 30 pgs.
Bulterman et al., “Synchronized Multimedia Integration Language (SMIL 3.0)”, W3C Recommendation, Dec. 1, 2008, https://www.w3.org/TR/2008/REC-SMIL3-20081201/, 321 pgs. (presented in five parts).
Cahill et al., “Locally Adaptive Deblocking Filter for Low Bit Rate Video”, Proceedings 2000 International Conference on Image Processing, Sep. 10-13, 2000, Vancouver, BC, Canada, 4 pgs.
Candelore, U.S. Appl. No. 60/372,901, filed Apr. 17, 2002, 5 pgs.
Chaddha et al., “A Frame-work for Live Multicast of Video Streams over the Internet”, Proceedings of 3rd IEEE International Conference on Image Processing, Sep. 19, 1996, Lausanne, Switzerland, 4 pgs.
Cheng, “Partial Encryption for Image and Video Communication”, Thesis, Fall 1998, 95 pgs.
Cheng et al., “Partial encryption of compressed images and videos”, IEEE Transactions on Signal Processing, vol. 48, No. 8, Aug. 2000, 33 pgs.
Cheung et al., “On the Use of Destination Set Grouping to Improve Fairness in Multicast Video Distribution”, Proceedings of IEEE INFOCOM'96, Conference on Computer Communications, vol. 2, IEEE, 1996, 23 pgs.
Collet, “Delivering Protected Content, An Approach for Next Generation Mobile Technologies”, Thesis, 2010, 84 pgs.
Diamantis et al., “Real Time Video Distribution using Publication through a Database”, Proceedings SIBGRAPI'98. International Symposium on Computer Graphics, Image Processing, and Vision (Cat. No. 98EX237), Oct. 1990, 8 pgs.
Dworkin, “Recommendation for Block Cipher Modes of Operation: Methods and Techniques”, NIST Special Publication 800-38A, 2001, 66 pgs.
Fang et al., “Real-time deblocking filter for MPEG-4 systems”, Asia-Pacific Conference on Circuits and Systems, Oct. 28-31, 2002, Bail, Indonesia, pp. 541-544.
Fecheyr-Lippens, “A Review of HTTP Live Streaming”, Jan. 2010, 38 pgs.
Fielding et al., “Hypertext Transfer Protocol—HTTP1.1”, Network Working Group, RFC 2616, Jun. 1999, 114 pgs.
Fukuda et al., “Reduction of Blocking Artifacts by Adaptive DCT Coefficient Estimation in Block-Based Video Coding”, Proceedings 2000 International Conference on Image Processing, Sep. 10-13, 2000, Vancouver, BC, Canada, pp. 969-972.
Huang, U.S. Pat. No. 7,729,426, U.S. Appl. No. 11/230,794, filed Sep. 20, 2005, 143 pgs.
Huang et al., “Adaptive MLP post-processing for block-based coded images”, IEEE Proceedings—Vision, Image and Signal Processing, vol. 147, No. 5, Oct. 2000, pp. 463-473.
Huang et al., “Architecture Design for Deblocking Filter in H.264/JVT/AVC”, 2003 International Conference on Multimedia and Expo., Jul. 6-9, 2003, Baltimore, MD, 4 pgs.
Jain et al., U.S. Appl. No. 61/522,623, filed Aug. 11, 2011, 44 pgs.
Jung et al., “Design and Implementation of an Enhanced Personal Video Recorder for DTV”, IEEE Transactions on Consumer Electronics, vol. 47, No. 4, Nov. 2001, 6 pgs.
Kalva, Hari “Delivering MPEG-4 Based Audio-Visual Services”, 2001, 113 pgs.
Kang et al., “Access Emulation and Buffering Techniques for Steaming of Non-Stream Format Video Files”, IEEE Transactions on Consumer Electronics, vol. 43, No. 3, Aug. 2001, 7 pgs.
Kim et al., “A Deblocking Filter with Two Separate Modes in Block-Based Video Coding”, IEEE transactions on circuits and systems for video technology, vol. 9, No. 1, 1999, pp. 156-160.
Kim et al., “Tree-Based Group Key Agreement”, Feb. 2004, 37 pgs.
Laukens, “Adaptive Streaming—A Brief Tutorial”, EBU Technical Review, 2011, 6 pgs.
Legault et al., “Professional Video Under 32-bit Windows Operating Systems”, SMPTE Journal, vol. 105, No. 12, Dec. 1996, 10 pgs.
Li et al., “Layered Video Multicast with Retransmission (LVMR): Evaluation of Hierarchical Rate Control”, Proceedings of IEEE INFOCOM'98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Gateway to the 21st Century, Cat. No. 98, vol. 3, 1998, 26 pgs.
List et al., “Adaptive deblocking filter”, IEEE transactions on circuits and systems for video technology, vol. 13, No. 7, Jul. 2003, pp. 614-619.
Massoudi et al., “Overview on Selective Encryption of Image and Video: Challenges and Perspectives”, EURASIP Journal on Information Security, Nov. 2008, 18 pgs.
McCanne et al., “Receiver-driven Layered Multicast”, Conference proceedings on Applications, technologies, architectures, and protocols for computer communications, Aug. 1996, 14 pgs.
Meier, “Reduction of Blocking Artifacts in Image and Video Coding”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 9, No. 3, Apr. 1999, pp. 490-500.
Nelson, “Smooth Streaming Deployment Guide”, Microsoft Expression Encoder, Aug. 2010, 66 pgs.
Newton et al., “Preserving Privacy by De-identifying Facial Images”, Carnegie Mellon University School of Computer Science, Technical Report, CMU-CS-03-119, Mar. 2003, 26 pgs.
O'Brien, U.S. Appl. No. 60/399,846, filed Jul. 30, 2002, 27 pgs.
O'Rourke, “Improved Image Decompression for Reduced Transform Coding Artifacts”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 5, No. 6, Dec. 1995, pp. 490-499.
Park et al., “A postprocessing method for reducing quantization effects in low bit-rate moving picture coding”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 9, No. 1, Feb. 1999, pp. 161-171.
Richardson, “H.264 and MPEG-4 Video Compression”, Wiley, 2003, 306 pgs. (presented in 2 parts).
Schulzrinne et al., “Real Time Streaming Protocol (RTSP)”, Internet Engineering Task Force, RFC 2326, Apr. 1998, 80 pgs.
Shinyusha, “Buu Player”, Windows 100%, Jan. 1, 2010, vol. 13, No. 1, p. 14.
Sima et al., “An Efficient Architecture for Adaptive Deblocking Filter of H.264 AVC Video Coding”, IEEE Transactions on Consumer Electronics, vol. 50, No. 1, Feb. 2004, pp. 292-296.
Spanos et al., “Performance Study of a Selective Encryption Scheme for the Security of Networked, Real-Time Video”, Proceedings of the Fourth International Conference on Computer Communications and Networks, IC3N'95, Sep. 20-23, 1995, Las Vegas, NV, pp. 2-10.
Srinivasan et al., “Windows Media Video 9: overview and applications”, Signal Processing: Image Communication, 2004, 25 pgs.
Stockhammer, “Dynamic Adaptive Streaming over HTTP—Standards and Design Principles”, Proceedings of the second annual ACM conference on Multimedia, Feb. 2011, pp. 133-145.
Timmerer et al., “HTTP Streaming of MPEG Media”, Proceedings of Streaming Day, 2010, 4 pgs.
Tiphaigne et al., “A Video Package for Torch”, Jun. 2004, 46 pgs.
Trappe et al., “Key Management and Distribution for Secure Multimedia Multicast”, IEEE Transaction on Multimedia, vol. 5, No. 4, Dec. 2003, pp. 544-557.
Van Deursen et al., “On Media Delivery Protocols in the Web”, 2010 IEEE International Conference on Multimedia and Expo, Jul. 19-23, 2010, 6 pgs.
Ventura, Guillermo Albaida “Streaming of Multimedia Learning Objects”, AG Integrated Communication System, Mar. 2003, 101 pgs.
Waggoner, “Compression for Great Digital Video”, 2002, 184 pgs.
Watanabem et al., “MPEG-2 decoder enables DTV trick plays”, esearcher System LSI Development Lab, Fujitsu Laboratories Ltd., Kawasaki, Japan, Jun. 2001, 2 pgs.
Wiegand, “Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG”, Jan. 2002, 70 pgs.
Willig et al., U.S. Appl. No. 61/409,285, filed Nov. 2, 2010, 43 pgs.
Yang et al., “Projection-Based Spatially Adaptive Reconstruction of Block-Transform Compressed Images”, IEEE Transactions on Image Processing, vol. 4, No. 7, Jul. 1995, pp. 896-908.
Yang et al., “Regularized Reconstruction to Reduce Blocking Artifacts of Block Discrete Cosine Transform Compressed Images”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 3, No. 6, Dec. 1993, pp. 421-432.
Yu et al., “Video deblocking with fine-grained scalable complexity for embedded mobile computing”, Proceedings 7th International Conference on Signal Processing, Aug. 31-Sep. 4, 2004, pp. 1173-1178.
Zakhor, “Iterative Procedures for Reduction of Blocking Effects in Transform Image Coding”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 2, No. 1, Mar. 1992, pp. 91-95.

Related Publications (1)

	Number	Date	Country
	20220046296 A1	Feb 2022	US

Continuations (3)

	Number	Date	Country
Parent	16879440	May 2020	US
Child	17352811		US
Parent	16298747	Mar 2019	US
Child	16879440		US
Parent	15163370	May 2016	US
Child	16298747		US

Systems and methods for providing audio content during trick-play playback

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications

Disclaimer

Abstract