The present invention relates generally to multimedia distribution and more specifically to interactive multimedia distribution systems.
Audio and/or video information can be provided in a variety of forms to consumer electronics devices, which can then display the information. A consumer electronics device that requires media in a fixed form such as a compact disk (CD) or digital video disk (DVD) is limited to playing the CDs or DVDs available to the user. In order to increase the amount of audio and/or video information accessible to a user at any given time, manufacturers of consumer electronics have sought to transfer audio and/or video information contained on fixed media to a storage device within the consumer electronics device. Systems that use internal storage provide added convenience, but typically limit the user to displaying the audio and/or video information contained on the storage device. Another approach to making more audio and/or video information available to users has been to provide the consumer electronics device with network connectivity. When a consumer electronics device is connected to a network, the audio and/or video information can be stored remotely and provided as desired to the consumer electronics device via the network. In many instances, consumer electronics devices are provided with the ability to extract audio and/or video information from fixed media, store audio and/or video information and obtain audio and/or video information via a network.
Embodiments of the present invention distribute multimedia over a network. In one aspect, embodiments of the present invention are capable of transcoding video encoded in a first format for distribution in accordance with a predetermined multi-channel protocol. In another aspect, embodiments of the present invention include a mechanism for automatic system updates. One embodiment of the invention includes a server connected to a client via a network and at least one storage device containing audio, video and/or overlay information formatted in accordance with a first format. In addition, the client includes a storage device that stores information indicative of the audio, video and/or overlay formats that the client is capable of decoding and the server is configured to transmit audio, video, overlay and control information via separate audio, video and overlay and control channels.
In a further embodiment, the server is configured to query the client to obtain the information indicative of the audio, video and/or overlay formats that the client is capable of decoding.
In another embodiment, the server is configured to transcode at least one of the stored audio, video and overlay information into a second format and the information indicative of the audio, video and/or overlay formats indicates that the client is capable of decoding audio, video or overlay encoded in the second format.
In a still further embodiment, the server is configured to obtain a list of available updates and the server is configured to determine updates that can be applied to the client based upon the information indicative of the audio, video and/or overlay formats that the client is capable of decoding.
Still another embodiment also includes a third device including a storage device that stores information concerning the capabilities of the third device. In addition, the server is configured to query the third device to obtain the stored information concerning the capabilities of the third device.
In a yet further embodiment, the server is further configured to determine the updates that can be applied to the client with reference to the information obtained from the third device concerning the capabilities of the third device.
In yet another embodiment, the server includes a storage device that stores information concerning the capabilities of the server.
In a further embodiment again, the server is configured to determine the updates that can be applied to the client with reference to the information concerning the capabilities of the server.
Another embodiment again of the invention includes a processor, a network interface configured to communicate with the processor and to receive packets of audio, video, overlay and control information on separate channels and a storage device containing information concerning the audio, video and overlay information formats that can be decoded by the processor.
In an additional further embodiment, the processor is configured to respond to a query received via the network interface by transmitting the stored information concerning the audio, video and overlay information formats that can be decoded by the processor via the network interface.
In another additional embodiment, the stored information is stored as an XML file.
A still further embodiment again includes a processor and a network interface in communication with the processor. In addition, the processor is configured to receive audio, video and overlay information encoded in a first format and transcode at least one of the audio, video and overlay information into a second format and the processor and network interface device are configured to transmit audio, video, overlay and control information.
In still another embodiment again, the processor and the network interface device are configured to transmit a query requesting information.
In a yet further embodiment again, the processor and the network interface device are configured to receive information indicative of the capabilities of an external device.
In yet another embodiment again, the processor is configured to parse the information to obtain a list of capabilities.
An additional further embodiment again includes a processor and a network interface in communication with the processor. In addition, the processor and network interface device are configured to obtain a list of available updates, the processor and network interface device are configured to query external devices concerning their capabilities, the processor is configured to determine updates to be provided to external devices based upon the list of available updates and the capabilities of the external devices and the processor and network interface device are configured to transmit audio, video, overlay and control information.
Another additional embodiment again further includes a storage device that contains information concerning the capabilities of the server. In addition, the processor is further configured to determine updates to be provided to external devices based upon the stored information concerning the capabilities of the server.
In another further embodiment, the capabilities of the external device include the communications protocol supported by each device, at least one communications protocol is supported by each available update and the processor is configured to determine the updates to apply to external devices by ensuring that each updated device will support the same communications protocol.
An embodiment of the method of the invention includes retrieving audio, video and overlay information, transcoding at least one of the audio, video and overlay information, transmitting audio, video, overlay and control information and time stamps associated with one or more of the audio, video, overlay and control information, receiving the audio, video, overlay and control information and the time stamps associated with one or more of the audio, video, overlay and control information, queuing the received information in separate audio, video and overlay queues, processing the queued information based on the time stamps associated with the information, transmitting a reporting indicating at least one time stamp of the processed information, receiving the report and recording information concerning the at least one time stamp contained within the received report.
A further embodiment of the method of the invention includes determining an appropriate format in which to transcode the audio, video or overlay information.
Another embodiment of the method of the invention includes determining the available updates and the version of the communication protocol supported in each update, determining the capabilities of each device including the version of the communication protocol supported by each device, determining the latest version of the communication protocol that can be supported by all devices provided the necessary updates are performed and perform the necessary updates.
a is a flow chart showing a process in accordance with an embodiment of the present invention for transcoding data for transmission via multiple communications channels;
Turning now to the drawings, embodiments of the present invention include at least one server connected to at least one client via a network and enable the distribution of audio and/or video information. In one aspect of many embodiments, the server can transmit a variety of information to a client. Each type of information is typically transmitted on a separate channel. In one embodiment, the information transmitted on the channels is obtained by transcoding stored information. In other instances, a stream of information is transcoded in real time. In another aspect of many embodiments, the server selects information to send to the client in response to user instructions forwarded to the server by the client on a control channel. In many embodiments, the servers can create the impression to the user that they are navigating through an interactive graphical user interface by providing an appropriate sequence of audio, video and/or overlay information to a client for display in response to a user's instructions. In order to achieve interactivity, the server typically maintains information concerning the state of the user interface being displayed by the client. In addition, the server can control the configuration of a client to reduce latency when transitioning from one user interface state to another in response to a user input. In numerous embodiments, the system is capable of distributing software updates.
An embodiment of a distribution system in accordance with the present invention is illustrated in
Although some clients possess extremely sophisticated computational abilities, many other clients have limited computational and storage capabilities. Therefore, clients in accordance with the present invention typically execute a very simple routine that does not vary directly in response to most user instructions. The bulk of the processing is shifted to the servers, which handle user input and implement the system's interactive functionality. The servers can control the information displayed by the clients in a very precise manner, which enables the servers to respond to users' requests by ensuring that the required information is displayed by the client almost immediately. Typically, the clients do not possess the capability to interpret the majority of user requests. The clients simply forward user requests to the server and display information provided to them by the server in the manner directed by the server. The operation of the server, network and clients is discussed below.
The servers 12, network 14 and clients are configured to enable the servers to transmit information to clients via the network. In one embodiment, the server and the clients communicate over a fixed network using the TCP/IP protocol. In other embodiments, other network communication protocols can be used and fixed connections can be replaced with wireless connections. The term network is used throughout to refer to any connectivity between a server and a client including a direct connection, a home network, a local area network, a wide area network, a private network and networks of networks such as the Internet.
The communication channels established between a server and a client in accordance with an embodiment of the present invention are conceptually illustrated in
The video channel 17b is used to communicate packetized video information from the server to the client. As will be discussed in greater detail below, the video channel is configured in accordance with the nature of video contained within the packets of video information. The packets of video information typically contain encoded frames of video. The frames may be part of a feature presentation or part of a menu or user interface. The term “feature presentation” is used throughout to describe a continuous video sequence such as a feature length film that typically plays linearly and does not require user interaction. The term “feature presentation” is meant in a broad sense and is not limited to feature length films, encompassing all types of prerecorded video and broadcast video streams.
The audio channel 17a is used to communicate packetized audio information. As with the video channel, the server specifies the characteristics of the audio channel. The audio data transmitted by the audio channel does not necessarily accompany video or overlay information. Many embodiments of the present invention offer the capability of distributing sound recordings (e.g., music). The audio information can also accompany video information transmitted on the video channel. In many instances the audio information is the sound track accompanying a “feature presentation”. However, the audio information can also be a sound effect forming part of a menu or user interface.
The overlay channel 17c is a channel that can be used by the server to transmit overlay information to the client. Overlays are graphics or text that can either be superimposed on frames of video or are themselves an entire picture that can be displayed without background video. Examples of overlays include subtitles accompanying a “feature presentation” or a highlighted menu option that is part of a menu or user interface. Overlay information can be encoded graphically or as text. In one embodiment, overlays are encoded in accordance with the jpeg file interchange format specified by the Joint Photographic Experts Group. In another embodiment, overlays are encoded as bit maps. The nature of the overlay information and of the overlay channel itself is usually specified by the server.
The control channel 19 is a channel that can be used by both the server and the client to transmit control information. Embodiments of systems in accordance with the present invention typically function more effectively when the control channel is configured to reliably communicate information between the server and the client. As will be discussed in greater detail below, the client can use the control channel to forward user instructions and timing information to the server. In turn, the server can use the control channel to establish the audio, video and overlay channels with the client and to provide instructions to the client concerning the manner in which it should display received audio, video and overlay information. In other embodiments, the audio, video and overlay channels are initialized by packets sent over each of the audio, video and overlay channels. The ability of the server and client to communicate over the control channel enables the overall system to interact with users. For example, a client in accordance with an embodiment of the present invention can use the control channel to forward user commands to the server. The server can then respond to the user commands by sending information to the client via the audio, video, overlay and/or control channels. Appropriate selection of the audio, video, overlay and/or control information can achieve such effects as an interactive menu or fast forwarding, pausing or rewinding of a feature presentation. The manner in which interactive features can be implemented in accordance with aspects of embodiments of the present invention is discussed further below.
In many embodiments of the present invention, communication over the network 14 is conducted in accordance with the TCP/IP protocol. In embodiments where the TCP/IP protocol is used, separate channels can be established by assigning a separate port address to each of the channels. In this way, packets of information can be sent across the network and a port address can be used to determine with which channel the packet is associated. In other embodiments, the UDP protocol is used in conjunction with the IP protocol to communicate information over the network. Other protocols can also be used to communicate information over a network in accordance with embodiments of the present invention and any variety of techniques can be used to create separate channels for the communication of audio, video, overlay and/or command information. In other embodiments, a cellular communication protocol can be used to establish the necessary channels between the client and the server. Alternatively, the channels can be found over a connection that conforms to the IEEE 1394 standard. In other embodiments, other network protocols can be used to communicate audio, video and/or overlay and/or command information. Indeed, different networks can be used to communicate different types of information and/or different sequences of the same type of information. Although many embodiments of the invention include separate channels, several embodiments combine audio, video, overlay and/or control information on a single channel.
The audio, video and overlay information sent by the server to the client via the audio, video and overlay channels determines the information that can be presented to a user by the client. As indicated above, this information can take a variety of forms. For example, the audio, video and overlay information can be associated with a sound recording or a feature presentation. In addition, the audio, video and/or overlay information can be associated with a user interface. In many instances, the audio, video and/or overlay information may not relate to the same content. Examples include overlays containing information about other available programming that are displayed over a feature presentation or symbol overlays that inform the user that a feature presentation is fast forwarding, pausing or being manipulated in some other fashion.
In many embodiments, the server obtains information for transmission by extracting the information from a file containing appropriately encoded audio, video and overlay information. In other embodiments the encoded audio, video and overlay information is received by the server as a stream of data. In several embodiments, the server receives audio, video and/or overlay information encoded in a first format and transcodes the audio, video and/or overlay information into a format appropriate for transmission. The first format may not be suitable for transmission for the reason that the client intended to receive the information is not capable of decoding information encoded in the first format. In many embodiments, quality of service requirements can cause a server to transcode information encoded in a first format to a format requiring a different data transmission rate. In embodiments that utilize quality of service determinations, the clients can provide information to the servers that enable the servers to make quality of service determinations. Another reason the first format may not be appropriate is that the server cannot directly extract audio, video and/or overlay information for transmission on separate channels, when the audio, video and/or overlay information is encoded in the first file format. Transcoding is discussed further below.
Having generally discussed the characteristics typical of embodiments of the system of the present invention, a closer examination of individual components of these systems is warranted. A server in accordance with an embodiment of the present invention is shown in
The storage device 24 can contain one or more data files. The data files may include one or more audio tracks, one or more pictures, one or more feature presentations and audio, video and/or overlays associated with one or more user interfaces. In one embodiment of the present invention, a stored data file can include more than one video track, more than one audio track, more than one overlay track and multimedia associated with a graphical user interface. In many embodiments of the present invention, the storage device 24 can include multimedia files similar to the multimedia files described in U.S. application Ser. No. 11/016,184 entitled “Multimedia Distribution System” to Van Zoest et al. filed on Dec. 17, 2004, the disclosure of which is incorporated herein by reference in its entirety.
In embodiments where the server is capable of transcoding audio, video and/or overlay stored on the storage device 24 or from another source for transmission, the transcoding can be performed by configuring the processor 21 using appropriate software. In other embodiments, the transcoding is performed using application specific circuitry within the server or the combination of a microprocessor and application specific circuitry. In one embodiment, a microprocessor decodes audio, video and/or overlay information and application specific circuitry encodes the decoded audio, video and/or overlay information for transmission. As indicated above, the transmitted audio, video and/or overlay information can be stored remotely. When the audio, video and/or overlay information is stored remotely, the server can receive the information and transcode the information in real time into a format appropriate for transmission on separate audio, video, overlay and/or control channels.
In embodiments of the present invention that communicate in accordance with the TCP/IP protocol, the network interface device 26 and/or the processor 21 implement a TCP/IP protocol stack. The TCP/IP protocol stack handles the transmission of information to and from the server on each of the appropriate channels. In other embodiments the network interface device can be implemented to support other protocols.
As an aside, one of ordinary skill in the art would appreciate that the server shown in
A client in accordance with an embodiment of the present invention is illustrated in
The graphics accelerator 44 can be used to perform repetitive processing associated with generating video frames. The graphics accelerator can also act as a hub connecting the microprocessor to video RAM 46, an I/O controller 48 and a video converter 50. The video RAM 46 can be utilized by the graphics accelerator to store information associated with the generation of video frames. The video frames can be provided to a video converter 50, which can convert the digital information into an appropriate video format for rendering by a rendering device, such as a television or video display/monitor. The format could be an analog format or a digital format. The I/O controller also interfaces with the graphics accelerator and enables the microprocessor and graphics accelerator to address devices including a network interface device 52, an input interface device 54, memory 56 and an audio output device 58 via a bus 60. The architecture shown in
The network interface device 52 can be used to send and receive information via a network. In embodiments where information is communicated via the TCP/IP protocol the network interface device and/or other devices such as the microprocessor implement a TCP/IP protocol stack. In other embodiments, other communication protocols can be used and the network interface device is implemented accordingly.
The input interface device 54 can enable a user to provide instructions to the client 40. In the illustrated embodiment, the input interface device 54 is implemented to enable a user to provide instructions to the client 40 using an infrared (IR) remote control via an IR receiver 62. In other embodiments, other input devices such as a mouse, track ball, bar code scanner, tablet, keyboard and/or voice commands can be used to convey user input to the client 40 and the input interface device 54 is configured accordingly.
The memory 56 typically includes a number of memory devices that can provide both temporary or permanent storage of information. In one embodiment, the memory is implemented as a combination of EEPROM and SRAM. In other embodiments, a single memory component or any variety of volatile and/or non-volatile memory components can be used to implement the memory.
The audio output device 58 can be used to convert digital audio information into a signal capable of producing sound on a rendering device, such as a speaker or sound system. In one embodiment, the audio output device 58 outputs stereo audio in an analog format. In other embodiments, the audio output device can output audio information in any of a variety of analog and/or digital audio formats. In one embodiment, the MP3 audio format specified by the Motion Picture Experts Group (MPEG) is used. In other embodiments, other formats such as the AC3 format specified by the Advanced Television Systems Committee, the AAC format specified by MPEG or the WMA format specified by Microsoft Corporation of Redmond, Wash. can be used.
As will readily be appreciated by one of ordinary skill in the art, any number of configurations can be used to implement a client in accordance with embodiments of the present invention. Clients in accordance with embodiments of the present invention need not include graphics capability or audio capability. In addition, clients in accordance with aspects of many embodiments of the present invention need not accept any user input. For example, user input can be provided directly to the server or to a second client that forwards the user instructions to the necessary server or servers. Alternatively, the client may simply be unable to process or forward user instructions. Embodiments of clients in accordance with the present invention can include any variety of processing components or a single processing component. Indeed any networked consumer electronics or computing device capable of communicating with a server in the manner described herein can be used to implement a client in accordance with aspects of numerous embodiments of the present invention.
In many embodiments of systems in accordance with the present invention, different clients can possess different capabilities. In many embodiments, clients can be configured to store information identifying its capabilities. In several embodiments, the clients include a file containing information in the Extensible Markup Language (XML) specified by the World Wide Web Consortium. The XML file can contain information describing the device capabilities. In many embodiments the XML file describes the playback capabilities of the client. In embodiments where a client can perform transcoding, the server can provide media directly to the client and make decisions with respect to transcoding based upon processor loading or a previously set user configuration. In many embodiments, servers also store files that describe the capabilities of the server.
As discussed above, servers in accordance with embodiments of the present invention are capable of providing audio, video and/or overlay information to clients. A client typically initiates the transmission of information by one or more servers. Each transmission can be referred to as a control session and a client can initiate a control session-by forming a connection with the control port of a server. The client then requests the initiation of a control session and if the control session is granted, the server establishes channels for audio, video and/or overlay data by sending channel assignment information to the client. Once the audio, video and/or overlay channels are established, the server can commence the transmission of audio, video and/or overlay information to the client. As was also discussed, interactivity can be achieved by the client forwarding user instructions to the server and the server responding by providing appropriate audio, video, overlay and/or control information to the client. In many embodiments, the establishment of audio, video and/or overlay channels need not occur simultaneously and individual channels can be disconnected and reconnected (often to a different server as required). For example, in one embodiment a video channel is connected to enable the display of visual information associated with a user interface. Once a feature is selected the video channel is disconnected and reconnected to another server and an audio channel is established with that same server. Another example in accordance with embodiments of the present invention relates to fast forwarding through a feature that has accompanying subtitles. The overlay channel that is providing the subtitles can be disconnected in response to the fast forward instruction from the user and reconnected to another server that provides an overlay with a fast forward icon. Alternatively, the same server could provide both the overlays and the fast forward icon and the overlay channel would simply be reallocated. The establishment of a control session, transmission of audio, video and/or overlay information and implementation of interactive features are now considered in more detail.
Once a control channel has been established, the client attempts to initiate (84) a control session with the server via the control channel. The attempt can be made by sending a packet requesting a control session that also contains information concerning the client's available port assignments. The client then waits (86) for the server's response to the request. In one embodiment, the server responds even if a session is denied. In other embodiments the request is assumed to be denied after a predetermined period of time has expired. If the session is denied (88), then the attempt to establish a session has failed. If the attempt is successful, the client typically receives (90) information from the server specifying the frequency with which the client should provide the server with information concerning the internal timer of the client. In other embodiments, the characteristics of the audio, video and/or overlay channels are specified in an XML file located on the client that is provided to the server. The importance of parameters of the data channels and the frequency with which a client reports its internal time value is discussed in greater detail below.
The client also receives (92) port assignments from the server. The port assignments typically include information concerning the parameters of the audio, video or overlays provided on each channel (e.g., audio sample rate or video resolution) and the amount of audio, video or overlay information to buffer. The initialization of the channels also includes an initial time stamp for the information that will be sent on the channel. This time stamp can be used to set the client's internal timer. The client's timer typically is paused until the specified amount of data has been queued and the client commences rendering the queued data.
The initialization can include information concerning how the information arriving on a channel should be handled. In one embodiment, a client can be initialized to render incoming data when the client's timer is greater than or equal to a time stamp associated with the data. In several embodiments, a client can be initialized to render incoming data when the client's timer exactly matches a time stamp associated with the data. In these embodiments, pausing the client's timer can also pause the rendering of data from the channel. Many embodiments enable a client to be initialized to render incoming data as soon as possible after it is received by the client. In many embodiments, the client can be instructed to synchronize audio to video packets. Synchronization of audio to video can enable a client to generate sound effects accompanying transitions or actions in a user interface.
In addition to reducing the processing required of the client, providing the ability for a server to manage a client's queues enables the server to configure the client's queues in anticipation of audio, video and/or overlay information that the server is about to send to the client. If the audio, video and/or overlay information being sent by the server is part of a menu for instance, then the server can configure the client's queues so that the client is in a constant ready start state. The term “constant ready start state” describes a state where the client does not queue any information or queues very little information so that information received from the server is processed almost immediately and rendered. Alternatively, when the server is about to send audio, video and/or overlay information associated with a feature presentation then the server can configure the client to queue sufficient information to increase the likelihood that the audio, video and/or overlay will play smoothly. So-called smooth play refers to the display of frames at appropriately spaced time intervals with synchronized audio and overlays. Smooth play typically requires that the information required for rendering be available to the client when it is required. Increasing the length of the client's queues can accommodate variations in network delays that might otherwise cause packets to arrive after they are required by the client. If audio, video and/or overlay information is not available for rendering, then the user can experience a freeze in the image, an interruption to an audio track or an overlay that is not synchronized with the accompanying video or audio.
In many embodiments, the server can constantly monitor and vary the amount of information queued by the client in order to achieve predetermined quality of service parameters. In a number of embodiments, the server can preserve quality of service by transcoding the data to a lower data rate in response to network congestion. In several embodiments, time stamp reports are used by the server to monitor system latency and manage the client's queues accordingly. In other embodiments, other information obtained from the client or another source can be used to monitor the quality of service provided by the system.
Following the port assignments, the client starts receiving (94) data on the audio, video and/or overlay channels from the server. The client handles the packets and performs the necessary reporting of time stamps to the server. The client can also receive (96) control instructions from the server. If a control instruction is received, the client responds (98) by handling the instruction.
The client can also receive (100) a user instruction. When the client receives a user instruction, the client typically forwards (102) the user instruction to the server. The client continues to display the multimedia information provided by the server until the control session is terminated.
In many embodiments, the client is only capable of responding to a very limited set of user instructions. For example, a client may be able to respond to volume control and power on/off instructions. If an instruction is received that relates to the rendered audio, video and/or overlays, then the client will typically respond by forwarding the instruction to the server.
In one embodiment, the client forwards all user instructions that are directed toward interrupting or altering the way in which audio, video and/or overlay information is provided to the rendering device(s). In further embodiments, the client forwards all user instructions related to the navigation of a menu or user interface to the server. In additional embodiments, the client forwards all user instructions that relate to the future speed and/or direction with which audio, video and/or overlays should be rendered by the rendering device. Examples of such instructions include pause, slow advance, slow rewind, fast forward and fast rewind. In further embodiments again, the client forwards all user instructions requesting that the audio, video and/or overlays rendered by the rendering device(s) progress in a non-linear fashion. Examples of such instructions include instructions to skip between chapters or scenes in a feature presentation or to skip between tracks or randomly play tracks of a sound recording.
In another embodiment, the client only handles user instructions that are independent of the audio, video and/or overlay being rendered by the rendering device(s) at the time the user instruction is received. An instruction is typically considered to be dependent upon the audio, video and/or overlay being rendered if the instruction in any way influences the content, speed or direction of audio, video and/or overlays rendered in the future. Examples of independent instructions include power on/off, volume control, mute, brightness control and contrast control.
Turning now to
If the session is accepted by the server, the server establishes (130) connections for each of the data channels. In one embodiment, the data channels include an audio channel, a video channel and an overlay channel and the server designates a port assignment for each channel. In other embodiments, the data channels can include an audio and control channel, a video and control channel or a video, an overlay and a control channel or any other combination of such channels.
In embodiments where a variety of channel configurations are supported, the establishment of the data channels can include initialization of the data channels by sending information to the client specifying the format of the data. This information can include time stamp information, information concerning the amount of data to queue and the time at which data should be processed. The initial time stamp can be determined at random. The time stamp associated with data sent on the channel can be determined in accordance with the formula:
data timestamp=initial timestamp+Abs(Data start time−Data position)/Rate
where:
data timestamp is the timestamp associated with the data;
initial timestamp is the initial timestamp chosen by the server;
data start time is a predetermined time indicative of starting time that is associated with the start of a stored sequence of data;
data position is a predetermined time associated with a particular piece or collection of data that is indicative of the time at which the data would be rendered if the sequence of data were rendered linearly from its start at a predetermined rate; and
rate is a value indicative of the speed at which the server desires the data to be rendered relative to the predetermined rate.
In instances where the sequence is played faster or slower, the rate value scales the timestamp to accommodate for an increased or reduced number of frames.
Following the establishment of the data channels, the server can commence (132) sending media to the client. In one embodiment, the server extracts the media information from a file similar to the files described in U.S. patent application Ser. No. 11/016,184 to Alexander van Zoest. In several embodiments, the server initially extracts audio, video and/or overlay information to create a user interface. Embodiments of user interfaces in accordance with the present invention can be audio interfaces, a purely graphical interface or interfaces that combine both audio and graphical components. In instances where the server uses the data channels to transmit a feature presentation, the server can select a video and audio track from a number of video and audio tracks contained within a file stored on the server. In addition, the server can select an overlay track to provide subtitles or another form of overlay such as an information bar or an icon indicating actions such as the feature presentation being paused, fast forwarded, rewound or skipped between chapters. In other embodiments, the server may only provide the audio, video or overlay track. In such embodiments, other tracks can be provided by other servers or there may not be any other data tracks.
If information is received (134) from the client, then the server responds (136) to the information. The information will typically contain a user instruction or a time stamp report. Most forwarded user instructions relate to audio, video and/or overlay information that the user wishes to access. The server's response may vary depending upon whether the information displayed at the time the user instruction was received was part of a user interface or part of a feature presentation. The handling of forwarded user instructions by an embodiment of a server in accordance with the present invention is discussed further below. However, it is worth noting that the server is able to obtain information from the time stamp reports concerning the audio, video and/or overlays at the time a user instruction was received.
The above discussion provides a description of information exchange between an embodiment of a server and a client in accordance with the present invention. As indicated above, servers in accordance with embodiments of the invention can transcode audio, video and/or overlay information for transmission to a particular client. A process in accordance with an embodiment of the present invention for transcoding audio, video and/or overlay information is shown in
If a determination is made that transcoding of the audio, video and/or overlay information is required, then the server transcodes (138d) the audio, video and/or overlay information and provides the transcoded audio, video and/or overlay information for transmission with any of the originally formatted audio, video and/or overlay information that does not require transcoding. In the event that a determination is made that no transcoding is necessary, then the originally formatted audio, video and/or overlay information is provided for transmission (138e).
A flow chart illustrating the manner in which the client handles packets received from a server in accordance with an embodiment of the present invention is illustrated in
The fact that the audio, video and/or overlay information is communicated via separate channels enables the client to access a particular type of information as soon as it arrives. In embodiments where all of the data types are multiplexed on a single channel, then the client could be forced to process the data in the order of arrival as opposed to on the basis of the data most needed by the client. Conceivably, such a client could be starved of one type of data, have a packet of that type of data stored in its buffer but be forced to process other types of data because they arrived first. However, the client could be configured to locate and handle desired information.
In many embodiments, the server can include digital rights management (DRM) information with the information transmitted on each of the audio, video, overlay and/or control channels. In one embodiment, information about the nature of the DRM information is communicated to the client by the server. The client can acknowledge that it has the ability to perform the necessary decryption to play the DRM protected information or can respond that it does not possess this ability.
As discussed previously, many embodiments of clients in accordance with the present invention do not directly respond to user instructions. Instead, the client forwards the instruction to the server and the server responds to the instruction by selecting audio, video and/or overlay information to be displayed by the client. For many embodiments, the fact that the client's capabilities do not extend far beyond the handling of incoming packets is key to the simplicity with which a client can be implemented. The handling of user instructions by embodiments of servers and client in accordance with the present invention is now considered in more detail.
Embodiments of the system of the present invention are often configured to reduce latency when responding to user instructions, because reducing latency can enhance a user's experience when interacting with the system 10. Latency is the delay between the time a user instruction is received and the display of audio, video and/or overlay information on a rendering device. There are a number of ways that embodiments of servers in accordance with the present invention can attempt to reduce latency. One technique is to manage the client's queues so that information sent in response to a user instruction is immediately processed. Were a server to respond to a user instruction by simply transmitting information to a client, delays could occur due to the client playing previously queued information before playing the newly transmitted information. The server can reduce system latency by sending an instruction to the client to flush its queues prior to the server sending the audio, video and/or overlay information in response to the user instruction. Once the queues are flushed, the newly received information can be immediately displayed by the client.
In many embodiments, the new audio, video and/or overlay information sent by a server in response to a user instruction has a different format to the previous multimedia transmission. The format changes can include changes in the encoding format of the data such as the resolution, width and height of video or sampling rate of audio, changes in the amount of data that the client should queue, changes in the manner in which the client should process data based on the data's time stamp or activation of DRM. In instances where a format change is required to respond to a user instruction, the server can reinitialize the media channels with the client prior to sending media information in the new formats.
Turning first to
When the user instruction cannot be handled by the client, then the user instruction is forwarded (168) to the server via the control channel. The client then enters a loop checking (170) for control messages from the server, and in the absence of a control message, processing (172) audio, video and/or overlay information for rendering and sending (173) time stamp reports via the control channel to the server at intervals specified by the server. As will be discussed further below, the time stamp reports can be used by the server to determine the audio, video and/or overlay information that was being rendered at the time a user provided an instruction.
If a control instruction is received from the server, then the client determines (174) the type of control instruction. The control instruction may command the client to resynchronize its queues. Resynchronization (176) can involve flushing queues and/or assigning a new timer value to the client. Flushing queues enables a client to immediately render new data sent by the server. In many instances, the client is resynchronized without flushing its queues. Resynchronization without flushing a queue can be useful in instances where display of information in the queue is desired, such as when the system desires a feature to play out and then return to a user interface, such as a menu. An example of such a situation is when a server intends a client to automatically go back to a user interface without cutting off a feature presentation. In many embodiments, the server can send a resynchronization request but not provide additional information to the client until an acknowledgement is received that the media queued by the client (or the media having a time stamp less than an indicated time stamp) has played out. In several embodiments, resynchornization without flushing a queue can be used to ensure that a user interface is not updated by a client until a sound effect has been rendered.
Following receipt of the resynchronization instruction, the client can send a resynchronization acknowledgment to the server via the control channel. The client can then continue to process audio, video and/or overlay information that it receives from the server while checking for further control instructions (170 and 172) and sending (173) time stamp reports to the server via the control channel.
The client may determine (178) that the control requires reinitialization of the data channels. Once the client has adapted (182) to the new channel parameters provided by the server, the client continues to process and output audio, video and/or overlay information for display by a rendering device while checking for further control instructions (170 and 172) and sending (173) time stamp reports to the server via the control channel.
The client may determine (184) that the control instruction requires the termination of the control session. In which case, the client terminates (186) the control session by disconnecting each of the audio, video, overlay and/or control channels that have been established. The client can also handle (188) other types of control instructions necessary to implement the functionality of the system. Following the handling of a control instruction, the client typically continues to process audio, video and/or overlay information for display by a rendering device while checking for further control instructions (170 and 172) and sending (173) time stamp reports to the server via the control channel.
Turning now to
During a feature presentation, valid user instructions typically require the manipulation of the speed and/or direction in which the feature is being presented, the transition to a menu and/or the addition of an overlay. When a menu is being rendered, the server typically possesses information concerning the valid actions that can be performed during the display of a particular menu. This information can take the form of a state machine. If the server has a record of the menu state at the time the user issues an instruction, then a valid instruction will typically involve a transition to another menu state or the display of a feature presentation.
When the user instruction requires the immediate display of audio, video and/or overlay information by the client, then the server can send (206) a control instruction directing the client to flush any queued media information, if determined (204) to be appropriate. Once the resynchronization message has been sent and acknowledged (207), the server can send the required audio, video and/or overlay information. As discussed above, flushing the queues can reduce the latency with which the system responds to user instructions and avoid awkward jumps in feature presentations as information queued by the client prior to the instruction is rendered. Other types of resynchronization of the server and the client can also be performed.
When a feature presentation is being rendered, the server can use time stamp reports provided by the client to determine the audio, video and/or overlay information that was being rendered at the time the user instruction was received. The server can then respond to a user instruction involving the speed and direction in which the feature is presented by flushing the queue and sending audio, video and/or overlay information that, when processed by the client and rendered, presents the feature in accordance with the user's instructions concerning speed and direction from the point in the rendered feature presentation corresponding to the point at which the user instruction was issued. By flushing the queues, the server is often forced to resend information that was being queued by the client prior to the user issuing an instruction. However, the queued information would have been rendered by the client in a way that would not have conformed with the user's instructions, detracting from the user's experience of the system.
When the server determines (208) that the user instruction requires the transmission of a different type of multimedia information to the multimedia information sent previously, then the server can send (210) a control instruction to the client directing the client to reinitialize the audio, video and/or overlay channels. The server then commences transmitting (216) audio, video and/or overlay information in accordance with the new channel parameters.
The above description is not meant to be exhaustive of the control instructions that can be sent by a server in response to a user instruction or under any other circumstance for that matter. If the server determines (218) that another type of command should be sent (220) to the client, then the server can send (220) such a command. Indeed, the server may determine that no command is required to be sent to the client and simply send multimedia information in accordance with the user instruction. In addition, a server that is using transcoding to provide the audio, video and/or overlay information in accordance with an embodiment of the invention can also be configured to respond to user commands in a manner that ensures the video provided to the transcoder is appropriate to the instructions provided by a client.
The above description has generally focused upon instances where audio, video and/or overlay information are provided by a single server. Many embodiments of the present invention use multiple servers to provide information to clients. In one embodiment, multiple servers simultaneously provide information to a client with each of the servers providing different types of information. In another embodiment, a first server provides audio, video and/or overlay information to a client and then a transition is made and a second server provides audio, video and/or overlay information to the client.
An embodiment of a system in accordance with the present invention where multiple servers are capable of simultaneously providing data to a client is illustrated in
When information is being sent to a client from multiple servers, coordinating the information delivered to the client can become problematic. In many embodiments, a single server is chosen to act as a control hub. The control hub server is responsible for forwarding appropriate control messages to all of the servers communicating with a client and for forwarding control messages from other servers to the client. Typically, the control hub is chosen to be the server with which a client initially seeks to establish a control session. In many instances, the user will request information that is not present on a first server and the first server will seek to establish connections with other servers that can provide the desired information. In some instances, this may simply be a single channel of information. In other instances, all of the desired information may be resident on another server. For example, a first server may store information for a user interface and the user interface enables a user to access a feature presentation that is stored on another server. In instances where a first server provides all of the required information for a period of time and then a second server provides all of the required information for a period of time, the first server can function as a control hub or hand control off to the second server.
Embodiments of systems in accordance with the present invention can also include one or more servers communicating with one or more clients. In these embodiments, a single server can act as a control hub and maintains control connections with each of the servers and clients that are present in a particular control session. Alternatively, control messages can be broadcast to all of the servers and clients involved in the control session. In one embodiment, a server or client will be part of a control session, if the server or client provides information to or is responsive to instructions from the client that first initiated the control session with one of the servers. In other embodiments, a server or client can be part of a control session if it communicates information within a particular network such as a home network or portion of a network such as a virtual private network. In many embodiments, the server that acts as the control hub determines the clients and servers that form part of the control session.
As discussed above, various clients in systems that are embodiments of the invention can possess different capabilities. In many instances the capabilities of the client can be determined by the underlying hardware within the client and the software that is used to configure the hardware. While the hardware is usually fixed, the operation of a client can be modified by changing the software. In many embodiments of the present invention, the servers and clients are configured so that the server can provide updated software to a client.
In several embodiments, a simple update can be performed in which information is provided to a client by a server and the information is used by the client to modify its software or firmware. Simple updates are typically performed in circumstances where the modifications to the client do not affect the manner in which the server and client communicate.
In instances where a software update involves modification of the protocol by which the server and client communicate, several embodiments of the present invention perform an advanced update. An advanced update is a software or firmware update that involves determining the state of the network prior to performing the update. If the current capabilities of all of the servers and clients in the system are known, along with the compatibilities of all available updates, it is possible to make a decision about which devices to update and which update to use for each device.
As described above, the capabilities of a device in accordance with an embodiment of the present invention can be expressed as an XML file. Prior to a device receiving an update, the device can provide its XML to the device providing the update. The XML can then be parsed to generate a list of capabilities. The lists of capabilities can then be used to determine the update to apply to the device. When an advanced update is performed, the capabilities of all of the servers and clients connected to the network can be gathered and the lists for the clients and servers used to determine an update path for each device that will ensure system stability. To ensure that a correct view of the network is gathered, an advanced update will typically require user participation to ensure that all devices are connected to the network and are active.
In many embodiments, individual updates for each device are distinguished using version numbers. In many instances, different updates may be compatible with different communication protocols. A device should not be updated to support an updated communication protocol unless all other devices connected to the network support that (updated) communication protocol. If any device does not support the updated communication protocol, then updates that involve a migration to the updated communication protocol should not be applied to any other device on the network.
An embodiment of a process in accordance with the present invention for performing an advanced update is shown in
In one embodiment, the process for obtaining information about a client can be the same during updates as the process used to determine a client's capabilities, when transmitting media to the client. In many instances, servers in accordance with embodiments of the present invention, push updates to clients by sending information to the client during discovery that indicates an update is being pushed. In one embodiment, the information could be conveyed using a flag set in an SSDP packet sent by a server. A client receiving the SSDP packet can query a server to obtain a URL. The client can then use the URL to connect to an HTTP port and download the applicable update. In many embodiments, an update server can identify itself by using a separate UPNP device UUID.
While the above description contains many specific embodiments of the invention, these should not be construed as limitations on the scope of the invention, but rather as an example of one embodiment thereof. Accordingly, the scope of the invention should be determined not by the embodiments illustrated, but by the appended claims and their equivalents.
This application is a continuation-in-part of U.S. application Ser. No. 11/198,142, filed Aug. 4, 2005, entitled INTERACTIVE MULTICHANNEL DATA DISTRIBUTION SYSTEM. This application also claims the benefit of U.S. Provisional Patent Application No. 60/642,065, filed Jan. 5, 2005, and U.S. Provisional Patent Application No. 60/642,265, filed Jan. 5, 2005, the contents of which are hereby expressly incorporated by reference in their entirety. This application is also related to co-pending U.S. Patent Application entitled SYSTEM AND METHOD FOR A REMOTE USER INTERFACE [Attorney Docket No. 56419/D579], filed Dec. 30, 2005 and U.S. patent application entitled MEDIA TRANSFER PROTOCOL [Attorney Docket No. 56420/D579] filed Dec. 30, 2005, the contents of which are hereby expressly incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
60642065 | Jan 2005 | US | |
60642265 | Jan 2005 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11198142 | Aug 2005 | US |
Child | 11322604 | Dec 2005 | US |