Embodiments of the present invention are related to the field of communication networks.
Computer networks are frequently used to deliver media (e.g., audio and/or video) to users. Frequently, the users want to skip through irrelevant or unimportant portions of the media and quickly access what they feel is important. Other times, the users want to review a portion of the media which is interesting or important to them.
As an example, using the keypad of a telephone, the users can cue forward to a desired portion of a message or review a portion of the message that was previously accessed. The media is typically sent from a locally situated server which is communicatively coupled with a user's electronic device. In one prior art implementation, time scale modification (TSM) software on the user's electronic device is used to cue to a desired portion of the message. TSM software allows lengthening or compressing the time scale of an audio signal without altering its frequency character. In other words, the audio can be played back at a faster or slower rate than real time and still be clearly understood. Another example is a personal video recorder (PVR) which allows users to digitally record television content and play it back later. The PVR systems include the typical fast forward/reverse controls found on digital versatile disk (DVD) players and video cassette recorders (VCRs). Typically, the software application for controlling the media is disposed upon the user's electronic device. As a result, response to user input is perceived, by the user, as being immediate or nearly so.
However, while these systems perform well when implemented locally, providing a similar level of performance when the media control software is implemented upon a remote computer (e.g., in a network) has been problematic. For example, due to the delay inherent in networked communications, immediate response to a user's input is difficult to achieve. As a result, the user's perceived control over the playback of the media is reduced. Thus, if a user wants to review a previously played portion of an audio message, the user “presses” the rewind button on their controller and the media control software on the remote computer responds to the user's input to rewind the audio message. However, due to the delay inherent in the network, there is a lag between the time the user generates the command and when the remote computer actually acts upon the command. As a result, the user may overshoot the desired portion of the media and then must either input additional cueing commands to reach the desired portion, or wait until the desired portion is reached.
For example, a user is watching a movie via a streaming data connection with a remotely located server and wants to rewind the movie to see a favorite scene again. The user presses the rewind button and expects the movie to rewind immediately or shortly thereafter. However, due to the latency in the communications between the user's computer and the remotely located server, the service delay in the remotely located server fulfilling the user's command, and the latency in delivering the media to the user's computer, there is a substantial pause (e.g., 5 seconds) in fulfilling the user's command. As a result, the user keeps pressing the rewind button in anticipation of seeing the movie begin to rewind. When the user reaches the desired portion of the movie, they release the rewind button. Again, the delay between when the user releases the button and when the results of that action are delivered to the user result in the user rewinding past, or “overshooting,” the desired portion of the movie by 5 seconds. The user can either play the movie at normal speed until the desired portion of the movie is reached, or press the fast forward of their media controller in the hope of resuming play at the desired portion. However, the user may overshoot the desired portion again as problem of communication latency still exists. Thus, the user either can try to anticipate the communication delay in anticipation of the above mentioned delays or settle for a reduced amount of control over the media.
Thus, systems for interactive control of media via a network suffer from a perceived lack of immediacy to users of the system. As a result, precise control over the media is difficult to realize.
In one embodiment of the present invention, a request is generated from a first device to a remotely located second device for a modified media stream. A period of delay is then determined between when the request is generated and when the modified media stream is desired. The modified media stream is then created on the first device during the period of delay such that the modified media stream is available during the period of delay.
The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments of the present invention and, together with the description, serve to explain the principles of the invention. Unless specifically noted, the drawings referred to in this description should be understood as not being drawn to scale.
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings. While the present invention will be described in conjunction with the following embodiments, it will be understood that they are not intended to limit the present invention to these embodiments alone. On the contrary, the present invention is intended to cover alternatives, modifications, and equivalents which may be included within the spirit and scope of the present invention as defined by the appended claims. Furthermore, in the following detailed description of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, embodiments of the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the present invention.
Portions of the present invention may also comprise computer-readable and computer-executable instructions (e.g., software application 121) that reside in remotely located computer 120. Remotely located computer 120 is for sending media (e.g., audio content, video content, or audio/video content) to computer 110 via communication network 150. In embodiments of the present invention, computer network 150 may comprise an Ethernet network, an infrared (IR) communication network, a Bluetooth wireless communication network, a cellular telephone network, a radio network, a broadband connection, a satellite link, the Internet, or the like.
The following discussion sets forth in detail the operation of embodiments of the present invention. As shown in
In embodiments of the present invention, user media control 210 is utilized to generate requests for a modification of the media received from remotely located computer 120. In embodiments of the present invention, the request is accessed by accessor 211 and is then conveyed to media modifier 252 via media request 270. In embodiments of the present invention, the request is also sent by accessor 211 to request history 220, adjustment determiner 222, and media modifier 252 of computer 120. In embodiments of the present invention, this media request 270 may be formatted as a real-time streaming protocol (RTSP) request. Media modifier 252 modifies original media steam 240 in accordance with media request 270 and sends the modified media stream 241 to buffer 223.
For example, when media from remotely located computer 120 comprises audio content, user media control 210 can be used to generate a request to increase/decrease the playback rate of the audio content, or to enable/disable noise reduction filtering, etc. A RTSP rate update message (e.g., media request 270) is sent to remotely located computer 120 comprising a playback command and a scalar which designates a rate for streaming the data from remotely located computer 120 to computer 110. While the present invention recites a RTSP message specifically, embodiments of the present invention are well suited to use other formats for media request 270. Media modifier 252 will then modify original media stream 240 to increase/decrease the rate of the audio content, to enable/disable noise reduction filtering, etc., in accordance with the request from computer 110.
When media from remotely located computer 120 comprises video content, user media control 210 can be used to generate a request to increase/decrease the playback rate of the video content, to control a camera which is recording a live event (e.g., controlling the orientation or zoom level of the camera), to track a pre-designated person or object in a meeting, etc. Media modifier 252 will then modify original media stream 240 to increase/decrease the playback rate of the video content, to stream video content wherein the zoom level of the camera or the coordinates upon which the view are centered are changed, or to track a pre-designated person or object in a meeting in accordance with the request from computer 110. Alternatively, media modifier 252 may be a pre-fetchor which retrieves additional data in response to user commands. For example, if a user media control 210 is controlling the scrolling of a camera and the user command is to scroll the view to the left, media request 270 may be a pre-fetch command for additional data in anticipation that the user will continue scrolling. While the present embodiment recites these modifications to a media stream specifically, embodiments of the present invention are not limited to these modifications alone.
This embodiment is also important for browsing of large images, where only a portion of the image is stored locally, because, for example, of limited local storage or limited bandwidth for downloading the image, or privacy or commercial reasons. An example of a commercial application is where the end-user is granted access to portions of the image that they pay for, so as the end-user browses the large image he agrees to pay for each new portion before that portion is delivered or before it is delivered with maximum quality. Another related commercial issue is that the end-user may not need to pay for the image data itself, but may need to pay for the network resources for downloading the image data, once again leading to a preference for only downloading user-selected portions of data (as opposed to downloading all of the image data). Example images include satellite images, maps, medical images, electronic versions of printed material such as newspapers or books, as well as other images which contain a large amount of date.
In conventional systems in which the modification of the original media stream is performed locally, there is little or no discernable latency by the user between when a request for a modified media steam is generated and when the modified media stream is displayed. However, in a system implemented via computer network wherein the server is remotely located, a greater amount of latency is exhibited between when the user generates a request for a modified media steam is generated and when the modified media stream is displayed. This is due, in part, to the delay in conveying the request from computer 110 to remotely located computer 120 as well as the delay in conveying the modified media stream from remotely located computer 120 to computer 110. An additional delay may be realized in the amount of time it actually takes remotely located computer 120 to service the request for a modification to the media stream. Finally, a delay may be incurred as data packets currently in buffer 223 will be presented to the user without the requested modification.
As a result, software application 111 has no knowledge of what part of original media stream 240 is being serviced at the time media request 270 is generated. This is shown in
As shown in
In order to correctly modify the incoming media stream, embodiments of the present invention determine an adjustment which, when applied to the incoming media stream, will create the modified media stream requested by computer 110. Additionally, embodiments of the present invention determine a time period for applying the adjustment to the incoming media stream which approximates the delay period between when the request for the modified media stream is generated and when the modified media stream is delivered or likely to be delivered to computer 110. In so doing, embodiments of the present invention locally modify the media stream to approximate the requested modified media stream until the modified media stream can be delivered via the network.
In embodiments of the present invention, when the request for the modified media stream is sent to remotely located computer 120, information about the request is also sent to request history 220 and to adjustment determiner 222. In embodiments of the present invention, request history 220 comprises a buffer which stores information describing the requested modification of the media stream and a timestamp of when the request for modification of that modification was made. For example, if a user is receiving audio content at a rate of 1×, the request for a modified media stream may comprise a request to decrease the playback rate of the media stream to a rate of 0.5×. Request history 220 records a request for a playback rate of 0.5× and appends a timestamp to the request. By storing a history of the requests in request history 220, embodiments of the present invention can determine the playback rate characteristics of data packets in buffer 223 for a given time period.
In embodiments of the present invention, media history 221 is used to determine the period of delay between when a request for a modified media stream is generated and when the modified media stream is delivered or expected to be delivered. In embodiments of the present invention, media history 221 modifies the timestamp of the rate request from request history 220 to account for the communications delay incurred via network 150 (e.g., media request 270 and RTSP response 271), the time it takes for remotely located computer 120 to service the request, and the local buffer delay. In so doing, embodiments of the present invention can determine the period of delay between when a request for a modified media stream is generated and when the modified media stream is delivered or expected to be received. In embodiments of the present invention, media history 221 further comprises the information describing the requested modification of the media stream from request history 220. Using the above example, media history 221 would include the playback rate of data packets currently in buffer 223 (1×) as well as the requested playback rate (0.5×). This facilitates comparing the playback rate requested in media request 270 with the playback rate of data packets currently in buffer 223. In so doing, embodiments of the present invention can determine a local adjustment that is applied to data packets currently in buffer 223 in order to create modified media stream locally (e.g., on computer 110). This local adjustment is applied to arriving data packets as well until the first data packet of modified media stream 241 arrives at local media modifier 224.
There are a variety of methods for determining when a response to a request for a modification to a media stream is delivered or expected to be delivered in accordance with embodiments of the present invention. In the embodiment of
Additionally, in RTSP response 270 remotely located computer 120 can provide service delay information which notifies computer 110 of the period of delay between the receipt of media request 270 and when that request will be acted upon. In embodiments of the present invention, remotely located computer 120 may also send timestamp information in RTSP response 271 which tells computer 110 the timestamp of the first data packet of modified data stream 241 (e.g., the timestamp of data packet 309 of FIG. 3), or may designate a place in the media stream (e.g., data packet 309 of
Embodiments of the present invention also determine a media adjustment, also referred to as the “Δ value,” and the duration of the media adjustment that is applied locally (e.g., by computer 110) to data packets prior to the arrival of the requested modified media stream 241. In embodiments of the present invention, adjustment determiner 222 determines the duration of the locally applied media adjustment by accessing media history 221. As described above, when media request 270 is generated, adjustment determiner 222 accesses the media request. Adjustment determiner 222 also accesses media history 221 to determine when modified media stream 241 is delivered or expected to be received. Adjustment determiner 223 compares these values to determine the period of delay between when the request is generated and when modified media stream (e.g., modified media stream 241) is delivered or expected to be received. Adjustment determiner 222 also determines the Δ value of the locally applied media adjustment by accessing media history 221. As described above, media history 221 comprises a record of the characteristics of the requested modified media stream 241 as well as the characteristics of the data packets already in, for example, buffer 223. For example, using the above example, adjustment determiner compares the playback rate of data packets currently in buffer 223 (1×) with the requested playback rate (0.5×) to determine the Δ value that will be applied to data packets in buffer 223 and in transit via communication network 150. By comparing these values, adjustment determiner 222 can determine the Δ value of the locally applied adjustment to the media stream.
In embodiments of the present invention, information about the Δ value and duration of the locally applied media adjustment is then sent to local media modifier 224. Local media modifier 224 receives the data packets from buffer 223 and adjusts or modifies these data packets such that they are in accordance with requested parameters of modified media stream 241. In other words, adjustment determiner 222 provides the Δ value which is then applied to the data packets received from buffer 223 by local media modifier 224. In so doing, the present invention creates locally, modified media stream 241 from the prior media stream. Additionally, the duration of the locally applied media modification is sent to local media modifier 224 from adjustment determiner 222. Thus, the present invention applies a local adjustment or modification to data packets from buffer 223 such that the requested modified media stream is created locally. Additionally, the locally adjustment is applied for a time period long enough to allow data packets modified in accordance to media request 270 to be conveyed from media modifier 252 to local media modifier 224. In embodiments of the present invention, when it is determined that modified media stream 241 has been received at, for example, local media modifier 224, the Δ value is no longer applied to the data packets. As a result, modified media stream 241 is now presented to the user.
The following discussion will refer to
Request history 220 appends a timestamp to media request 270 and forwards this information (e.g., the request to slow the data rate from 1× to 0.5× and the appended timestamp) to media history 221. Media history 221 modifies the timestamp information appended to the request to account for the communications delay incurred via network 150 (e.g., the time between media request 270 is generated and RTSP response 271 is received), the timestamp of data packet 309, and the local buffer delay. Adjustment determiner 222 accesses media history 221 to determine the Δ value that will be applied to the original media stream until data packet 309 is conveyed to local media modifier 224.
In the present embodiment, adjustment determiner compares the current data rate (e.g., 1×) with the requested data rate (e.g., 0.5×) and determines that the data rate of data packets 302-308 should be slowed by a factor of 2 (e.g., 1×/0.5×). Adjustment determiner 222 also determines how long this Δ value will be applied by local media modifier 224. In the present embodiment, by accessing media history 221, adjustment determiner 222 identifies data packet 309 (e.g., by the timestamp appended by remotely located computer 120) as the first data packet of modified media stream 240. Adjustment determiner 222 then passes this information to local media modifier 224. In response, local media modifier 224 applies the Δ value to data packets 302-308, thereby creating modified media stream 241 locally on computer 110. In other words, local media modifier 224 adjusts the playback rate of data packets 302-308 to correspond to the difference between the user's current setting (e.g., media request 270) and the setting that it believes media modifier 252 gave at the buffer time of the data packet currently being played. When data packet 309 arrives at local media modifier 224, the Δ value is no longer applied to incoming data packets as the rate adjustment has already been applied at remotely located computer 120.
In another embodiment, a mechanism may be included in system 200 which facilitates detecting and/or verifying when the first data packet of the modified media stream has been conveyed to computer 110. For example, information in the header of the data packets may indicate to local media modifier 224 that the data packet currently being accessed by local media modifier 224 has already been modified by media modifier 252 and, therefore, does not require further processing. In another embodiment, an out of band signal (e.g., sideband pathway 272 of
In embodiments of the present invention, local media modifier 224 may align the last full pitch period of the last data packet (e.g., data packet 308) which is modified locally with the first full pitch period of the first data packet (e.g., data packet 309) of modified media stream 241. This can reduce discontinuities which occur when transitioning from the data packets processed by local media modifier 224 to the data packets of modified media stream 241 which have been modified by media modifier 252. For example, in a streaming audio application, the pitch periods of data packets 308 and 309 may not match the pitch period of data packet 309 because, for example, they were processed using different processing algorithms. As a result, a user may be able to detect a jump or gap in the audio signal. However, by aligning the pitch periods of the data packets, this effect can be minimized
In embodiments of the present invention, different processing algorithms may be used by media modifier 252 and local media modifier 224. For example, local media modifier 224 may utilize a lower quality (e.g., less computationally expensive) algorithm that approximates the processing performed by media modifier 252. This may be desirable in situations in which computer 110 is a less capable system and not able to adequately support a complex or more computationally expensive processing algorithm. Additionally, media modifier 252 may be utilizing a proprietary high-quality algorithm which is only available remotely while local media modifier 224 utilizes a simpler processing algorithm which is more widely available.
Using a lower quality processing algorithm in local media modifier 224 may result in noticeably diminished quality to the user of computer 110 while modifications to the media stream are being performed by local media modifier 224. However, this may be minimally perceptible to the user as it takes the user a short amount of time to adjust to a change, thus masking the initial low quality. However, the user will be able to perceive quick system response to requests for modifying the media stream, even if the initial media quality is lower for a short amount of time.
Embodiments of the present invention are advantageous because they facilitate conveying the media via a network while still giving the sense of immediacy to user commands for controlling the media. As a result, the user has a greater sense of control over the presentation of the media while controlling the media interactively. Previously, this level of control was typically found on computers which performed all of the media modification locally. However, because some types of media modification are computationally intensive, a user's computer, or other electronic device, may not be well suited to performing this function. Therefore, it is advantageous to perform this function remotely (e.g., at remotely located computer 120) and send the modified media to the user's computer. An additional benefit of embodiments of the present invention is that bandwidth utilization may be reduced by removing extraneous data before sending it to the user's computer over a network. Additionally, because the local adjustment to the original media stream is only performed until the modified media stream can be conveyed from remotely located computer 120, a less capable (e.g., less computationally intensive) media modifier can be used on the user's computer.
In the above example, while media request 270 may request a modification of original media stream 240 to provide a 0.5× data rate, this may not be possible. For example, in time scale modification, changing the data rate is typically dependent upon adding/removing a full pitch period from the original media stream. As a result, while a data rate of 0.5× may be requested, the closest achievable data rate may actually be 0.6×. However, sending this information to computer 110 may require more bandwidth than is desired or available. In one embodiment of the present invention, this information is conveyed to media history 221 via the data path of RTP packet information 242.
As shown in
In embodiments of the present invention, sideband pathway 272 is a real time protocol (RTP) pathway which allows media modifier 252 to send information about modified media stream 241 to computer 110. For example, when media modifier 252 is performing TSM processing of an audio stream, it can send rate information via sideband pathway 272 to media history 221 and local media modifier 224. Thus, information about actual playback rate variations can be sent via sideband pathway 272 rather than relying upon variably sized data packets described above with reference to
Additionally, media modifier 252 can send information about the media modification which relieves local media modifier 224 from performing some of its calculations. For example, media modifier can send information which conveys the scaling rate used for each data packet of modified media stream 241 as well as alternative overlaps that could be used by local media modifier 224 to adjust the rate of the media stream locally. There is no additional computational cost to remotely located computer 120 in doing this because media modifier 252 computes alternative offsets when it computes peak-location information from original media stream 240. Advantageously, local media modifier 224 does not have to be as capable (e.g., computationally intensive) a component. This is advantageous for devices such as handheld computers and PDAs which may not have sufficient computation power to perform complex media modification calculations in a timely manner.
In another embodiment, media modifier 252 can send information via sideband pathway 272 which allows computer 110 to locally restore information that was dropped by remotely located computer 120. For example, when increasing the data rate, remotely located computer 120 may drop some data which computer 110 would want to restore when media request 270 is for a decrease in the data rate. Sometimes when speeding an audio signal, the appearance of pitch is destroyed (e.g., the voicing of a fricative near a vowel). Because media modifier 252 has access to original media stream 240, it can send information about data that was dropped from original media stream 240 which allows local media modifier 224 to restore that data.
In embodiments of the present invention, the timestamp information sent from computer 110 is based upon a delay of the network delay in conveying messages via communication network 150 as well as service delays incurred by remotely located computer 120 in servicing media requests. In embodiments of the present invention, media modifier 252 can pause, or postpone performing the requested media modification until the designated timestamp is reached. This embodiment gives computer 110 a predictable time history of target data rates, even without feedback from media modifier 252. This is advantageous in situations when, for example, remotely located computer 120 is running a legacy version of media modifier 252 which does not support sending information to media history 221 or local media modifier 224 (e.g., via RTSP response 271 or sideband pathway 272).
With reference to step 820 of
With reference to step 830 of
In step 920 of
In step 930 of
Some portions of the detailed descriptions which follow are presented in terms of procedures, logic blocks, processing and other symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. In the present application, a procedure, logic block, process, or the like, is conceived to be a self-consistent sequence of steps or instructions leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, although not necessarily, these quantities take the form of electrical or magnetic signal capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the present invention, discussions utilizing terms such as “generating,” “determining,” “creating,” “identifying,” “designating,” “comparing,” “applying,” “accessing,” “establishing,” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
With reference to
In the present embodiment, computer system 1000 includes an address/data bus 1001 for conveying digital information between the various components, a central processor unit (CPU) 1002 for processing the digital information and instructions, a volatile main memory 1003 comprised of volatile random access memory (RAM) for storing the digital information and instructions, and a non-volatile read only memory (ROM) 1004 for storing information and instructions of a more permanent nature. In addition, computer system 1000 may also include a data storage device 1005 (e.g., a magnetic, optical, floppy, or tape drive or the like) for storing vast amounts of data. It should be noted that the software program for performing a method for interactive control of media over a network of the present invention can be stored either in volatile memory 1003, data storage device 1005, or in an external storage device (not shown).
Devices which are optionally coupled to computer system 1000 include a display device 1006 for displaying information to a computer user, an alpha-numeric input device 1007 (e.g., a keyboard), and a cursor control device 1008 (e.g., mouse, trackball, light pen, etc.) for inputting data, selections, updates, etc. Computer system 1000 can also include a mechanism for emitting an audible signal (not shown).
Returning still to
Furthermore, computer system 1000 can include an input/output (I/O) signal unit (e.g., interface) 1009 for interfacing with a peripheral device 1010 (e.g., a computer network, modem, mass storage device, etc.). Accordingly, computer system 1000 may be coupled in a network, such as a client/server environment, whereby a number of clients (e.g., personal computers, workstations, portable computers, minicomputers, terminals, etc.) are used to run processes for performing desired tasks. In particular, computer system 1000 can be coupled in a system for interactive control of media over a network.
The preferred embodiment of the present invention, a method and system for interactive control of media over a network, is thus described. While the present invention has been described in particular embodiments, it should be appreciated that the present invention should not be construed as limited by such embodiments, but rather construed according to the following claims.