Interactive applications, such as games, can be computationally intensive. Particularly for certain kinds of interactive applications, such as interactive multimedia applications, a major component of this high computational load is the need to generate video and audio in response to user inputs. Further, the load is multiplied by the number of users, since the same video and audio may need to be generated separately for each of the multiple users of a given application.
When such applications are hosted on servers, for example cloud-based servers, one result can be a need for large numbers of servers, which are costly to acquire, update, and maintain.
There is a need for a better solution for hosting computationally intensive interactive applications, such as games.
Embodiments of the present invention convert multimedia computer program outputs to a series of streaming video clips that can then be distributed worldwide through the video streaming infrastructure consisting of Internet Data Centers (IDCs) and a Content Delivery Network (CDN).
Further, in some embodiments, the video clips are tagged with metadata to facilitate playback. Metadata can include, for example, an identifier and trigger information. The identifier can be a unique identifier for each video clip. The trigger information can specify the identifier of the next clip to be played, possibly as a function of the current user input or other conditions.
In general, embodiments of the present invention include a video clip production process and an interactive playback process.
In the production process, a user (or, in some variations, a simulated, “robot” user) interacts with a conventional interactive computer program. In response to the user interaction, the computer program produces raw video and sound data. The user input or other event that triggered the production of the particular video and sound data is stored. The particular video and sound data associated with the trigger condition are then converted to streaming video clips. The clips are tagged with metadata, including for example an ID, the trigger condition or playback event, and a length. In some embodiments, the clips are then sent via a Content Delivery Network to selected Internet Data Centers to support one or more interactive applications.
In the playback process, in certain embodiments, for example an embodiment that supports the playing of an interactive game, a first video clip is played. At the conclusion of the playing of the first video clip (or, in some embodiments, at any time during the playing of the first video clip), the metadata is consulted to identify the trigger condition or conditions that will trigger the playing of a next video clip. Upon detection of the trigger condition (for example a user pushing a certain button), the next video clip is played. Playback continues in this manner until a last video clip is played based on a last trigger condition.
In some embodiments, playback occurs in a server, such as a cloud-based streaming server, and the content is streamed to the user from the server. In other embodiments, at playback the content is streamed to the user via the CDN and IDC.
Embodiments of the present invention provide production and playback of multi-media information as streaming video clips for interactive real-time media applications.
User device 103 includes central processing unit (CPU) 120, memory 122 and storage 121. User device 103 also includes an input and output (I/O) subsystem (not separately shown in the drawing) (including e.g., a display or a touch enabled display, keyboard, d-pad, a trackball, touchpad, joystick, microphone, and/or other user interface devices and associated controller circuitry and/or software). User device 103 may include any type of electronic device capable of providing media content. Some examples include desktop computers and portable electronic devices such as mobile phones, smartphones, multi-media players, e-readers, tablet/touchpad, notebook, or laptop PCs, smart televisions, smart watches, head mounted displays, and other communication devices.
Server computer 101 includes central processing unit CPU 110, storage 111 and memory 112 (and may include an I/O subsystem not separately shown). Server computer 101 may be any computing device capable of hosting computer product 131 for communicating with one or more client computers such as, for example, user device 103, over a network such as, for example, network 102 (e.g., the Internet). Server computer 101 communicates with one or more client computers via the Internet and may employ protocols such as the Internet protocol suite (TCP/IP), Hypertext Transfer Protocol (HTTP) or HTTPS, instant-messaging protocols, or other protocols.
Memory 112 and 122 may include any known computer memory device. Storage 111 and 121 may include any known computer storage device.
Although not illustrated, memory 112 and 122 and/or storage 111 and 121 may also include any data storage equipment accessible by the server computer 101 and user device 103, respectively, such as any memory that is removable or portable, (e.g., flash memory or external hard disk drives), or any data storage hosted by a third party (e.g., cloud storage), and is not limited thereto.
User device(s) 103 and server computer(s) 101 access and communicate via the network 102. Network 102 includes a wired or wireless connection, including Wide Area Networks (WANs) and cellular networks or any other type of computer network used for communication between devices.
In the illustrated embodiment, computer program product 131 in fact represents computer program products or computer program product portions configured for execution on, respectively, server 101 and user device 103. A portion of computer program product 131 that is loaded into memory 112 configures server 101 to record and play back interactive streaming video clips in conformance with the inventive requirements further described herein. The streaming video clips are played back to, for example, user device 103, which supports receiving streaming video, such as via a browser with HTML5 capabilities.
Media files 201 are initially stored in file storage 202. Media files 201 are then distributed via CDN 200 to IDCs 210-260. After a file is distributed, each respective IDC has a local copy of the distributed media file. The respective local copies are then stored as media file copies 211-261. Each IDC 210-260 then serves streaming media, such as video, to users in the geographic vicinity of the respective IDC, in response to user requests. Media file copies 211-261 may be periodically updated.
In some embodiments of the present invention, video streaming infrastructure 2000 is used to distribute the video clips produced by the inventive process disclosed herein. That is, for example, the inventive video clips are stored as media files 201 in file storage 202, and then distributed via CDN 200 to IDCs 210-260, where they are available for playback to users as streaming video.
In other embodiments, the inventive video clips are distributed directly from, for example, a server or servers, such as cloud-based servers, without making use of video streaming infrastructure 2000.
In the illustrated embodiment, in addition to producing and storing interactive video clips tagged with metadata, system 3000 performs additional related functions. For example, in this embodiment system 3000 is also capable of playing back prestored video clips and is additionally capable of streaming video to a user in response to user interactions without first storing the video as a video clip. In alternative embodiments, one or more of these functions can be provided by a separate system or systems.
In
In some embodiments, program output 320 comprises raw video and sound outputs. In some embodiments, program output 320 comprises a video rendering result.
In some embodiments, program input 330 comprises control messages based on indications of user input interactions, such as a user pushing a button, selecting an item on a list, or typing a command. Such user input interactions can originate from input peripherals 350, which can be peripherals associated with a user device, such as user device 103. Specific user device-associated peripherals can include a joystick, a mouse, a touch-sensitive screen, etc. In some embodiments, input peripherals 350 can be collocated with a remote user device 103 and communicate with other elements of the system via a network. Although labeled as “peripherals,” those skilled in the art will understand that input devices/elements such as peripherals 350 may, in particular embodiments, include input elements that are built into, i.e., part of, user device 103 (e.g., a touchscreen, a button, etc.) rather than being separate from and plugged into, user device 103.
In some embodiments, input peripherals 350 are “robot” entities that produce sequences of inputs that simulate the actions of a real user. Such robot entities can be used to “exercise” the system and cause it to produce many (or even all) possible instances of program output 320. The purpose of “exercising” system 3000 in this manner may be to, for example, cause it to produce and store at least one copy of each video clip associated with program output 320.
Application Interaction Container 340 provides a runtime environment to run computer program 310. In embodiments of the present invention, Application Interaction Container 340 detects and intercepts user inputs generated by input peripherals 350 and delivers the intercepted user inputs to computer program 310 in the form of program input 330.
Application Interaction Container 340 also intercepts raw video and sound generated as program output 320 and, utilizing the services of computer program video processing platform, 360, converts the raw video and sound to a streaming video format, and then stores the converted video and sound as one or more video segments or clips 370 in database 390. Each clip represents the audio and video program output produced in response to particular trigger conditions (or playback events), where the set of possible trigger conditions comprise, for example, particular items of program input 330. In some embodiments, the raw video and sound are converted into a multi-media container format. In some embodiments, the raw video and sound are converted into the format known as MPEG2-Transport Stream (MPEG2-TS).
As the video clips 370 are generated, they are also tagged with a set of attributes 380 (also referred to herein as “metadata”), comprising, for example, a clip ID, a playback event, and a length. The attributes in metadata 380 are stored in association with corresponding video clips 370 in database 390. The stored clips 370 can then be used for future playback. The stored, tagged video clips 370 can be re-used by the same user or a different user. Potentially, a given clip 370 can be reused by thousands of users interacting with computer program 310 on a shared server or set of servers.
For example, the next time a given playback event arises (based, for example, on the detection of a particular user input, either from the same user or a different user), the stored video clip 370 tagged with that event can be played, thus avoiding the need to regenerate the corresponding raw video and sound. For some applications, this can result in a substantial savings of computer processing power. See description of playback process below for further details.
As noted above, in the illustrated embodiment, system 3000 can also play back prestored video clips. For example, based on a user interaction via input peripherals 350 resulting in program input 330, computer program 310 may determine that a certain prestored clip 370 with metadata 380 corresponding to the user interaction is available and is the appropriate response to the user interaction. The matching clip 370 can then be retrieved from storage and streamed, for example according to a multi-media container format, such as MPEG2-TS, to user device 103.
As further noted above, in the illustrated embodiment, system 3000 can also stream video to a user in response to user interactions even if the video is not currently stored as a streaming video clip 370. For example, based on a user interaction via input peripherals 350 resulting in program input 330, computer program 310 may determine that a certain video output is the appropriate response to the user interaction, but that no corresponding clip 370 is available. The required video can then be generated by computer program 310 as raw video output 320. Application Interaction Container 340 then intercepts the program output 320 and, utilizing the services of computer program video processing platform 360, converts the raw video to a streaming format, according to, for example, a multi-media container format, such as MPEG2-TS, and sends the streaming video to user device 103. Advantageously, the streaming video can simultaneously be recorded, encapsulated as a video clip 370, and stored for future use along with appropriate metadata 380.
At step 410, a computer program launches in a server, such as server 101. The server can be, for example, a cloud-based server. The server can be, for example, a game server. The computer program can be, for example, an interactive multimedia application program, such as, for example, a game application.
At step 420, the process monitors for user input.
At decision box 430, if no user input is detected, the process returns to step 420 and continues to monitor for user input. If user input is detected, control passes to decision box 440.
At decision box 440, if a prestored video clip with matching metadata (i.e., metadata corresponding to the user input) exists, control passes to step 450, where the prestored video clip is streamed to the user. Control then returns to step 420 and the process continues monitoring for user input.
If, at decision box 440, no prestored clip with matching metadata is found, control passes to step 460. At step 460, a video segment from the program output responsive to the user input is streamed to the user. Simultaneously, the video segment is recorded in preparation for the creation of a corresponding video clip. At step 470, the recorded video is encapsulated into a video clip in a streaming format. For example, the streaming format can be a multi-media container format such as MPEG2-TS.
At step 480, metadata associated with the video clip (e.g. clip ID, playback event or trigger, length) is generated.
At step 490, the video clip and its associated metadata are stored for future use. For example, the video clip can be used in the future by a playback process when a trigger corresponding to the stored metadata for the clip is encountered. By using the stored video clip, the playback process can then avoid the need for the computer program to regenerate the video segment corresponding to the stored video clip.
Video segments can continue to be recorded, encapsulated into clips in a streaming format, and stored with associated metadata until, for example, the game ends.
Note that, in the case where process 4000 is running on a server, for example a cloud-based server, it may actually be handling multiple users, possibly many users, simultaneously. In such a case, it is entirely possible that a given video segment has already been recorded, encapsulated and stored as a video clip 370, with corresponding metadata 380 in the course of a previous user's interaction with process 4000. In such a case, it should not be necessary to record the corresponding segment again. Rather, the video clip can be retrieved from the set of previously stored clips, based on the metadata, which can include a unique ID.
Each interactive multimedia application or portion of an application may have associated with it a playback video clip set of a form similar to video clip set 5000, also referred to as a “metadata playlist.” For example, each level of a multilevel game can have its own metadata playlist. As described above, the metadata associated with each video clip 370 is learned as the application executes in response to real or “robot” user input. Therefore, at the same time, the metadata playlist 5000 is also learned. This is because the metadata playlist is the collection of video clips 370, linked according to metadata 380, for the particular application or portion of an application.
In the example of
Optionally, a caching mechanism can be employed to help smooth playback of the video clips.
In some embodiments of the present invention, the video delivered from a server to a user device is a mix of pre-calculated video (stored and re-played video clips) and real-time generated video streams (for video that has not yet been stored as video clips with metadata).
In the above description, reference is made to streaming multi-media container formats, such as MPEG2-TS. It should be understood that embodiments of the present invention are not limited to MPEG2-TS, but rather can employ any of a wide variety of streaming container formats, including but not limited to 3GP, ASF, AVI, DVR-MS, Flash Video (FLV, F4V), IFF, Matroska (MKV), MJ2, QuickTime File Format, MPEG program stream, MP4, Ogg, and RM (RealMedia container). Embodiments that operate without a standardized container format are also contemplated.
Linking to an Interactive Video Script from a Linear Video
A metadata playlist such as metadata playlist 5000 can also be referred to as representing an “interactive video script.” That is, the metadata playlist 5000 represents determining, as a function of user input, the direction that a video “script” will take. Thus, in the example of
It may sometimes be desirable to utilize a combination of a linear playback video and an interactive video script. For example, it may be desirable to link to an interactive video script from a linear video or to switch between (possibly to switch back and forth between) a linear playback video and an interactive video script. Also, in some cases it may be desirable to link from a linear video to a particular location in an interactive video script, for example in order to play a particular portion of the interactive video script.
As one example, consider an advertisement for a set of games offered through a cloud gaming service. Playing of the advertisement may be initiated by a user clicking on a link to a video ad for the service on a web site. The video ad then begins playing as a conventional linear playback video. However, at one or more points in the linear video, the user is invited to experience playing an actual game. The user then performs a triggering action that initiates play of the game via an interactive video script. Depending on, for example, the triggering action, play can be initiated from the beginning of the interactive video script, or from some other location within the interactive video script. Referring again to
In the above scenario, an important challenge is to identify the specific game (or portion of a game, or location within a game) that is to be initiated based on the user input triggering action. One solution to this problem is to provide a current play timestamp that is sent with the user interaction trigger. The timestamp identifies the portion of the linear video that was being viewed at the time of the user interaction. The desired game (or portion of a game, or location within a game) is then identified as the one that was being featured at that time in the linear video playback.
At any time during the linear video playback, the user can trigger the playing of an interactive video script. Optionally, the user can trigger the playing of the interactive video script starting with any location within the script. Any of a variety of user actions might be used as a trigger mechanism, including for example selecting a menu item, clicking a link, pressing a physical button, or speaking a voice command. As one example, the user might trigger the playing of an interactive video script by touching a screen location 608 (labeled “T” for trigger in the figure). Touching screen location 608 during linear video playback would then trigger the sending of a script request. In some embodiments, the script request comprises a timestamp.
The purpose of the timestamp is to identify the particular video script being requested based on the amount of time that has elapsed since the linear video began playing. For example, in the advertising scenario described above, a request at T1 seconds might identify a script corresponding to multimedia interactive game G1, because the linear ad is presenting the features of that game at time T1. In other embodiments, the particular video script is identified by another mechanism. In some embodiments, the particular video script is identified by clicking on a menu item. In some embodiments, the particular video script is identified by pressing a physical button, or via a voice command. In some embodiments, the timestamp or other mechanism is used to identify not only a particular video script, but also a particular starting location within the video script.
Once the particular script is identified from the timestamp, play of the game corresponding to the selected script can commence. Play will be carried out according to a process such as that described above, for example in connection with
In
At step 710, a user clicks a link on a web page corresponding to a linear video ad. At step 720, the linear video ad plays via a video streaming infrastructure. At decision box 730, if the linear video ad has reached the end, control transfers to box 790 and the process ends. If the end has not been reached and playback continues, at decision box 740 the system monitors for triggers. If no trigger is detected, linear playback continues.
If a trigger is detected, at step 750 the playback time of the video at the time of the trigger action is calculated from a timestamp sent with the trigger. At step 760 a particular interactive video script is selected based on the calculated playback time. As discussed above, other mechanisms may alternatively be employed to select a particular interactive video script. Also, as discussed above, the timestamp or other mechanism may optionally be used to identify a particular starting location within the particular interactive video script.
At step 770 the selected interactive video script is played. Playback of the interactive video script can proceed, for example, generally in the manner of metadata playlist embodiments discussed in connection with
Although, in the embodiment of
Although a few exemplary embodiments have been described above, one skilled in the art will understand that many modifications and variations are possible without departing from the spirit and scope of the present invention. Accordingly, all such modifications and variations are intended to be included within the scope of the claimed invention.
This application is a continuation-in-part of U.S. application Ser. No. 14/932,252, filed on Nov. 4, 2015, which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | 14932252 | Nov 2015 | US |
Child | 15095987 | US |