The present invention generally relates to media, and more particularly relates to a media timeline processing infrastructure.
Users of computers, such as desktop PCs, set-top boxes, personal digital assistants (PDAs), and so on, have access to an ever increasing amount of media from an ever increasing variety of sources. For example, a user may interact with a desktop PC that executes a plurality of applications to provide media for output, such as home videos, songs, slideshow presentations, and so on. The user may also utilize a set-top box to receive traditional television programming that is broadcast to the set-top box over a broadcast network. Additionally, the set-top box may be configured as a personal video recorder (PVR) such that the user may store the broadcast content in memory on the set-top box for later playback. Further, the user may interact with a wireless phone that executes a plurality of applications such that the user may read and send email, play video games, view spreadsheets, and so forth.
Because of the wide variety of media sources and the wide variety of computers that may be utilized to provide and interact with media, traditional applications and computers were often configured to specifically address each particular type of media. For example, applications that were executed on a video-game console to output video-games were typically configured to provide an output of the applications to a television, and were not configured to provide the output that could be utilized by other computers and other devices. Therefore, presentation of content that was provided by the different media sources, such as computers and/or applications, may involve multiple applications and devices which may be both time and device intensive. Additionally, multiple applications that were executed on the same computer may be configured to specifically address the particular type of media provided by each respective application. For instance, a first audio playback application may be configured to output media configured as songs. A second audio playback application, however, may be configured to record and playback the recordings in an audio format that is not compatible with the first audio playback application, such as an audio-dictation format. Thus, even applications that are configured for execution on the same computer and the same type of media, e.g. audio, may provide media that is incompatible, one to another.
A timeline provides a way for a user to define a presentation of media. For example, a media player can play a list of songs, which is commonly referred to as a “playlist”. Traditional timelines, however, were limited by the wide variety of media sources and the wide variety of computer configurations that may be utilized to provide and interact with media. When desiring the output of different types of media, for instance, each application needed to “understand” each type of media, such as how to render the particular type of media. This may result in an inefficient use of both hardware and software resources of the computer.
Accordingly, there is a continuing need to provide improved techniques for processing media timelines.
A media timeline processing infrastructure is described. In an implementation, a method is described in which an application is executed to derive a plurality of segments from a media timeline. The media timeline references a plurality of media and each of the segments references media to be rendered during a duration of the segment. The application is executed to queue the plurality of segments for rendering by an infrastructure.
In another implementation, one or more computer readable media include computer executable instructs that, when executed, provide an infrastructure having an application programming interface that is configured to accept a plurality of segments from an application for sequential rendering. Each of the segments reference at least one media item for rendering by the infrastructure and is a segment taken by the application from a media timeline.
The same numbers are used throughout the disclosure and figures to reference like components and features.
Overview
A media timeline processing infrastructure is described. A media timeline provides a technique for a user to define a presentation based on media, such as already existing media (e.g., stored media such as video, songs, documents, and so on) and/or media that is output in “real-time” from a media source, such as streaming audio and/or video. The media timeline may be utilized to express groupings and/or combinations of media and provide compositional metadata utilized by media timeline processing infrastructure that executes, e.g. renders, the media referenced by the media timeline to provide a final presentation.
Different multimedia applications may have different media timeline object models for dealing with collections of media. For example, a media player may use a playlist in order to play media in sequence. On the other hand, an editing application may use a media timeline configured as a storyboard to edit a presentation of the media. Yet another application may utilize an event based timeline, where media playback jumps between items based on certain events. Accordingly, a wide variety of media timeline object models may be encountered which are different, one to another, such that each application may have its own custom media timeline solution.
In an implementation, a media timeline processing infrastructure is described which provides “base level” support for applications such that the applications may render the media timelines which are particular to the application. For example, the media timeline processing infrastructure may be configured to allow an application to queue media segment which does not change over a period of time but have the infrastructure itself “figure out” how to render the segment. In another example, the media timeline processing infrastructure is configured to allow an application to cancel or update segments “on the fly” during rendering of the segment, with the infrastructure handling all the nuances of updating the rendering of the segment as needed. Thus, an application in contact with the media timeline processing infrastructure need only concentrate on the specifics of the particular media timeline object model for that application by translating the media timeline into a sequence of segments which are understood by the media timeline processing infrastructure.
In the following discussion, an exemplary environment is first described which is operable to employ the media timeline processing infrastructure. Exemplary procedures are then described which are operable in the exemplary environment, as well as in other environments.
Exemplary Environment
The computer 102 may obtain a variety of media from a variety of media sources. For example, the computer 102 may locally store a plurality of media 104(1), . . . , 104(k), . . . , 104(K). The plurality of media 104(1)-104(K) may include an assortment of audio and video content having various formats, such as WMV, WMA, MPEG 1, MPEG 2, MP3, and so on. Further, the media 104(1)-104(K) may be obtained from a variety of sources, such as from an input device, from execution of an application, and so on.
The computer 102, for instance, may include a plurality of applications 106(1), . . . , 106(n), . . . , 106(N). One or more of the plurality of applications 106(1)-106(N) may be executed to provide media, such as documents, spreadsheets, video, audio, and so on. Additionally, one or more of the plurality of applications 106(1)-106(N) may be configured to provide media interaction, such as encoding, editing, and/or playback of the media 104(1)-104(K).
The computer 102 may also include a plurality of input devices 108(1), . . . , 108(m), . . . , 108(M). One or more of the plurality of input devices 108(1)-108(M) may be configured to provide media for input to the computer 102. Input device 108(1), for instance, is illustrated as a microphone that is configured to provide an input of audio data, such as a voice of the user, a song at a concert, and so on. The plurality of input devices 108(1)-108(M) may also be configured for interaction by a user to provide inputs that control execution of the plurality of applications 106(1)-106(N). For example, input device 108(1) may be utilized to input voice commands from the user, such as to initiate execution of a particular one of the plurality of applications 106(1)-106(N), control execution of the plurality of applications 106(1)-106(N), and so forth. In another example, input device 108(m) is illustrated as a keyboard that is configured to provide inputs to control the computer 102, such as to adjust the settings of the computer 102.
Further, the computer 102 may include a plurality of output devices 110(1), . . . , 110(j), . . . , 110(J). The output devices 110(1)-110(J) may be configured to render media 104(1)-104(K) for output to the user. For instance, output device 110(1) is illustrated as a speaker for rendering audio data. Output device 110(j) is illustrated as a display device, such as a television, that is configured to render audio and/or video data. Thus, one or more of the plurality of media 104(1)-104(K) may be provided by the input devices 108(1)-108(M) and stored locally by the computer 102. Although the plurality of input and output devices 108(1)-108(M), 110(1)-110(J) are illustrated separately, one or more of the input and output devices 108(1)-108(M), 110(1)-110(J) may be combined into a single device, such as a television having buttons for input, a display device, and a speaker.
The computer 102 may also be configured to communicate over a network 112 to obtain media that is available remotely over the network 112. The network 112 is illustrated as the Internet, and may include a variety of other networks, such as an intranet, a wired or wireless telephone network, a broadcast network, and other wide area networks. A remote computer 114 is communicatively coupled to the network 112 such that the remote computer 114 may provide media to the computer 102. For example, the remote computer 114 may include one or more applications and a video camera 116 that provides media, such as home movies. The remote computer 114 may also include an output device to output media, such as the display device 118 as illustrated. The media obtained by the computer 102 from the remote computer 114 over the network 112 may be stored locally with the media 104(1)-104(K). In other words, media 104(1)-104(K) may include locally stored copies of media obtained from the remote computer 114 over the network 112.
Thus, the computer 102 may obtain and store a plurality of media 104(1)-104(K) that may be provided both locally (e.g., through execution of the plurality of applications 106(1)-106(N) and/or use of the plurality of input device 108(1)-108(M)), and remotely from the remote computer 114 (e.g., through execution of application and/or use of input devices). Although the plurality of media 104(1)-104(K) has been described as stored on the computer 102, the media 104(1)-104(K) may also be provided in “real-time”. For example, audio data may be streamed from the input device 108(1), which is illustrated as a microphone, without storing the audio data.
The computer 102 is illustrated as including a media timeline 120. As previously described, the media timeline 120 provides a technique for a user to define a presentation of stored and/or real-time media from the plurality of media sources. For example, the media timeline 120 may describe a collection of media that was obtained from the input devices 108(1)-108(M), the applications 106(1)-106(N), and/or the remote computer 114. The user, for instance, may utilize one or more of the input devices 108(1)-108(M) to interact with the application 106(n) to define groupings and/or combinations of the media 104(1)-104(K). The user may also define an order and effects for presentation of the media 104(1)-104(K). A sequencer source 122 may then be executed on the computer 102 to render the media timeline 120. The media timeline 120, when rendered, provides the expressed groupings and/or combinations of the media 104(1)-104(K) for rendering by one or more of the plurality of output devices 110(1)-110(J). Further discussion of execution of the sequencer source 122 may be found in relation to the following figures.
The application 202, which may be the same as or different from applications 106(1)-106(N) of
In an embodiment, the media foundation 204 exposes one or more application program interfaces that can be called by the application 202 to render the media 206(g). For example, the media foundation 204 may be thought of as existing at an “infrastructure” level of software that is executed on the computer 102 of
The media foundation 204 may utilize several components among which include the sequencer source 122, a media source 210, a media processor 212, a media session 214, the media engine 208, a source resolver 216, one or more transforms 218, one or more media sinks 220, 222, and so on. One advantage of various illustrated and described embodiments is that the system 200 is a pluggable model in the sense that a variety of different kinds of components can be utilized in connection with the systems described herein. Also included as a part of system 200 is a destination 224, which is discussed in more detail below. In at least one embodiment, however, the destination 224 is an object that defines where a presentation is to be presented (e.g. a window, disk file, and the like) and what happens to the presentation. That is, the destination may correspond to one or more of the media sinks 220, 222 into which data flows.
The media timeline 120 is illustrated as a part of the application 202. The media timeline 120 may be configured in a variety of ways to express how a plurality of media is to be rendered. For example, the media timeline may employ an object model which provides a way for a user of the application 202 to define a presentation based on media that is rendered by the media foundation 204. The media timeline 120, for instance, may range from a sequential list of media files to more complex forms. For example, the media timeline 120 may employ file structures, such as SMIL and AAF, to express media playback experiences that include transitions between media, effects, and so on. The application 202, for instance, may be configured as a media player that can play a list of songs, which is commonly referred to as a playlist. As another example, in an editing system a user may overlay one video over the other, clip a media, add effect to the media and so forth. Such groupings or combinations of media may be expressed using the media timeline 120. Further discussion of the media timeline 120 is found beginning in relation to
The media source 210 is utilized to abstract a provider of media. The media source 210, for instance, may be configured to read a particular type of media from a particular source. For example, one type of media source might capture video from the outside world (e.g., a camera), and another might capture audio (e.g., a microphone). Alternately or additionally, the media source 210 may read a compressed data stream from disk and separate the data stream into its compressed video and compressed audio components. Yet another media source 210 might obtain data from the network 112 of
The media source 210 provides one or more media presentation 226 objects (media presentation). The media presentation 226 abstracts a description of a related set of media streams. For example, the media presentation 226 may provide a paired audio and video stream for a movie. Additionally, the media presentation 226 may describe the configuration of the media source 210 at a given point in time. The media presentation 226, for instance, may contain information about the media source 210 including descriptions of the available streams of the media source 210 and their media types, e.g. audio, video, MPEG, and so on.
The media source 210 may also provide a media stream 228 object (media stream) which may represent a single stream from the media source 210 which can be accessed by the application 202, i.e. exposed to the application 202. The media stream 228 thus allows the application 202 to retrieve samples of the media 206(g). In an implementation, the media stream 228 is configured to provide a single media type, while the sequencer source 122 may be utilized to provide multiple media types, further discussion of which may be found in relation to
The sequencer source 122 is configured to receive segments from the application 202, which then queues the segments on the media session 214 to cause the segments to be rendered. Thus, the sequencer source 122 may be utilized to hide the intricacies of rendering the media timeline 120 to provide media described by the media timeline 120 from other components of the media foundation 204.
The segments received by the sequencer source 122, for instance, may be utilized to create a topology 230 from segments received by the application 202. The topology 230 defines how data flows through various components for a given presentation. A “full” topology includes each of the components, e.g. software modules, used to manipulate the data such that the data flows with the correct format conversions between different components. The sequencer source 122 interacts with the media session 214, which handles “switching” between consecutive topologies for rendering by the media processor 212. For example, the sequencer source 122 may “queue” the topology 230 on the media session 214 for rendering. Further discussion of the interaction of the sequencer source 122, application 202 and the media session 214 may be found in relation to
When a topology is created, the user might choose to create it partially. This partial topology is not sufficient, by itself however, to provide a final presentation. Therefore, a component called the topology loader 232 may take the partial topology and convert it into a full topology by adding the appropriate data conversion transforms between the components in the partial topology.
In the topology 230, for example, data generally originates at the media source 210, flows through one or more transforms 218, and proceeds into one or more media sinks 220, 222. Transforms 218 can include any suitable data handling components that are typically used in presentations. Such components can include those that uncompress compressed data and/or operate on data in some way, such as by imparting an effect to the data, as will be appreciated by the skilled artisan. For example, for video data, transforms can include those that affect brightness, color conversion, and resizing. For audio data, transforms can include those that affect reverberation and re-sampling. Additionally, decoding and encoding can be done by transforms.
Media sinks 220, 222 are typically associated with a particular type of media content. Thus, audio content might have an associated audio sink such as an audio renderer. Likewise, video content might have an associated video sink such as a video renderer. Additional media sinks can send data to such things as computer-readable media, e.g. a disk file and the like, stream the data over the network, such as broadcasting a radio program, and so on.
The media session 214 is a component which may schedule multiple presentations. Therefore, the media processor 212 may be used to drive a given presentation, and the media session 214 utilized to schedule multiple presentations. The media session 214, for instance, may change topologies that are rendered by the media processor 212 as previously described. For example, the media session 214 may change from a first topology that is rendered on the media processor 212 to a second topology such that there is no gap between the renderings of samples from the consecutive presentations that are described by the respective topologies. Thus, the media session 214 may provide a seamless user experience as the playback of the media moves from one presentation to another.
The source resolver 216 component may be utilized to create a media source 210 from URLs and/or byte stream objects. The source resolver 216 may provide both synchronous and asynchronous ways of creating the media source 210 without requiring prior knowledge about the form of data produced by the specified resource.
In at least one embodiment, the media foundation 204 is utilized to abstract away the specific details of the existence of and interactions between various components of the media foundation 204. That is, in some embodiments, the components that are seen to reside inside the media foundation 204 are not visible, in a programmatic sense, to the application 202. This permits the media foundation 204 to execute so-called “black box” sessions. For example, the media engine 208 can interact with the media session 214 by providing the media session certain data, such as information associated with the media (e.g. a URL) and the destination 224, and can forward the application's 202 commands (e.g. open, start, stop and the like) to the media session 214. The media session 214 then takes the provided information and creates an appropriate presentation using the appropriate destination. Thus, the media foundation 204 may expose a plurality of software components that provide media functionality over an application programming interface for use by the application 202.
The sequencer source 122 may also be utilized to write media sources for specific timeline object models. For example, if a movie player has a proprietary file format which is used to represent its timeline, the movie player may use the sequencer source 122 to create a “stand alone” media source which will render its presentation to the media foundation 204. Therefore, an application which uses media foundation 204 may then play the movie player's file directly as it plays any other media file.
Additionally, the media foundation 204 allows 3rd parties to register a particular file type based on its extension, scheme, header, and so on. For instance, the 3rd party may register an object called a “byte stream plug-in” which understands the file format. Therefore, when a file of this particular format is found it creates the registered byte stream plug-in and asks it to create a media source which can source media samples from the file. Continuing with the previous example, the movie player may register a byte stream plug-in for its particular file type. When this byte stream plug-in is invoked, it may parse the media timeline and “figure out” the topologies which form the presentation. The plug-in may then queue the topologies on the sequencer source and rely on the sequencer source to playback the topologies back-to-back. To the application 202, it looks like any other media source for a file was given to the media foundation 204 and is played back just as a normal audio or video file.
The arrows of the system depict how data, control and status flow between the components of the system 300. For example, the application 202 is illustrated as being in contact with the media session 214. Arrow 302 represents communication of control information from the application 202 to the media session 214 through an application programming interface. A variety of control information may be communicated by the application 202 to the media session 214, such as to “set” a topology on the media session 214, call “start” to initiate rendering of a set topology, call “stop” to terminate rendering of the set topology, and so on. Arrow 304 represents the flow of status information from the media session 214 to the application 202, such as acknowledging that a topology has been set, “start” or “stop” calls have been implemented, current status of rendering of a topology by the media session 214, and so forth.
The application 202 is also illustrated as being in contact with the sequencer source 122. Arrow 306 represents communication of partial topologies from the application 202 to the sequencer source 122 and arrow 308 represent communication of status information from the sequencer source 122 to the application 202. As previously described, for instance, the application 202 may segment the media timeline 120 and queue the segments to the sequencer source 122 for rendering. The sequencer source 122 may then fire out events to notify the media processor and the media session that new presentations are available for rendering. These presentations are then picked up by the session, resolved, and queued up to be given to the processor once the rendering of the current presentation is completed, further discussion of which may be found in relation to
The sequencer source 122 may also be viewed as a media source by the media session 214. For example, the sequencer source 122 may set a topology on the media session 214 which specifies that the source of the media is the sequencer source 122. The sequencer source 122 may then aggregate media from a plurality of media sources (e.g., media sources 210(1), 210(2)) and provide the media from the media sources to the media processor 212. In an implementation, the sequencer source 122 may aggregate media of different types and have that media appear as a single media source. For example, the samples may flow directly from the media sources 210(1), 210(2) to the media processor, and from the media process to the media session to be given to bit pumps, which is illustrated by arrows 310-314. The sequencer source 122 may timestamp samples received by the media sources 210(1), 210(2) and provide these samples to the media processor 212 for concurrent rendering. The sequencer source 122 may also control the operation of the media sources, 210(1), 210(2), which is illustrated in
The media session 214 may also be executed to control operation of the sequencer source 122, which is illustrated by arrow 320 as a flow of control information from the media session 214 to the sequencer source 122. For example, the media session 214 may receive a “start” call to begin rendering a topology. The topology may specify that the sequencer source 122 as a media source in the topology. Therefore, the media processor 212, when rendering the topology, may call “start” on the sequencer source 122 to provide the samples represented in the topology. In this instance, the sequencer source 122 also calls “start” on the media sources 210(1), 210(2) and thereafter provides aggregated and time stamped samples back to the media session 214. Thus, in this instance the media session 214 is not “aware” that the sequencer source 122 is providing samples from a plurality of other media sources. Further discussion of media timeline 120 rendering may be found in relation to
Media Timelines
In an implementation, the media timeline 400 is not executable by itself to make decisions about a user interface (UI), playback or editing. Instead, the metadata 414-424 on the media timeline 400 is interpreted by the application 202. For example, the media timeline 400 may include one or more proprietary techniques to define presentation of the media referenced by the timeline. The application 202 may be configured to utilize these proprietary techniques to determine a “playback order” of the media, further discussion of which may be found in relation to
The nodes 402-412, as positioned on the media timeline 400, describe a basic layout of the media timeline 400. This layout may be utilized for displaying a timeline structure. For instance, various types of nodes 402-412 may be provided such that a desired layout is achieved. The node type indicates how the children of that node are interpreted, such as a root node 402 and leaf nodes 408-412. The root node 402 in this instance specifies a starting point for rendering the metadata timeline 400 and includes metadata 414 that describes how rendering is to be initiated.
In the illustrated implementation of
Nodes 404, 406 may also describe additional nodes of the media timeline 400. For example, node 404 may be utilized to describe the order of execution for nodes 406, 408. In other words, node 404 acts as a “junction-type” node to provide ordering and further description of its “children”. There are a variety of junction-type nodes that may be utilized in the media timeline 400, such as a sequence node and a parallel node.
Although the child nodes of the sequence node 502 are configured as leaf nodes in this implementation, child nodes of the sequence node 502 may represent any other type of node. For example, child nodes may be utilized to provide a complex tree structure as shown in
The children of the parallel node 602 may be rendered simultaneously. For example, leaf node 606 and leaf node 608 are children of parallel node 602. Each of the leaf nodes 606, 608 includes respective metadata 610, 612 having respective pointers 614, 616 to respective media 618, 620. Each of the leaf nodes 606, 608 includes a respective time 622, 624 included in the respective metadata 610, 612 that specifies when the respective leaf nodes 606, 608 are to be rendered. The times 622, 624 on the leaf nodes 606, 608 are relative to the parallel node 602, i.e. the parent node. Each of the child nodes can represent any other type of node and combinations of nodes, providing for a complex tree structure with combined functionality. For example, a “junction” type node may also reference media, and so forth. Although metadata including time data has been described, a variety of metadata may be included on nodes of the media timeline, an example of which is described in the following implementation.
Although a few examples of media timelines were described in relation to
Exemplary Procedures
The following discussion describes processing techniques that may be implemented utilizing the previously described systems and devices. Aspects of each of the procedures may be implemented in hardware, firmware, or software, or a combination thereof. The procedures are shown as a set of blocks that specify operations performed by one or more devices and are not necessarily limited to the orders shown for performing the operations by the respective blocks. In portions of the following discussion, reference will be made to the environment, systems and timelines of
The application creates a partial topology for each segment of a media timeline (block 706). For example, in this implementation the media timeline is configured as a playlist, which may be represented by the media timeline 500 of
The application then creates a partial topology for one or more leaf nodes of the sequence node of the media timeline (block 706). In this embodiment, for instance, the media timeline 120 is a playlist which references media that is to be played in sequence, one media item after another. Therefore, each leaf node is the media timeline 120 represents a partial topology for playback of the media timeline. In another example, if the timeline specifies a cross fade between two left nodes, there will be topologies where both leaf nodes are used during the cross fade. In a first example, an effect can be specified for a small duration of the leaf node. For instance, if the leaf node represents media which is 10 seconds long and the timeline specifies a fadeout effect on the last five seconds of the leaf node, then this will result in two topologies, the first one does not include the effect and the second one does.
The application queues the topologies on the sequencer source (block 708) and the last topology is marked as the “end” (block 710). For example, a flag may be set on the last topology such that the sequencer source ends playback after that “flagged” topology is rendered.
A presentation descriptor is then created from the sequencer source (block 712). The presentation descriptor describes the media stream objects (hereinafter “media streams”) which are to be rendered. As previously described, media streams are objects which produce/receive media samples. A media source object may produce one or more media streams. Accordingly, the presentation descriptor may describe the nature of these streams, such as location of the streams, formats, and so on.
The application then obtains the topology from the sequencer source which corresponds to the presentation descriptor (block 714). For example, the application may communicate the presentation descriptor to the sequencer source and receive a topology corresponding to the presentation descriptor. In another example, the sequencer source may “set” the topology on the media session. In addition, the obtained topology may be configured in a variety of ways. For example, the obtained topology may be a partial topology that is resolved into a full topology by the topology loader 232 of
The topology is then set on the media session (block 716). For example, the media session 214 may include a queue for topologies such that the topologies may be rendered in sequence, one after the other, without encountering a “gap” between the rendering of the topologies. Therefore, the application may call the media session to “set” a first one of the queued topologies to be rendered and call “start” on the media session to begin the rendering (block 718).
During the rendering, the application may “listen” for media session events (block 720). For example, the application 202 may receive status events from the media session 214 as illustrated by arrow 304 of
When a “new topology” event is received (“yes” from decision block 722), a presentation descriptor is obtained for the new topology (block 724). The topology from the sequencer source is obtained which corresponds to the presentation descriptor (block 714) and a portion (blocks 714, 716, 720-724) of the procedure 700 is repeated for the new topology. In this way, the application 202, sequencer source 122 and media session 214 may provide a sequential playback of a playlist. In some instances, however, parallel rendering is described which involves multiple media sources and complex topologies. Similar functionality may be employed in such an instance, further discussion of which may be found in relation to the following figures.
Leaf node 904 also includes a pointer 922 that references the A1.asf 802 file described in relation of
The application 202, to play (i.e., render) the media timeline 900 of
The media timeline 900 of
The application then derives a plurality of segments from the media timeline (block 1104). For example, the application may determine which components are utilized during the rendering of the media timeline for a particular duration. The application may then determine segments of the duration which reference media items which do not change during the duration of the segment, i.e., media items are not added or removed during the segment.
Once the media timeline has been segmented, the application constructs a data structure describing the plurality of segments (block 1106). For example, the application may segment the media timeline 900 of
The application then passes the data structure to the sequencer source via an application programming interface (API) (block 1108). As previously described in relation to
The topology is then set on the media session (block 1112). For example, the media session 214 may include a queue for topologies such that the topologies may be rendered in sequence, one after the other, without encountering a “gap” between the rendering of the topologies. Therefore, the application may call the media session to “set” a first one of the queued topologies to be rendered and call “start” on the media session to begin the rendering (block 1114).
During the rendering, the application may “listen” for media session events (block 1116). For example, the application 202 may receive status events from the media session 214 as illustrated by arrow 304 of
A variety of media timelines may be rendered by the media timeline processing infrastructure. For example, a media timeline may be “event based” such that an author may specify the started of media based on an event. For instance, at time “12 am” start playing audio file “A1.asf”. These object modes may queue media on the sequencer source during playback, and can cancel or update topologies which have already been queued as previously described.
Exemplary Operating Environment
The various components and functionality described herein are implemented with a number of individual computers.
Generally, various different general purpose or special purpose computing system configurations can be used. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, network-ready devices, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The functionality of the computers is embodied in many cases by computer-executable instructions, such as software components, that are executed by the computers. Generally, software components include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Tasks might also be performed by remote processing devices that are linked through a communications network. In a distributed computing environment, software components may be located in both local and remote computer storage media.
The instructions and/or software components are stored at different times in the various computer-readable media that are either part of the computer or that can be read by the computer. Programs are typically distributed, for example, on floppy disks, CD-ROMs, DVD, or some form of communication media such as a modulated signal. From there, they are installed or loaded into the secondary memory of a computer. At execution, they are loaded at least partially into the computer's primary electronic memory.
For purposes of illustration, programs and other executable program components such as the operating system are illustrated herein as discrete blocks, although it is recognized that such programs and components reside at various times in different storage components of the computer, and are executed by the data processor(s) of the computer.
With reference to
Computer 1202 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by computer 1202 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. “Computer storage media” includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computer 1202. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more if its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.
The system memory 1206 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 1210 and random access memory (RAM) 1212. A basic input/output system 1214 (BIOS), containing the basic routines that help to transfer information between elements within computer 1202, such as during start-up, is typically stored in ROM 1210. RAM 1212 typically contains data and/or software components that are immediately accessible to and/or presently being operated on by processing unit 1204. By way of example, and not limitation,
The computer 1202 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
The drives and their associated computer storage media discussed above and illustrated in
The computer may operate in a networked environment using logical connections to one or more remote computers, such as a remote device 1250. The remote device 1250 may be a personal computer, a network-ready device, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to computer 1202. The logical connections depicted in
When used in a LAN networking environment, the computer 1202 is connected to the LAN 1252 through a network interface or adapter 1256. When used in a WAN networking environment, the computer 1202 typically includes a modem 1258 or other means for establishing communications over the Internet 1254. The modem 1258, which may be internal or external, may be connected to the system bus 1208 via the I/O interface 1242, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 1202, or portions thereof, may be stored in the remote device 1250. By way of example, and not limitation,
As previously described, the application programs 1218, 1218′ may also provide media timelines for rendering by the media foundation 204 of
Exemplary Media Timeline Implementations
The media timelines previously discussed may employ a variety of methods of storing and restoring timeline data, such as one or more Windows® Media Player Playlist files, eXecutable Temporal Language (XTL) files, and so on.
A media timeline, for instance, may be described as the following Windows® Media Player Playlist file identified by an ASX file extension.
This ASX file specifies three files for output, back to back. No start and stop times have been specified for the files. The ASX file may be represented by the media timeline 1300 shown in
Another example of a media timeline is shown in the following XTL file.
This XTL file describes two tracks, e.g., streams, of media for output. One of the tracks is an audio track and the other is a video track.
The XTL file may be represented by the media timeline 1400 that is shown in
Sequence node 1406 also has two child leaf nodes 1430, 1432. Leaf node 1430 includes metadata that specifies a start time 1434 of “20”, a stop time 1436 of “40”, and a media start 1438 of “0”. Leaf node 1432 include metadata that specifies a start time 1440 of “40”, a stop time 1442 of “60”, and media start 1444 of “0”.
Although the invention has been described in language specific to structural features and/or methodological acts, it is to be understood that the invention defined in the appended claims is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the claimed invention.