1. Field of the Invention
The present invention relates to a system and method of performing format negotiation to accomplish the transmission of multi-media data between a first node and a second node.
2. Introduction
Small handheld computing devices continue to improve in their complexity and data processing capabilities. Examples of such devices include PalmOne's Treo 600 computing device that includes capabilities for multi-media viewing and recording of pictures and video which can also include audio. These operate on various versions of the Palm Operating system. Other available devices include, for example, Hewlett Packard's mobile devices based on the Windows PC operating system, such as the HP ipaq hx4700, the HP rx3000 or the Dell Axim X50. Such devices will each have improved hardware capabilities which may include, for example, a VGA (480×640) display and central processors, video co-processors, means for recording multi-media data and more. Some devices have been called Portable Media Centers or Mobile Media Companions since they have hardware capable of providing higher quality video and audio capabilities.
In order for video to be transmitted from a video source, such as a DVD or streaming multi-media, to the hardware display or recordation destination, the source must be compatible with the capabilities of the hardware and software capabilities of the destination display/record device. This may be handled by way of multi-media standard formats, such as MPEG-2 or MPEG-4 in which the multi-media source is encoded according to the standard and the destination device/software is compatible with that standard. An example media software application is Windows Media Player 10 Mobile (MP 10 Mobile). MP 10 Mobile is a software application running on an operating system, such as the Windows CE Operating system, that will manage digital rights management for content, display special features such as album cover art that are programmed into the source data and will receive appropriately formatted content and record the content or display the video on the display as well as producing the sound from the multi-media source.
There are limitations regarding how source content may be played using such an approach. For example, recorded TV shows may be played using MP 10 Mobile only if they were recorded with a Windows Media Center Edition PC, and then they need to be converted and transferred to the mobile computing device for viewing. MPEG and WMV video must be converted to be played on the mobile device as well. Other differences in the source and the playback device may relate to audio. A multi-media source may be recorded m stereo or in the 5.1 or 7.1 format, and the computing device may have a single speaker.
Many mobile computing devices are also periodically synchronized with a desktop PG When this synchronization occurs, multi-media content may be transmitted to the mobile device for viewing. With MP 10 Mobile, upon synchronization, the Windows Media Player 10 on the PC automatically recognizes the mobile device capabilities and adjusts its conversion settings so that the device receives video formatted for its capabilities. In this regard the software application Windows Media Play 10 on the PC running on the desktop computer's operating system must be programmed to recognize the mobile device, its capabilities and perform the appropriate adjustments to match the mobile device capabilities.
When the software that manages the display of video is running separate from the operating system of the computing device as in the MP 10 Mobile software, challenges will exist for third party software developers that develop software containing multimedia components. Third party software for video games or that may include video clips must be developed to anticipate the capabilities of the destination computing device. For example, third party software may include MPEG encoded video. Given the variety of variety of hardware devices and software applications such as MP 10 Mobile that are in the marketplace, the need for software developers to insure that their software will be compatible and operable on a variety of devices increases the difficulty, complexity and cost of developing software. Such complexities can further inhibit or slow down the sales and success of both the third party software program and the sales of the computing devices on which they are designed to run.
What is needed in the art is a simplified method for insuring that source capabilities and destination capabilities are compatible and that appropriate configuration is accomplished to establish a media stream between the source and destination.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The features and advantages of the invention may be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. These and other features of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth herein.
The present invention addresses the needs in the prior art for an improved system and method of managing the differences in source multimedia content and destination display/recording hardware and software capabilities. The present invention comprises a system, method and computer-readable media that perform format negotiation of parameters or format constraints associated with a source node and format constraint associated with a destination node. The format negotiation occurs within a graph consisting of two or more nodes.
The method aspect of the invention relates to establishing a multimedia format for communicating data between a source node and the destination node. The invention relates to any transformation of media data whether the data is in a raw form such as RGB or YUV video frame or encoded media data such as MPEG1 or MPEG4, for example. The source node may be, for example, a third party software program on a computing device which may manage the reception of images or video through a camera on the device or attached to the device. An example of the destination node is the computing device display or means for recording data and software that controls the presentation of data on the display. The method comprises receiving source node format data from a source node, receiving destination node format data and negotiating any unresolved format constraints between the source node format data and the destination node format data. In addition to the source node and destination node, a graph of nodes may consist of any number of nodes between the source and destination nodes. In this context, the method comprises propagating format constraints for each node to directly or indirectly connected nodes in order to accomplish format negotiation. The communication of multimedia data (which may be in the form of raw data, encoded data or any other form of multi-media data) may be in the context of playback and/or recording of data.
In order to describe the manner in which the above-recited and other advantages and features of the invention can be obtained, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered to be limiting of its scope, the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
Various embodiments of the invention are discussed in detail below. While specific implementations are discussed, it should be understood that this is done for illustration purposes only. A person skilled in the relevant art will recognize that other components and configurations may be used without parting from the spirit and scope of the invention.
The present invention provides for a system, method and computer-readable media that perform format negotiation between a source node for multi-media information and a destination node for playback or recording of multi-media information. In a more general statement, the invention provides for format negotiation between any two nodes that publish their possible multimedia formats within a system. A graph of nodes comprises a first source node, one or more intermediate nodes or filter nodes and a second destination node. In general, the intermediate nodes are nodes within the graph between the source and destination nodes that have responsibilities such as transforming or converting data from one format to another format. There may also be cascading format negotiations along a communication path from node to node or a propagation of a format from node to node as will be described more fully below. The graph is a representation of the interconnected nodes through which media data will pass through the system. The nodes consume, modify or produce media. Any node may have zero or more inputs and zero or more outputs and will have at least one connection to modules or other nodes outside the current node. An example multi-media node is one that transmits audio and/or video data. Inasmuch as one embodiment of the invention relates to a hardware device or system, the basic components associated with a computing device are discussed first.
With reference to
Although the exemplary environment described herein employs the hard disk, the removable magnetic disk and the removable optical disk, it should be appreciated by those skilled in the art that other types of computer readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, Bernoulli cartridges, random access memory (RAM), read only memory (ROM), and the like, may also be used in the exemplary operating environment.
As can be appreciated, the above description of hardware components is only provided as illustrative. For example, the basic components may differ between a desktop computer and a handheld or portable computing device. Those of skill in the art would understand how to modify or adjust the basic hardware components based on the particular hardware device (or group of networked computing devices) upon which the present invention is practiced.
As shown in
One aspect of the invention is that the destination node 202 simply publishes 208 its format constraints to achieve a communication link to another node or nodes for transmission or reception 212 of multimedia data. Each node within the graph will have certain constraints on what input media it accepts or can operate on and also publishes the format associated with the node output. A published format may be a logical expression represented by a vector, an object, a set of constraints or a set of values as format constraints. In one example, a format parameter may be a component of a media library and represents a multimedia format. They may be associated with several objects in a multimedia subsystem, such as media endpoints and codec objects (extractors, decoders, encoders and composers) to describe the data formats that a node can read or write. The format constraints may be packaged together into a vector object. The published format preferably comprises a format type. The following format constraints associated with identifying a format type within a header file are provided by way of example. In the framework, they may be declared in a multi-media format definition header file be grouped into distinct types, such as MPEG4, raw audio, raw video, MIDI audio and so forth.
Another example criterion for the display/record node is that the display/recording of YUV is hardware accelerated which includes a constraint that the width of the video must be a multiple of 16. Wildcard parameters may also be published or the node may not publish any value for a particular parameter which indicates that any value is acceptable.
A source node 206 also publishes 210 its format constraints. An example source node 206 is a video decoder that outputs either YUV and RGB formats or a video camcorder producing raw audio/video data. Such decoders may be, for example, an MPEG decoder. Other format constraints associated with the display/record node 202 and source node 206 may also be available. For example, the video decoder may publish the following set of format constraints: (width=160 AND height=120 AND colorspace=RGB) OR (width=160 AND height=120 AND colorspace=YUV). This parameter set indicates that the width must be 160, the height must be 120 and the colorspace is flexible and may be either RGB or YUV. When a movie or video content is played or recorded, the source node 206 and destination node 202 must have some compatibility to achieve transmission or reception 214 of the multi-media data. According to an aspect of the present invention, the operating system (preferably) manages format negotiation to achieve this compatibility. The manner in which the nodes publish their format constraints enables the system or some other entity, including the nodes themselves, to negotiate within the constraints. The display/record node 202 publishes, for example: (colorspace=RGB, height={240, 120, 60}) OR (colorspace=YUV AND width is-multiple-of 16) to the operating system 204. Thus indicating that the width and height may be any value, the colorspace must be either RGB or YUV and if it is YUV, the width must be a multiple of 16. The Boolean algebra guarantees that any given Boolean term can always be reduced or transformed to an equivalent term in normal-form, wither the disjunctive normal form (DNF) or the conjunctive normal form (CNF). For example, if A, B, C, D, E are constraints associated with a node, i.e., A:=“width”<100; B:=“height”>20 and so forth, then the DNF form is: (A AND B) OR (C AND D AND E) (other combinations of course may be used as well) and the CNF form is: (A OR B) AND (C OR D OR E) (or in other combinations). The two forms shown above are not equivalent but serve to illustrate the nature of the normal forms. The preferred embodiment of the invention utilizes the DNF form for the media-format. A DNF form example would be: (“width”=100 AND “height”=200) OR (“width”=200 AND “height”=400). The example describes the media-format with two alternatives.
If a node with an output is connected to a node with an input in the process of generating a graph of nodes, there are both output constraints as well as input constraints. In order to form a connection, it is necessary to negotiate the two sets of constraints (if they can be negotiated at all), i.e.: C1 (“width” in {100,200}) AND (“height” in {50,60}), C2 (“width”<150) AND (“height” !=60) can be negotiated by simply “AND”ing the boolean terms. In another example, the constraints C1 AND C2=>(“width”=100) AND (“height”=50) produce a single fully resolved alternative of width=100 and height=50 to allow the communication of media data between the two nodes.
Format negotiation can fail in several ways: either the constraints are not restrictive enough so that even after negotiation there are variables to choose from or if there is no way to satisfy both constraints simultaneously. If both constraints cannot simultaneously be satisfied, it is not possible to connect the two nodes. The following is an example of two published constraints that are impossible to negotiate: C1: “width”=100, C2: “width”=50. The following shows an example of two constraints with variables left after reduction: C1: “width”<100, C2: “width”<50. In this case, any width less than 50 will do, but some higher authority still needs to actually choose a concrete value for width satisfying this constraint.
The examples above typically apply to format negotiation between two nodes. However, as will be discussed below, if the information of the entire graph or at least a significant portion of it is known ahead of time, more sophisticated format negotiation can be performed.
The source node publishes its information and the operating system, framework or other module such as a remote server or proxy server receives all the published data and performs format negotiation to achieve a connection between the source and the destination nodes. The constraints are preferably a well-known name or value such as “colorspace”, a relationship such as equals or less than, and a constant value. In the preferred embodiment of the invention, no relationship between the names is allowed in the format constraint publication.
A strength of the invention is that a format constraint published by a node may contain any number of parameters as defined by the node and not the framework The framework itself does not attempt to describe each parameter that may be associated with a constraint. In this manner, because the framework has a purpose of resolving format constraints instead of defining a particular set of restraints, the invention enables an expandable capability for communication of media data between any number of nodes connected in a graph.
Given the above example, the operating system 204 analyzes the published format constraints for the source node 206 and destination node 202 and produces the resulting negotiated format as: (width=160 AND height=120 AND colorspace=YUV). In another example, assume that a video having a 120×90 size is to be played. With that size and width ratio, the resulting negotiated format would be: (width=120 AND height=90 AND colorspace=RGB) because 120 is not a multiple of 16, and so the YUV video overlay cannot be used. Therefore, the system is forced to use the RGB colorspace.
As mentioned above, it is preferable in the present invention that the operating system or basic framework and not the nodes or separate software application such as a MP 10 Mobile that decides which format to use. One benefit of this approach is that third party software developers only need to publish the format constraints and source node data when their software will run on a framework that performs format negotiation. This greatly simplifies the coding process for multi-media applications and reduces development time and costs. With the published format constraints, the framework negotiates a compromise format for the playback of the multi-media information. As mentioned above, the framework that actually performs the format negotiation is preferably the operating system but may be any separate module (hardware or software) and may be performed remotely or on a proxy server.
There are many variations on how the format constraints may be published by the nodes. In addition to the examples set forth above regarding the form of the constraints, the format constraints may be more complicated logical expressions as well. In this regard, the format constraints are preferably represented in a disjunctive normal form such as:
In other words, a logical expression may be a format constraint that is a set of terms conjoined by AND and OR. In this context, a format constraint may comprise an AND'd set of terms, wherein a format vector comprises an OR'd set of format constraints or format alternatives. A parameter in a format constraint may be related to a value or a range of values but are preferably independent of other parameters. However, format alternatives may be used to get an equivalent effect in many cases. For example, if “x” and “y” may take the value of 1 or 2, but may not be equal, this can be expressed as follows: ((x=1) AND (y=2)) OR ((x=2) AND (y=1)).
With this flexibility, nodes may publish not only logical expressions but requirements, preferences, alternative values, wireless-related values, quality of service, power consumption, pricing plans and any other variations. The algorithms associated with the framework can therefore maximize the resulting set of format constraints for displaying or recording the data based on the ability to resolve the constraints in the parameter sets and to maximize the ultimate transmission of data from the source node to the destination node. As mentioned above, in previous systems, the software applications where responsible for doing the negotiation and capability analysis. In the present invention, the “nodes” whether they are a software application, display device, recording device and so forth, only need to publish their format constraints and the framework manages all the negotiation.
With regards to format preferences, a developer may create a format preference object with variables such as a key, a value and a weight. The key specifies an attribute of a multimedia object (such as bit rates for MPEG and successor formats, number of frames processed per second, number of frames per buffer for audio or video data, width in native pixels of a video frame or graphics file, height of a video frame in native pixels of a video frame or graphics file, audio decoder parameters etc.). The value specifies the preferred value for that attribute and the weight specifies how much that value is preferred. The weight may only be used if two competing preferences are specified. In that case, the one of the greater weight is used.
An example of using a format preference object is provided next. A format preference object is created within a media endpoint object to specify a format preference. Suppose a developer has a media node that can take any byte order but works more efficiently if the byte order is a little endian. This node's endpoints would have a format that either did not specify a format term for the P-FORMATKEY_BYTE_ORDER attribute or had one with a wild value. To specify that the little-endian order is preferred, it should pass a format preference object that uses the P_FORMATKEY_BYTE_ORDER key with a value of little endian. After creating the appropriate format constraints to publish from a node or an object, the developer does not interact with the system any further with regards to the formats. The format, format vector and endpoint correctly handle all format comparisons via the framework at the appropriate times.
The framework on the computing device preferably performs the negotiation, but other devices such as a separate server or a proxy server which selects alternate content or transforms content format to optimize it for the target device. This aspect of the invention is shown in
In one aspect of the invention, the format negotiation may involve testing to see if the framework can change a parameter associated with a node based on a certain criteria. For example, if significant improvement may be gained if a particular parameter were changed for the source node, the negotiation may test whether the source node would accept that alteration even if the alteration is outside of the original published set of format constraints. The framework may also manage timing and buffering of data.
Software developers may be able to prepare software applications using a multimedia library comprising a multimedia API that enables them to access call services associated with the operating system. The library provides public SDK-level APIs that multimedia clients use to access multimedia features.
The method may further comprise, based on the source node format data and the destination node format data, selecting a communication node for communicating the multimedia data between the source of node and the destination node. The black box 204 may be, for example, a media encoder or decoder (codec) component. Where intermediate nodes are in the communication path between a first node and a second node, cascading format negotiation may occur in between sets of nodes throughout the communication path.
In one scenario where a communication node or communication nodes are used to connect a source node and a destination node, the method may comprise generating a graph of connections between source nodes and destination and nodes. For example, a source device and destination device may publish their device requirements, these format constraints may be used by the framework to determine which communication nodes and which codes to use for connecting the source multimedia data to the destination device for playback.
Several characteristics of communication nodes (or media nodes) are discussed next. A communication node may have the responsibility for obtaining and transmitting buffers. A communication node has one or more endpoints that are portals through which a media node sends buffers to or receives buffers from another media node. Endpoints are either inputs or outputs. By connecting media node endpoints together, one can construct a graph of media nodes. The endpoints can also publish the list of media format format constraints that they particular media node works with for format negotiation.
An example graph is shown in
If a graph is generated of the negotiated connection, the method may comprise negotiating the unresolved format constraints according to the generated graph. In one aspect of the invention, the step of negotiating the unresolved format constraints comprises selecting one of the range of acceptable values for the communication of data. If both the source node has a range of acceptable values for a parameter and the destination node also includes a range of values for the same parameter (such as width), then format negotiation may comprise selecting a first value for the source node from the range of acceptable values that matches a second value from the range of acceptable values for the destination node. For example, if a first node publishes a parameter “A<15” and the second node publishes “A>10” then there is an overlapping range of compatible values from 11-14 that format negotiation will resolve to select the appropriate value for within the compatible range. Furthermore, optimization and maximization of a value such as quality of service, predefined preferences, bandwidth or pricing may further be employed to select the negotiated value. For example, if a predefined preference based on some criteria establishes that A is preferably within the range of 8-11, then the format negotiation would select a value of A based on the compatible range 11-14 and the known preference.
In another aspect of the invention, negotiating unresolved format constraints further comprises identifying alternate contents or alternately transforming content formats to optimize content for the destination node. As with the other aspects of the invention, these steps may be practiced on the framework, a remote server or proxy server used to identify the alternate content or alternate transform content formats.
In yet another aspect of the invention, shown in
Selection of which filter nodes to use may comprise several steps, such as (1) identifying the source data format (such as MPEG) and finding the correct nodes for further processing, and (2) receiving the specification of what nodes support what formats and analyzing other format constraints associated with each node. In this regard, once multiple nodes have been identified, then multiple format negotiations between any two nodes will occur to complete the connection between the source node and the destination node.
An example will further illustrate multiple format negotiations and also data transformation based on the format negotiation. Assume the source node 502 will output audio data and publishes the appropriate format constraints. Format negotiation may or may not take place between the source node 502 and the first filter node 504. The node 504 would publish that it will be outputting MPEG encoded audio data and the system would match node 504 with a decoder node 506 that can handle (decode) the particular MPEG data. Assume that destination node 508 is a single speaker. The node 506 then would publish to the destination node 508 that it will be outputting raw PCM data in, for example, two channels for stereo sound. Its output format constraints such as assemble rate of the audio are example format constraints. Alterations to the data may occur as part of the negotiation. If the destination node 508 is a single speaker, and the PCM raw output from decoder node 506 is two speaker channels, based on the format negotiation process which identifies that a parameter of speaker node 508 is it is a single channel, the stereo channels may be downmixed into a single channel before transmission to the destination node 508. In this manner, the format negotiation enables the appropriate data transformation to match the data with the destination.
The steps (602)-(606) are repeated for each new node that requests connection to the graph. This process continues until the source node, all intervening media or connection nodes, and a destination node are connected in a graph with compatible format constraints. After each new node is connected to the graph, the method further comprises resolving all format constraints for each node in the graph. This approach enables in a single atomic transaction the full resolution of all the format constraints across the graph for each node or each node media endpoint.
The task to be performed by any given node in a graph is in material to the present invention. The propagation and resolution of format constraints within a graph is indifferent to whether a node is a codec, a filter, a recording device, an abstraction of hardware on the computing device such as a video/audio input or output.
Embodiments within the scope of the present invention may also include computer-readable media for carrying or having computer-executable instructions or data structures stored thereon. Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or combination thereof) to a computer, the computer properly views the connection as a computer-readable medium. Thus, any such connection is properly termed a computer-readable medium. Combinations of the above should also be included within the scope of the computer-readable media.
Computer-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. Computer-executable instructions also include program modules that are executed by computers in stand-alone or network environments. Generally, program modules include routines, programs, objects, components, and data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.
Those of skill in the art will appreciate that other embodiments of the invention may be practiced in network computing environments with many types of computer system configurations, including personal computers, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. Embodiments may also be practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination thereof) through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
Although the above description may contain specific details, they should not be construed as limiting the claims in anyway. Other configurations of the described embodiments of the invention are part of the scope of this invention. For example, the format negotiation may exist on a single compute device and between nodes within that device or the negotiation may occur over a communications network between nodes of the network. The format constraints may take the form of equations having various parameters and variables. Accordingly, the appended claims and their legal equivalents should only define the invention, rather than any specific examples given.
The present invention claims priority to U.S. Provisional Patent Application No. 60/543,108 filed on Feb. 9, 2004, and U.S. Provisional Patent Application No. 60/543,356, filed on Feb. 9, 2004, the contents of each of these cases are incorporated herein by reference. The present application relates to the following applications: (1) Ser. No. ______ Attorney Docket No. 4002.Palm.PSI, entitled “A Method And Graphics Subsystem For A Computing Device”; (2) Ser. No. ______ Attorney Docket No. 4003.Palm.PSI, entitled “A Method And System For A Security Model For A Computing Device”; and (3) Ser. No. ______ Attorney Docket 4004.Palm.PSI, entitled “A System And Method Of Managing Connections With An Available Network” each of which are filed on the same day as the present application; the contents of each Application are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
60543108 | Feb 2004 | US | |
60543356 | Feb 2004 | US |