The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings.
Methods and apparatuses for rapidly tagging and recalling metadata from moving or still images are described herein. Embodiments of the present application utilize temporally and/or spatially dynamic object tagging in moving images in conjunction with the use of a pointing device to allow quick access to said information. Embodiments of the present application further provide on-demand advertising where said dynamic metadata and key objects are partially sponsored by paying entities and corporations.
In the following description, numerous details are set forth to provide a more thorough explanation of embodiments of the present invention. It will be apparent, however, to one skilled in the art, that embodiments of the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring embodiments of the present invention.
Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification do not necessarily all refer to the same embodiment.
According to certain embodiments, an on-demand exchange of information is provided that allows viewers/consumers to interact in real-time with TV programs (or other media content) in order to gather relevant information about selected objects or images in a video scene. For example, a movie scene may present a group of high-society women enjoying coffee on a balcony, when suddenly Brad Pitt brings his red Lamborghini to a screeching halt in front of the appalled women. Consider now being able to point immediately to Brad Pitt's watch and have a cursor on the screen change shape and inform you in a call-out box—“Rolex—$300”, which upon further clicking instantly brings you to a website with the option to buy this watch, or other potentially useful information such as the company website, local vendors, watch types, the history of clocks, etc. Alternatively, pointing to Brad Pitt's head may call up the metadata—“Brad Pitt” with subsequent biographical data being available in the lower half of the screen. Other viewers may be more inclined to point to one of the women's dresses to be informed that this is a “Pierre Cardin blue dress—$299” and a subsequent click may show a list of similar dresses, prices, and locations (both local and online) where they may be bought. Optional features may include the pausing of the show during these information-gathering actions.
This model of embedding and retrieving data clearly fulfills the two key attributes that define a good advertising model: 1) It is an “on-demand” service that fulfills the consumers desire to be informed when and where he wants, while being “invisible” and non-invasive when the consumer wants to just enjoy the show, and 2) it is relevant, personalized and targeted to the specific and immediate interests of the consumer, making it an enriching experience as well as a more efficient means of relevant information exchange.
It is evident with this pointing-based information exchange model that some degree of product placement in the media content may be required. This phenomenon is already becoming widespread. However, it is not an absolute requirement for this model because pointing to a specific object, such as a car, may bring up more generic descriptions of the object that may still lead to sponsored information about similar cars from different vendors as well as more generic information about the object.
There are several technological factors that have converged to make these concepts viable. For one, digital content can now easily carry with it the simple metadata that would be required. With the standard processing power of content players, this metadata can now easily be made to dynamically associate with various objects on the viewer's screen. Second, direct, accurate, and fast pointing, which is a critical element of the implementation and viability of this model, is starting to become widely available. For example, for PC users watching TV at their desk, the computer mouse lends itself very well to quick pointing. For Mobile devices such as Cell Phones and PDAs, touch screens are becoming ever more common and are natural tools for pointing. And finally, for the digital living room, absolute-pointer remote controls, such as vision-based devices, have become available that make pointing as easy, natural, and fast as pointing your finger. This is especially true when the content is displayed on a large, high resolution digital TV screen.
In one embodiment, data having full descriptions and hyperlinks are tagged to specific objects in moving images and the invisible hyperlinks move dynamically to continually track the associated object. In one embodiment, a pointing device can be used to point to objects in the scene, whether moving or stationary, and by appropriate action such as clicking or activating a button, be able to substantially immediately recall part or all of the metadata content that pertains to the object.
In one embodiment, the pointing device is a multi-dimensional free-space pointer where the pointing is direct and absolute in nature, similar to those described in co-pending U.S. patent application Ser. No. 11/187,435, filed Jul. 21, 2005, co-pending U.S. patent application Ser. No. 11/187,405, filed Jul. 21, 2005, co-pending U.S. patent application Ser. No. 11/187,387, filed Jul. 21, 2005, and co-pending U.S. Provisional Patent Application No. 60/831,735, filed Jul. 17, 2006. The disclosure of the above-identified applications is incorporated by reference herein in its entirety.
In one embodiment this metadata is strictly informative and yields results akin to a visual search query such as “What is this that I am pointing at?” In one embodiment, the data is wholly or partially sponsored and paid for to instantiate an on-demand advertising model. In one embodiment, the payment is proportional to the frequency of the searches. In one embodiment, the “point & search” patterns of users are logged for later use in, for example, modifying and tailoring the metadata content.
In one embodiment, a cursor may appear on the screen that changes color and/or shape when a valid tag or hyperlink exists. This feature is similar to that of static hyperlinks that may be embedded in certain web-page images. One difference is that now the tags are dynamically moving with the object, and may grow, shrink, and/or evolve with object size and/or shape, or may disappear and reappear with the object.
In one embodiment, the object that is pointed to may be selected by pressing a button on a remote control or pointing device. This action may subsequently log the “click” for later retrieval, or in the preferred embodiment it substantially immediately brings up on-screen information about the selected object.
Once an object is selected, some or all of the metadata associated with the object may become immediately visible in, for example, a pop-up graphical representation or menu. Alternatively the object selection may simply be recorded for later viewing. At this point the user may choose to receive more information about the object by, for example, clicking once more inside the call-out bubble. In one embodiment all “clicks” are logged in a click-history that the user can pull up at his convenience at a later time, as illustrated in
Returning now to the metadata content, it is desirable to the Service Provider that the tagging data be easy to generate, although this is irrelevant to the end user, i.e. the consumer of the service. In one embodiment, the tagging information consists of simple data files that can be specifically generated for different media content. In one embodiment this data consists of arrays of numbers arranged according to the rules laid out in
In one embodiment, the location data is generated by using a software program that allows the Service Provider to run the media content one or more times while pointing to the objects of interest. If, for example, the Service Provider simultaneously holds down specific keys on a keyboard that correspond to that object, the object's position is recorded (overwritten) in the corresponding object column. While the object is not visible on the screen, no key will be pressed and hence the default value of −1 will remain in the object column, signifying that the object is not present.
Having discussed embodiments for how different objects moving around in video content may be easily tagged with time and location stamps and stored in “tagging” files, it is useful to discuss the actual descriptive metadata itself.
Thus, methods and apparatuses for rapidly tagging and recalling (via direct pointing) metadata from moving or still images have been described herein. Some portions of the preceding detailed descriptions have been presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the ways used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of operations leading to a desired result. The operations are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the above discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
Embodiments of the present invention also relate to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), erasable programmable ROMs (EPROMs), electrically erasable programmable ROMs (EEPROMs), magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method operations. The required structure for a variety of these systems will appear from the description below. In addition, embodiments of the present invention are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of embodiments of the invention as described herein.
A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a machine-readable medium includes read only memory (“ROM”); random access memory (“RAM”); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.); etc.
In the foregoing specification, embodiments of the invention have been described with reference to specific exemplary embodiments thereof. It will be evident that various modifications may be made thereto without departing from the broader spirit and scope of the invention as set forth in the following claims. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.
This application claims the benefit of co-pending U.S. Provisional Application No. 60/840,881, filed Aug. 28, 2006, which is incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
60840881 | Aug 2006 | US |