Image analysis system and method using image recognition and text search

Information

  • Patent Grant
  • 9384408
  • Patent Number
    9,384,408
  • Date Filed
    Wednesday, January 12, 2011
    15 years ago
  • Date Issued
    Tuesday, July 5, 2016
    9 years ago
Abstract
Provided herein are systems and method for obtaining contextual information of an image published on a digital medium. The methods and systems disclosed herein generally identify and analyze the image to obtain image descriptors corresponding to the image. The methods also identify and analyze text published proximate to the image to obtain textual descriptors, which function to describe, identify, index, or name the image or content within the image. The textual descriptors are then matched to the image descriptors to provide contextual information of the published image.
Description
SUMMARY

Provided herein are systems and method for obtaining contextual information of an image published on a digital medium. The methods and systems disclosed herein generally identify and analyze the image to obtain image descriptors corresponding to the image. The methods also identify and analyze text published proximate to the image to obtain textual descriptors, which function to describe, identify, index, or name the image or content within the image. The textual descriptors are then matched to the image descriptors to provide contextual information of the published image.





BRIEF DESCRIPTION OF THE FIGURES

The accompanying drawings, which are incorporated herein, form part of the specification. Together with this written description, the drawings further serve to explain the principles of, and to enable a person skilled in the relevant art(s), to make and use the claimed systems and methods.



FIG. 1 is a flowchart illustrating one embodiment presented herein.



FIG. 2 is a high-level schematic diagram illustrating the method of FIG. 1.



FIG. 3 is a flowchart illustrating an exemplary embodiment of the method illustrated in FIG. 1.



FIG. 4 is a schematic drawing of a computer system used to implement the methods presented herein.





DEFINITIONS

Prior to describing the present invention in detail, it is useful to provide definitions for key terms and concepts used herein.


Contextual Advertising: a form of targeted advertising for advertisements and/or content appearing or displayed on digital media, such as websites or mobile browsers.


Contextual Information: data related to the contents and/or context of an image or content within the image; for example, but not limited to, a description, identification, index, or name of an image, or object, or scene, or person, or abstraction within the image.


Image: a visual representation of an object, or scene, or person, or abstraction.


In-image advertising: a form of contextual advertising where specific images on a digital medium are matched with related advertisements, and the related advertisements are then provided within or around the specific image.


Proximate: is intended to broadly mean “relatively adjacent, close, or near,” as would be understood by one of skill in the art. The term “proximate” should not be narrowly construed to require an absolute position or abutment. For example, “text proximate to an image,” means “text that is relatively near an image,” but not necessarily abutting an image or image frame. In another example, “text proximate to an image,” means “text on the same screen page or web page as an image.”


Publisher: party that owns, provides, and/or controls a digital content platform or medium; or third-party charged with providing, maintaining, and/or controlling a digital content platform or medium. Digital content platforms include websites, browser-based web applications, software applications, mobile device applications, TV widgets, and equivalents thereof.


Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.


It is to be understood that the systems and methods provided below are not limited to particular embodiments described and, as such, may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims. It should also be appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination. All combinations of the embodiments are specifically embraced by the present invention and are disclosed herein just as if each and every combination was individually and explicitly disclosed, to the extent that such combinations embrace operable processes and/or devices/systems/kits.


Further, as will be apparent to those of skill in the art upon reading this disclosure, each of the individual embodiments described and illustrated herein has discrete steps, components, and/or features, which may be readily separated from or combined with the steps, components, and/or features of any of the other embodiments without departing from the scope or spirit of the present invention. Any recited method may be carried out in the order of events recited or in any other order that is logically possible.


DETAILED DESCRIPTION

The present invention generally relates to digital media applications. More specifically, the present invention relates to systems and methods for obtaining contextual information of an image published on a digital medium.


In digital media applications, a common goal is to identify the context of published images. To such aim, image recognition software and algorithms have been developed to analyze an image, and identify and differentiate objects/content within the image. Such algorithms may, for example, be able to differentiate between a human and a piece of luggage. However, such algorithms have limited specificity. Questions arise as to whether such image recognition algorithms can differentiate between, for example, a man and a woman, or more specifically determine the identity of the person, with 100% confidence. Similarly, questions arise as to whether such image recognition algorithms can identify product brands or names with 100% confidence.


Image recognition specificity is important for applications such as, for example, image-indexed searches, contextual advertising, digital content monitoring, etc. For example, for applications such as contextual advertising, the context of a published image is valuable to an advertiser. If an advertiser is able to identify the context of an image with a high degree of specificity and confidence, the advertiser can accurately match advertisements with published images. For example, if a digital publisher posts an image of a field hockey player, current image recognition algorithms may be able to analyze the image and determine that the image contains a person holding a stick. However, such information lacks the contextual specificity needed to determine whether the scene is of a baseball player, an ice hockey players, or in fact a field hockey player. The presented systems and methods, however, uses surrounding text to provide contextual information of a published image to identify the content and/or context of the image.


The methods provided herein generally identify an image published on a digital medium, and analyze the image with, for example, an image recognition engine to obtain image descriptors (or tags) corresponding to the image. The methods also, in series or parallel, identify text published proximate to the image on the digital medium, and analyze the text to obtain textual descriptors (or tags). The textual descriptors may vary, and in some instances function to describe, identify, index, or name the image or content within the image. The textual descriptors are then matched to the image descriptors to provide contextual information of the published image. The methods presented may be implemented on a computer-based system.


The following detailed description of the figures refers to the accompanying drawings that illustrate exemplary embodiments. Other embodiments are possible. Modifications may be made to the embodiments described herein without departing from the spirit and scope of the present invention. Therefore, the following detailed description is not meant to be limiting.



FIG. 1 is a flowchart illustrating a method 100 of obtaining contextual information of a published digital image. FIG. 2 is a high-level schematic diagram illustrating the method of FIG. 1. In essence, method 100 dissects a digital publication to analyze and obtain information from the image and the text. The information derived from the image and the text is then correlated to ultimately obtain contextual information of the image with a high degree of specificity and confidence. In some instances, for example, the methods presented herein provide for image recognition with a higher degree of specificity and confidence than simply analyzing the image alone.


More specifically, in step 101, an image published on a digital medium is identified. Digital mediums can include, but are not limited to: web pages, software applications, mobile applications, TV widgets, and equivalents thereto. The identification of the digital image may be performed by, for example: having the publisher provide the image; having a browser or application-enabled program (e.g., a JavaScript widget) “scrape” the digital publication (e.g., a “walking the DOM” function); having a browser or application plug-in identify images on the digital medium; having a web-crawler gather information from one or more websites; or other equivalent protocols.


In step 103, the image is analyzed with an image recognition engine to obtain image descriptors or image tags. Any convenient image recognition algorithm (i.e., image recognition analysis program) may be employed. Image recognition algorithms include, but are not limited to, those described in, for example, Wang et al., “Content-based image indexing and searching using Daubechies' wavelts,” Int J Digit Libr (1997) 1:311-328, which is herein incorporated by reference in its entirety (with the exception of term definitions that contradict with the terms explicitly defined above). As described below, image descriptors or image tags may include: common nouns; descriptive adjectives; positional information, such as, a real position coordinate, a boundary area coordinate, or coordinates that outline content within the image; or equivalent descriptors. Image descriptors or image tags may be standardized and/or application-specific.


In step 105, the text published proximate to the image is identified. The identification of the text may be performed by, for example: having the publisher provide the text; having a browser or application-enabled program (e.g., a JavaScript widget) “scrape” the digital publication (e.g., a “walking the DOM” function); having a browser or application plug-in identify the text on the digital medium; having a web-crawler gather information from one or more websites; or other equivalent protocols. In step 107, the identified text of step 105 is analyzed to obtain textual descriptors or textual tags of the image. Textual descriptors or textual tags may include: proper and/or common nouns; descriptive adjectives; positional information, such as, text position, position relative to image, or key word location; or equivalent descriptors. Textual descriptors or textual tags may be standardized and/or application-specific. In one embodiment, the text analysis employs a language-independent proximity pattern matching algorithm. Steps 105 and 107 may be performed before, after, in series, or in parallel to steps 101 and 103. In step 109, the textual descriptors (or tags) and image descriptors (or tags) are matched to provide contextual tags.



FIG. 3 is a flowchart illustrating an exemplary embodiment of the method 100 illustrated in FIG. 1. As shown, step 103 calls for processing the published image through an image recognition engine. The image recognition engine may then provide image tags and positional information. For example, for the image provided in FIG. 3, the image recognition engine may provide tags: (1) woman [x,y:x,y]; (2) black dress [x,y:x,y]; and (3) red purse [x,y:x,y]. The [x,y:x,y] coordinates identify where on the image the object (i.e., woman, black dress, or red purse) is located. In practice, if these tags are the only information that the image recognition engine can provide, such information is limited due to its lack of specificity. However, in step 107, the text proximate to the image is processed through a text recognition engine. The text is searched for textual tags; such as, for example, subject clues identifying, describing, and/or indexing the image. For example, the text proximate to the image may be a story about how Jennifer Aniston attended a movie premier wearing a black Valentino dress and a Ferragamo clutch. The text recognition engine identifies the textual tags “Jennifer Aniston,” “Valentino dress,” and “Ferragamo clutch.” These textual tags are matched to the image tags, in step 109, to create contextual tags. The contextual tags provide additional specificity for identifying the content and/or context of the published image. In one application, for example, an advertiser can use the contextual tags to provide specific targeted advertisements (i.e., contextual advertising). A system of in-image advertising may also be employed based on the contextual tags. For example, if a web user were to mouse-over the black dress, an advertisement for a black Valentino dress may be provided as a “pop-up add.”


In another embodiment, there is provided a method for determining a confidence level for an analyzed image. First, an image is submitted to an image recognition engine. Image tags (Set A) are the identified objects in the image. Next, associated text is submitted to an indexing engine. Text tags (Set B) are the subjects identified within the text. With reference to the above-provided example, Set A may include image tags: (1) woman; (2) black dress; and (3) red purse. Set B may include text tags: (1) Jennifer Aniston; (2) Valentino dress; and (3) Ferragamo clutch. Set A and Set B are processed through a matching engine, which may use object-type matching in order to identify which image tags match with corresponding text tags, resulting in Set M with matched tags. The “change in percentage” is defined as Δ%=100%/(AN+1), wherein AN is the total number of tags in Set A. Confidence level is defined as C=MN×Δ%, wherein AN is greater than 1, and wherein MN is the total number of object-type matches. Confidence level (C) can then be used as an indicator as to whether the content and/or context of the image has been positively identified. Confidence level (C) can also be used to make decisions, make the image interactive, and/or generally associate information related to the context of the image. Confidence level (C) can also be used to enable search indexing.


In another embodiment, there is provided a method of obtaining contextual information for an image published on a digital medium, the method comprising: obtaining an image tag and a textual tag from an image published on a digital medium and text published proximate to the image; and matching the textual tags with the image tags to obtain contextual information for the image. The method may further comprise identifying the image published on a digital medium. The image may be analyzed with an image recognition engine. The method may further comprise identifying the text published proximate to the image. The textual tag may function to describe, identify, index, or name the image or content within the image.


In yet another embodiment, there is provided a method of obtaining contextual information of an image published on a digital medium, comprising: (a) identifying an image published on a digital medium; (b) analyzing the image with an image recognition engine to obtain an image tag corresponding to the image; and (c) identifying text published proximate to the image on the digital medium. The method further includes: (d) analyzing the text from step (c) to obtain a textual tag, descriptor, or subject; and (e) matching the textual tag, descriptor or subject with the image tag. The textual tag, descriptor, or subject may function to describe, identify, index, or name the image or content within the image. The matched image tag and textual tag, descriptor, or subject serve as contextual descriptors of the image, and may be provided and/or displayed to a user. The textual tag, descriptor, or subject may be a proper noun, and the image tag may be a common noun. In such embodiment, the method may further comprise: (f) maintaining a database of proper nouns and corresponding common nouns; and (g) conducting a match query against the database to identify matching image tags and textual tags, descriptors, or subjects.


Further, the image recognition engine may create at least one tag for the image, wherein the tag includes the image descriptor and positional information corresponding to the image descriptor. The positional information may include coordinates selected from the group consisting of: a real position coordinate, a boundary area coordinate, a coordinate that outlines content within the image, and any combination thereof. Further, step (d) may employ a language-independent proximity pattern matching algorithm.


In another embodiment, there is provided a system for obtaining contextual information of an image published on a digital medium, comprising means for identifying an image published on a digital medium, which may include sub-systems and sub-protocols for performing step 101, and equivalents thereto. The system further comprises means for analyzing the image to obtain an image tag corresponding to the image, which may include sub-systems and sub-protocols for performing step 103, and equivalents thereto. The system further comprises means for identifying text published proximate to the image on the digital medium, which may include sub-systems and sub-protocols for performing step 105, and equivalents thereto. The system further includes means for analyzing the text to obtain a textual tag, descriptor, or subject that functions to describe, identify, index, or name the image or content within the image, which may include sub-systems and sub-protocols for performing step 107, and equivalents thereto. The system further includes means for linking the textual tag, descriptor, or subject with the image tag, which may include sub-systems and sub-protocols for performing step 109, and equivalents thereto. The system may further include: means for determining a confidence level of an analyzed image.


In still another embodiment, there is provide a system for obtaining contextual information of an image published on a digital medium, the system comprising: an identification module configured to identify an image published on a digital medium and text published proximate to the image; and a processor that receives and analyzes the image and text to obtain a contextual descriptor by matching at least one image tag with at least one textual tag corresponding to the image. The textual descriptor may function to describe, identify, index, or name the image or content within the image. The processor may be further configured to determine a confidence level for the matched image tag(s) and textual tag(s). The identification module may be a processing module for receiving an image from a publisher, a browser, an application-enabled program (e.g., a JavaScript widget), a browser or application plug-in, a web-crawler, or other equivalent system.


The methods and systems provided herein find use in a variety of application. One application of interest in contextual advertising. In one embodiment, for example, there is provided a method of contextual advertising, comprising: (a) identifying an image published on a digital medium; (b) analyzing the image with an image recognition engine to obtain an image tag corresponding to the image; and (c) identifying text published proximate to the image on the digital medium. The method further includes (d) analyzing the text from step (c) to obtain a textual tag, descriptor, or subject that functions to describe, identify, index, or name the image or content within the image; (e) matching the textual tag, descriptor, or subject with the image tag, wherein the matched textual tag, descriptor, or subject and the image tag serve as a contextual tag for the image; and (f) providing an advertising creative based on the contextual tag. The contextual tags may include positional information corresponding to the image. The positional information may include: a real position coordinate, a boundary area coordinate, or coordinates that outline content within the image. The method may also employ a language-independent proximity pattern matching algorithm. Such method may also be used for in-image advertising.


In still another embodiment, the presented methods may be incorporated into a digital media advertising protocol as described in U.S. patent application Ser. No. 12/902,066, filed on Oct. 11, 2010, titled “System and Method for Selecting Creatives Based on Visual Similarity and Campaign Metrics,” which is herein incorporated by reference in its entirety (with the exception of term definitions that contradict with the terms explicitly defined above). For example, the present invention may be incorporated into a computer-implemented method of selecting an advertisement creative. The method includes the steps of: (a) receiving a request for a creative; (b) selecting a target creative from a database of creatives; and (c) providing the target creative in response to the request received in step (a). The request may be received from a publisher (e.g., a web publisher, or an application publisher), a web user, a web browser, a mobile application, or a merchant.


In practice, the target creative may be selected from the group consisting of a source creative and a plurality of replacement creatives. The plurality of replacement creatives includes creatives that are visually and contextually similar to the source creative. The replacement creatives may be visually and contextually similar to the source creative with a recognition confidence level of 0% or above, or of 50% or greater, or of 75% of greater, or of 85% or greater. The selection of the target creative may be based on factors selected from the group consisting of a source creative campaign metric, a recognition confidence level of one or more of the replacement creatives, a replacement creative campaign metric, a contextual similarity confidence level, a contextual tag, and any combination thereof. For example, the selection of the target creative may be based on which replacement creative is most visually and/or contextually similar to the source creative. The method may further include the step of cataloging the database of creatives based on visual and/or contextual similarity.


In yet another embodiment, there is provided a computer-readable storage medium, comprising instructions executable by at least one processing device that, when executed, cause the processing device to: (a) identify an image published on a digital medium; (b) analyze the image with an image recognition engine to obtain an image tag corresponding to the image; and (c) identify text published proximate to the image on the digital medium. The instructions further cause the processing device to (d) analyze the text to obtain a textual tag, descriptor, or subject that functions to describe, identify, index, or name the image or content within the image; and (e) match the textual tag, descriptor, or subject with the image tag.


The presented systems and methods, or any part(s) or function(s) thereof, may be implemented using hardware, software, or a combination thereof, and may be implemented in one or more computer systems or other processing systems. For example, the presented methods may be implemented with the use of one or more dedicated ad servers. Where the presented methods refer to manipulations that are commonly associated with mental operations, such as, for example, identifying, analyzing, obtaining, receiving or selecting, no such capability of a human operator is necessary. In other words, any and all of the operations described herein may be machine operations. Useful machines for performing the operation of the methods include general purpose digital computers or similar devices.


In fact, in one embodiment, the invention is directed toward one or more computer systems capable of carrying out the functionality described herein. An example of a computer system 400 is shown in FIG. 4. Computer system 400 includes one or more processors, such as processor 404. The processor 404 is connected to a communication infrastructure 406 (e.g., a communications bus, cross-over bar, or network). Computer system 400 can include a display interface 402 that forwards graphics, text, and other data from the communication infrastructure 406 (or from a frame buffer not shown) for display on a local or remote display unit 430.


Computer system 400 also includes a main memory 408, such as random access memory (RAM), and may also include a secondary memory 410. The secondary memory 410 may include, for example, a hard disk drive 412 and/or a removable storage drive 414, representing a floppy disk drive, a magnetic tape drive, an optical disk drive, flash memory device, etc. The removable storage drive 414 reads from and/or writes to a removable storage unit 418 in a well known manner. Removable storage unit 418 represents a floppy disk, magnetic tape, optical disk, flash memory device, etc., which is read by and written to by removable storage drive 414. As will be appreciated, the removable storage unit 418 includes a computer usable storage medium having stored therein computer software and/or data.


In alternative embodiments, secondary memory 410 may include other similar devices for allowing computer programs or other instructions to be loaded into computer system 400. Such devices may include, for example, a removable storage unit 422 and an interface 420. Examples of such may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an erasable programmable read only memory (EPROM), or programmable read only memory (PROM)) and associated socket, and other removable storage units 422 and interfaces 420, which allow software and data to be transferred from the removable storage unit 422 to computer system 400.


Computer system 400 may also include a communications interface 424. Communications interface 424 allows software and data to be transferred between computer system 400 and external devices. Examples of communications interface 424 may include a modem, a network interface (such as an Ethernet card), a communications port, a Personal Computer Memory Card International Association (PCMCIA) slot and card, etc. Software and data transferred via communications interface 424 are in the form of signals 428 which may be electronic, electromagnetic, optical or other signals capable of being received by communications interface 424. These signals 428 are provided to communications interface 424 via a communications path (e.g., channel) 426. This channel 426 carries signals 428 and may be implemented using wire or cable, fiber optics, a telephone line, a cellular link, a radio frequency (RF) link, a wireless communication link, and other communications channels.


In this document, the terms “computer-readable storage medium,” “computer program medium,” and “computer usable medium” are used to generally refer to media such as removable storage drive 414, removable storage units 418, 422, data transmitted via communications interface 424, and/or a hard disk installed in hard disk drive 412. These computer program products provide software to computer system 400. Embodiments of the present invention are directed to such computer program products.


Computer programs (also referred to as computer control logic) are stored in main memory 408 and/or secondary memory 410. Computer programs may also be received via communications interface 424. Such computer programs, when executed, enable the computer system 400 to perform the features of the present invention, as discussed herein. In particular, the computer programs, when executed, enable the processor 404 to perform the features of the presented methods. Accordingly, such computer programs represent controllers of the computer system 400. Where appropriate, the processor 404, associated components, and equivalent systems and sub-systems thus serve as “means for” performing selected operations and functions.


In an embodiment where the invention is implemented using software, the software may be stored in a computer program product and loaded into computer system 400 using removable storage drive 414, interface 420, hard drive 412, or communications interface 424. The control logic (software), when executed by the processor 404, causes the processor 404 to perform the functions and methods described herein.


In another embodiment, the methods are implemented primarily in hardware using, for example, hardware components such as application specific integrated circuits (ASICs) Implementation of the hardware state machine so as to perform the functions and methods described herein will be apparent to persons skilled in the relevant art(s). In yet another embodiment, the methods are implemented using a combination of both hardware and software.


Embodiments of the invention may also be implemented as instructions stored on a machine-readable medium, which may be read and executed by one or more processors. A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computing device). For example, a machine-readable medium may include read only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other forms of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.), and others. Further, firmware, software, routines, instructions may be described herein as performing certain actions. However, it should be appreciated that such descriptions are merely for convenience and that such actions in fact result from computing devices, processors, controllers, or other devices executing firmware, software, routines, instructions, etc.


CONCLUSION

The foregoing description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Other modifications and variations may be possible in light of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical application, and to thereby enable others skilled in the art to best utilize the invention in various embodiments and various modifications as are suited to the particular use contemplated. It is intended that the appended claims be construed to include other alternative embodiments of the invention; including equivalent structures, components, methods, and means.


It is to be appreciated that the Detailed Description section, and not the Summary and Abstract sections, is intended to be used to interpret the claims. The Summary and Abstract sections may set forth one or more, but not all exemplary embodiments of the present invention as contemplated by the inventor(s), and thus, are not intended to limit the present invention and the appended claims in any way.

Claims
  • 1. A method of obtaining contextual information for an image published on a digital medium, the method comprising: obtaining, by an image recognition engine, a set of image tags from an image published on a digital medium, wherein each of the image tags is an object identified in the image;obtaining, by a text recognition engine, a set of textual tags from text published proximate to the image, wherein each of the textual tags is a subject identified in the text; andmatching, by a matching engine, the set of textual tags with the set of image tags based on object-type matching to obtain contextual information for objects identified in the image.
  • 2. The method of claim 1, wherein the method further comprises identifying the image published on a digital medium.
  • 3. The method of claim 1, wherein the image is analyzed with an image recognition engine.
  • 4. The method of claim 1, wherein the method further comprises identifying the text published proximate to the image.
  • 5. The method of claim 1, wherein the textual tag functions to describe, identify, index, or name the image or content within the image.
  • 6. A method of obtaining contextual information for an image published on a digital medium, the method comprising: (a) identifying, by an image recognition engine, an image published on a digital medium;(b) analyzing the image with an image recognition engine to obtain a set of image tags corresponding to the image, wherein each of the image tags is an object identified in the image;(c) identifying, by a text recognition engine, text published proximate to the image on the digital medium;(d) analyzing the text from step (c) to obtain a set of textual tags, wherein each of the textual tags is a subject identified in the text; and(e) matching, by a matching engine, the set of textual tags with the set of image tags to obtain contextual information of the image, wherein the matched textual tags provide additional specificity for identifying objects in the image.
  • 7. The method of claim 6, wherein the textual tag is a proper noun.
  • 8. The method of claim 7, wherein the image tag is a common noun.
  • 9. The method of claim 8, further comprising: (f) maintaining a database of proper nouns and corresponding common nouns.
  • 10. The method of claim 9, further comprising: (g) conducting a match query against the database to identify matching image tags and textual tags.
  • 11. The method of claim 6, wherein the image recognition engine creates at least one tag for the image, and wherein the tag includes the image tag and positional information corresponding to the image tag.
  • 12. The method of claim 11, wherein the positional information includes coordinates selected from the group consisting of: a real position coordinate, a boundary area coordinate, a coordinate that outlines content within the image, and any combination thereof.
  • 13. The method of claim 6, wherein step (d) further comprises employing a language-independent proximity pattern matching algorithm.
  • 14. The method of claim 6, wherein the textual tag functions to describe, identify, index, or name the image or content within the image.
  • 15. A system for obtaining contextual information of an image published on a digital medium, the system comprising: an identification module configured to identify an image published on a digital medium and text published proximate to the image; anda processor that receives and analyzes the image and text to obtain contextual tags by matching a set of image tags and a set of textual tags corresponding to the image, wherein each of the image tags is an object identified in the image, and wherein the each of textual tags is a subject identified in the text.
  • 16. The system of claim 15, wherein the textual tag functions to describe, identify, index, or name the image or content within the image.
  • 17. A method of contextual advertising, the method comprising: (a) identifying, by an image recognition engine, an image published on a digital medium;(b) analyzing the image with an image recognition engine to obtain a set of image tags corresponding to the image, wherein each of the image tags is an object identified in the image;(c) identifying, by a text recognition engine, text published proximate to the image on the digital medium;(d) analyzing the text from step (c) to obtain a set of textual tags, wherein each of the textual tags is a subject identified in the text;(e) matching, by a matching engine, the set of textual tags with the set of image tags, wherein the linked textual tags and image tags serve as contextual tags for the image; and(f) providing an advertising creative based on the contextual tags.
  • 18. The method of claim 17, wherein the tag includes positional information corresponding to the image tag.
  • 19. The method of claim 18, wherein the positional information includes coordinates selected from the group consisting of: a real position coordinate, a boundary area coordinate, a coordinate that outlines content within the image, and any combination thereof.
  • 20. The method of claim 17, wherein step (d) further comprises employing a language-independent proximity pattern matching algorithm.
  • 21. The method of claim 17, wherein the textual tag functions to describe, identify, index, or name the image or content within the image.
  • 22. A non-transitory computer-readable storage medium, comprising: instructions executable by at least one processing device that, when executed, cause the processing device to(a) identify, by an image recognition engine, an image published on a digital medium;(b) analyze the image with an image recognition engine to obtain a set of image tags corresponding to the image, wherein each of the image tags is an object identified in the image;(c) identify, by a text recognition engine, text published proximate to the image on the digital medium;(d) analyze the text to obtain a set of textual tags that function to describe, identify, index, or name the image or content within the image, wherein each of the textual tags is a subject identified in the text; and(e) match, by a matching engine, the set of textual tags with the set of image tags.
US Referenced Citations (199)
Number Name Date Kind
D297243 Wells-Papanek et al. Aug 1988 S
4789962 Berry et al. Dec 1988 A
5008853 Bly et al. Apr 1991 A
5199104 Hirayama Mar 1993 A
5287448 Nicol et al. Feb 1994 A
5349518 Zifferer et al. Sep 1994 A
5367623 Iawi et al. Nov 1994 A
5428733 Carr Jun 1995 A
5583655 Tsukamoto et al. Dec 1996 A
5589892 Knee et al. Dec 1996 A
5615367 Bennett et al. Mar 1997 A
5627958 Potts et al. May 1997 A
D384050 Kodosky Sep 1997 S
D384052 Kodosky Sep 1997 S
5682469 Linnett et al. Oct 1997 A
5684716 Freeman Nov 1997 A
5689669 Lynch et al. Nov 1997 A
5706507 Schloss Jan 1998 A
5721906 Siefert Feb 1998 A
5724484 Kagami Mar 1998 A
5754176 Crawford May 1998 A
5796932 Fox et al. Aug 1998 A
D406828 Newton et al. Mar 1999 S
5933138 Driskell Aug 1999 A
5956029 Okada et al. Sep 1999 A
6026377 Burke Feb 2000 A
6034687 Taylor et al. Mar 2000 A
D427576 Coleman Jul 2000 S
6285381 Sawano et al. Sep 2001 B1
D450059 Itou Nov 2001 S
6356908 Brown et al. Mar 2002 B1
6414679 Miodonski et al. Jul 2002 B1
D469104 Istvan et al. Jan 2003 S
6513035 Tanaka et al. Jan 2003 B1
6728752 Chen et al. Apr 2004 B1
6941321 Schuetze Sep 2005 B2
7069308 Abrams Jun 2006 B2
D528552 Nevill-Manning Sep 2006 S
D531185 Cummins Oct 2006 S
7117254 Lunt et al. Oct 2006 B2
7124372 Brin Oct 2006 B2
7159185 Vedula et al. Jan 2007 B1
7231395 Fain et al. Jun 2007 B2
7233316 Smith et al. Jun 2007 B2
7251637 Caid et al. Jul 2007 B1
D553632 Harvey et al. Oct 2007 S
D555661 Kim Nov 2007 S
D557275 De Mar et al. Dec 2007 S
D562840 Cameron Feb 2008 S
D566716 Rasmussen et al. Apr 2008 S
D567252 Choe et al. Apr 2008 S
7383510 Pry Jun 2008 B2
D577365 Flynt et al. Sep 2008 S
7428504 Song Sep 2008 B2
7437358 Arrouye et al. Oct 2008 B2
7502785 Chen et al. Mar 2009 B2
D590412 Saft et al. Apr 2009 S
7519200 Gokturk et al. Apr 2009 B2
7519595 Solaro et al. Apr 2009 B2
7542610 Gokturk et al. Jun 2009 B2
7558781 Probst et al. Jul 2009 B2
D600704 LaManna et al. Sep 2009 S
D600706 LaManna et al. Sep 2009 S
7599938 Harrison, Jr. Oct 2009 B1
7627556 Liu Dec 2009 B2
7657100 Gokturk et al. Feb 2010 B2
7657126 Gokturk et al. Feb 2010 B2
7660468 Gokturk et al. Feb 2010 B2
D613299 Owen et al. Apr 2010 S
D613750 Truelove et al. Apr 2010 S
D614638 Viegers et al. Apr 2010 S
7760917 Vanhoucke et al. Jul 2010 B2
7774333 Colledge et al. Aug 2010 B2
7783135 Gokturk et al. Aug 2010 B2
7792818 Fain et al. Sep 2010 B2
D626133 Murphy et al. Oct 2010 S
7809722 Gokturk et al. Oct 2010 B2
D629411 Weir et al. Dec 2010 S
D638025 Saft et al. May 2011 S
7945653 Zuckerberg et al. May 2011 B2
D643044 Ording Aug 2011 S
8024345 Colledge et al. Sep 2011 B2
8027940 Li et al. Sep 2011 B2
8036990 Mir et al. Oct 2011 B1
8055688 Giblin Nov 2011 B2
8060161 Kwak Nov 2011 B2
8065184 Wright et al. Nov 2011 B2
D652424 Cahill et al. Jan 2012 S
8136028 Loeb et al. Mar 2012 B1
8166383 Everingham et al. Apr 2012 B1
8175922 Jones et al. May 2012 B2
8234168 Lagle Ruiz et al. Jul 2012 B1
D664976 Everingham Aug 2012 S
D664977 Everingham Aug 2012 S
8250145 Zuckerberg et al. Aug 2012 B2
8255495 Lee Aug 2012 B1
8280959 Zuckerberg et al. Oct 2012 B1
8311889 Lagle Ruiz et al. Nov 2012 B1
8392538 Lee Mar 2013 B1
20020065844 Robinson et al. May 2002 A1
20030050863 Radwin Mar 2003 A1
20030131357 Kim Jul 2003 A1
20030220912 Fain et al. Nov 2003 A1
20040070616 Hildebrandt et al. Apr 2004 A1
20040247206 Kaneda et al. Dec 2004 A1
20050216300 Appleman et al. Sep 2005 A1
20050235062 Lunt et al. Oct 2005 A1
20050251760 Sato et al. Nov 2005 A1
20060155684 Liu et al. Jul 2006 A1
20060179453 Kadie et al. Aug 2006 A1
20060265400 Fain et al. Nov 2006 A1
20070032244 Counts et al. Feb 2007 A1
20070118520 Bliss et al. May 2007 A1
20070157119 Bishop Jul 2007 A1
20070203903 Attaran Rezaei et al. Aug 2007 A1
20070219968 Frank Sep 2007 A1
20070255785 Hayashi et al. Nov 2007 A1
20070258646 Sung et al. Nov 2007 A1
20080002864 Das et al. Jan 2008 A1
20080016040 Jones et al. Jan 2008 A1
20080046458 Tseng et al. Feb 2008 A1
20080079696 Shim et al. Apr 2008 A1
20080082426 Gokturk et al. Apr 2008 A1
20080091723 Zuckerberg et al. Apr 2008 A1
20080134088 Tse et al. Jun 2008 A1
20080141110 Gura Jun 2008 A1
20080163379 Robinson et al. Jul 2008 A1
20080177640 Gokturk et al. Jul 2008 A1
20080199075 Gokturk et al. Aug 2008 A1
20080208849 Conwell Aug 2008 A1
20080268876 Gelfand et al. Oct 2008 A1
20080306933 Valliani et al. Dec 2008 A1
20090006375 Lax et al. Jan 2009 A1
20090007012 Mandic et al. Jan 2009 A1
20090064003 Harris et al. Mar 2009 A1
20090070435 Abhyanker Mar 2009 A1
20090113475 Li Apr 2009 A1
20090125544 Brindley May 2009 A1
20090144392 Wang et al. Jun 2009 A1
20090148045 Lee et al. Jun 2009 A1
20090158146 Curtis et al. Jun 2009 A1
20090159342 Markiewicz et al. Jun 2009 A1
20090165140 Robinson et al. Jun 2009 A1
20090171964 Eberstadt et al. Jul 2009 A1
20090193032 Pyper Jul 2009 A1
20090208116 Gokturk et al. Aug 2009 A1
20090228838 Ryan et al. Sep 2009 A1
20090287669 Bennett Nov 2009 A1
20090313239 Wen Dec 2009 A1
20100005001 Aizen et al. Jan 2010 A1
20100005087 Basco Jan 2010 A1
20100046842 Conwell Feb 2010 A1
20100054600 Anbalagan et al. Mar 2010 A1
20100054601 Anbalagan et al. Mar 2010 A1
20100063961 Guiheneuf Mar 2010 A1
20100077290 Pueyo Mar 2010 A1
20100161631 Yu et al. Jun 2010 A1
20100191586 Veeramachaneni et al. Jul 2010 A1
20100260426 Huang et al. Oct 2010 A1
20100287236 Amento et al. Nov 2010 A1
20100290699 Adam et al. Nov 2010 A1
20100312596 Saffari et al. Dec 2010 A1
20100313143 Jung et al. Dec 2010 A1
20110010676 Khosravy Jan 2011 A1
20110022958 Kang et al. Jan 2011 A1
20110072047 Wang et al. Mar 2011 A1
20110082825 Sathish Apr 2011 A1
20110087990 Ng et al. Apr 2011 A1
20110131537 Cho et al. Jun 2011 A1
20110138300 Kim et al. Jun 2011 A1
20110164058 Lemay Jul 2011 A1
20110173190 van Zwol et al. Jul 2011 A1
20110184814 Konkol et al. Jul 2011 A1
20110196863 Marcucci et al. Aug 2011 A1
20110225508 Steelberg Sep 2011 A1
20110243459 Deng Oct 2011 A1
20110264736 Zuckerberg et al. Oct 2011 A1
20110276396 Rathod Nov 2011 A1
20110280447 Conwell Nov 2011 A1
20110288935 Elvekrog et al. Nov 2011 A1
20110296339 Kang Dec 2011 A1
20120005209 Rinearson et al. Jan 2012 A1
20120036132 Doyle Feb 2012 A1
20120054355 Arrasvuori et al. Mar 2012 A1
20120059884 Rothschild Mar 2012 A1
20120075433 Tatzgern et al. Mar 2012 A1
20120086792 Akbarzadeh et al. Apr 2012 A1
20120110464 Chen et al. May 2012 A1
20120158668 Tu et al. Jun 2012 A1
20120185343 Jones Jul 2012 A1
20120203651 Leggatt Aug 2012 A1
20120205436 Thomas et al. Aug 2012 A1
20120231425 Calman et al. Sep 2012 A1
20120233000 Fisher et al. Sep 2012 A1
20120233143 Everingham Sep 2012 A1
20120258776 Lord et al. Oct 2012 A1
20120287469 Tomiyasu et al. Nov 2012 A1
20120290387 Davis Nov 2012 A1
20130063561 Stephan Mar 2013 A1
Non-Patent Literature Citations (12)
Entry
Cascia et al., “Combining Textual and Visual Cues for Content-based Image Retrieval on the World Wide Web,” IEEE Workshop on Content-based Access of Image and Video Libraries (Jun. 1998).
Everingham et al., “Hello! My name is . . . Buffy—Automatic Naming of Characters in TV Video,” Proceedings of the 17th British Machine Vision Conference (BMVC2006), pp. 889-908 (Sep. 2006).
FAQ from Pixazza's website as published on Feb. 22, 2010, retrieved at http://web.archive.org/web/20100222001945/http://www.pixazza.com/faq/.
Galleguillos et al., “Object Categorization using Co-Occurrence, Location and Appearance,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Anchorage, USA (2008).
Heitz & Koller, “Learning Spatial Context: Using Stuff to Find Things,” European Conference on Computer Vision (ECCV) (2008).
Hoiem et al., “Putting Objects in Perspective,” IJCV (80), No. 1 (Oct. 2008).
Jain et al., “Fast Image Search for Learned Metrics,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Jun. 2008).
Lober et al., “IML: An Image Markup Language,” Proceedings, American Medical Informatics Association Fall Symposium, pp. 403-407 (2001).
Rao, Leena Google Ventures-Backed Pixazza Raises $12 Million for Crowdsourced ‘AdSense for Images’, published Jul. 18, 2010, retrieved from http://techcrunch.com/2010/07/18google-funded-pixazza-raises-12-million-for-crowdsourced-adsense-for-images/.
Russell & Torralba, “LabelMe: a database and web-based tool for image annotation,” International Journal of Computer Vision, vol. 77, Issue 1-3, pp. 157-173 (May 2008).
Torralba, “Contextual Priming for Object Detection,” International Journal of Computer Vision, vol. 53, Issue 2, pp. 169-191 (2003).
Venkatesan et al., “Robust Image Hashing” Image Processing Proceedings. 2000 International Conference vol. 3, 664-666 (2000).
Related Publications (1)
Number Date Country
20120177297 A1 Jul 2012 US