This invention pertains to multimedia authoring systems, software, and product distribution media. In particular, this invention automatically generates a multimedia presentation based on a user's stored media files, thereby automatically generating for a user a customized multimedia story.
It is widely acknowledged that viewing images in the form of a multimedia presentation, referred to herein as a “story”, or a hardcopy thematic album is much more compelling than browsing through a number of random hard-copy prints, or looking at a random series of static images presented sequentially using a slide projector, computer, or television. The selective addition of other elements to the presentation such as a sound track appropriate to the content of the images, the insertion of interesting transitions between the images, the addition of a video clip or the creation of various video-style special effects including fades and dissolves, image-collaging, backgrounds and borders, and colorization makes the presentation much more interesting to the viewer and can greatly enhance the emotional content of the images being presented. The proliferation in the home of new television-based viewing platforms able to accommodate multimedia, including DVD and Video CD players, also increases the demand for this type of presentation.
For the ordinary photographic consumer, the creation of a multimedia presentation or album of still images is not presently very convenient. The selection and layout of digital images can be a significant and time consuming process. Even if the images are available in digital form, a consumer must have facility with multimedia authoring software tools such as Macromedia Director™ or Adobe Premier™ in order to create such a presentation. These software tools, while very flexible, are aimed more at the professional presentation creator, have multiple options, and require a great deal of time and experience to develop the skill needed to use them to advantage. More recently, template-based multimedia presentation applications such as Photojam™, offered by Shockwave.com, or PC-based “movie making” applications such as Apple's I-Movie™ have become available. While these applications can simplify the creation of multi-media presentations for a consumer, they do not help to automate many of the story making options. Current applications often require the user to select a presentation theme and to select the assets, such as pictures and music, that are used to automatically generate an image product. In addition, these applications offer no way to automatically generate an image product such as for special occasions, holidays, anniversaries, or for selected other calendar events.
Thus, there remains a need for an automated authoring system where an inexperienced user can receive an automatically-generated multimedia story and obtain copies of the presentation over a variety of channels and in a variety of formats suitable for multiple types of presentation devices
In answer to these and other needs, and in accordance with one preferred embodiment of the present invention, there is provided a method for automatically generating a customized story (or image product) of a set of digital media files provided by a user on a digital storage device, comprising the steps of analyzing the digital media files for semantic information, including metadata, and organizing the digital images in association with a selected presentation format and on a medium that can be viewed by the user, the format automatically chosen in accordance with the semantic and metadata information, or preselected by the user or by the computer system.
Another preferred embodiment of the present invention is a method, software, and a programmed computer system for automatic story-creation from a collection of assets (still images, video, music, public content) utilizing prescribed template rules applied to the collection. The template rules rely on metadata associated with the assets, personal profile and/or user preference data acquired from the user. Metadata can be in the form of EXIF data, index values from image understanding and classification algorithms, GPS data, and/or personal profile/preferences. These rules or a subset of them, when automatically applied to a collection within the system, will produce a story for rendering via a multimedia output engine. The story can be delivered to the user on a variety of storage media such as CDs, DVDs, magnetic discs, and portable flash memory media. The story can be transmitted via cellular networks, by satellite providers, over local and wired area networks. The story can be received and viewed by the user on a variety of hand held display devices such as PDAs, and cell phones. The story can be received at a home and displayed on a computer, television, or over theater style projection systems.
Another preferred embodiment of the invention comprises a method for automatically creating an image product comprising the steps of obtaining a plurality of digital media files associated with an event such as a birthday, holiday, anniversary or other occasion. Classifying the event is accomplished based on analyzing the digital media files and automatically determining a format of an output product based upon the analysis, and then selecting which ones of the digital media files will be included in accordance with requirements of said output image product.
Another preferred embodiment of the invention comprises a method for automatically analyzing a plurality of digital media files with particular attention to their associated metadata, which might include derived metadata. Based on this analysis, one preferred method involves automatically determining the occurrence and number of occurrences of substantial similar metadata elements among the digital media files. These are then automatically grouped based on the number of times a particular meta data element occurs. That information is then used for classifying the digital media files. An image product is generated using the digital media files having the most frequently occurring meta data elements incorporated therein.
Another preferred embodiment of the invention comprises a program storage device storing a computer program for execution on a computer system. The program is capable of automatically generating an image product utilizing a number of digital media files that are resident in the computer system. The program is designed to first detect an image product trigger which might be a calendar date, a user request for an image product, or an upload to the computer system of a plurality of digital media files such as images, sound files, video, etc. The program locates a plurality of digital media files associated with an event if it's a calendar event, for example, or, if the trigger is an upload of media files, the program will determine if the media files satisfy an output product format type. The program automatically classifies the plurality of digital media files based on analyzing metadata associated therewith and automatically selects those files, based on the classifying step, that satisfy an output product format type. The selected media files are ranked based on one or more of a variety of metrics, such as an image value index, and some or all of the ranked files are included in an appropriate image product format that is related to the event.
These, and other, aspects and objects of the present invention will be better appreciated and understood when considered in conjunction with the following description and the accompanying drawings. It should be understood, however, that the following description, while indicating preferred embodiments of the present invention and numerous specific details thereof, is given by way of illustration and not of limitation. Many changes and modifications may be made within the scope of the present invention without departing from the spirit thereof, and the invention includes all such modifications.
With respect to
The bottom portion of
With reference to the front-end user interface, the user introduces selected assets into the system by activating the asset uploader 101. This component then communicates with the server-side asset import 104 component. The asset import functions to store copies of the assets into the asset store 112 and informs the system manager 107 that it has completed the upload. In one preferred embodiment, communication between the asset import and system manager occurs via the database 113, however, each of the back-end components can be implemented to communicate directly with the system manager 107. For ease of illustration,
The semantic indexers 110 include metadata extraction mechanisms for extracting metadata included in the digital asset, as explained above, and recording it in the database. Other examples of such metadata would be the capture date and time, among many other examples as described herein. The indexers can also include complex algorithms that analyze a stored asset to generate more complex metadata. For example, these algorithms can include scene classifiers which identify or classify a scene into one or more scene types (i.e., beach, indoor, etc.) or one or more activities (i.e., running, etc.); face detection which is used to find as many faces as possible in image collections; and people recognition. People recognition is the identification of a person using facial features and/or other contextual information such as clothing identification, etc. The indexers 110 also include algorithms that operate on sets of assets such as event segmenters which automatically sort, segment, and cluster an unorganized set of media into separate temporal events and sub-events. All of the generated metadata is recorded in the database 113 and are appropriately associated with its corresponding database asset. In a preferred embodiment, the generated metadata may be stored in the triplestore 115, a type of database optimized for storing large quantities of unstructured data.
When the last semantic indexer has completed, or at least a sufficient number of indexers have completed, the system manager 107 will activate the story suggester 106 to determine if one or more appropriate stories should be created, which will result in generating an image product. The story suggester in turn will activate the inference engine 111 for evaluating the various rules stored in the rule base 114 to determine if any of the story rules stored therein can be satisfied. One preferred embodiment of the inference engine is the Prolog inference engine having the rule base 114 represented as a set of Prolog clauses stored in an XML file and evaluated by the Prolog engine as requested.
When the story suggester is searching for stories to create based upon an event, such as an anniversary, holiday, birthday, etc., the story suggester 106 requests that the inference engine 111 evaluate the Prolog clause suggestStoryByEvent, looking for valid bindings for several free variables, including but not necessarily limited to the user, the story type, the intended recipient, and the product type. If a valid set of variable bindings is identified, the story suggester will then obtain from the smart asset selector the appropriate set of assets to go with the suggested story, and then request that the product generator 108 create the desired product representation. The product generator will create one or more files of the appropriate format representing the image product, and store the resulting file(s) in the asset store 112. The system manager 107 is notified by the product generator when the image product has been generated, at which point the system manager alerts the story notifier service 105, which in turn causes the story notifier 103 to inform the user that a new product has been created. In addition to the notification methods described earlier, the notification may be in the form of a pop-up window on a display containing text and graphics information indicating that an image product has been created and is ready for viewing. The user may then view the product using the story viewer 102. The story viewer may be implemented as a browser such as Internet Explorer, or a video playback device such as Windows Media Player. In a preferred embodiment, the user has the option to request from the story viewer a hard-copy rendition of the product, such as a bound photo album, if appropriate. To display the product, the story viewer requests and obtains the necessary assets from the asset server 112. The system manager may also launch the story suggester on a periodic basis, such as nightly, to determine if calendar event driven stories can be created from digital media files stored on the computer system. The reader will appreciate that alternative architectures may result in fundamentally the same behavior. For example, the story suggester 106 and smart asset selector 109 components may be combined into a single component, or the story suggester may directly invoke the smart asset selector to determine that the appropriate set of assets are available for a particular story.
With reference to
With reference to
With reference to
An embodiment of the present invention comprises a computer program executing on a computer system with a display that automatically creates a composite image product. A story theme is first chosen which defines a set of rules for selecting a number of assets to use in creating the image product. Based on the selected assets, a product representation is then selected which can include, for example, a hard-copy album, slide show, DVD, collage, multimedia presentation, screen saver, mug, t-shirt, greeting card, calendar, etc. These two steps are not completely independent; the product representation may impact the asset selection rules, or vice versa. An example story theme might be Mother's Day; a product representation might be a hard-copy album. The same asset selection rules may apply to other image product forms; for example, the images chosen to make a hard copy album might work just as well to make a DVD.
Part of the power of the program is that it allows automatic asset selection whereby the computer system selects a subset of images in an intelligent fashion so that, for example, all the pictures in a collection need not be included in the image product. The number of assets selected may be determined by the output product desired. For example, if a two minute multimedia presentation is selected at a transition rate of four seconds per slide, this would require thirty images. This constraint may be specified as part of a rule set.
The computer system may generate image products based on calendar entries that identify significant dates. The dates may be personally significant, such as anniversaries or birthdays, or they may be holidays such as Mother's Day or New Years Day. The data for these calendar dates may be input to the system by users or it may be inferred by the programmed computer system.
To illustrate calendar-driven stories, suppose user Alex is married, has young children and Mother's Day is May 13th. The programmed computer system can be set to automatically create an image product at a pre-selected time, for example, one week in advance of that date. The computer can be set to alert Alex that a Mother's Day image product has been generated by the computer system and is ready for viewing. The alert can be a pop-up on the computer screen generated by an RSS reader while Alex is using the computer, or it can be a text message sent to his cell phone using an SMS system, etc. The Mother's Day image story theme can be, for example, a multimedia product which includes pictures of Alex's wife and her family.
The specific logic for suggesting a Mother's Day story for a particular user in the preferred embodiment is expressed in Prolog, and has the English equivalent as follows:
R-1.1. Target date Date is a known recurring holiday Holiday
R-1.2. The Holiday is Mother's Day
R-1.3. The system user User is the spouse of the recipient Recipient
R-1.4. The recipient Recipient is a mother
Suppose the system manager 107 invokes the story suggester specifying the date May 13, 2007, and the rule base 114 includes the rule 0. The rule base may have other rules specifying different types of stories that are triggered based upon the date; the rule 0 is simply an example of one such rule. Assuming that Alex is a system user, and is married to Ann and Ann is a female parent, and that May 13, 2007 is the day Mother's Day was celebrated in the year 2007, the system will suggest that a Mother's Day Multimedia Album product be created for Alex, where the story type is Mother's Day Album. Facts such as the day that Mother's Day is celebrated in a given year may be represented as either an enumerated set of facts, one for each holiday and year, or using generalized rules. For example, in one preferred embodiment, the system explicitly has knowledge that Mother's Day is observed on the second Sunday of May.
The story type defines a set of rules used to pick the assets to use to make a particular product. The smart asset selector 109 executes the rule set requested by the story suggester to determine the appropriate set of assets for the product being created. In the preferred embodiment, the rules making up a rule set are expressed in Prolog, using a version of Prolog where clauses are written in a parenthesized prefix form known as S-expressions.
R-1.1. Begin with at most the two best pictures of the mother alone, shown as the rule 1001.
R-1.2. Next, at most the best three pictures of the mother with all children (no husband), shown as rule 1002.
R-1.3. Next, the best picture of the mother with each child individually from any year, shown as rule 1003.
R-1.4. Best pictures of the mother with her mother from any year (not shown).
R.1.5. Best pictures of the mother with family (children and husband) from past year (not shown).
R-1.6. Finally, at most two video clips, shown as rule 1004, where the video belongs to an event classified as type “Family Moments” and the video is less than 60 seconds in length.
“Best” may be defined according to a variety of programmed metrics, or a combination thereof, including various image value index (IVI) metrics. These criteria can be extended to other types of dates besides holidays. The above rules are merely exemplary; the Prolog language enables an arbitrary set of constraints to be defined. In a preferred embodiment, the exact definition of best is defined as appropriate using additional Prolog clauses.
The story suggester requests that the smart asset selector compute the set of assets matching the rule set “Mother's Day Album.” The smart asset selector in turn requests that the inference engine execute the associated rules, determining which assets satisfy the constraints specified by the rules. Continuing the previous example, given the rule set 0 as the rule set “Mother's Day Album”, which is shown in part in
A rule set specifies a set of assets. A rule set may also specify further constraints on the assets that are to be respected by the product generator. For example, a rule set may specify the order the assets are to be presented in the final product and/or how the assets are to be grouped. The scope of the invention includes all such constraints.
Another preferred embodiment of the present invention is in the form of an event driven story type. This story type is triggered based upon an upload of assets to the computer system. In one embodiment, the system, upon receipt of a set of assets, attempts to classify those assets as belonging to one or more event types. The system combines this event classification with additional information about the user to suggest a particular story type. In general, the programmed computer system includes:
an interest and activity ontology
a product catalog ontology, which associates specific product types with specific interests or activities
the ability to associate with people interests or activities from the interest and activity ontology.
The interest and activity ontology defines an extensible list of possible activities, interests and hobbies. For example, a subset of the ontology may include the following classes:
1.a) Indoor Sports
1.b) Outdoor Sports
2.a) Parties
2.b) Solemn Occasions
A full ontology class can be scaled to contain an arbitrary amount of information. The computer system, upon uploading of a set of assets, for example, a series of photos from a digital camera, attempts to first group those assets into events and then classify the events according to the interest and activity ontology. In one preferred embodiment, the programmed computer system classifies assets belonging to one of the following example high level event types:
Outdoor Sports
Party
Family Moments
Vacation
These event types are selected because images can be categorized into these four categories using statistical techniques. These categories can be mapped to one or more classes from the previous activity and interest ontology. For example, the event type Outdoor Sports is mapped to the item 1.b Outdoor Sports in the ontology.
The product catalog likewise contains a set of possible product types, along with the activities/interests those products may be associated with:
Baseball Album (goes with baseball)
Soccer Album (goes with soccer)
Baseball DVD (goes with baseball)
Using this data, the system uses the following generalized rule:
Given the above, the system can suggest a themed story based upon an upload of a set of digital media assets. For example, suppose a father uploads a set of pictures from his daughter Jane's recent little league game, and the system knows the following information:
Jane likes baseball, known because either the system was explicitly told this by the user, or because the system was able to infer this information.
The baseball product is associated with the activity baseball, known because the manufacturer or vendor of the product has associated that metadata as part of the product description.
Baseball is a type of outdoor sport, which is a type of sport, known from an ontology of activities and interests that the system has been explicitly told, such as the previous interest and activity ontology.
The specific logic for picking a story based on automatically selecting a theme associated with a set of pictures is as follows in one preferred embodiment:
This rule, along with many other such rules, is stored in the rule repository 114 and executed by the inference engine 111 when requested by the story suggestor 106, as illustrated in
With reference to
The previously described inference engine 111 of
Event E1 513 is owned by user Alex 501, as shown by link 514, so Alex satisfies rule clause 0. Event E1 contains pictures P1 through P1, 518. Moreover, Event E1 has activity type Outdoor Sports, shown by nodes 513 and 510 and “classifiedAs” link 512. Consequently, rule clause 0 is satisfied by binding the variable EventType to Outdoor Sports.
A set of pictures making up an event is considered to feature a particular person if that person is portrayed in the pictures. More complex definitions of what it means for a set of pictures to feature a person may be defined to require that the person be predominantly portrayed in those pictures, for example, appearing in a majority of the pictures, etc. Using the simple definition that an event features a person if the person appears in a picture belonging to the event, the rule 0 is satisfied by binding the variable Person to Jane, in light of the statement represented by 518, 515 and 503. Clause 0 is satisfied by binding User to Alex, supported by the statement represented by 501, 502 and 503, that Alex is a parent of Jane. Clause 0 is satisfied by binding ActivityType to the class baseball, supported by the statement represented by 503, 504 and 506, that Jane likes baseball. Given the binding of ActivityType to baseball, clause 0 is satisfied by binding Product to the baseball album, using 519, 520 and 506. Given that baseball is a subclass of Outdoor Sports (506, 505, 507), the variable binding of Activity to baseball and EventType to Outdoor Sports satisfies clause 0, and so the entire rule 0 is satisfied given the variable binding of User to Alex and Product to baseball album.
As noted previously, the preferred embodiment uses a Prolog inferencing engine to search for solutions to rules, where the rules are represented using Prolog clauses, but other mechanisms for describing constraints may also be used.
Table 1 is intended to show some examples of rules and the associated metadata and/or algorithms required to make them work. These rules can be used in various combinations in a given rule set to facilitate automatic story generation. These rules are simply illustrative of the arbitrarily complex types of rules that may be expressed within the system.
With reference to
For familial relationships, the system does not require that the user enter all family relationships—one does not need to say, for example, that Jane is Ann's daughter, that Jane is Mary's grandchild, that Jane is Bill's niece, etc. Instead, the system requires only that the canonical relationships of spouse, and parent/child be entered; all other familial relationships are automatically inferred by the system. Relationships by marriage can likewise be inferred, such as mother-in-law, etc.; the system provides a way for the user to specify that such a relationship has terminated as a consequence of divorce.
With reference to
With reference to
It will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without departing from the spirit and scope of the invention. In particular, although one of the example preferred embodiments uses Prolog as the means for expressing and evaluating rules, the reader will appreciate that rules for suggesting stories may be formulated in other languages, and evaluated using other mechanisms than Prolog. Accordingly, the scope of protection of this invention is limited only by the following claims and their equivalents.