This invention relates to generating an output video customized to include a person-of-interest from single or multiple input video sequences.
Personal video recordings of public events, such as school concerts and school sports, are quite common and easily created with the advent of digital imaging. Since there is little or no consumable cost, most parents and friends are quite willing to make their own memory of a personal event. However, the image quality of these personal videos is usually poor due to low lighting levels at the venue, recording equipment quality being consumer grade or lower, and the relative distance being too far between the recording device and an object of interest.
Some venues offer recorded videos to the participants, audience members, or the public, for a fee. But these videos are generic such that the same video is offered to all customers. These videos also may or may not be of higher image quality than what could be produced by, for example, by an audience member's personal recording device. And, it is quite common, that if the venue does offer a video of the event, that it does not allow personal video recording devices to be used during the event.
Accordingly, a need in the art exists for improved ways to generate desirable videos of an event.
The above-described need is addressed and a technical solution is achieved in the art by systems and methods for generating a video according to various embodiments of the present invention. In some embodiments of the present invention one or more input video sequences, and a set of person-of-interest (“POI”) information are received. The set of POI information identifies at least one person-of-interest. A particular video sequence is identified that prominently or relatively prominently displays at least the person-of-interest. The particular video sequence is identified from (a) the input video sequence(s), or (b) a portion of the input video sequence, if only one was received, or a portion of one of the input video sequences, if more than one was received. Then, a customized output video is generated from at least a portion or portions of the input video sequence(s), the customized output video being generated based at least upon the set of POI information to include at least the particular video sequence. The customized output video is stored in a processor-accessible memory system.
Accordingly, an output video customized to include the person-of-interest is generated. It can be seen, then, that embodiments of the present invention allow a plurality of different output videos of the same event to be generated, each output video being customized to include its own set of persons-of-interest. In some embodiments, customers who wish to purchase a customized output video have the ability to specify the person or persons-of-interest they want in their customized output video.
In some embodiments, the input video sequence(s) include(s) images of an event spanning a period of time, and a set of times-of-interest (“TOI”) information is received. The TOI information identifies particular times-of-interest within the event's period of time. In these instances, the customized output video is generated to include video from the particular times-of-interest within the event's period of time based at least upon the TOI information. Accordingly, for example, these embodiments allow a customer who wants a customized output video of an event to select particular spans of time of the event that are of interest to the customer, thereby further increasing customization options.
In some embodiments of the present invention, a plurality of input video sequences are received, and the customized output video is generated from at least a portion or portions of at least two of the received plurality of video sequences. Some of these embodiments have the customized output video generated to include two video sequences from the plurality of input video sequences in a picture-in-picture configuration. Also, in some of these embodiments one of the plurality of input video sequences represents a wide-angle view of an event, and another of the plurality of input video sequences represents a zoomed-in view of the event.
In addition to the embodiments described above, further embodiments will become apparent by reference to the drawings and by study of the following detailed description.
The present invention will be more readily understood from the detailed description of exemplary embodiments presented below considered in conjunction with the attached drawings, of which:
It is to be understood that the attached drawings are for purposes of illustrating the concepts of the invention and may not be to scale.
Embodiments of the present invention pertain to generating an output video customized to include a person-of-interest from single or multiple input video sequences. In this regard, some embodiments of the present invention relate to generating a plurality of different output videos of a same event, each output video being customized to include its own set of one or more persons-of-interest. Further, in some embodiments, customers who wish to purchase a customized output video have the ability to specify the person or persons-of-interest they want in their customized output video. Accordingly, many different output videos of an event can be generated such that each output video is customized specifically for the person or people who wish to purchase it.
The data processing system 102 includes one or more data processing devices that implement the processes of the various embodiments of the present invention, including the example processes described herein. The phrases “data processing device” or “data processor” are intended to include any data processing device, such as a central processing unit (“CPU”), a desktop computer, a laptop computer, a mainframe computer, a personal digital assistant, a Blackberry™, a digital camera, cellular phone, or any other device for processing data, managing data, or handling data, whether implemented with electrical, magnetic, optical, biological components, or otherwise.
The data storage system 104 includes one or more processor-accessible memories configured to store the information needed to execute the processes of the various embodiments of the present invention. The data-storage system 104 may be a distributed data-storage system including multiple processor-accessible memories communicatively connected to the data processing system 102 via a plurality of computers and/or devices. On the other hand, the data storage system 104 need not be a distributed data-storage system and, consequently, may include one or more processor-accessible memories located within a single computer or device.
The phrase “processor-accessible memory” is intended to include any processor-accessible data storage device, whether volatile or nonvolatile, electronic, magnetic, optical, or otherwise, including but not limited to, floppy disks, hard disks, Compact Discs, DVDs, flash memories, ROMs, and RAMs.
The phrase “communicatively connected” is intended to include any type of connection, whether wired or wireless, between devices, data processors, or programs in which data may be communicated. Further, the phrase “communicatively connected” is intended to include a connection between devices and/or programs within a single computer, a connection between devices and/or programs located in different computers, and a connection between devices not located in computers at all. In this regard, although the data storage system 104 is shown separately from the data processing system 102, one skilled in the art will appreciate that the data storage system 104 may be stored completely or partially within the data processing system 102. Further in this regard, although the peripheral system 106 and the user interface system 108 are shown separately from the data processing system 102, one skilled in the art will appreciate that one or both of such systems may be stored completely or partially within the data processing system 102.
The peripheral system 106 may include one or more devices configured to provide information, including, for example, video sequences to the data processing system 102 used to facilitate generation of output video information as described herein. For example, the peripheral system 106 may include digital video cameras, cellular phones, regular digital cameras, or other computers. The data processing system, upon receipt of information from a device in the peripheral system 106, may store it in the data storage system 104.
The user interface system 108 may include a mouse, a keyboard, a mouse and a keyboard, or any device or combination of devices from which data is input to the data processing system 102. In this regard, although the peripheral system 106 is shown separately from the user interface system 108, the peripheral system 106 may be included as part of the user interface system 108.
The user interface system 108 also may include a display device, a plurality of display devices (i.e. a “display system”), a computer accessible memory, one or more display devices and a computer accessible memory, or any device or combination of devices to which data is output by the data processing system 102.
As will be detailed below, the video event input data 200 includes one or more input video sequences and, optionally, additional audio or other information. Further, the video event input data 200 includes one or more sets of interest information each indicating at least one or more persons-of-interest. At least the set(s) of interest information are used by the data processing system 102 of the video production system 110 to generate the video event output data 250. The video event output data 250 includes one or more customized output videos generated by the video production system 110.
Referring to
Referring to
Referring now to
Referring to
To elaborate, for example, each set of interest information 232, 234, . . . 236 identifies a person-of-interest 262; a time-of-interest 264; or other data of interest 266 information, as shown by
A set of interest information that identifies a time-of-interest is referred to herein as a set of time-of-interest (“TOI”) information. Times-of-interest identify any time information that is useful for producing the final video output. For example, a set of TOI information may identify particular times-of-interest within the event's period of time that have a preference of being included in a corresponding customized output video. Further, such times-of-interest may be associated with a particular set of POI information, to facilitate designation of starting and ending times for highlighting the persons-of-interest, in the corresponding customized video output.
Other data of interest 266 may include other identifiers of interest to create a corresponding customized output video, such as audio markers or lighting markers that signify the start or termination of a particular event, or additional media content (such as music, voice-over, animation) that is incorporated in the final output video. One skilled in the art will appreciate that additional content may include content for smell, touch and taste as the video display technology becomes more capable of incorporating these other stimuli.
In another embodiment, data of interest are identified during a review of the entire video. The person can identify persons-of-interest and times-of-interest by some input method such as a touch screen or mouse click.
Various methods are available to identify or mark the different inputs of interest. In order to identify the person-of-interest, in the example of events with fixed performer locations, such as a school band concert, a seat identification method may be used. Each input video sequence may then be predefined to capture a particular set of performer locations. For example, an input video sequence that captures a wide-angle view of all event performers will be predefined to have captured all performer locations. However, an input video sequence that captures a small group of performers may be predefined to have captured only those performer locations associated with the small group. Further, for sporting activities, a player's number would be used, and corresponding image recognition techniques known in the art may be used by the data processing system 102 to determine which input video sequences capture which players. Additionally, face recognition applications known in the art may be employed to identify a person-of-interest in an input video sequence.
In order to identify times of interest, automatic methods, manual methods, or both, may be used. In the example of a concert event, the data processing system 102 may be configured to automatically identify the start of each song that is being played. This may accomplished by identifying pauses between songs or identifying applause. If a time-of-interest is identified as a third song in a concert event, the data processing system 102 may be configured to highlight the third song in a corresponding customized output video. In the example of a football game, the change in score could be used to identify times-of-interest. This may be accomplished if a time-of-interest is identified as a kickoff or field goal kick. The data processing system 102 may be configured to highlight the time of a touchdown since the next play will be the one of interest.
Another method to mark the times of interest may include a manual method by an attendee at the event. A stopwatch-type device supplied by the venue may allow the attendee control of the times of interest. Such a stopwatch-type device may be synchronized with the video capture devices. As a time of interest occurs, the person clicks the stopwatch-type device to mark the time. The stopwatch-type device is able to handle multiple highlighted times as well as start and stop times.
Accordingly, one skilled in the art will appreciate that the invention is not limited to any particular technique for identifying persons-of-interest, times-of-interest, or other data-of-interest in input video sequences, and that any technique may be used.
Further in this regard, although each set of interest information 232, 234, . . . 236 in
As stated earlier, each output video 252, 254, . . . 256 is generated from at least a portion or portions of the input video sequences 212, 214, . . . 216 and a set of POI information. To accomplish this, according to an embodiment of the present invention, the data processing system 102 identifies a particular video sequence that prominently or relatively prominently displays at least the person-of-interest identified in the corresponding set of POI information. The particular video sequence is identified from (a) the input video sequence(s), or (b) a portion of the input video sequence, in the case that only one input video sequence was received, or a portion of one of the input video sequences, if more than one input video sequence was received. In this regard, each customized output video 252, 254, . . . 256 is generated based at least upon a corresponding set of POI information to include at least the particular video sequence.
In some embodiments of the present invention, at least one of the output videos 252, 254, . . . 256 have a picture-in-picture format having a smaller video-viewing area superimposed on a larger video-viewing area. In some of these embodiments, a particular video sequence that prominently or relatively prominently displays at least the person-of-interest is displayed in the smaller video-viewing area. Also, in embodiments where one of the input video sequences 212, 214, . . . 216 represents a wide-angle view of an event, and another of the input video sequences 212, 214, . . . 216, represents a zoomed-in view of the event, a customized output video may be generated to include a picture-in-picture format utilizing at least the wide-angle view and the zoomed-in view.
For purposes of clarity, an example of a school band concert event will be provided. At the location of the event, lighting and recording equipment are optimized to obtain good image and sound quality. Several video cameras that supply the input video information 210 record the event from various views. A wide-angle camera at a distance captures the entire concert. A number of wide-angle cameras located closer to the stage capture smaller groups of players. Alternatively, a single video camera is used to capture the event if it has enough resolution to crop regions of interest in the final video. Microphones, to supply the audio information 220, are located with the video equipment as well as targeted locations near the performers.
An individual, who is planning on attending or who is attending the school band concert, prepays or selects a video product, by identifying persons, as well as times or other items of interest. This information may be received by the system 110 through the interface system 108 as shown in
For example, one parent requests to have a close-up portion of his daughter playing the violin during a selected solo on his customized output video. The parent identifies the location of his daughter, the name of the music piece for the solo and other types of customizations desired. With this information, a particular video sequence from the input video sequences 212, 214, . . . 216 is identified by the data processing system 102. In this case, the particular video sequence may be a portion of an input video sequence captured by a camera focused on a small group of performers that has been zoomed-in and cropped to focus on the daughter (i.e., person-of-interest). Referring to
At the same school band concert, another parent requests to have a close-up portion her son playing the trombone on her customized output video, as shown in
The system 110 may have varying levels of automation. A fully automated system may have editing software that automatically select portions of the video with prescribed action or content and trims the rest of the video, according to techniques known in the art. It would also merge (e.g., picture-in-picture) close-up segments into the final video. Such editing software may have the capability of identifying the appropriate cropped portion of the close up segments to be inserted into the customized output video. In the above example, one parent requested to have a close-up portion of his daughter playing the violin in his output video, while another parent requested to have a close-up portion of her son playing the trombone on her video. The editing software would automatically identify and crop each portion specifically for each video product, according to techniques known in the art.