The present application relates generally to computer ecosystems and more particularly to automatically curated content.
A computer ecosystem, or digital ecosystem, is an adaptive and distributed socio-technical system that is characterized by its sustainability, self-organization, and scalability. Inspired by environmental ecosystems, which consist of biotic and abiotic components that interact through nutrient cycles and energy flows, complete computer ecosystems consist of hardware, software, and services that in some cases may be provided by one company, such as Sony. The goal of each computer ecosystem is to provide consumers with everything that may be desired, at least in part services and/or software that may be exchanged via the Internet. Moreover, interconnectedness and sharing among elements of an ecosystem, such as applications within a computing cloud, provides consumers with increased capability to organize and access data and presents itself as the future characteristic of efficient integrative ecosystems.
Two general types of computer ecosystems exist: vertical and horizontal computer ecosystems. In the vertical approach, virtually all aspects of the ecosystem are owned and controlled by one company, and are specifically designed to seamlessly interact with one another. Horizontal ecosystems, one the other hand, integrate aspects such as hardware and software that are created by other entities into one unified ecosystem. The horizontal approach allows for greater variety of input from consumers and manufactures, increasing the capacity for novel innovations and adaptations to changing demands.
Present principles are directed to specific aspects of computer ecosystems, specifically, searching electronic videos for various purposes. Currently, many users have a large amount of video content that has been captured but is no longer being viewed. This is due to the onerous nature of video editing. It is both time consuming and not easy. There are no solutions available to permit videos to be edited, produced and put to music without significant user intervention.
Present principles facilitate automatic creation of video content that has been edited to include just the highlights for a given theme. The theme can be created by the user to identify a grouping of content that together makes up the theme. For example, if a video of a child growing up is requested for a birthday party, this can include all videos of that particular person growing up over time. Besides the theme, a time frame can be provided that can establish a length for the video output.
A cloud based algorithm may automatically view each frame of a video and automatically generate searchable tags that can be used for video creation. These tags can include facial recognition, geo-tagging, time tagging, object recognition, etc. The tagging can include searches of social networks, calendar information, emails, etc. to provide an even higher level of context. In addition, the video stream and audio stream can be further analyzed for indications of excitement, emotion, etc. lending itself to highlight generation. The user can start the process by uploading his videos to the cloud service. By using the combination of tags, highlights, and a theme and video length, the system can generate a video montage including background music, which excites the user and makes their stored videos come alive. This final output can be made available for download and distribution.
Accordingly, a device includes at least one computer readable storage medium bearing instructions executable by a processor, and at least one processor configured for accessing the computer readable storage medium to execute the instructions to configure the processor for recognizing at least one feature in respective electronic images of plural digital video streams. For each image, the processor automatically associates the image with an original metadata indicating the at least one feature. Also, for at least some segments of the plural video streams, the processor associates the segments with respective indicia of scene excitement derived at least in part on motion vector analysis of the segments, and/or on object recognition on images in the segments. A user specification for a video montage including at least a montage subject is received, and based on the user specification, plural segments are selected from plural video streams. This selecting of plural segments is responsive to a determination that each selected segment satisfies an excitement threshold based at least in part on the respective index of scene excitement to render plural selected segments. The plural selected segments from plural video streams are assembled into a montage video stream.
In some examples, the processor when executing the instructions is further configured for presenting on a display a user interface (UI) permitting a user to specify parameters for a video montage to be created from video files in one or more libraries of video files. The UI can include a first selector by which a user can specify a theme or subject of the montage, a second selector by which a user can enter a desired length of the montage, and a third selector allowing a user to select only video clips for the montage that indicate excitement in the video clips. The UI may also include a music selector allowing a user to enter a music track identification to associate a music track with the montage, and/or a music selector allowing a user to indicate that the processor is to select a music track to associate with the montage. In some examples the UI may include an order selector allowing a user to specify whether the video clips are to be in chronological order in the montage or assembled in the montage in a temporally manner.
The index of scene excitement can be associated with motion vectors of a segment and/or can be associated with emotion in a segment as indicated by a face recognition algorithm.
In another aspect, a method includes presenting on a display a user interface (UI) permitting a user to specify parameters for a video montage to be created from video files in one or more libraries of video files. The method includes receiving via the UI the parameters, which include at least a montage subject. Based on the parameters, plural segments are selected from plural video streams and assembled into a montage video stream.
In another aspect, a system includes at least one computer readable storage medium bearing instructions executable by a processor which is configured for accessing the computer readable storage medium to execute the instructions to configure the processor for presenting on a display a user interface (UI) permitting a user to specify parameters for a video montage to be created from video files in one or more libraries of video files. The UI includes a first selector by which a user can specify a theme or subject of the montage, and a second selector by which a user can enter a desired length of the montage. The UI also includes a third selector allowing a user to select only video clips for the montage that indicate excitement in the video clips.
The details of the present application, both as to its structure and operation, can be best understood in reference to the accompanying drawings, in which like reference numerals refer to like parts, and in which:
This disclosure relates generally to computer ecosystems including aspects of consumer electronics (CE) device based user information in computer ecosystems. A system herein may include server and client components, connected over a network such that data may be exchanged between the client and server components. The client components may include one or more computing devices including portable televisions (e.g. smart TVs, Internet-enabled TVs), portable computers such as laptops and tablet computers, and other mobile devices including smart phones and additional examples discussed below. These client devices may operate with a variety of operating environments. For example, some of the client computers may employ, as examples, operating systems from Microsoft, or a Unix operating system, or operating systems produced by Apple Computer or Google. These operating environments may be used to execute one or more browsing programs, such as a browser made by Microsoft or Google or Mozilla or other browser program that can access web applications hosted by the Internet servers discussed below.
Servers may include one or more processors executing instructions that configure the servers to receive and transmit data over a network such as the Internet. Or, a client and server can be connected over a local intranet or a virtual private network.
Information may be exchanged over a network between the clients and servers. To this end and for security, servers and/or clients can include firewalls, load balancers, temporary storages, and proxies, and other network infrastructure for reliability and security. One or more servers may form an apparatus that implement methods of providing a secure community such as an online social website to network members.
As used herein, instructions refer to computer-implemented steps for processing information in the system. Instructions can be implemented in software, firmware or hardware and include any type of programmed step undertaken by components of the system.
A processor may be any conventional general purpose single- or multi-chip processor that can execute logic by means of various lines such as address lines, data lines, and control lines and registers and shift registers.
Software modules described by way of the flow charts and user interfaces herein can include various sub-routines, procedures, etc. Without limiting the disclosure, logic stated to be executed by a particular module can be redistributed to other software modules and/or combined together in a single module and/ or made available in a shareable library.
Present principles described herein can be implemented as hardware, software, firmware, or combinations thereof; hence, illustrative components, blocks, modules, circuits, and steps are set forth in terms of their functionality.
Further to what has been alluded to above, logical blocks, modules, and circuits described below can be implemented or performed with a general purpose processor, a digital signal processor (DSP), a field programmable gate array (FPGA) or other programmable logic device such as an application specific integrated circuit (ASIC), discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A processor can be implemented by a controller or state machine or a combination of computing devices.
The functions and methods described below, when implemented in software, can be written in an appropriate language such as but not limited to C# or C++, and can be stored on or transmitted through a computer-readable storage medium such as a random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), compact disk read-only memory (CD-ROM) or other optical disk storage such as digital versatile disc (DVD), magnetic disk storage or other magnetic storage devices including removable thumb drives, etc. A connection may establish a computer-readable medium. Such connections can include, as examples, hard-wired cables including fiber optics and coaxial wires and digital subscriber line (DSL) and twisted pair wires. Such connections may include wireless communication connections including infrared and radio.
Components included in one embodiment can be used in other embodiments in any appropriate combination. For example, any of the various components described herein and/or depicted in the Figures may be combined, interchanged or excluded from other embodiments.
“A system having at least one of A, B, and C” (likewise “a system having at least one of A, B, or C” and “a system having at least one of A, B, C”) includes systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.
Now specifically referring to
Accordingly, to undertake such principles the CE device 12 can be established by some or all of the components shown in
In addition to the foregoing, the CE device 12 may also include one or more input ports 26 such as, e.g., a USB port to physically connect (e.g. using a wired connection) to another CE device and/or a headphone port to connect headphones to the CE device 12 for presentation of audio from the CE device 12 to a user through the headphones. The CE device 12 may further include one or more tangible computer readable storage medium 28 such as disk-based or solid state storage, it being understood that the computer readable storage medium 28 may not be a carrier wave. Also in some embodiments, the CE device 12 can include a position or location receiver such as but not limited to a GPS receiver and/or altimeter 30 that is configured to e.g. receive geographic position information from at least one satellite and provide the information to the processor 24 and/or determine an altitude at which the CE device 12 is disposed in conjunction with the processor 24. However, it is to be understood that that another suitable position receiver other than a GPS receiver and/or altimeter may be used in accordance with present principles to e.g. determine the location of the CE device 12 in e.g. all three dimensions.
Continuing the description of the CE device 12, in some embodiments the CE device 12 may include one or more cameras 32 that may be, e.g., a thermal imaging camera, a digital camera such as a webcam, and/or a camera integrated into the CE device 12 and controllable by the processor 24 to gather pictures/images and/or video in accordance with present principles. Also included on the CE device 12 may be a Bluetooth transceiver 34 and other Near Field Communication (NFC) element 36 for communication with other devices using Bluetooth and/or NFC technology, respectively. An example NFC element can be a radio frequency identification (RFID) element.
Further still, the CE device 12 may include one or more motion sensors 37 (e.g., an accelerometer, gyroscope, cyclometer, magnetic sensor, infrared (IR) motion sensors such as passive IR sensors, an optical sensor, a speed and/or cadence sensor, a gesture sensor (e.g. for sensing gesture command), etc.) providing input to the processor 24. The CE device 12 may include still other sensors such as e.g. one or more climate sensors 38 (e.g. barometers, humidity sensors, wind sensors, light sensors, temperature sensors, etc.) and/or one or more biometric sensors 40 providing input to the processor 24. In addition to the foregoing, it is noted that in some embodiments the CE device 12 may also include a kinetic energy harvester 42 to e.g. charge a battery (not shown) powering the CE device 12.
Still referring to
Now in reference to the afore-mentioned at least one server 54, it includes at least one processor 56, at least one tangible computer readable storage medium 58 that may not be a carrier wave such as disk-based or solid state storage, and at least one network interface 60 that, under control of the processor 56, allows for communication with the other CE devices of
Accordingly, in some embodiments the server 54 may be an Internet server, may include and perform “cloud” functions such that the CE devices of the system 10 may access a “cloud” environment via the server 54 in example embodiments.
Now referring to
Proceeding to block 72, in some examples the processor may decide which one of plural software-implemented image recognition algorithms to apply. For example, the processor may have access to a facial recognition algorithm, a spatial recognition algorithm, an object recognition algorithm, a brand recognition algorithm, a geo-specific data recognition algorithm, and an algorithm for recognizing time specific events. The user may establish which algorithm to select, or the processor may undertake the selection automatically as described below. In some cases a single algorithm may provide the capability to recognize two or more of the recognition types above.
An algorithm for deciding which one of a set of specific recognition algorithms to apply is now described. The processor may determine that an image includes human faces by virtue of detecting pixel patterns with enclosed generally ovular borders. Having determined on this basis that a face exists in the image, a face recognition algorithm may be employed to compare features of the face as reflected in pixel patterns within the face image to a database of known faces to identify, at block 74, the person being imaged.
Or, the processor may determine that it should invoke a spatial recognition algorithm by determining that a continuous area of blue pixels or a continuous area of green pixels exceeds a threshold area, indicating a sky or sea or forest scene in the image. The spatial recognition algorithm can then be invoked to match the outlines of objects in the image to a database of tree and plant and water images, for example, and identify at block 74 the type of scene being imaged.
Or, the processor may determine that it should invoke an object recognition algorithm by virtue of detecting pixel patterns with enclosed borders of rectilinear shape, or of other non-human shapes such as purely circular shapes, elongated shapes indicating trains or other vehicles, etc. Having determined on this basis that an object such as a non-human object exists in the image, an object recognition algorithm may be employed to compare features of the objects as reflected in pixel patterns within the object image to a database of known objects to identify, at block 74, the object being imaged.
Yet again, the processor may determine that it should invoke a brand recognition algorithm by virtue of detecting pixel patterns that form letters, for example. Having determined on this basis that a brand name may appear in the image, a brand recognition algorithm may be employed to compare the brand name as reflected in pixel patterns to a database of known brand names to identify, at block 74, the brand being imaged.
Still further, the processor may determine that it should invoke a geo-specific (geography) recognition algorithm by virtue of detecting pixel patterns of enclosed boundaries that define objects of unusual size, e.g., objects larger than five meters in any particular dimension, as may be determined from both the pixel pattern and any existing focal length metadata that might accompany the image as appended by the imaging device from imager settings. Having determined on this basis that a geographically unique object such as Mt. Rushmore, the Eiffel Tower, etc. may appear in the image, a geography recognition algorithm may be employed to compare the geographic object as reflected in pixel patterns to a database of known geographic objects to identify, at block 74, the geographic area being imaged.
Time specific events may also be recognized using timestamps that may accompany the image from the imaging device, or using any of the algorithms above to recognize combinations of objects and then access a database of object combinations that are correlated to the times at which the objects appears together. As but one example, a face recognition algorithm may recognize the faces of two known celebrities in a single image, and then access a database of news feeds to determine when and at what events the two celebrities appeared together.
Proceeding to block 76, one or more metadata fields associated with the image are automatically populated using information from the recognition that occurs at block 74 to describe the image and if desired curate the image into one or more image categories in a searchable database of images.
Returning to block 78 in
As an example, suppose the prior searches indicate that the user previously searched for “Chevrolet” at least a threshold number of times. From this, it may be inferred, using for instance a database of synonyms such as a Thesaurus, that the user likes to image his vehicle and that the vehicle is a Chevrolet. In the context of the metadata in
With the above in mind, once the user has entered the specifications, a processor such as a cloud processor hosted on n Internet server or the processor of the CE device or other processor may execute the logic of
The user specifications of a montage from, e.g., the example UI shown in
At block 108, accompanying background music is added to the montage as an audio accompaniment. The user-specified background music title (or genre) may be added, or if the user desires the system to add the music, a library of music may be accessed and selected in various heuristic ways. In one example, the music library is the user's music library. In another example, when the subject is a person the music library is the subject's library. A general Internet music library may be accessed, or a music library identified with a particular digital ecosystem. In any case, for clips recognized as “happy”, upbeat music may be selected. Whether the music is upbeat or not can be determined based on its genre or on a type indicator associated with the music or on the tempo, with faster tempo indicating happy and slower tempo indicating not happy. For clips recognized as “sad”, slower, mellow music may be selected. For subjects in clips recognized by face recognition principles as being children, the music accompanying those clips may be sweet tunes or childhood tunes such as school songs. The system may select a separate audio clip for each respective video clip if desired, depending on the nature of the video clip.
While the particular COMPUTER ECOSYSTEM WITH AUTOMATICALLY CURATED VIDEO MONTAGE is herein shown and described in detail, it is to be understood that the subject matter which is encompassed by the present invention is limited only by the claims.