This patent document relates to automated evaluation of acting performance that includes audios, videos, or other types of multimedia contents.
Good performing art professionals share certain characteristics whether they work in film, television, theater or other multimedia production with performances by persons or performing art professionals—they have the skills to portray their roles convincingly and connect with audiences by representing or conveying emotions, motivations, and intentions of a character through manifestation of expressions by eyes, facial expressions, voice and speech expressions and physical gesture or movements of the body. Evaluation of the acting performance, however, tends to be highly subjective and may require significant amount of time and effort by the trained professionals.
Described herein are techniques, subsystems and systems to facilitate automated evaluation of acting performance, particularly for amateur performances. The disclosed techniques can be used to facilitate quick evaluation of performances in mass audition processes and/or acting competitions. The evaluation can focus on the objective criteria to quickly filter out performances that fail to meet the standard for the auditions and/or competitions.
In one example aspect, the disclosed technology can be implemented to provide a system for an automated evaluation of performance activities. This system includes a user interface configured to allow a user to create a customized automated evaluation service. The user can select a subset of machine learning engines from one or more available machine learning engines to create such service. The user interface is further configured to receive a set of input data of a performance performed by a performer, the set of input data comprising at least video performance data or audio performance data. The system includes the one or more available machine learning engines each configured to generate a sub-score in a specific area using the set of input data received from the user interface. The system also includes an aggregator configured to produce an output indicating a quality of the performance by aggregating the sub-scores generated by the subset of the one or more available machine learning engines. In some implementations, for example, the aggregator can be in communication with the one or more machine learning engines to receive sub-scores generated by the one or more available machine learning engines, assign weighting factors to the sub-scores, respectively, based on the one or more characteristics of the performance, and aggregate the sub-scores using the weighting factors to produce an output indicating a quality of the performance activity.
In another example aspect, the disclosed technology can be implemented to provide a non-transitory computer program product having code stored thereon is disclosed. The code (e.g., software program), when executed by one or more processors in communication with each other via a network, can cause the one or more processors to receive a set of input data of a performance activity performed by a performer, wherein the set of input data comprises at least video performance data or audio performance data, determine one or more characteristics of the performance activity based on the at least video performance data or audio performance data, and build a customized evaluation service using a set of machine learning engines selected from one or more machine learning engines available in an evaluation system of performance activities. The set of machine learning engines is selected based on the one or more characteristics of the performance activity. Each of a set of machine learning engines determines a sub-score for the performance activity, and an output is generated to indicate a quality of the performance activity by aggregating sub-scores determined by the set of machine learning engines.
In yet another example aspect, the disclosed technology can be implemented to provide a method for automatically evaluating a performance activity performed by a performer. This method includes receiving, via a user interface, a set of input data of the performance activity, wherein the set of input data comprises at least video performance data or audio performance data, wherein part of the at least video performance data or audio performance data represents a reference activity corresponding to the performance activity; and constructing a customized evaluation service using a set of machine learning engines selected from one or more available machine learning engines, wherein the set of machine learning engines is selected based on one or more characteristics of the performance activity. The constructing includes generating, using at least one of the set of machine learning engines, a machine-readable script based on the audio and/or video data of the reference activity, wherein the machine-readable script includes at least one or more lines spoken and/or one or more actions performed during the reference activity, each of the one or more lines or the one or more actions is associated with a timestamp included in the machine-readable script. This method produces an output indicating a quality of the performance activity by comparing the performance activity with the reference activity using the machine-readable script.
These, and other, aspects of the disclosed technology are described in greater detail in the present document.
Television programs have taken on different forms nowadays. For example, variety shows that include a variety of acts such as musical performances, sketch comedy, magic, acrobatics, juggling, and/or ventriloquism are widespread in some parts of the world. Reality shows and various real-time competitions have also gained popularity amongst viewers. Some of the programs feature acting competitions, in which candidates are asked to provide acting performance and are evaluated in real-time to determine who is a better actor or actress. In these programs, providing a standardized baseline for the evaluation can be difficult as artistic evaluation are often highly subjective. Some of the programs require mass auditions to select the proper cast. Quickly filtering out candidates that do not match the profiles of the roles can be challenging and often require tremendous amount of time and professional experience.
This patent document discloses techniques that can be implemented in various embodiments to provide an automated platform for evaluating performance activities using objective criteria. The results provided by the platform can used as a baseline in real-time acting competitions, mass auditions, as well as other types of vocal and/or performing activities (public speech, gymnastics, and/or ice skating). For example, the platform can facilitate the evaluation process in mass audition by quickly ruling out candidates who do not meet the minimum requirements of the roles based on features such as articulation, body motions or facial expressions. The disclosed techniques can also be used in combination with subjective evaluation of the performances to iteratively select candidate performers in mass auditions and/or live acting competitions.
In some embodiments, the automated platform can be implemented as a cloud service that provides one or more modularly designed engines/sub-services to allow the users to customize the types of evaluations needed according to the genre or characteristics of the contents. The disclosed platform enables fully automated evaluation for recorded performances and is capable of processing a large amount of audio and video input data without imposing any burden on the professional crew.
Overview of the System
System Inputs
Different types of system inputs 101 can be provided for different types of acting tasks (e.g., drama acting, speech, singing, dancing, etc.). The inputs 101 can include at least one or more of the following:
Modular Architecture of the ML Core
As shown in
In some embodiments, for a particular category or genre of acting performances, the system 100 can be designed to provide a template of the ML processing engines to process the inputs 101 of such performance to facilitate the generation of processed outputs for evaluating the performance. Table 1 shows an example template selection of ML engines for different categories of inputs (e.g., speech, drama acting, etc.). For example, performances that focus on speech are given higher weights to speech and articulation analysis and lower weights to facial/bodily motion analysis. Performances that are related to actions and body movements (e.g., dance, action) are given higher weights in the motion/movement analysis and lower weights in speech or facial analysis. Details about the ML engines (e.g., Articulation Analysis Engine, Facial Expressiveness Engine, Speech Rhythm Analysis Engine, Imitation Analysis Engine) are described below in connection with
User Configuration Interface
To facilitate the flexible selection and configuration of the ML core, a user configuration interface 103 (e.g., a web interface) can be provided to the user. The user can be provided with a list of available engines for constructing a customized ML engine for the evaluation. Templates such as shown in Table 1 can also be provided to the user and allow the user to make further modifications to the customized evaluation system.
Depending on the nature or the category of a performance to be evaluated, the user can select appropriate ML engines and assign respective weights or weighting factors to the ML engines, respectively, or modify an existing template to build a customized ML engine for the automatic evaluation. For example, given a performance of public speech, the user can select one or more audio processing engines (e.g., a speech recognition engine, an articulation assessment engine, etc.) to evaluate the quality of the speech. In this specific example, only one facial recognition engine is needed to determine if the face of the presenter is recognizable during the speech. Greater weights can be assigned to the audio processing sub-scores and a smaller weight can be assigned to the facial recognition engine. As another example, for a drama performance, the user can select more engines for video analysis, such as a facial expression recognition engine and/or a gesture recognition engine, as compare to the number of engines needed for audio processing. Greater weights can be assigned to the video-processing sub-scores as compared to the sub-score(s) for the audio processing.
In some embodiments, the user can be prompted by the user interface 103 to provide additional input information, such as the textural description of the performance or a recording of the reference performance. The textual descriptions can preferably be a machine-readable script that includes description of the scene(s) and shot(s) of the performance.
System Outputs
As discussed above, the evaluation system 100 includes an aggregator 117 configured to generate a final score indicating the grading of the performance by aggregating sub-scores from different ML engines. The system 100 can also provide a textural description associated with the final score to summarize the overall acting performance. In addition to the final score, the system 100 can further output each of the sub-score in respective aspects and associated textural descriptions of the performance in each aspect.
In some embodiments, the outputs (e.g., the textural description, the final score, and sub-scores with corresponding descriptions) can be organized using a machine-readable script so that the evaluation results can be provided to the user or a final grader (e.g., via a user interface). For example, clips that are used as key indicator of the performances can be embedded or referenced in the script so that the user or the final grader can review the evaluation results by examining the key clips. Based on the outputs, the user or the final grader can incorporate subject grading criteria into the process by adjust the grading standard. The grading process can continue in an iterative manner to ensure that the evaluation results match the overall objective of the evaluation process.
In some embodiments, the ML engines as part of the ML system 110 in
Based on the nature of the performance, one or more ML engines (e.g., 225a, 225c-d), such as speech recognition, gesture, facial expression analysis engines, are selected to process the inputs. The inputs are transmitted to the selected ML engines for the engine-level processing to obtain sub-scores. The sub-scores are then aggregated and combined by the aggregator 230 to produce the final score. In the example shown in
In some embodiments, the engines are organized by the system based on complexity and comprehensiveness of the analysis performed by the engine, and users can select the appropriate engines based on the type of the input performances.
Some example low-level engines are described below.
1. Articulation Analysis Engine
Articulation refers to the ability to speak clearly and pronounce accurately.
In some embodiments, based on the extracted syllables, words, and sentences, the articulation analysis engine queries a database 430 to retrieve the reference waveforms of the relevant syllables and words. In some embodiments, the articulation analysis engine further takes the textural description 405 of the acting performance as an input. The textural description can be a machine-readable script that includes the lines spoken in the audio samples and reference audio samples as inputs. The articulation analysis engine can query the database 430 to retrieve the reference waveforms of the relevant syllables and words based on the lines spoken. A comparison is then performed between the waveform of the performer's audio samples and retrieved reference waveforms based on a grading standard. The grading standard can be a default criterion associated with the type of the inputs (e.g., speech, drama, etc.) provided by the system template(s). The grading standard can also be specified by the user. For example, when the system is used to evaluate the performance in an iterative manner, the user can adjust the grading standard using subjective criteria. A sub-score for articulation analysis is then generated by the engine.
2. Facial Expressiveness Engine.
Facial expressiveness refers to the ability to display emotions with various facial expressions.
The facial expressiveness engine 500 performs basic video processing on the input video samples to identify faces shown in the video samples. Bounding boxes can be given to mark the locations of the faces. The basic video processing can also include a scaling operation to scale the video samples to a standard size to facilitate subsequent processing of the samples.
The facial expressiveness engine 500 also includes one or more dimension reduction models (neural network, principal component analysis, etc.) as part of the advanced image processing unit 520 to build feature vectors for different facial expressions based on the performer's photos. These feature vectors can be stored as reference feature vectors in a database 530 for future evaluations of the same performer's video samples.
The detected faces and the extracted feature vectors are then used to evaluate the facial expressiveness of the performer in the input video samples. In some embodiments, the facial expressiveness engine further includes a facial expression detection unit 540 that takes the textural description 505 of the acting performance as an input. The textural description can be a machine-readable script that includes different facial expression tags describing the expressions that appear in the video samples. A sub-score for facial expressiveness is then generated based on the detected faces, the extracted feature vectors, and a grading standard. The grading standard can be a default criterion associated with the type of the inputs (e.g., speech, drama, etc.) provided by the system template(s). The grading standard can also be specified by the user, such as a standard that incorporates subjective grading criteria.
3. Musicality Analysis Engine
Musicality refers to the ability to carry an accurate tune and have a good vocal range. The input for this engine includes the performer's audio samples (e.g., a recorded song).
Referring back to
Take speech rhythm analysis as an example,
In some embodiments, the speech rhythm analysis engine 700 further takes the textural description 705 of the acting performance as an input. The textural description 705 can be a machine-readable script that includes the lines spoken in the audio samples and reference audio samples as inputs. The textural descriptions can be used to improve the accuracy of extracting the waveforms of audio units from the audio stream. The vectors of waveforms and the time lapses between any two consecutive units are then fed into the advanced audio comparison analysis to evaluate similarity between the performer's audio samples and the reference samples. A sub-score for speech rhythm is calculated based on both the similarity analysis and the grading standard. The grading standard can be a default criterion associated with the type of the inputs (e.g., speech, drama, etc.) provided by the system template(s). The grading standard can also be specified by the user. For example, when the system is used to evaluate the performance in an iterative manner, the user can adjust the grading standard using subjective criteria.
Referring back to
For example, as part of the evaluation process, the performer can be asked to play a scene selected from one of Shakespeare pieces that are often used for auditions. The objective of the evaluation is to determine whether the performance of the performer is consistent with what is the defined in the script.
In some embodiments, the advanced image processing unit 830 includes one or more dimension reduction models (neural network, principal component analysis (PCA), non-negative matrix factorization, linear discriminant analysis, generalized discriminant analysis, canonical correlation analysis, autoencoders, etc.) to build feature vectors for different facial expressions based on the performer's photos. Dimension reduction is the transformation of high-dimensional data (e.g., video or audio data that includes a large number of samples) into a low-dimensional space (e.g., selected number of video or audio features). These feature vectors can be stored as reference feature vectors in a database 840 for future evaluations of the same performer's video samples. The extracted feature vectors are then sent to a facial expression and body motion transition detection unit 850 that is configured to match the extracted feature vectors with the reference feature vectors stored in the database 840, and to align the detected transitions with the relevant tags in the script.
The output from the audio processing unit 810 and the facial expression and body motion transition detection engine 850 are then fed into an artistic analysis unit 860. The artistic analysis unit 860 examines the alignment of the spoken syllables/words and the script, as well as the alignment of the facial expressions/body actions with the script. The artistic analysis unit 860 then generates a sub-score indicating a degree of imitation level as compared to the reference sample(s) based on a grading standard. The grading standard can be a default criterion associated with the type of the inputs (e.g., speech, drama, etc.) provided by the system template(s). The grading standard can also be specified by the user. For “high-level” engines, the evaluation of the performance often involves more subjective criteria as compared to “low-level” engine that focuses on specific aspects of the content. Therefore, the artistic analysis unit 860 can provide preliminary analysis results to the user and allow the user to adjust the grading standard to incorporate subject criteria in the process. Once the user is content with the output of this module, the sub-score can be aggregated with outputs of the other modules to produce a final evaluation.
The engine is also configured to extract the features from the performer's audio and video data and transmit such data to the artistic analysis unit 920. The artistic analysis unit 920 examines the alignment of the syllables/words spoken by the performer and the items defined the generated script, as well as the alignment of the facial expressions/body actions with the items in the script. The artistic analysis unit 860 then generates a sub-score indicating a degree of imitation level as compared to the reference sample(s) based on a grading standard. The grading standard can be a default criterion associated with the type of the inputs (e.g., speech, drama, etc.) provided by the system template(s). Similar to the example shown in
The processor(s) 1005 may include central processing units (CPUs) to control the overall operation of, for example, the host computer. In certain embodiments, the processor(s) 1005 accomplish this by executing software or firmware stored in memory 1010. The processor(s) 1005 may be, or may include, one or more programmable general-purpose or special-purpose microprocessors, digital signal processors (DSPs), programmable controllers, application specific integrated circuits (ASICs), programmable logic devices (PLDs), or the like, or a combination of such devices.
The memory 1010 can be or include the main memory of the computer system. The memory 1010 represents any suitable form of random access memory (RAM), read-only memory (ROM), flash memory, or the like, or a combination of such devices. In use, the memory 1010 may contain, among other things, a set of machine instructions which, when executed by processor 1005, causes the processor 1005 to perform operations to implement embodiments of the presently disclosed technology.
Also connected to the processor(s) 1005 through the interconnect 1025 is a (optional) network adapter 1015. The network adapter 1015 provides the computer system 1000 with the ability to communicate with remote devices, such as the storage clients, and/or other storage servers, and may be, for example, an Ethernet adapter or Fiber Channel adapter.
Implementations of the subject matter and the functional operations described in this patent document can be implemented in various systems, digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Implementations of the subject matter described in this specification can be implemented as one or more computer program products, e.g., one or more engines of computer program instructions encoded on a tangible and non-transitory computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more of them. The term “data processing unit” or “data processing apparatus” encompasses various apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
A computer program (also known as a program, software, software application, machine-readable script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a engine, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more engines, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include various forms of nonvolatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that various illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.
This patent document is a continuation of and claims priority to and benefits of U.S. patent application Ser. No. 17/549,749, filed Dec. 13, 2021. The entire content of the before-mentioned patent application is incorporated by reference as part of the disclosure of this application.
Number | Name | Date | Kind |
---|---|---|---|
6466655 | Clark | Oct 2002 | B1 |
8244104 | Kashiwa | Aug 2012 | B2 |
8560951 | Snyder | Oct 2013 | B1 |
8910201 | Zamiska et al. | Dec 2014 | B1 |
9106812 | Price et al. | Aug 2015 | B1 |
9998722 | Swearingen et al. | Jun 2018 | B2 |
10057537 | MacDonald-King et al. | Aug 2018 | B1 |
10721377 | Wu et al. | Jul 2020 | B1 |
11070888 | Wu et al. | Jul 2021 | B1 |
11107503 | Wu et al. | Aug 2021 | B2 |
11166086 | Wu et al. | Nov 2021 | B1 |
11315602 | Wu et al. | Apr 2022 | B2 |
11321639 | Wu et al. | May 2022 | B1 |
11330154 | Wu et al. | May 2022 | B1 |
11564014 | Wu et al. | Jan 2023 | B2 |
11570525 | Wu et al. | Jan 2023 | B2 |
20020099577 | Black | Jul 2002 | A1 |
20030061610 | Errico | Mar 2003 | A1 |
20030225641 | Gritzmacher et al. | Dec 2003 | A1 |
20060053041 | Sakai | Mar 2006 | A1 |
20060251382 | Vronay et al. | Nov 2006 | A1 |
20060282783 | Covell et al. | Dec 2006 | A1 |
20060282785 | McCarthy et al. | Dec 2006 | A1 |
20070099684 | Butterworth | May 2007 | A1 |
20080010601 | Dachs | Jan 2008 | A1 |
20080028318 | Shikuma | Jan 2008 | A1 |
20080033801 | McKenna et al. | Feb 2008 | A1 |
20080036917 | Pascarella et al. | Feb 2008 | A1 |
20080101476 | Tian et al. | May 2008 | A1 |
20090063659 | Kazerouni et al. | Mar 2009 | A1 |
20090279840 | Kudo et al. | Nov 2009 | A1 |
20110085025 | Pace et al. | Apr 2011 | A1 |
20110162002 | Jones et al. | Jun 2011 | A1 |
20110206351 | Givoly | Aug 2011 | A1 |
20110249953 | Suri et al. | Oct 2011 | A1 |
20120294589 | Samra et al. | Nov 2012 | A1 |
20130067333 | Brenneman | Mar 2013 | A1 |
20130124984 | Kuspa | May 2013 | A1 |
20130151970 | Achour | Jun 2013 | A1 |
20130166625 | Swaminathan et al. | Jun 2013 | A1 |
20130167168 | Ellis et al. | Jun 2013 | A1 |
20130177294 | Kennberg | Jul 2013 | A1 |
20130204664 | Romagnolo et al. | Aug 2013 | A1 |
20130232178 | Katsambas | Sep 2013 | A1 |
20130290557 | Baratz | Oct 2013 | A1 |
20140082079 | Dunsmuir | Mar 2014 | A1 |
20140119428 | Catchpole et al. | May 2014 | A1 |
20140132841 | Beaulieu-Jones et al. | May 2014 | A1 |
20140133834 | Shannon | May 2014 | A1 |
20140242560 | Movellan et al. | Aug 2014 | A1 |
20140328570 | Cheng et al. | Nov 2014 | A1 |
20150012325 | Maher | Jan 2015 | A1 |
20150043892 | Groman | Feb 2015 | A1 |
20150082349 | Ishtiaq et al. | Mar 2015 | A1 |
20150256858 | Xue | Sep 2015 | A1 |
20150261403 | Greenberg et al. | Sep 2015 | A1 |
20150281710 | Sievert et al. | Oct 2015 | A1 |
20150302893 | Shannon | Oct 2015 | A1 |
20150363718 | Boss et al. | Dec 2015 | A1 |
20150379358 | Renkis | Dec 2015 | A1 |
20160027198 | Terry et al. | Jan 2016 | A1 |
20160050465 | Zaheer et al. | Feb 2016 | A1 |
20160071544 | Waterston et al. | Mar 2016 | A1 |
20160132546 | Keating | May 2016 | A1 |
20160292509 | Kaps et al. | Oct 2016 | A1 |
20160323483 | Brown | Nov 2016 | A1 |
20160350609 | Mason et al. | Dec 2016 | A1 |
20160360298 | Chalmers et al. | Dec 2016 | A1 |
20170017644 | Accardo et al. | Jan 2017 | A1 |
20170048492 | Buford et al. | Feb 2017 | A1 |
20170169853 | Hu et al. | Jun 2017 | A1 |
20170178346 | Ferro et al. | Jun 2017 | A1 |
20170337912 | Caligor et al. | Nov 2017 | A1 |
20170358023 | Peterson | Dec 2017 | A1 |
20180005037 | Smith et al. | Jan 2018 | A1 |
20180213289 | Lee et al. | Jul 2018 | A1 |
20190045194 | Zavesky et al. | Feb 2019 | A1 |
20190058845 | MacDonald-King et al. | Feb 2019 | A1 |
20190075148 | Nielsen et al. | Mar 2019 | A1 |
20190107927 | Schriber et al. | Apr 2019 | A1 |
20190155829 | Schriber et al. | May 2019 | A1 |
20190215421 | Parthasarathi et al. | Jul 2019 | A1 |
20190215540 | Nicol et al. | Jul 2019 | A1 |
20190230387 | Gersten | Jul 2019 | A1 |
20190244639 | Benedetto | Aug 2019 | A1 |
20190354763 | Stojancic et al. | Nov 2019 | A1 |
20190356948 | Stojancic et al. | Nov 2019 | A1 |
20200065612 | Xu et al. | Feb 2020 | A1 |
20200081596 | Greenberg et al. | Mar 2020 | A1 |
20200168186 | Yamamoto | May 2020 | A1 |
20200213644 | Gupta et al. | Jul 2020 | A1 |
20200312368 | Waterman | Oct 2020 | A1 |
20200327190 | Agrawal et al. | Oct 2020 | A1 |
20200364668 | Altunkaynak | Nov 2020 | A1 |
20200396357 | Wu et al. | Dec 2020 | A1 |
20210011960 | Chambon-Cartier | Jan 2021 | A1 |
20210084085 | Jones et al. | Mar 2021 | A1 |
20210104260 | Wu et al. | Apr 2021 | A1 |
20210152619 | Bercovich | May 2021 | A1 |
20210185222 | Zavesky et al. | Jun 2021 | A1 |
20210211779 | Wu et al. | Jul 2021 | A1 |
20210264161 | Saraee | Aug 2021 | A1 |
20210350829 | Wu et al. | Nov 2021 | A1 |
20210398565 | Wu et al. | Dec 2021 | A1 |
20220070540 | Wu et al. | Mar 2022 | A1 |
20220132223 | Wu et al. | Apr 2022 | A1 |
20220254378 | Wu et al. | Aug 2022 | A1 |
20230041641 | Wu et al. | Feb 2023 | A1 |
Number | Date | Country |
---|---|---|
3038767 | Oct 2019 | CA |
101316362 | Dec 2008 | CN |
101365094 | Feb 2009 | CN |
101960440 | Jan 2011 | CN |
104581222 | Apr 2015 | CN |
108447129 | Aug 2018 | CN |
109196371 | Jan 2019 | CN |
109783659 | May 2019 | CN |
109905732 | Jun 2019 | CN |
2000101647 | Apr 2000 | JP |
2004105035 | Dec 2004 | WO |
2008156558 | Dec 2008 | WO |
2010068175 | Jun 2010 | WO |
2011004381 | Jan 2011 | WO |
2014090730 | Jun 2014 | WO |
2021074721 | Apr 2021 | WO |
Entry |
---|
International Search Report and Written Opinion dated Mar. 10, 2020 in International Application No. PCT/CN2019/090722, 10 pages. |
Davenport, Glorianna, et al., “Cinematic primitives for multimedia”, MIT Media Laboratory, IEEE Computer graphics and Applications, pp. 67-74, Jul. 1991. |
International Search Report and Written Opinion dated May 7, 2020 for International Application No. PCT/CN2019/099534, filed on Aug. 7, 2019 (9 pages). |
International Search Report and Written Opinion dated May 27, 2020 for International Application No. PCT/CN2019/109919, filed on Oct. 8, 2019 (11 pages). |
International Search Report and Written Opinion dated Aug. 7, 2020 for International Application No. PCT/US2020/032217, filed on May 8, 2020 (10 pages). |
International Search Report and Written Opinion dated Jan. 3, 2022 for International Application No. PCT/US2021/047407, filed on Aug. 24, 2021 (20 pages). |
P. Minardi and B. Alonso, “How Automation Can Help Broadcasters and Production Companies Reach Video Production Nirvana,” SMPTE17: Embracing Connective Media, 2015, pp. 1-12, doi: 10.5594/M001738. (Year: 2015). |
International Search Report and Written Opinion dated Feb. 28, 2022 for International Application No. PCT/US2021/056839, filed on Oct. 27, 2021 (16 pages). |
Hua et al., “AVE—Automated Home Video Editing,” Proceedings of the 11th ACM International Conference on Multimedia, MM '03, Berkeley, CA, Nov. 2-8, 2003. |
Tunikova, Oksana, “Product Placement—A Good Advertising Adaptation?,” Business 2 Community, available at https://www.business2community.com/marketing/product-placement-good-advertising-adaptation-02026643. |
Extended European Search Report for European Patent Application No. 19932602.6, dated Nov. 25, 2022 (8 pages). |
Office Action for Chinese Patent Application No. 201980098650.5, dated Nov. 10, 2022 (15 pages). |
International Search Report and Written Opinion dated Apr. 21, 2023 for International Application No. PCT/US2022/081244 (23 pages). |
Number | Date | Country | |
---|---|---|---|
20230186153 A1 | Jun 2023 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 17549749 | Dec 2021 | US |
Child | 17734935 | US |