The present invention relates to a method and associated system for providing visual mood based annotations for audio media.
Describing data typically comprises an inaccurate process with little flexibility. Data description within a system typically includes a manual process. Manually describing data may be time consuming and require a large amount of resources. Accordingly, there exists a need in the art to overcome at least some of the deficiencies and limitations described herein above.
The present invention provides a method comprising: receiving, by a computer processor of a computing apparatus, mood description data describing different human emotions/moods; receiving, by the computer processor, an audio file comprising audio data presented by an author; generating, by the computer processor, a mood descriptor file comprising portions of the audio data associated with specified descriptions of the mood description data; receiving, by the computer processor, a mood tag library file comprising mood tags describing and mapped to animated and/or still objects representing various emotions/moods; associating, by the computer processor based on the mood tags, each animated and/or still object of the animated and/or still objects with an associated description of the specified descriptions; synchronizing, by the computer processor, the animated and/or still objects with the portions of the audio data associated with the specified descriptions; and first presenting, by the computer processor to a listener, the animated and/or still objects synchronized with the portions of the audio data associated with the specified descriptions.
The present invention provides a computing system comprising a computer processor coupled to a computer-readable memory unit, the memory unit comprising instructions that when executed by the computer processor implements a method comprising: receiving, by the computer processor of a computing apparatus, mood description data describing different human emotions/moods; receiving, by the computer processor, an audio file comprising audio data presented by an author; generating, by the computer processor, a mood descriptor file comprising portions of the audio data associated with specified descriptions of the mood description data; receiving, by the computer processor, a mood tag library file comprising mood tags describing and mapped to animated and/or still objects representing various emotions/moods; associating, by the computer processor based on the mood tags, each animated and/or still object of the animated and/or still objects with an associated description of the specified descriptions; synchronizing, by the computer processor, the animated and/or still objects with the portions of the audio data associated with the specified descriptions; and first presenting, by the computer processor to a listener, the animated and/or still objects synchronized with the portions of the audio data associated with the specified descriptions.
The present invention provides a computer program product, computer program product, comprising a computer readable storage device storing a computer readable program code, the computer readable program code comprising an algorithm that when executed by a computer processor of a computing system implements a method, said method comprising: receiving, by the computer processor of a computing apparatus, mood description data describing different human emotions/moods; receiving, by the computer processor, an audio file comprising audio data presented by an author; generating, by the computer processor, a mood descriptor file comprising portions of the audio data associated with specified descriptions of the mood description data; receiving, by the computer processor, a mood tag library file comprising mood tags describing and mapped to animated and/or still objects representing various emotions/moods; associating, by the computer processor based on the mood tags, each animated and/or still object of the animated and/or still objects with an associated description of the specified descriptions; synchronizing, by the computer processor, the animated and/or still objects with the portions of the audio data associated with the specified descriptions; and first presenting, by the computer processor to a listener, the animated and/or still objects synchronized with the portions of the audio data associated with the specified descriptions.
The present invention advantageously provides a simple method and associated system capable of describing data.
System 2 of
System 2 enables a process for automatically or manually generating mood based annotations for audio media that is utilized to automatically generate and present (to a listener) visual media (synchronized with the audio media) representing a speaker's mood in order to keep the attention of the listener. An author of the audio media may control placement of the mood based annotations to be coupled with the audio media. Alternatively, a listener (of the audio media) may control placement of the mood based annotations to be coupled with the audio media. Computing system 14 receives input audio (i.e., speech data) from input audio files 5 and associates mood descriptions (i.e., tags describing various human emotions/moods) with portions of the input audio. The mood descriptions are associated with the mood based annotations (e.g., mood description objects) retrieved from a mood tag library 10. Software application 17 generates and presents (to a listener) a synchronized file 6 comprising the audio (i.e., the input audio) synchronized with the associated mood based annotations. The following examples describe various scenarios for generating audio files synchronized with mood based annotations:
System 2 provides the ability for an audio content author to manually inject moods (to be presented to a listener) in real time while he/she is recording the audio. For example, the input audio may be received from an author (of the input audio) via a recording device and the mood descriptions may be associated (by the author) with the input audio in real time as the input audio is being recorded (as the author speaks). In this scenario, software application 17 assigns the different mood descriptions (i.e., different mood descriptor tags) to the associated portions of the input audio (automatically based on a software analysis or manually based on commands from the author) at specified time frames (in the audio file) resulting in the generation of a mood descriptor file. The mood descriptions are associated with the mood based annotations (e.g., mood description objects such as still or animated video images) retrieved from a mood tag library 10. Software application generates and presents (to a listener) a synchronized file 6 comprising the audio (i.e., the input audio) synchronized with the associated mood based annotations.
System 2 provides the ability to annotate audio after an audio recording has been completed. An annotation process produces a set of tags that act as descriptors associated with the mood of an author of the audio. Annotations may span intervals in the audio recording or be placed at specified points along the audio recording. For example, the input audio may be fully received from an author (of the input audio) via a recording device and the mood descriptions may be associated (by the author) with the input audio after the input audio has been recorded. In this scenario, the author manually assigns the different mood descriptions (i.e., different mood descriptor tags) to the associated portions of the input audio at specified time frames (in the audio file) resulting in the generation of a mood descriptor file. The mood descriptions are associated with the mood based annotations (e.g., mood description objects such as still or animated video images) retrieved from a mood tag library 10. Software application generates and presents (to a listener) a synchronized file 6 comprising the audio (i.e., the input audio) synchronized with the associated mood based annotations.
System 2 provides the ability to produce a descriptor of moods (of the author of the audio) based on an automated examination of the audio (e.g., analysis of audio inflection points). For example, software application may sense (by sensing voice inflections) that the author is very excited thereby generating an emotion descriptor that describes rapid hand movement or extreme engagement of hand gestures. In this scenario, input audio may be received from an author (of the input audio) via a recording device and voice inflection points (of the author's voice) are detected and analyzed (by software application 17). The voice inflections may be analyzed by, inter alia, comparing the voice inflections to a predefined table or file describing different voice inflections for individuals including the author of the audio. Based on the analysis, the mood descriptions are automatically associated (by software application 17) with associated portions of the input audio at specified time frames resulting in the generation of a mood descriptor file. The mood descriptions are associated with the mood based annotations (e.g., mood description objects such as, inter alia, still or animated video images of rapid hand movement, extreme engagement of hand gestures, an animated video image of an excited person, an image of a person smiling or laughing, etc) retrieved from a mood tag library 10. Software application generates and presents (to a listener) a synchronized file 6 comprising the audio (i.e., the input audio) synchronized with the associated mood based annotations.
Still yet, any of the components of the present invention could be created, integrated, hosted, maintained, deployed, managed, serviced, etc. by a service supplier who offers to provide mood based annotations for audio media. Thus the present invention discloses a process for deploying, creating, integrating, hosting, maintaining, and/or integrating computing infrastructure, comprising integrating computer-readable code into the computer system 90, wherein the code in combination with the computer system 90 is capable of performing a method for providing mood based annotations for audio media. In another embodiment, the invention provides a business method that performs the process steps of the invention on a subscription, advertising, and/or fee basis. That is, a service supplier, such as a Solution Integrator, could offer to provide mood based annotations for audio media. In this case, the service supplier can create, maintain, support, etc. a computer infrastructure that performs the process steps of the invention for one or more customers. In return, the service supplier can receive payment from the customer(s) under a subscription and/or fee agreement and/or the service supplier can receive payment from the sale of advertising content to one or more third parties.
While
While embodiments of the present invention have been described herein for purposes of illustration, many modifications and changes will become apparent to those skilled in the art. Accordingly, the appended claims are intended to encompass all such modifications and changes as fall within the true spirit and scope of this invention.
This application is a continuation application claiming priority to Ser. No. 14/856,904 filed Sep. 17, 2015, now U.S. Pat. No. 9,953,451, issued Apr. 24, 2018, which is a continuation application claiming priority to Ser. No. 14/596,494 filed Jan. 14, 2015 now U.S. Pat. No. 9,235,918 issued Jan. 12, 2016 which is a continuation application claiming priority to Ser. No. 13/153,751 filed Jun. 6, 2011, now U.S. Pat. No. 8,948,893 issued Feb. 3, 2014.
Number | Name | Date | Kind |
---|---|---|---|
5265248 | Moulios et al. | Nov 1993 | A |
5616876 | Cluts | Apr 1997 | A |
6795808 | Strubbe et al. | Sep 2004 | B1 |
7102643 | Moore et al. | Sep 2006 | B2 |
7257538 | Qian | Aug 2007 | B2 |
7372536 | Shah et al. | May 2008 | B2 |
7396990 | Lu et al. | Jul 2008 | B2 |
7400351 | Zhang et al. | Jul 2008 | B2 |
7921067 | Kemp et al. | Apr 2011 | B2 |
8126220 | Greig | Feb 2012 | B2 |
8443290 | Bill | May 2013 | B2 |
8948893 | Abuelsaad et al. | Feb 2015 | B2 |
9235918 | Abuelsaad et al. | Jan 2016 | B2 |
20020018074 | Buil et al. | Feb 2002 | A1 |
20020147628 | Specter et al. | Oct 2002 | A1 |
20030035412 | Wang et al. | Feb 2003 | A1 |
20050158037 | Okabayashi et al. | Jul 2005 | A1 |
20070088727 | Kindig | Apr 2007 | A1 |
20070157795 | Hung | Jul 2007 | A1 |
20070256545 | Lee et al. | Nov 2007 | A1 |
20070277092 | Basson et al. | Nov 2007 | A1 |
20080110322 | Lee et al. | May 2008 | A1 |
20080158334 | Reponen et al. | Jul 2008 | A1 |
20080163074 | Tu | Jul 2008 | A1 |
20090116684 | Andreasson | May 2009 | A1 |
20090278851 | Ach et al. | Nov 2009 | A1 |
20100191733 | Park et al. | Jul 2010 | A1 |
20100325135 | Chen et al. | Dec 2010 | A1 |
20110007142 | Perez et al. | Jan 2011 | A1 |
20110029112 | Kemp et al. | Feb 2011 | A1 |
20110100199 | Sugimoto et al. | May 2011 | A1 |
20110239137 | Bill | Sep 2011 | A1 |
20120310392 | Abuelsaad et al. | Dec 2012 | A1 |
20150127129 | Abuelsaad et al. | May 2015 | A1 |
20160004500 | Abuelsaad et al. | Jan 2016 | A1 |
Number | Date | Country |
---|---|---|
92010140278 | Jun 2010 | JP |
2010105396 | Sep 2010 | WO |
Entry |
---|
Amendment filed Jun. 25, 2015 in response to Office Action (dated May 22, 2015) for U.S. Appl. No. 14/596,494, filed Jan. 14, 2015. |
Amendment filed Sep. 12, 2014 in response to Office Action (dated Jun. 18, 2014) for U.S. Appl. No. 13/153,751, filed Jun. 6, 2011. |
Laurier et al.; Mood Cloud: A Real-Time Music Mood Visualization Tool; CMMR, Computer Music Modeling and Retrieval 5th International Symposium; May 19-23, 2008; 5 pages. |
Notice of Allowance (dated Jul. 17, 2015) for U.S. Appl. No. 14/596,494, filed Jan. 14, 2015. |
Notice of Allowance (dated Sep. 29, 2014) for U.S. Appl. No. 13/153,751, filed Jun. 6, 2011. |
Office Action (dated Jun. 18, 2014) for U.S. Appl. No. 13/153,751, filed Jun. 6, 2011. |
Office Action (dated May 22, 2015) for U.S. Appl. No. 14/596,494, filed Jan. 14, 2015. |
Office Action (dated Aug. 10, 2017) for U.S. Appl. No. 14/856,904, filed Sep. 17, 2015. |
Amendment filed Nov. 7, 2017 in response to Office Action (dated Aug. 10, 2017) for U.S. Appl. No. 14/856,904, filed Sep. 17, 2015. |
Notice of Allowance (dated Dec. 18, 2017) for U.S. Appl. No. 14/856,904, filed Sep. 17, 2015. |
Number | Date | Country | |
---|---|---|---|
20180204367 A1 | Jul 2018 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14856904 | Sep 2015 | US |
Child | 15920802 | US | |
Parent | 14596494 | Jan 2015 | US |
Child | 14856904 | US | |
Parent | 13153751 | Jun 2011 | US |
Child | 14596494 | US |