The present disclosure relates to music categorization.
A piece of music has a beginning and an end, regardless of its length, its type, whether it is entirely instrumental, vocal or a combination of both, and regardless of whether it is part of a collection of pieces, such as an album, or by itself, a single. Traditional music selections systems, such as APPPLE ITUNES, tend to rely on music types, such as style and genre and other factors, such as, performer(s), decade, etc., to enable users to browse through vast libraries of music and make selections to, listen, rent, buy, etc. For example, in such music selection systems, the music is often organized by the genre, style or type of music, i.e., jazz, classical, hip hop, rock and roll, electronic, etc., and within such genres, the music may be further classified by the artist, author, record label, era (i.e., 50's rock), etc.
Some music selection systems will also make recommendations for music based on user preferences and other factors. Pandora Media, Inc.'s PANDORA radio system, for example, allows users to pick music based on genre and artists, and will then recommend additional pieces the user may be interested in listening to based on the user's own identification system. This identification system is derived from the Music Genome Project. While the details of the Music Genome Project do not appear to be publicly available, certain unverified information about it is available on-line. For example, Wikipedia states that the Music Genome Project uses over 450 different musical attributes, combined into larger groups called focus traits, to make these recommendations. There are alleged to be thousands of focus traits, including rhythm syncopation, key tonality, vocal harmonies, and displayed instrumental proficiency. See, http://en.wikipedia.org/wiki/Music_Genome_Project (accessed Jan. 27, 2019).
According to this Wikipedia article, each piece is represented by a vector (a list of attributes) containing up to 450 or more attributes or “genes,” as noted above. Each gene corresponds to a characteristic of the music, for example, gender of lead vocalist, level of distortion on the electric guitar, type of background vocals, etc. Different genres of music will typically have different sets of genes, e.g., 150 genes for some types of music, 350 to 400 genes for other types, and as many as 450 genes for some forms of classical music. Each gene is assigned a number between 0 and 5, in half-integer increments. The assignment of gene values is performed by humans in a process that takes 20 to 30 minutes per piece. Some percentage of the pieces is further analyzed by other humans to ensure conformity. Distance functions are used to develop lists of pieces related to a selected piece based on the vector assigned to the selected piece.
While the Music Genome Project represents an ambitious and detailed identification system, it suffers from many shortcomings as a result of its inherent complexity. The most significant of these deficiencies is that it often recommends pieces, as implemented by PANDORA, as being similar to other pieces, but listeners of those pieces are not capable of identifying why those pieces were determined to be similar. For example, PANDORA allows users to select a “radio” that are based on the music by a particular artist, such as Madonna Radio, which will primarily play Madonna music mixed in with a variety of other artists that PANDORA considers to be similar. Many listeners find, however, regardless of the artist selected for a radio, within a relatively short period of time, such as an hour, the music selection will go off in disparate directions, often ending up with holiday music and other types of music are not remotely related to the selected artist. There may be very good reasons for this, considering the hundreds of attributes being used to make determinations of similarities between the pieces, but those similarities do not appear to relate to what most listeners hear or feel. Accordingly, a better solution is needed.
A method for categorizing music based on a sample set of RTP scores (rhythm, texture and pitch) for predetermined pieces of music is disclosed. Some RTP scores correspond to human-determined RTP scores. Each RTP score corresponds to a category among categories. Unless an unknown piece of music was previously RTP scored based on a unique identification, low-level data is extracted from the unknown piece and analyzed to identify RTP scores based on the sample set. The identified RTP scores are then used to categorize each piece of unknown music and playlists may be created based on the categories. Each RTP score corresponds to an intensity level within the corresponding category, which may also be used in creating playlists. The low-level data may be converted to mel-frequency cepstrum coefficient (MFCC) data that is input into a trained neural network to identify the RTP scores.
Throughout the drawings, reference numbers may be re-used to indicate correspondence between referenced elements. The drawings are provided to illustrate examples described herein and are not intended to limit the scope of the disclosure.
Embodiments of the present disclosure are primarily directed to music categorization. In particular, embodiments involve a music categorization system that objectively categories music based on rhythm, texture and pitch (RTP) values or scores, from which the mood or some other category of the music may be determined.
With respect to mood, when someone listens to a piece of music, the piece tends to evoke some emotion. This may be because of some personal connection a user has to the piece, such as memories or experiences related to the piece, but may also be because of the piece's inherent qualities. Since those inherent qualities may be represented by frequency-related data (i.e., frequencies, structure and organization), that frequency-related data may be used to identify those inherent qualities. The present disclosure describes how spectrograms, whether based on chromagrams or using other forms of spectrograms, may be used to objectively determine the inherent qualities of RTP, which may then be subjectively mapped to moods to identify pieces of music in a new manner.
Values for RTP may be determined holistically or based on low level data extracted from the music. An example of a holistic method for determining RTP is as follows. All music can be identified by its frequency-related data. Perhaps the simplest way of doing so is illustrated in
Accordingly, audio spectrograms based on a short-term Fourier transform, such as represented in
While the spectrogram visually represents some similarities and differences in the music, the time-domain signal representation makes the process of comparing spectrograms using correlation slow and inaccurate. One solution proposed for analyzing the characteristics of spectrogram images is disclosed by Y. Ke, D. Hoiem, and R. Sukthankar, Computer Vision for Music Identification, In Proceedings of Computer Vision and Pattern Recognition, 2005. In this paper, the authors propose determining these characteristics based on: “(a) differences of power in neighboring frequency bands at a particular time; (b) differences of power across time within a particular frequency band; (c) shifts in dominant frequency over time; (d) peaks of power across frequencies at a particular time; and (e) peaks of power across time within a particular frequency band.” Different filters are used to isolate these characteristics from the audio data. If the audio data is formatted in a particular music format, such as MP3, WAV, FLAC, etc., the compressed audio data would first be uncompressed before creating the spectrogram and applying the filters.
An alternative solution for analyzing spectrograms of music in this fashion is the CHROMAPRINT audio fingerprint used by the ACOUSTID database. CHROMAPRINT converts input audio at a sampling rate of 11025 Hz and a frame size of 4096 (0.371 s) with ⅔ overlap. CHROMAPRINT then processes the converted data by transforming the frequencies into musical notes, represented by 12 bins, one for each note, called “chroma features.” After some filtering and normalization, an image like that illustrated in
While the audio representation, or chromagram, of
The arrangement of filter images from
CHROMAPRINT uses 16 filters that can each produce an integer that can be encoded into 2 bits. When these are combined, the result is a 32-bit integer. This same process may be repeated for every subimage generated from the scanned image, resulting in an audio fingerprint, such as that illustrated in
Once an audio fingerprint has been determined for a piece of music having known RTP scores determined through other means (such as a human listener, a spectrum analyzer, or other electrical measurement tool), that audio fingerprint may be compared to other audio fingerprints having unknown RTP scores to see if a match can be found. If there matches, then any corresponding pieces of music have the same or very similar RTP scores. If they do not match, then further comparisons may need to be run until the unknown RTP scores in the audio fingerprint have been identified. Although this holistic approach might involve a human listening to the music to determine known RTP scores corresponding to a sufficient number of pieces of music for comparative purposes, the approach is still much more efficient than the existing technique of relying on humans to listen to every piece of music.
A different embodiment based on spectrograms, but less holistically, is further described below. In this embodiment, illustrated in
In an embodiment, a greedy algorithm analyzes all of the low-level data extracted from each piece of music in the sample set to determine which low-level data contributes to correct solutions for RTP scores of each piece of music, based on the known RTP scores. The greedy algorithm may operate by sorting through the low-level data to select the best low-level data candidates for solving for correct RTP scores for each piece. Each best candidate may then be analyzed to determine if the candidate can be used to contribute to the solution. If the candidate can contribute to the solution, a value is assigned to each contributing candidate based on whether it fully or partially solves the solution. If there is no candidate that provides a full solution (as is almost always the case), a collection of contributing candidates is identified that either provides a complete solution or get closest to the complete solution.
In an embodiment, the following low-level data may form a collection of contribution candidates for a solution for rhythm (R):
In an embodiment, the following low-level data may form a collection of contribution candidates for a solution for texture (T):
In an embodiment, the following low-level data may form a collection of contribution candidates for a solution for pitch (Pt):
Different low-level data extractors may extract different data from the spectrograms than that indicated above. In such as case, the greedy algorithm may identify different low-level data that forms the collection of candidates for a solution to either R, T or P.
In an embodiment, rather than use a greedy algorithm, the extracted low-level data for each piece of music may converted to MFCCs (Mel-frequency cepstral coefficients) as an encoding step and then input into an artificial neural network. The layers of the neural network may extract data from the MFCCs for each piece of music and combine that MFCC data with other data to identify a RTP score for each piece of music, wherein the identification is based on the neural net being trained with known associates between MFCCs and RTP scores. The other data may include audio data augmentation, which may overcome problems associated with data scarcity and otherwise improve recognition performance. Audio data augmentation involves the creation of new synthetic training samples based on small perturbations in a training sample set to fill in gaps in the training data. A sufficiently large set of pieces of music with known RTP scores and other data, such as the audio data augmentation, may lead to a neural network sufficient trained to determine unknown RTP scores for pieces of music with reasonably sufficient accuracy.
As noted above, RTP scores in an embodiment may range from 1 to 5 on a half point scale, i.e., 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5 and 5,0. As such the RTP scores may constitute a spectrum of scores ranging from (1.0,1.0,1.0) to (5.0,5.0,5.0). These RTP scores may be grouped in various ways. In an embodiment, such as step 606 of
The moods identified above are just examples and other words representing other moods may be utilized instead, including completely arbitrary words. However, it has been found that the mood conveyed by pieces of music with RTP scores such as (1.0,2.0,1.0) and (1.0,1.0,2.0) and (2,0,2,0,2.0), are similar and therefor logically map to the same grouping of pieces of music. An example mapping of RTP scores to moods is illustrated in
Once the RTP scores have been grouped or mapped as desired, the RTP scores within a group may be further refined. For example, when RTP scores are mapped to moods, the RTP scores corresponding to a mood may fall along a spectrum of intensities. For example, RTP (1.0,1.0,1.0) may be the lowest intensity for sad, while RTP (3.0,5.0,2.0) may be the highest intensity for sad, with all other RTP scores corresponding to sad falling somewhere in between the lowest and highest RTP scores. Hence, in step 608, the intensity levels for RTP scores within each mood may be determined. Although other spectrums may be utilized, the above example may be used to group pieces of music corresponding to a mood as low, medium and high intensity (or any other suitable gradation) with respect to that mood, as will be further described below.
The description provided above for determining RTP scores for pieces of music may be based on averaging, where various averages are used to determine the RTP scores. For example, the entire piece of music may not be sampled to generate the spectrogram utilized to extract the low-level data. Rather, samples may be collected at different times during a piece of music, such as 10 second samples every 10 seconds, or different length samples at fixed or random points during the piece of music. For a piece of music with a consistent rhythm, texture and pitch throughout the entire piece of music, such as Pachelbel's Canon, written somewhere between 1680 and 1706, and considered the godfather of pop music because so much pop music is based on a similar repetition, this form of averaging may be sufficient to generate a singular RTP score that corresponds to the piece.
Other pieces of music may vary significantly throughout, such as starting softly and building up over time until there is a thunderous ending. Other pieces of music are literally all over the place and may have many different moods each with different intensity levels throughout. Bohemian Rhapsody by Queen, for example, is six minutes long and includes several sections, including an introduction, a ballad segment, an operatic passage, a hard rock part, and a reflective coda. For a piece of music like Bohemian Rhapsody, samples taken during the introduction, the ballad segment, the operatic passage, the hard rock part and the coda may result in completely different RTP scores. In an embodiment, samples may be taken during the entire piece or for sufficient lengths of time along a large enough set of points during each piece, such that different RTP scores may be determined for different parts of the same piece of music. For example, a piece may be 40% manic, 40% sad, and 20% happy, and may have different intensity levels within each of those corresponding moods. In order to simplify the current disclosure, only a single RTP score is determined for each piece of music, but it should be understood that multiple RTP scores may be determined for each piece of music.
RTP scores may also be used to visualize moods to users through both color and shape. In the parent U.S. patent application Ser. No. 15/868,902, embodiments for utilizing color and shape to represent moods were discussed as a further way to verify the mood of a piece of music. The use of RTP scores are accurate enough, however, that validating embodiments may not be required. Nevertheless, shape and color may still be utilized in other ways as further illustrated in
In an embodiment, as illustrated in
Once the pieces of music have been input in some manner, the pieces may be analyzed to determine the RTP scores, step 904, for the pieces. Once the pieces have been scored, the pieces may be mapped into different mood classifications as noted above, step 906, and as appropriate for the RTP score of the pieces. Once the moods of pieces have been determined, users may organize the RTP scored music in any manner they choose. If a user has input the music from their own computer, they can create playlists on their computer based, at least, on the moods of the music. If the music has been input from a music service, the user application of the music service may allow the user to create playlists based on moods and other factors. Likewise, a radio station with a collection of music may allow users to create playlists based on that collection and the moods assigned to the pieces and then listen to the playlist through a user application associate with the radio station.
In an embodiment, users may be able to create a playlist that includes piece of music assigned to different moods and to customize that playlist further based on various factors, step 908. If the user does not wish to customize the playlist, the user may listen to the music based on just the mood classifications, 910. If the user wants to customize the playlist, 912, once the playlist has been customized, the user can listen to the custom playlist, 914.
An embodiment for customizing a playlist is illustrated in
While the intensity levels may cover a range, such as 1.0-5.0, with 1.0 being low, 3.0 being medium and 5.0 being high, as further described in the parent applications incorporated by reference herein, the ranges on the slider may correspond to spans within that range so that a user's choice within a range is not too limited. If a user's choice is truly limited to only RTP scored pieces of music with a high intensity level, the user may find that too few songs are selected for their liking. For this reason, once the user identifies an intensity level, the corresponding span is purposed designed to be a bit bigger, so more pieces of music will be included. For example, if a user selected an intensity level of 4.0, the span may cover a broader portion of the intensity range, such as 3.5-4.5 or 3.0-5.0, thereby allowing a larger number of pieces to be included with the selection, while still more or less honoring the user's intensity selection.
Once the primarily filters 1000 have been selected, secondary filters 1010, if desired, may be used to further customize a playlist. For each mood selected in the primary filters 1000, the user may then choose to only include or only exclude pieces of music with certain characteristics, such as genre 1012, decade 1014 and artist 1016. These characteristics are only exemplary and other characteristics, typically based on metadata associated with each piece of music may also be included. If a user selected genre 1012 for a first mood, the user may be able to further filter the playlist to include pieces of music that are of a certain genre, or exclude such pieces, such as include jazz, but exclude country. Likewise, a user could further filter by decade 1014, so as to exclude 1980's music but include 2000's music. Artists 1016 could also be included or excluded. Once the filtering has been completed, the user may then listen to the customize playlist 914.
As the RTP to mood mapping involves some objective determination, i.e., identifying which RTP scores map to which mood and/or the name of the mood or category, some users may feel that certain pieces of music are assigned to the wrong mood and/or the mood is incorrectly identified. For this reason, users may be given the ability to change the moods assigned to certain RTP scores or group pieces into their own categories and to name those categories as they choose. If a user does not want to customize any aspect of a mood for a piece, then the user may be able to just listen to the pieces as classified, step 910. Alternatively, if the user wants to customize the moods or categorize the pieces based on moods, they may do so, step 912. In an embodiment, the user may want to categorize pieces with the same mood or perhaps different moods within a single category that they name themselves, such as “Sunday Listening,” which includes a grouping of pieces with different moods that a user likes to listen to on Sundays, step 914. Users may also be able to change the names of the moods from Manic, Excited, Happy, Cautious, Peaceful and Sad to whatever words they want. Hence, RTP to mood mapping may be more about RTP to category mapping, with the user having the ability to identify what the category is to be called.
In an embodiment, the customization of step 912 may be performed as illustrated in
To customize the RTP to mood mappings, the user may select or deselect one of the different mood clusters 1106 illustrated in the cube 1104. In an embodiment, the cube 1104 may be designed to not show any mood clusters 1106 until the user selects a mood 1108 by checking one of the boxes 1110 corresponding to a mood. Once a box 1110 was selected, the mood cluster of spheres corresponding to that mood might be shown in the cube 1104. In another embodiment, as shown in
Once a single mood has been selected for customization, the user may adjust the RTP to mood mappings 1112 through use of the sliders 1114. Each of the sliders 1114 corresponds to R, T and P, with two handles, illustrated by the darkened circles and a darkened area between the handles representing the range of R, T or P scores for the corresponding mood cluster 1106 illustrated in the cube 1104. As shown in
To customize the RTP to mood mappings for that mood cluster, the user may then select one of the handles on the sliders 1114 and move it up or down. In an embodiment, by selecting the upper handle on the R slider 1114, the user may move the handle up to 5 such that R now covered the range of 2 to 5. As the sliders 1114 are manipulated for one mood cluster 1106, the spheres corresponding to that mood cluster and any other mood cluster with an impacted RTP score may likewise change. For example, as illustrated in
Once a user has customized the mood clusters for a particular playlist, the use may be able to save the playlist with the customizations. A save function is not illustrated, but would be known to one of ordinary skill in the art. In a similar manner, all of the playlists may be modified one by one, or a user may be able to customize all playlists at one time. If a user was unhappy with customization that the user had made to any playlist, the user could make further changes in that manner described above, or return to the default settings by selected a return to default settings button (not shown).
A block diagram of a music categorization system based on the above disclosure is illustrated in
In an embodiment, the analyzer 1206 may be utilized to generate static representation of the piece based on the low-level sampled frequency data, which may be a static visual representation, such as a spectrogram or mel spectrogram. The static visual representation may then be filtered by the filter 1208 to capture intensity differences or other differences represented in the static visual representation and to generate a filtered representation of the content. An encoder 1210 may then encode the filtered representation and create digitized representations of the content based on the encoded filtered representation, such as an audio fingerprint. Alternatively, the analyzer 1206 may utilize the spectrograms in a neural network to determine RTP scores as described herein. The analyzer 1206 may operate in conjunction with the user interface and display 1212 to generate imagery for display to a user over a display and to receive commands and input from the user.
In an embodiment, before a piece of music is processed to extract the low level data and perform other processing, a music identification code may be obtained from the metadata file associated with the music, such as the international standard recording code (ISRC), a Shazam code, or MusicBrainz Identifier (MBID). Each music identification code unique identifies the piece of music and may also be used to identify other information about a piece of music, such as an artist name, releases, recordings, etc. In an embodiment, a database is maintained of RTP scores determined for known music identification codes. A lookup may first be performed, prior to extracting data from a piece of music to determine in an RTP score already exists for the piece of music, in which case the RTP score may be provided without performing any further analysis.
In an embodiment, a method for categorizing music comprises creating a sample set that includes a RTP score for a plurality of possible combinations of a rhythm score (R), a texture score (T), and a pitch score (P) respectively from a R range, a T range, and a P range, at least some of which RTP scores each correspond to a human-determined RTP score for a predetermined piece of music among a plurality of predetermined pieces of music, each RTP score corresponding to a category among a plurality of categories; extracting low-level data from each piece of music among a plurality of pieces of music to be RTP scored; analyzing the low-level data to determine computer-derived RTP scores for each piece of music among the plurality of pieces based on the sample set, each computer-derived RTP score corresponding to one RTP score in the sample set; utilizing the computer-derived RTP scores for each piece of music to determine a corresponding category for each piece of music among the plurality of categories; and creating a playlist based on pieces of music corresponding to one or more categories among the plurality of categories.
In the embodiment, wherein each RTP score further corresponds to an intensity level within the corresponding category.
In the embodiment, further comprising modifying the playlist based on the intensity levels of pieces of music within the one or more categories among the plurality of categories. In the embodiment, wherein the intensity levels of the pieces of music are human-derived. In the embodiment, wherein the intensity levels of the pieces of music are based on a spectrum of the human-derived RTP scores within each category. In the embodiment, further comprising modifying the playlist based on one or more of a music genre corresponding to each piece of music among the pieces of music, a decade during which each piece of music among the pieces of music were published, and an artist that performed each piece of music among the pieces of music. In the embodiment, wherein modifying the playlist includes excluding one or more of the music genre, the decade and the artist. In the embodiment, wherein modifying the playlist includes including one or more of the music genre, the decade and the artist.
In the embodiment, further comprising providing a user interface configured to enable a user to modify the computer-derived RTP scores for one or more pieces of music, wherein the medication changes the corresponding category for at least some of the one or more pieces of music. In the embodiment, wherein providing a user interface includes providing a three-dimensional image that includes positions for objects corresponding to at least a plurality of RTP scores, and wherein a plurality of objects corresponding to the computer-derived RTP scores for one category among the plurality of categories form a three-dimensional shape within the three-dimensional image. In the embodiment, wherein providing the user interface configured to enable the user to modify the computer-derived RTP scores for the one category includes enabling the user to adjust one or more of the R range, the T range, and the P range for the three-dimensional shape. In the embodiment, wherein providing the user interface configured to enable the user to modify the computer-derived RTP scores includes enabling the user to adjust one or more of the R range, the T range, and the P range for the computer-derived RTP scores of the one category.
In the embodiment, wherein the low-level data contributing to a determination of the R score of each computer-derived RTP score include one or more of: a beats per minute histogram, energy in a frequency band, and mel-frequency cepstrum coefficients. In the embodiment, wherein the low-level data contributing to a determination of the T score of each computer-derived RTP score include one or more of: Shannon entropy, a beats per minute histogram, and mel-frequency cepstrum coefficients. In the embodiment, wherein the low-level data contributing to a determination of the T score of each computer-derived RTP score include one or more of: a weighted mean of frequencies as a measure of a spectral centroid, Shannon entropy, and a beats per minute histogram.
In the embodiment, wherein analyzing includes converting the low-level data to mel-frequency cepstrum coefficient (MFCC) data; inputting the MFCC data to a neural network trained to extract the MFCC data and to combine the MFCC data with additional data to identify one RTP score for each piece of music, wherein the neural network is trained based on known associations between MFCC data and RTP scores. In the embodiment, wherein the additional data includes audio data augmentation data.
In an embodiment, a method for categorizing music comprises creating a sample set that includes a RTP score for a plurality of possible combinations of a rhythm score (R), a texture score (T), and a pitch score (P) respectively from a R range, a T range, and a P range, at least some of which RTP scores each correspond to a human-determined RTP score for a predetermined piece of music among a plurality of predetermined pieces of music, each RTP score corresponding to a category among a plurality of categories; extracting low-level data from each piece of music among a plurality of pieces of music to be RTP scored; converting the low-level data to mel-frequency cepstrum coefficient (MFCC) data; inputting the MFCC data to a neural network trained to extract the MFCC data and identify one RTP score for each piece of music, wherein the neural network is trained based on the sample set, each identified RTP score corresponding to one RTP score in the sample set; utilizing the identified RTP scores for each piece of music to determine a corresponding category for each piece of music among the plurality of categories; and creating a playlist based on pieces of music corresponding to one or more categories among the plurality of categories.
In the embodiment, wherein the neural net is further trained to combine the MFCC data with audio data augmentation data to identify the one RTP score for each piece of music.
In an embodiment, a method for categorizing music comprises creating a sample set that includes a RTP score for a plurality of possible combinations of a rhythm score (R), a texture score (T), and a pitch score (P) respectively from a R range, a T range, and a P range, wherein at least some of the RTP scores each correspond to a human-determined RTP score for a predetermined piece of music among a plurality of predetermined pieces of music, wherein each predetermined piece of music has a unique music identification code, wherein each unique music identification code corresponds to one RTP score, and wherein each RTP score corresponds to a category among a plurality of categories; identifying a music identification code associated with each piece of music among a plurality of pieces of music to be RTP scored; determining if the music identification code for each piece of music to be RTP scored matches any of the unique music identification codes corresponding to the predetermined pieces of music; when the music identification code matches one unique identification code among the unique identification codes, identifying the one RTP score as the RTP score corresponding to the matched unique music identification code; utilizing the one RTP score for each piece of music to determine a corresponding category for each piece of music among the plurality of categories; and creating a playlist based on pieces of music corresponding to one or more categories among the plurality of categories.
Conditional language used herein, such as, among others, “can,” “could,” “might,” “may,” “e.g.,” and the like, unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain examples include, while other examples do not include, certain features, elements, and/or steps. Thus, such conditional language is not generally intended to imply that features, elements and/or steps are in any way required for one or more examples or that one or more examples necessarily include logic for deciding, with or without author input or prompting, whether these features, elements and/or steps are included or are to be performed in any particular example. The terms “comprising,” “including,” “having,” and the like are synonymous and are used inclusively, in an open-ended fashion, and do not exclude additional elements, features, acts, operations, and so forth. Also, the term “or” is used in its inclusive sense (and not in its exclusive sense) so that when used, for example, to connect a list of elements, the term “or” means one, some, or all of the elements in the list.
In general, the various features and processes described above may be used independently of one another, or may be combined in different ways. All possible combinations and subcombinations are intended to fall within the scope of this disclosure. In addition, certain method or process blocks may be omitted in some implementations. The methods and processes described herein are also not limited to any particular sequence, and the blocks or states relating thereto can be performed in other sequences that are appropriate. For example, described blocks or states may be performed in an order other than that specifically disclosed, or multiple blocks or states may be combined in a single block or state. The example blocks or states may be performed in serial, in parallel, or in some other manner. Blocks or states may be added to or removed from the disclosed examples. The example systems and components described herein may be configured differently than described. For example, elements may be added to, removed from, or rearranged compared to the disclosed examples.
While certain example or illustrative examples have been described, these examples have been presented by way of example only, and are not intended to limit the scope of the subject matter disclosed herein. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of certain of the subject matter disclosed herein.
This application is a continuation-in-part of U.S. patent application Ser. No. 15/868,902, filed Jan. 11, 2018; which is a continuation-in-part of U.S. patent application Ser. No. 14/671,973, filed Mar. 27, 2015, now U.S. Pat. No. 9,875,304, issued Jan. 23, 2018; which is a continuation-in-part of U.S. patent application Ser. No. 14/603,324, filed Jan. 22, 2015, now U.S. Pat. No. 10,061,476, issued Aug. 28, 2018; and is a continuation-in-part of U.S. patent application Ser. No. 14/603,325, filed Jan. 22, 2015; both of which are continuation-in-parts of U.S. patent application Ser. No. 13/828,656, filed Mar. 14, 2013, now U.S. Pat. No. 9,639,871, issued May 2, 2017; the entire contents of each of which are incorporated herein by reference. U.S. patent application Ser. Nos. 14/603,324 and 14/603,325 both claim benefit under 35 U.S.C. § 119(e) of Provisional U.S. Patent Application No. 61/930,442, filed Jan. 22, 2014, and of Provisional U.S. Patent Application No. 61/930,444, filed Jan. 22, 2014, the entire contents of each of which are incorporated herein by reference. U.S. patent application Ser. No. 14/671,973 also claims benefit under 35 U.S.C. § 119(e) of Provisional Application No. 61/971,490, filed Mar. 27, 2014, the entire contents of which are incorporated herein by reference. U.S. patent application Ser. No. 15/868,902 is also a continuation-in-part of U.S. patent application Ser. No. 14/671,979, filed Mar. 27, 2015, the entire contents of which are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5561649 | Lee et al. | Oct 1996 | A |
6151571 | Pertrushin | Nov 2000 | A |
6411289 | Zimmerman | Jun 2002 | B1 |
6464585 | Miyamoto et al. | Oct 2002 | B1 |
6539395 | Gjerdingen et al. | Mar 2003 | B1 |
6657117 | Weare et al. | Dec 2003 | B2 |
6748395 | Picker et al. | Jun 2004 | B1 |
6964023 | Maes et al. | Nov 2005 | B2 |
6993532 | Platt et al. | Jan 2006 | B1 |
7022907 | Lu et al. | Apr 2006 | B2 |
7024424 | Platt et al. | Apr 2006 | B1 |
7080324 | Nelson et al. | Jul 2006 | B1 |
7115808 | Lu et al. | Oct 2006 | B2 |
7205471 | Looney et al. | Apr 2007 | B2 |
7206775 | Kaiser et al. | Apr 2007 | B2 |
7227074 | Ball | Jun 2007 | B2 |
7233684 | Fedorovskaya et al. | Jun 2007 | B2 |
7296031 | Platt et al. | Nov 2007 | B1 |
7396990 | Lu et al. | Jul 2008 | B2 |
7424447 | Fuzell-Casey et al. | Sep 2008 | B2 |
7427708 | Ohmura | Sep 2008 | B2 |
7485796 | Myeong et al. | Feb 2009 | B2 |
7541535 | Ball | Jun 2009 | B2 |
7562032 | Abbosh et al. | Jul 2009 | B2 |
7582823 | Kim et al. | Sep 2009 | B2 |
7626111 | Kim et al. | Dec 2009 | B2 |
7756874 | Hoekman et al. | Jul 2010 | B2 |
7765491 | Cotterill | Jul 2010 | B1 |
7786369 | Eom et al. | Aug 2010 | B2 |
7809793 | Kimura et al. | Oct 2010 | B2 |
7822497 | Wang | Oct 2010 | B2 |
7858868 | Kemp et al. | Dec 2010 | B2 |
7921067 | Kemp et al. | Apr 2011 | B2 |
8013230 | Eggink | Sep 2011 | B2 |
8229935 | Lee et al. | Jul 2012 | B2 |
8248436 | Kemp et al. | Aug 2012 | B2 |
8260778 | Ghatak | Sep 2012 | B2 |
8269093 | Naik et al. | Sep 2012 | B2 |
8346801 | Hagg et al. | Jan 2013 | B2 |
8354579 | Park et al. | Jan 2013 | B2 |
8390439 | Cruz-Hernandez et al. | Mar 2013 | B2 |
8407224 | Bach et al. | Mar 2013 | B2 |
8410347 | Kim et al. | Apr 2013 | B2 |
8505056 | Cannistraro et al. | Aug 2013 | B2 |
8686270 | Eggink et al. | Apr 2014 | B2 |
8688699 | Eggink et al. | Apr 2014 | B2 |
8855798 | Dimaria et al. | Oct 2014 | B2 |
8965766 | Weinstein et al. | Feb 2015 | B1 |
9165255 | Shetty et al. | Oct 2015 | B1 |
9195649 | Neuhasuer et al. | Nov 2015 | B2 |
9788777 | Knight et al. | Oct 2017 | B1 |
9830896 | Wang et al. | Nov 2017 | B2 |
9842146 | Chen et al. | Dec 2017 | B2 |
20030045953 | Weare et al. | Mar 2003 | A1 |
20030133700 | Uehara et al. | Jul 2003 | A1 |
20030221541 | Platt et al. | Dec 2003 | A1 |
20050065781 | Tell et al. | Mar 2005 | A1 |
20050109194 | Gayama et al. | May 2005 | A1 |
20050109195 | Haruyama et al. | May 2005 | A1 |
20050211071 | Lu et al. | Sep 2005 | A1 |
20050234366 | Heinz et al. | Oct 2005 | A1 |
20050241465 | Goto et al. | Nov 2005 | A1 |
20050252362 | McHale | Nov 2005 | A1 |
20060047649 | Liang et al. | Mar 2006 | A1 |
20060096447 | Weare et al. | May 2006 | A1 |
20060143647 | Bill et al. | Jun 2006 | A1 |
20060170945 | Bill et al. | Aug 2006 | A1 |
20070079692 | Glatt et al. | Apr 2007 | A1 |
20070107584 | Kim et al. | May 2007 | A1 |
20070113725 | Oliver et al. | May 2007 | A1 |
20070113726 | Oliver et al. | May 2007 | A1 |
20070131096 | Lu et al. | Jun 2007 | A1 |
20080021851 | Alcalde et al. | Jan 2008 | A1 |
20080040362 | Aucouturier et al. | Feb 2008 | A1 |
20080184167 | Berrill et al. | Jul 2008 | A1 |
20080189754 | Yoon et al. | Aug 2008 | A1 |
20080235284 | Aarts et al. | Sep 2008 | A1 |
20080253695 | Sano et al. | Oct 2008 | A1 |
20080300702 | Gomez et al. | Dec 2008 | A1 |
20080314228 | Dreyfuss et al. | Dec 2008 | A1 |
20090069914 | Kemp et al. | Mar 2009 | A1 |
20090071316 | Oppenheimer et al. | Mar 2009 | A1 |
20090182736 | Ghatak et al. | Jul 2009 | A1 |
20090234888 | Holmes et al. | Sep 2009 | A1 |
20100011388 | Bull et al. | Jan 2010 | A1 |
20100042932 | Lehtiniemi et al. | Feb 2010 | A1 |
20100053168 | Kemp et al. | Mar 2010 | A1 |
20100086204 | Lessing et al. | Apr 2010 | A1 |
20100091138 | Nair et al. | Apr 2010 | A1 |
20100094441 | Mochizuki et al. | Apr 2010 | A1 |
20100223128 | Dukellis et al. | Sep 2010 | A1 |
20100223223 | Sandler et al. | Sep 2010 | A1 |
20100253764 | Sim et al. | Oct 2010 | A1 |
20100260363 | Glatt et al. | Oct 2010 | A1 |
20100325135 | Chen et al. | Dec 2010 | A1 |
20110112671 | Weinstein et al. | May 2011 | A1 |
20110184539 | Agevik et al. | Jul 2011 | A1 |
20110191674 | Rawley et al. | Aug 2011 | A1 |
20110202567 | Bach et al. | Aug 2011 | A1 |
20110239137 | Bill et al. | Sep 2011 | A1 |
20110242128 | Kang et al. | Oct 2011 | A1 |
20110252947 | Eggink et al. | Oct 2011 | A1 |
20110252951 | Leavitt et al. | Oct 2011 | A1 |
20110271187 | Sullivan et al. | Nov 2011 | A1 |
20110289075 | Nelson et al. | Nov 2011 | A1 |
20110314039 | Zheleva et al. | Dec 2011 | A1 |
20120090446 | Moreno et al. | Apr 2012 | A1 |
20120132057 | Kristensen et al. | May 2012 | A1 |
20120172059 | Kim et al. | Jul 2012 | A1 |
20120179693 | Knight et al. | Jul 2012 | A1 |
20120179757 | Jones et al. | Jul 2012 | A1 |
20120197897 | Knight et al. | Aug 2012 | A1 |
20120226706 | Choi et al. | Sep 2012 | A1 |
20120260789 | Ur et al. | Oct 2012 | A1 |
20120272185 | Dodson et al. | Oct 2012 | A1 |
20120296908 | Bach et al. | Nov 2012 | A1 |
20130032023 | Pulley et al. | Feb 2013 | A1 |
20130086519 | Fino et al. | Apr 2013 | A1 |
20130138684 | Kim et al. | May 2013 | A1 |
20130167029 | Friesen et al. | Jun 2013 | A1 |
20130173526 | Wong et al. | Jul 2013 | A1 |
20130178962 | DiMaria et al. | Jul 2013 | A1 |
20130204878 | Kim et al. | Aug 2013 | A1 |
20130205223 | Gilbert et al. | Aug 2013 | A1 |
20130247078 | Nikankin et al. | Sep 2013 | A1 |
20140052731 | Dahule et al. | Feb 2014 | A1 |
20140053710 | Serletic, II | Feb 2014 | A1 |
20140053711 | Serletic, II | Feb 2014 | A1 |
20140080606 | Gillet et al. | Mar 2014 | A1 |
20140085181 | Roseway et al. | Mar 2014 | A1 |
20140094156 | Uusitalo et al. | Apr 2014 | A1 |
20140140536 | Serletic, II | May 2014 | A1 |
20140180673 | Neuhauser et al. | Jun 2014 | A1 |
20140282237 | Fuzell-Casey et al. | Sep 2014 | A1 |
20140310011 | Biswas et al. | Oct 2014 | A1 |
20140372080 | Chu | Dec 2014 | A1 |
20150078583 | Ball et al. | Mar 2015 | A1 |
20150081064 | Ball et al. | Mar 2015 | A1 |
20150081065 | Ball et al. | Mar 2015 | A1 |
20150081613 | Ball et al. | Mar 2015 | A1 |
20150134654 | Fuzell-Casey | May 2015 | A1 |
20150179156 | Uemura et al. | Jun 2015 | A1 |
20150205864 | Fuzell-Casey et al. | Jul 2015 | A1 |
20150220633 | Fuzell-Casey et al. | Aug 2015 | A1 |
20160110884 | Fuzell-Casey et al. | Apr 2016 | A1 |
20160125863 | Henderson | May 2016 | A1 |
20160203805 | Strachan | Jul 2016 | A1 |
20160329043 | Kim et al. | Nov 2016 | A1 |
20160372096 | Lyske | Dec 2016 | A1 |
20170091983 | Sebastian et al. | Mar 2017 | A1 |
20170103740 | Hwang et al. | Apr 2017 | A1 |
20170206875 | Hwang et al. | Jul 2017 | A1 |
20170330540 | Quattro et al. | Nov 2017 | A1 |
20180033416 | Neuhasuer et al. | Feb 2018 | A1 |
20180047372 | Scallie et al. | Feb 2018 | A1 |
20180049688 | Knight et al. | Feb 2018 | A1 |
20180053261 | Hershey | Feb 2018 | A1 |
Entry |
---|
www.picitup.com; Picitup's; PicColor product; copyright 2007-2010; accessed Feb. 2, 2015; 1 page. |
http://labs.tineye.com; Multicolor; Idee Inc.; copyright 2015; accessed Feb. 2, 2015, 1 page. |
http://statisticbrain.com/attention-span-statistics/; Statistics Brain; Statistic Brain Research Institute; accessed Feb. 2, 2015; 4 pages. |
Dukette et al.; “The Essential 20: Twenty Components of an Excellent Health Care Team”; RoseDog Books; 2009; p. 72-73. |
Music Genome Project; http://en.wikipedia.org/wiki/Music.sub.--Genome.sub.--Project; accessed Apr. 15, 2015; 4 pages. |
Ke et al.; “Computer Vision for Music Identification”; In Proceedings of Computer Vision and Pattern Recognition; 2005; vol. 1; p. 597-604. |
Lalinsky; “How does Chromaprint work?”; https://oxygene.sk/2011/01/how-does-chromaprint-work; Jan. 18, 2011; accessed Apr. 15, 2015; 3 pages. |
Gforce; http://www.soundspectrum.com; copyright 2015; accessed Apr. 15, 2015; 2 pages. |
Harlan J. Brothers; “Intervallic Scaling in the Bach Cello Suites”; Fractals; vol. 17 Issue 4; Dec. 2009; p. 537-545. |
Ke et al.; “Computer Vision for Music Identification”; IEEE Computer Vision and Pattern Recognition CVPR; Jun. 2005; 8 pages. |
Number | Date | Country | |
---|---|---|---|
20190199781 A1 | Jun 2019 | US |
Number | Date | Country | |
---|---|---|---|
61971490 | Mar 2014 | US | |
61930442 | Jan 2014 | US | |
61930444 | Jan 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14603325 | Jan 2015 | US |
Child | 13828656 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15868902 | Jan 2018 | US |
Child | 16292193 | US | |
Parent | 14671979 | Mar 2015 | US |
Child | 15868902 | US | |
Parent | 14671973 | Mar 2015 | US |
Child | 14671979 | US | |
Parent | 13828656 | Mar 2013 | US |
Child | 14671973 | US | |
Parent | 14603324 | Jan 2015 | US |
Child | 13828656 | US | |
Parent | 13828656 | Mar 2013 | US |
Child | 14603324 | US |