The invention relates to a method of generating a user profile of a user of a device for processing data representative of items of content, a respective recording of at least one perceptible content element being associated with each item of content, including
determining a set containing a plurality of recordings of at least one perceptible content element, each associated with one of a plurality of items of content associated with the user, and
generating data representative of a user profile indicating preferences of the user, wherein the data representative of the user profile includes one or more parameter values within an at least one-dimensional feature space, each dimension representing a property of at least a section of a recording of a perceptible content element, such that at least one of the dimensions in the feature space represents a quantifiable property of at least a section of a recording of a perceptible content element.
The invention also relates to a method of filtering a user profile of a user of a device for processing data representative of items of content, a respective recording of at least one perceptible content element being associated with each item of content, including
determining a set containing at least one recording of at least one perceptible content element, respectively associated with items of content associated with the user,
generating data representative of a user profile indicating preferences of the user, wherein the data representative of the user profile includes one or more parameter values within an at least one-dimensional feature space, each dimension representing a property of at least a section of a recording of a perceptible content element, such that at least one of the dimensions in the feature space represents a quantifiable property of at least a section of a recording of a perceptible content element, and transmitting the data representative of the user profile to a second device arranged to determine, using a similarity metric, whether any of one or more further profiles including a set of one or more parameter values within the feature space match the user profile.
The invention also relates to a method of filtering a user profile expressing preferences of a user of a device for processing data representative of items of content having respective recordings of perceptible content elements associated with them, including generating data representative of a user profile indicating preferences of the user, wherein the data representative of the user profile includes one or more parameter values within an at least one-dimensional feature space, each dimension representing a property of at least a section of a recording of a perceptible content element, such that at least one of the dimensions in the feature space represents a quantifiable property of at least a section of a recording of a perceptible content element, and determining, using a similarity metric, whether any of one or more further profiles including a set of one or more parameter values within the feature space match the user profile.
The invention also relates to a method of filtering a user profile expressing preferences of a user of a device for processing data representative of items of content having respective recordings of perceptible content elements associated with them, wherein
a second device receives from the device through a data link data representative of a user profile indicating preferences of the user, wherein the data representative of the user profile includes one or more parameter values within an at least one-dimensional feature space, each dimension representing a property of at least a section of a recording of a perceptible content element, wherein at least one of the dimensions in the feature space represents a quantifiable property of at least a section of a recording of a perceptible content element, and wherein
the second device retrieves one or more further profiles including a set of one or more parameter values within the feature space and determines, using a similarity metric, whether any of the retrieved profiles matches the user profile.
The invention also relates to a system for generating a user profile, configured to execute a method of generating a user profile according to the invention.
The invention also relates to a system for filtering a user profile, configured to execute a method of filtering a user profile according to the invention.
The invention also relates to a computer program.
In U.S. Pat. No. 6,545,209, systems and methods are described that associate closely related and/or similarly situated media entities with each other using inherent media entity characteristics. The songs from a database are classified according to digital signal processing techniques. Exemplary classifications for songs include, inter alia, tempo, sonic, melodic movement and musical consonance characterisations. The quantitative machine classification and qualitative human classifications for a given piece of media are placed into what is referred to as a classification chain. The technique maps a pre-defined parameter space to a psychoacoustic perceptual space defined by musical experts. This mapping enables content-based searching of media. Playlists may be generated, for example, from a single song and/or a user preference profile in accordance with an appropriate analysis and matching algorithm performed on the data store of the database. Nearest neighbour and/or other matching algorithms may be utilised to locate songs that are similar to the single song and/or are suited to the user profile. In the case of a song as input, first, a DSP analysis of the input song is performed to determine the attributes, qualities, likelihood of success, etc. of the song.
A problem of the known technique is that a user with a broad taste requiring a playlist matching the entire spectrum of his or her preferences, must either submit a plurality of songs to generate a plurality of playlists, or must resort to the less precise method of submitting a user profile describing subjective aspects of his or her preferred songs. The former is relatively inefficient, whereas the latter option may require repeated attempts, due to the inherent inaccuracy in classifying favourite songs according to subjective aspects.
It is an object of the invention to provide a method of generating a user profile and methods of filtering a user profile, as well as corresponding systems and a computer program, of the types mentioned above, that permit relatively accurate and efficient filtering.
This object is achieved according to one aspect by the method of generating a user profile according to the invention, which is characterised in that at least one set of parameter values, each set containing at least one parameter value and quantifying a dimension of the feature space in the user profile, is obtained by applying a pre-determined analysis algorithm to each of a plurality of signals, each signal representing at least a section of a recording of at least one perceptible content element, such that the set of parameter values is based on a plurality of the recordings in the set of recordings.
Because the at least one set of parameter values is obtained by applying an analysis algorithm to each of the signals representing at least sections of the recordings of at least one perceptible content element, an objective measure of at least one property of the recording is obtained. Because the analysis algorithm is pre-determined, comparisons with parameter values generated by another entity using the same algorithm can accurately and predictably identify recordings or sets of recordings with similar properties to those on which the set of parameter values is based. Because the analysis is applied to each of a plurality of signals, such that the set of parameter values is based on a plurality of the recordings in a set of recordings associated with the user, the preferences of the user as expressed in the generated user profile are based on several preferred recordings of at least one perceptible content element. Thus, when such a user profile is filtered, i.e. matched against profiles characterising other users' preferences or characterising other recordings of a perceptible content element, it is avoided that the result is based on a recording that is an anomaly within the spectrum of the user's tastes. In the present context, a perceptible content element is assumed to be one of a sound or visual element of audiovisual content comprising either one or both of such elements. Examples include the audio track in a music file, the audio track accompanying a movie, a sequence of images in a movie, the visual information in a picture file, etc.
In an embodiment, the data representative of the user profile includes data representative of the distribution over the plurality of recordings in the set of recordings of parameter values obtained by applying the pre-determined analysis algorithm to the signals representing at least sections of the respective recordings.
Thus, the user profile becomes suitable for indicating the extent of the user's taste with respect to the content. It becomes more accurate, allowing for better and thus faster searching for other users' tastes or for recordings of perceptible content elements or content items matching the preferences expressed in the user profile.
An embodiment includes applying a clustering algorithm to the parameter values obtained by applying the pre-determined analysis algorithm to the signals representing at least sections of the respective recordings, and including in the data representative of the user profile data indicating the position of the clusters along at least one quantified dimension of the feature space.
In an embodiment, the data representative of the user profile includes data indicating the strength of each cluster.
Thus, the significance of overlaps between clusters identified in two different user profiles can be assessed. Where, for example, of two clusters in a first user profile, one overlaps with a cluster indicated in a second user profile and one overlaps with a cluster indicated in a third user profile, the matches between the first and second and between the first and third user profile can be ranked.
In an embodiment the data representative of the user profile includes data indicating the spread of each cluster along at least one quantified dimension of the feature space.
This is useful to determine whether a particular preference of the user is marked or not. The similarity metric can advantageously be based on the area of overlap between clusters identified in different user profiles, rather than relying only on distances between the centres of clusters.
In an embodiment, at least one further set of parameter values, each set containing at least one parameter value and quantifying at least one further dimension of the feature space, is obtained by applying at least one pre-determined analysis algorithm to each of a plurality of signals, each signal representing at least a section of a recording of at least one perceptible content element in the set of recordings,
at least one measure of a correlation between parameter values quantifying different dimensions of the feature space is determined, and
data representative of the determined measures is included among the data representative of the user profile.
This embodiment thus determines relations between dimensions. This is useful for identifying particular styles characterised by such a relation. The user profile and/or comparison of the user profile may also be made more efficient in cases where there is a marked relation between two dimensions. Data quantifying one of the two may be omitted from the user profile or comparison.
An embodiment of the method includes determining the items of content associated with the user on the basis of a user's selection of the items of content for processing by the user device.
Such an implicit selection makes the method unobtrusive and efficient, since separate selection options need not be provided in the user device.
According to another aspect of the invention a method of filtering a user profile of a user of a device for processing data representative of items of content, a respective recording of at least one perceptible element being associated with each item of content, is provided. The method includes
determining a set containing at least one recording of at least one perceptible content element, respectively associated with a user's favourite items of content associated with the user,
generating data representative of a user profile indicating preferences of the user, wherein the data representative of the user profile includes one or more parameter values within an at least one-dimensional feature space, each dimension representing a property of at least a section of a recording of a perceptible content element, such that at least one of the dimensions in the feature space represents a quantifiable property of at least a section of a recording of a perceptible content element, and transmitting the data representative of the user profile to a second device arranged to determine, using a similarity metric, whether any of one or more further profiles including a set of one or more parameter values within the feature space match the user profile, and is characterised in that
at least one set of at least one parameter value quantifying a dimension of the feature space in the user profile is obtained by applying a pre-determined analysis algorithm to at least one signal representing at least a section of a recording of at least one perceptible content element in the set of audio tracks.
Because at least one set of at least one parameter value quantifying a dimension of the feature space in the user profile is obtained by applying an analysis algorithm to at least one signal representing at least a section of a recording of at least one perceptible content element in the set of recordings, an objective measure of at least one property of the recording of at least one perceptible content element is obtained. Because the analysis algorithm is pre-determined, comparisons with parameter values generated by another entity using the same algorithm can accurately and predictably identify recordings or sets of recordings with similar properties to those of the recordings on which the set of parameter values is based. Because the set of at least one parameter value is included in the data representative of the user profile, and is transmitted to a second device, the method is more efficient than would be the case if the entire section of the recording were to be transmitted. Due to the increased accuracy and objective nature of the parameter values, the likelihood of a good match at the first try is increased, decreasing the amount of data exchanged between the user's device and the second device.
According to another aspect of the invention, a method of filtering a user profile expressing preferences of a user of a device for processing data representative of items of content having respective recordings of at least one perceptible content element associated with them is provided, including
generating data representative of a user profile associating preferences of the user with the user, wherein the data representative of the user profile includes one or more parameter values within an at least one-dimensional feature space, each dimension representing a property of at least a section of a recording of a perceptible content element, such that at least one of the dimensions in the feature space represents a quantifiable property of at least a section of a recording of a perceptible content element, and determining, using a similarity metric, whether any of one or more further profiles including a set of one or more parameter values within the feature space match the user profile, wherein the data representative of the user profile is generated by applying a method of generating a user profile according to the invention.
According to another aspect of the invention, a method of filtering a user profile expressing preferences of a user of a device for processing data representative of items of content having respective recordings of at least one perceptible content element associated with them is provided, wherein
a second device receives from the device through a data link data representative of a user profile associating preferences of the user with the user, wherein the data representative of the user profile includes one or more parameter values within an at least one-dimensional feature space, each dimension representing a property of at least a section of a recording of a perceptible content element, wherein at least one of the dimensions in the feature space represents a quantifiable property of at least a section of a recording of a perceptible content element, and wherein
the second device retrieves one or more further profiles including a set of one or more parameter values within the feature space and determines, using a similarity metric, whether any of the retrieved profiles matches the user profile, characterised in that the second device receives and determines whether any of the retrieved profiles matches a user profile in which at least one of the parameter values is obtainable by applying a pre-determined analysis algorithm to at least one signal representing at least a section of a recording in the set of recordings.
Because at least one set of at least one parameter value quantifying a dimension of the feature space in the user profile is obtained by applying an analysis algorithm to at least one signal representing at least a section of a recording of a perceptible content element in the set of recordings, an objective measure of at least one property of the recording is obtained. Because the analysis algorithm is pre-determined, comparisons with parameter values generated by another entity using the same algorithm can accurately and predictably identify recordings or sets of recordings with similar properties to those on which the set of parameter values is based. Because the set of at least one parameter value is included in the data representative of the user profile, and is received and used to determine matching profiles, the method is more efficient than would be the case if the entire section of the recording were to be received and then analysed. Due to the increased accuracy and objective nature of the parameter values, the likelihood of a good match at the first try is increased, decreasing the amount of data received and processed by the second device.
An embodiment, wherein the further profiles are formed by further user profiles, each expressing the preferences of a further user of a device for processing data representative of items of content having respective recordings of at least one perceptible content element associated with them, includes identifying further users having a further user profile matching the user profile.
Thus, an efficient and accurate way of matching users with similar tastes is provided, which does not rely only on subjective descriptions of the users' tastes as provided by them.
An embodiment includes, for each further user having a further user profile matching the user profile, retrieving a list of content items selected by that further user, and
generating a list of recommended content items for the user, wherein the recommended content items are selected from the retrieved lists of content items.
Thus, an efficient, accurate and more effective way of collaborative filtering is provided. The method is efficient and accurate, because it is based on a comparison of parameter values obtainable by applying a pre-determined analysis algorithm to recordings associated with the respective users. It is more effective than conventional collaborative filtering methods, because it does not rely only on an identification of the items of content that have been selected by several users. Thus, the items selected by users with similar tastes to the target user but who have no selected item of content in common with the target user are also considered for recommendation to the target user. Because the method does not rely only on subjective rankings provided by the users, the recommendations are more accurately tuned to the target user's tastes.
According to another aspect of the invention, the system for generating a user profile is configured to execute a method of generating a user profile according to the invention.
According to another aspect of the invention, the system for filtering a user profile is configured to execute a method of filtering a user profile according to the invention.
According to another aspect of the invention, there is provided a computer program including a set of instructions capable, when incorporated in a machine-readable medium, of causing a system having information processing capabilities to perform a method according to the invention.
The invention will now be explained in further detail with reference to the accompanying drawings, in which:
The methods outlined herein find application in all situations where it is desirable to express the musical tastes of one or more persons in an efficient and accurate manner. It is especially useful where this expression of musical tastes is to be communicated between devices. Although the methods employed differ slightly per application, the generation of a user profile is common to all applications.
In the following, an example will be described wherein the user profile is based on audio tracks, and usable to find audio tracks matching the tastes expressed in the user profile. However, the user profile could also be used to express preferences for items of content of a different type but having audio tracks associated with them. An example would be a video file, wherein the audio track expresses particular moods. The principles of the method are also applicable where a type of recording of a perceptible content element other than an audio track is analysed.
The method illustrated in
The method illustrated in
A feature vector (not shown) is generated (steps 4,5) for each of the audio tracks in the set 3. The feature vector contains a number of elements, each consisting of a parameter value that quantifies a dimension of a multi-dimensional feature space. The multi-dimensional feature space describes perceptually important properties of an audio track. Each value in the feature vector associated with a particular audio track is obtained by applying a pre-determined analysis algorithm to a signal representing at least a section of that particular audio track. In certain embodiments, several signals, each based on a different section of the audio track are analysed, so that different values in the feature vector relate to different sections.
The use of a computational method based on a pre-determined analysis algorithm ensures that the feature vector is an objective characterisation of perceptual properties of at least a section of the audio track concerned. It is more compact than the entire set of data encoding the audio track. Depending on the implementation, the analysis algorithm may take PCM (Pulse-code Modulation) values, DCT (Direct Cosine Transform) coefficients or any other convenient form of encoded audio file as input.
Suitable analysis algorithms for quantifying perceptually important properties of an audio track are known as such. For this reason, they are not described in any great detail herein. One example is described in Klapuri et al., “Analysis of the Meter of Acoustic Musical Signals”, IEEE Trans. Speech and Audio Proc. This article describes a method which analyses the meter of acoustic musical signals at the tactus, tatum and measure levels, which correspond to different time scales. The result can be used, for example, to identify the genre of music (classical, jazz, etc.). Another example of an algorithm that can be used in one of steps 4,5 is presented in Scheirer, E. D., “Tempo and beat analysis of acoustic musical signals”, J. Acoust. Soc. Am. 103 (1), January 1998. A further possibility is to model an audio track or section of an audio track using Mel Frequency Cepstral Coefficients, as employed also in speech recognition algorithms.
In
In a simple implementation, a user profile is generated on the basis of only one audio track, termed a seed song. This user profile can be transferred from a first device to a second device arranged to determine, using a similarity metric, whether any of one or more further feature vectors including parameter values determining a point in the same multi-dimensional feature space, match the user profile generated from the seed song. A distance metric is used for this purpose. One useful application, for example, is to determine whether another mobile music player has further audio tracks by the same artist stored therein, in the absence of tags identifying the audio tracks as being by that artist. Another useful application is to compile collections of audio tracks that capture a particular mood.
In the example illustrated in
Data clustering algorithms are known as such. The term refers to a partitioning of a data set into subsets (clusters) such that the data in each subset share some common trait, in particular proximity according to a pre-defined distance measure. Two types of data clustering exist and are suitable for implementing the step 8 illustrated in
Considering
Preferably, data indicating the strength of each cluster 10-12 is also included in the table 9 identifying the clusters 10-12. This could be a normalised count of the number of feature vectors in the cluster, for example. Clustering has the effect of separately identifying disparate preferences. Such information is lost if only an average of all feature vectors is taken. By also indicating the strengths of each cluster, a ranking between different styles, artists, genres, etc. can be made. This is especially useful where the selection in step 2 is an implicit selection. If a user selects predominantly rock music and occasionally classical music, then this will be indicated in the user profile 7, which is generated by inserting data corresponding to some or all of the entries in the table 8 identifying the clusters 10-12 into the user profile 7 (step 13).
It is also useful to determine at least one measure of a correlation between parameter values quantifying the different dimensions X,Y of the feature space, so that data representative of the determined measures can be included among the data representative of the user profile 7. Preferably, a measure of a correlation is determined separately for each of the clusters 10-12, and added to the data indicating the position of the clusters. It will be apparent that the correlation is much stronger for a third cluster 12 than for the first two clusters 10-11. Thus, the measure of the correlation serves to characterise the clusters 10-12. Such a strong correlation could server to identify a particular musical style, and thus a user's preference for this style.
The user profile 7 further contains data suitable for identifying the user who has been profiled, so that the user is associated with the properties of the audio tracks in the set 3 of selected audio tracks. Besides parameter values quantifying a measurable property of an audio track, the user profile 7 will in some embodiments also include subjective information as provided by the user.
A use of the user profile 7 is illustrated in
First, a user profile 17 of the user of the first mobile music player 14 is generated in a step 18, and a user profile 19 of the user of the second mobile music player 15 is obtained through the communications link 16 in a further step 20. The second mobile music player 15 is configured to generate the user profile 19 in a step similar to the step 18 executed by the first mobile music player 14. The step 18 of generating the user profile 17 is implemented using the method illustrated in
An application of user profiles in a system for collaborative filtering is illustrated in
The server system 27 implements a method of filtering a user profile 29. In the illustrated embodiment, the server system 27 generates the user profile 29. In variants, the music players 23-26 may generate user profiles for their respective users and transmit them to the server system 27 for filtering.
The method of
In a first step 32, the user requesting a recommendation is identified. A list 33 identifying the audio tracks previously selected for download by the user is then obtained (step 34).
In a subsequent step 35, the user profile 29 is generated on the basis of the feature vectors obtained by applying analysis algorithms to the audio tracks identified in the list 33. Those feature vectors are clustered to generate the user profile 29. The user profile 29 is then compared (step 36) to each of a set 37 of user profiles for representing the preferences of users of other ones of the music players 23-26.
The users for whom the matching user profiles have been generated are then identified (step 38), and lists 39 of audio files previously downloaded by them are selected (step 40). From these lists 39, a recommendation is made (step 41) for the user identified in the first step 32 (the target user). For example, the audio track appearing most often in the lists 39 may be suggested to the target user.
The effect of using the user profile 29 to identify other users with the same tastes and thence the lists 39 of audio files, is that a more consistent and precise match is obtained. Using a pre-determined analysis algorithm results in far less subjectivity across tracks and different collections of tracks, and will result in more consistent recommendations and comparisons. Because the dimensions in the feature space represent a quantifiable property of at least a section of an audio track, it is easier to calculate degrees of similarity between users' preferences. Also, a representation in terms of parameter values is relatively compact, at least more compact than the use of labels attached to the tracks by users. This in turn makes sharing of the data representative of the user profiles across different devices feasible.
It should be noted that the above-mentioned embodiments illustrate, rather than limit, the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word “comprising” does not exclude the presence of elements or steps other than those listed in a claim. The word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
For instance, the set 37 of user profiles illustrated in
Number | Date | Country | Kind |
---|---|---|---|
05110802.5 | Nov 2005 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB06/53977 | 10/27/2006 | WO | 00 | 5/13/2008 |