The invention relates to media playback systems. Particular aspects of the invention provide systems and methods for playing back audio tracks accessible to audio playback systems.
Audio playback systems may comprise data storage (e.g. solid state memory or hard drive memory) or may have access to external data storage (e.g. an optical CD) containing audio information (e.g. musical tracks). Audio playback systems may have the ability to acquire, store, maintain and play back such audio information. In typical audio playback systems, such audio information is provided in the form of files or the like (e.g. successive tracks on an audio CD). In some systems, such files may be organized hierarchically (e.g. in folders). In some systems, groups of files may be organized into “playlists”.
In conventional audio playback systems, tracks are played back in a predetermined sequential order. For example, the tracks on an audio CD may be played in the predetermined order in which they were recorded on the CD or the tracks in a playlist may be played back in the order determined by the playlist. Sequential playback may be undesirable because of its lack of variation. This drawback with sequential playback is particularly problematic where the playlist (e.g. a set of audio tracks) is looping on a frequent basis or many times over, such as in car stereo systems or in the background music systems of shopping centers and restaurants.
Conventional audio playback systems may also have a “random” playback mode. However, the random modes in conventional audio playback systems are typically oblivious to a set of audio tracks comprising different types of tracks. For example, an audio playback system may have access to a set of available audio tracks which includes some music tracks that are suitable for background music in a shopping mall (e.g. holiday music or music containing softer sounds) and some musical tracks that are not suitable for background music in a shopping mall (e.g. aggressive sounding music). Typically, the random playback modes of conventional audio playback devices do not discriminate between these types of tracks and a user is forced to create a playlist containing a subset of the available tracks.
Similarly, a user may be in the mood for a certain feel of music (e.g. music from related genres, music from related artists or music that is otherwise related), but does not want to sort through all of his or her hierarchically organized audio files to assemble a new playlist. For example, a person may want to listen to a mix of jazz and blues. Some audio playback systems provide the ability to play back tracks which have a particular artist or which have a particular genre. However, conventional audio playback systems do not provide the ability to automatically play back tracks from related genres or related artists without creating a completely new playlist.
Given the increasing volume of digital audio files, the increasing data storage capacities of modern audio playback systems and the ability of playback systems to access external audio files from sources such as the internet and the like, there is a general need for audio playback systems having improved ability to acquire, store, maintain and/or play back such audio information.
In drawings which show non-limiting embodiments of the invention:
Throughout the following description, specific details are set forth in order to provide a more thorough understanding of the invention. However, the invention may be practiced without these particulars. In other instances, well known elements have not been shown or described in detail to avoid unnecessarily obscuring the invention. Accordingly, the specification and drawings are to be regarded in an illustrative, rather than a restrictive, sense.
Particular aspects of the invention provide methods and apparatus for selecting a playback order of audio (or other media) tracks from a collection of accessible audio tracks. The methods and apparatus may be applied to selecting a playback order of audio tracks from a collection of different types of accessible audio tracks.
A user may interact with system 12 via input device 11 and output device 13. Input device 11 may comprise one or more of any suitable input device, such as a mouse, a keyboard, a series of buttons, a rolling input or the like, for example. Similarly, output device 13 may comprise one or more of any suitable output device, such as a flat screen display, an audio output device (e.g. speakers or headphones), a CRT monitor or the like for example. System 12 and/or software 16 may cause input device 11 and output device 13 to work together to provide a user interface 15 (e.g. a graphical and/or text-based user interface). In general, the invention disclosed herein should not be limited by the selection of data storage 18, input device 11 or output device 13. System 12 may comprise other components (not shown), such as amplifiers and the like, which are not germane to the present invention.
System 12 may be a stand-alone unit or may itself be a part of an external communication network (not shown), such as a local area communication network (LAN) or the internet, for example. External data storage 18B may be directly accessible by system 12 or may be accessible through such an external communication network. Software 16 may be executed by data processor 14 and may control how data processor 14 (and any other components of system 12) access data storage 18.
Data storage 18 is schematically depicted in
In the illustrated example of
Within the context of data storage 18, audio tracks 17 may be disorganized. By way of non-limiting example, audio tracks 17 may be stored in different directories or “folders”, audio tracks 17 may be stored on different data storage units (e.g. an optical disc drive and a magnetic hard drive), and audio tracks 17 may be stored in local data storage 18A and remote data storage 18B. In accordance with a particular embodiment of the invention shown in
As explained in more detail below, the links exiting a particular node may be assigned link strengths. Such link strengths may be based on similarities between the particular node and the other nodes in network 10. Such link strengths may be normalized such that the sum of the normalized link strengths exiting any given node is unity. In the illustrated embodiment of
Network 10 may be tangibly embodied as a plurality of related data entities which may be maintained and dynamically updated by system 12. Network 10 may be implemented in software or hardware or a combination of software and hardware. In specific embodiments, the data entities of network 10 may take the form of data structures. As described below, network 10 assists system 12 (and users of system 12) to manage audio tracks 17 contained in data storage 18. Users interact with network 10, and network 10 interacts with users, via user interface 15. For ease of explanation, network 10 may be conceptualized as a plurality of nodes A-F and links wAC, wCA . . . discussed herein.
In accordance with a particular embodiment of the invention, if system 12 plays back a particular audio track 17 corresponding to a particular node, then the probability of subsequently playing back a new audio track depends on the normalized link strength assigned to the link exiting the particular node and entering the node which represents the new audio track. For example, if system 12 plays back a particular audio track 17A corresponding to node A, the probability of subsequently playing back a new audio item 17B depends on the normalized link strength assigned to the link wAB exiting node A and entering node B.
Nodes A-F of network 10 have a one-to-one relationship with their corresponding audio tracks 17A-17F. Nodes A-F may be implemented as data structures. In some embodiments, the data structures associated with nodes A-F contain their corresponding audio tracks 17A-17F. Preferably, however, the data structures associated with nodes A-F contain information recognizable to system 12 and/or software 16 about how to access their corresponding audio tracks 17A-17F. By way of non-limiting example, such information may include: a universal remote locator (URL); an internet protocol address; a directory path and filename; a memory address or the like. Information about how to access a particular audio track (e.g. audio track 17A) is referred to herein as a “pointer” to audio track 17A.
The concept of pointers is well understood by software engineers. Pointers may point to audio tracks 17 that reside in internal data storage 18A, to audio tracks 17 that reside in external data storage 18B and/or to audio tracks 17 that reside in part in internal data storage 18A and in part in external data storage 18B. In the case where pointers point to audio data that resides in external data storage 18B, such external data storage may be accessed via the internet or some other communication network.
Track metadata field 34, may itself comprise any number of sub-fields 34A, 34B . . . 34n. In the illustrated example, track metadata sub-field 34A represents the artist(s) that created the corresponding audio track, track metadata sub-field 34B represents the album from which the audio track came and track metadata sub-field 34n represents the genre(s) to which the track belongs. In some embodiments, one or more of these sub-fields 34A, 34B . . . 34n may comprise a vector list or the like having multiple entries. For example, an audio track may have a composer, a writer, and any number of performer(s) and each of these artists may be represented as an entry in a vector list incorporated into artist sub-field 34A. Similarly, an audio track may have multiple associated genres which may be represented as entries in a vector list incorporated into genre sub-field 34n. In some embodiments, one or more of these sub-fields 34A, 34B . . . 34n may themselves comprise sub-fields. For example, the genre(s) sub-field 34n may comprise a primary genre sub-field and one or more secondary genre sub-fields.
The metadata that is associated with an audio track 17 is not limited to the metadata shown in data structure 31. In general, data structure 31 may incorporate any suitable metadata into metadata field 34. Non-limiting examples of metadata include: title of the audio track; alternate titles; dates of writing, publication, recording and/or release of the track; ranking of the track on a “billboard chart” or similar popular music list; user ranking of the track; collaborative filter ranking of the track; information on revision of the track; and information relating to source materials used in the creation of the track.
In data structure 31, track audio data field 36 also comprises a number of sub-fields 36A, 36B . . . 36m. In the illustrated example, audio data sub-field 36A represents the track length, audio data sub-field 36B represents the track rhythmic properties (of which tempo is an example) and audio data sub-field 36m represents the track timbral properties. Audio data sub-fields 36A, 36B . . . 36m may also incorporate vector lists or sub-fields similar to those of metadata sub-fields 34A, 34B . . . 34n. The audio data that is associated with audio track 17 is not limited to the audio data shown in data structure 31. In general, data structure 31 may incorporate any suitable audio data into audio data field 36. Non-limiting examples of audio data include: bit rate of the audio track; encoding format of the audio track; a playback counter associated with the audio track; a last played time stamp relating to the audio track; audio track structural properties (e.g. an audio track may be segmented); and time dependent rhythmic and/or spectral properties. In the embodiments, where data item 17 comprises another type of media content (i.e. other than pure audio content), then sub-fields 36A, 36B . . . 36m may comprise other types of media information.
The data used to populate the fields and sub-fields of data structure 31 may be obtained by, or otherwise provided to, network 10 via user input, via access to a communication network such as the internet, via accessing databases containing music information and/or by using audio analysis software, for example. In some cases, one or more properties of a data item 17 (e.g. metadata) may be associated with the data item 17 prior to the data item being added to network 10, such that system 12 and/or software 16 may obtain the properties when the data item is added (as a node) to network 10 and use these properties to populate the fields and sub-fields of data structure 31. The fields and sub-fields of data structure 31 need not be fully populated.
System 12 and/or software 16 may maintain an entry/exit list which identifies the nodes in network 10 and maintains a list of the links that enter each node and a list of the links that exit from each node.
Data structures 31, 41 and 50 of
When new audio tracks 17 become accessible to system 12, new nodes may be added to network 10. When a new node is added to network 10, new links may be created between the newly-added node and one or more existing nodes in network 10. Such newly-created links may enter and/or exit the newly-added node. New links can be manually created (e.g. by a user) and/or automatically created (e.g. by software 16) when a new node is added to network 10 and/or during creation of network 10.
After creating these new links in block 110, a link strength may be determined for each of the newly-created links. The strength of each newly-created link may be manually determined (e.g. by user input) or automatically determined (e.g. by software 16) and may be based on the similarity between the audio tracks 17 represented by the nodes between which the link extends. The similarity between two audio tracks 17 may be derived from a comparison of the properties associated with the audio tracks. Some of the properties of an audio track 17 may populate the fields of the node data structure which represents the audio track 17 in network 10. For example, metadata field 34 and audio data field 36 of node data structure 31 may be populated by the properties of a corresponding audio track 17. For ease of explanation, the properties of an audio track 17 that populate the fields of the node data structure 31 representing the audio track 17 may be referred to herein as the “properties of the node” and/or the “properties associated with the node”.
The similarity between the properties of a pair of nodes or a pair of audio tracks 17 may be based on metadata field 34. For example:
The similarity between the properties of a pair of nodes or a pair of audio tracks 17 may be based on audio data field 36. For example:
In the particular embodiment of method 100, the strengths of the newly-created links are automatically determined on the basis of the properties of the newly-added node and the properties of the previously-existing nodes in network 10. This automatic determination of link strength may be based on the correlation (i.e. similarity) between the properties of the newly-added node and the properties of the existing nodes.
In the embodiment of
In accordance with one particular embodiment, the vector distance function d(x,y) for two arbitrary nodes X, Y is given by the Euclidean norm:
In other embodiments, the vector distance function d(x,y) has other forms. For example, the vector distance function d(x,y) may be given by the cosine distance function:
where ∥x∥=(x1x1+x2x2+ . . . +xkyk)1/2. The cosine distance function outputs a result in the range of [−1,1], where an output of 1 corresponds to identical vectors.
Some of the properties (e.g. x1, x2, x3 . . . xk and y1, y2 . . . yk) associated with nodes X and Y may be coded into a numerical format to facilitate the calculation of a vector distance function d(x,y). In some cases, a particular property xi, yi may already exist in numerical format. Such numerical properties may include timbral properties 36m, rhythmic properties 36B and track length 36A. Inherently, numerical properties may be scaled or normalized before being used in the calculation of a vector distance d. In other cases, where a particular property xj, yj is not inherently numeric, system 12 and/or software 16 may be provided with a mapping function Mj(x) which maps the jth property into an n-dimensional numerical space. System 12 and/or software 16 may be provided with a mapping function Mj(x) for each non-numeric property. Properties which are not inherently numeric include artist 34A, album 34B and genre 34n. The mapping functions Mj(x) may be based on empirical formulae developed by musicians, musicologists or the like. The mapping functions Mj(x) may take advantage of available music databases and similar resources, which may be local to system 12 and/or accessible to system 12 over a communication network such as the internet.
In some embodiments, particular properties of the newly-added node and the previously-existing node may be given increased weight when determining the vector distance function d(x, y). In such cases, the weighted Euclidean norm vector distance function may be given by:
where ai represents a weighting coefficient assigned to the ith property. As an example, it may be desirable to give extra weight to similarities in the artist field 34A between a newly-added node and a previously-existing node. The artist field 34A may be property x3 in the newly-added node and property y3 in the previously-existing node. In such a case, the weighting coefficient a3 may have a relatively high value in comparison to other weighting coefficients. In some embodiments, the weighting coefficients ai may additionally or alternatively depend on an average value of the ith property.
As a part of block 120, the output of the vector distance function d(x,y) may be linearly scaled and/or linearly offset to provide a suitable vector distance range. Those skilled in the art will appreciate that there are many other distance functions and similar functions which can be used to compute a correlation/similarity between a pair of vectors.
Another technique for determining the similarity or correlation between two vectors involves using classifier models of machine learning. For example, classifier may be trained to map inherent properties of an audio track 17 (e.g. spectral properties, tempo, timbral properties and track length) into a metadata property, such as genre for example. The classifier may be trained using a set of training vectors. Preferably, the training vectors are developed from actual audio tracks. Each of the vectors in the training set is provided with the inherent properties being considered (e.g. spectral properties, tempo, timbral properties and track length) and a label corresponding to the metadata property. For example, where the metadata property is genre, the training set may include training vectors having labels, such as pop, rock, rap, classical, jazz, blues, or the like. Using the training set, the classifier develops a set of parameters that map the inherent properties of an audio track 17 (e.g. spectral properties, tempo, timbral properties and track length) into one of the labels of the training set. The classifier may then be used to predict a metadata property of arbitrary audio tracks 17 on the basis of the inherent properties of the audio tracks. In the example described above, the classifier may be provided with a vector corresponding to the inherent properties of an arbitrary audio track 17 (e.g. spectral properties, tempo, timbral properties and track length) and will predict the genre of the audio track.
To assess the similarity between the properties of two nodes, a classifier may be trained using a set of training vectors, where each vector in the training set is based on the properties of a pair of nodes. For example, as discussed above, the properties of a pair of nodes t, Y may be represented by a pair of vectors x=(x1, x2, . . . , xk), y=(y1, y2, . . . , yk). A vector in the training set may then be represented by concatenating the vectors x and y to form a training vector r, having the form r=(x1, x2, . . . , xk, y1, y2, . . . , yk). The labels of the training set may be a set of predetermined discrete similarity levels. Using this training set, the classifier develops a set of parameters that map a concatenated vector having the form (x1, x2, . . . , xk, y1, y2, . . . , yk) into one of the discrete similarity levels corresponding to the labels of the training set. The classifier may then be used to predict the similarity of a pair of arbitrary audio tracks 17 on the basis of the vectors x=(x1, x2, . . . , xk), y=(y1, y2, . . . , yk) representing the properties of the audio tracks. In the example described above, the classifier may be provided with a concatenated vector of the form (x1, x2, . . . , xk, y1, y2, yk) corresponding to the properties the pair of arbitrary audio tracks 17 and will predict the similarity of the pair of audio tracks to one of the discrete similarity levels used in the training.
After creating links between the newly-added node and the previously-existing nodes in block 110 and determining the vector distances d assigned to each of the newly-created links in block 120, newly-created links having vector distances d less than a threshold θ may be removed (or otherwise excluded (e.g. by setting their vector distance d=0)) from network 10 in block 130. Threshold θ may be a user-configurable parameter or may be set as a predetermined threshold in network 10. Threshold θ need not be a constant value. Threshold θ may be a function of the particular vector distance function d(x,y) used in block 120 to determine the similarity of the newly-added node to the previously-existing nodes and/or one or more of the individual properties (e.g. x1, x2, x3 . . . xk and y1, y2, y3 . . . yk) used to determine the vector distance d in block 120.
In method 100, block 140 involves applying a calibration function ƒ(d) to the vector distances d of the remaining newly-added links. Preferably, the calibration function ƒ(d) is a non-linear function which may be used to de-emphasize newly-created links having statistically-outlying vector distances d and/or to improve the dynamic range of the vector distances d for the newly-created links. For example, if there are ten newly-created links after the block 130 thresholding operation and nine of the newly-created links have vector distances d in a range of [0.5, 0.65] and one of the newly-created links has a vector distance d of 0.95, then it may be useful to emphasize the range of vector distances between [0.5, 0.65] and to de-emphasize vector distances in a vicinity of 0.95, so as to provide more dynamic range for the vector distances d in the range [0.5, 0.65].
In accordance with one particular example, the calibration function ƒ(d) is given by:
f(d):=a·2b·d·dc
where d=d(x,y) is the vector distance function between nodes X and Y and a, b, c are numerical calibration parameters. Parameters a, b, c may be user-configurable parameters or may be pre-configured parameters. Parameters a, b, c need not be constant and may be functions of the particular vector distance function d(x,y) used to determine the vector distances d in block 120. Where the parameters a, b, c depend on certain properties of the nodes, the block 140 calibration may be used to provide weight to certain properties of the nodes.
Additional calibration mechanisms may be provided as a part of block 140 (or elsewhere in method 100) for situations where a newly-added node has no links exiting from the node (i.e. all links exiting from the newly-added node were removed in the block 130 thresholding process) or a newly-added node has no links entering the node (i.e. all links entering the newly-added node were removed in the block 130 thresholding process). For example, for a newly-added node that has no exiting links, an additional calibration mechanism may comprise adding links (with some nominal value δ in the place of vector distance d) exiting from the newly-added node and entering all of the other nodes in network 10 (or some subset of the other nodes in network 10, such as the n nodes determined to be most similar to the newly-added node prior to the block 130 thresholding process). Similarly, for a newly-added node that has no entering links, an additional calibration mechanism may comprise adding links (with some nominal value E in the place of vector distance d) exiting from every other node in network 10 (or some subset of the other nodes in network 10, such as the n nodes determined to be most similar to the newly-added node prior to the block 130 thresholding process) and entering the newly-added node.
At the conclusion of the block 140 calibration process, the calibrated vector distances d (or the nominal values δ, ε) for each link may be retained in field 42 of the link data structure 41. Block 150 involves normalizing the vector distances d to obtain normalized link strengths. The block 150 link normalization process occurs for all nodes that have new exiting links or a change in their exiting links (i.e. as a result of blocks 110, 120, 130 and 140). Normalizing link strengths may be accomplished, for each node X, by dividing the calibrated vector distance d (or the nominal values δ, ε) of each individual link exiting from node X by the sum of the calibrated vector distances d (or the nominal values δ, ε) of all links exiting from node X. This may be accomplished by dividing the individual vector distance fields 42 of the link data structures 41 exiting node X by the sum of the vector distance fields 42 of the link data structures 41 exiting node X.
In alternative embodiments, where data structure 41 does not include vector distance field 42, the vector distances d may be recalculated for all of the links exiting from a node which receives a new exiting link as a result of blocks 110-140.
As a part of the block 150 normalization, the previously-existing links exiting from each node being normalized may have their link strengths re-normalized (i.e. because of the addition of new links). The re-normalized link strength of these previously-existing links may be subjected to a new threshold test. If the re-normalized link strength of a previously-existing link has decreased (e.g. because of the presence of a new link) and the strength of the previously-existing link is now below some re-normalization threshold λ, then the previously-existing link may be discarded and the block 150 normalization procedure may be repeated for that node. The re-normalization threshold λ may be a user configurable or a predefined parameter and may be a global parameter or a parameter that is specific to each node. The re-normalization threshold λ need not be constant and may be a function of the total number of links exiting a particular node. For example, the re-normalization threshold λ may be relatively low where the number of links exiting a particular node is relatively high and the re-normalization threshold λ may be relatively high where the number of links exiting a particular node is relatively low.
After normalization in block 150, the sum of the normalized link strengths for all of the links exiting from a particular node is unity. The normalized link strength may be retained in field 43 of link data structure 41.
For ease of description, method 100 is described for the case of adding a single new node to an existing network 10. Those skilled in the art will appreciate that adding multiple new nodes (or even new networks incorporating a plurality of nodes and links) may involve an extension of method 100. Such an extension of method 100 may involve repetitive application of method 100, but may additionally or alternatively involve some economization of method 100 to account for the addition of multiple new nodes. For example, some of the method 100 procedures may be implemented in parallel for some of all of the newly-added nodes.
The normalized link strengths determined in method 100 may be used as probabilities for transitions from one node in network 10 to another node in network 10 via a link. A transition between nodes of network 10 via a link may correspond with playback of an audio track 17 represented by the first node followed by playback of an audio track 17 represented by the second node. Accordingly, the normalized link strengths determined in method 100 may be used by system 12 and/or software 16 to determine the track playback order. Because the normalized strengths of links connecting nodes having similar properties will tend to be higher than the normalized strengths of links connecting nodes having dissimilar properties, the probability of a transition between nodes having similar properties is greater than the probability of a transition between nodes having dissimilar properties. Accordingly, successive playback of audio tracks 17 that are similar to one another is more likely than successive playback of audio tracks 17 that are dissimilar to one another.
When the block 210 play command is activated, the ‘currently-selected track’ is played back in block 220. Selection of the currently-selected track is explained in more detail below. When the play command is activated in block 210 for the first time (e.g. after system 12 has been powered down or after a predetermined amount of time), then the block 220 playback may involve playing back the track associated with a predetermined node (i.e. setting the track associated with a predetermined node to be the currently-selected track), playing back the track associated with a random node (i.e. setting the track associated with a random node to be the currently-selected track) or playing back the track associated with a user-selected node (i.e. where the user selects the track associated with a particular node to be the currently-selected track before or after activating the block 210 play command).
In the absence of additional user input, method 200 proceeds through blocks 230, 240 and 250 to block 260. If it is determined (in block 260) that playback of the currently-selected track has not ended (block 260 NO output), then method 200 loops back to block 220 and continues playing the currently-selected track. If it is determined (in block 260) that playback of the currently-selected track has ended (block 260 YES output), then method 200 proceeds to block 270, where it updates a play history list as explained below.
Network 10 may maintain a play history list.
In block 270, play history list 300 is updated to reflect the fact that playback of the currently-selected track has just ended (block 260 NO output).
In block 272, a new track is selected for playback. Preferably, the block 272 selection of a new track for playback involves a transition from the node associated with the currently-selected track to a new node via a link that exits from the node associated with the currently-selected track and enters the new node. For example, in network 10 of
In accordance with one particular embodiment, if X denotes the node associated with the currently-selected track (i.e. whose playback has just ended), Y denotes another node in the network and there is a link exiting node X and entering node Y, then the track associated with node Y is selected to be the next track in block 272 with probability pXY, where pXY is the normalized link strength of the link from node X to node Y. Returning to the previous example of network 10 (
The block 272 track selection may be performed via a number of methods. In one particular embodiment, the normalized link strengths of the links exiting the node associated with the currently-selected track are assigned concatenating, non-overlapping domains in the range (0,1] and system 12 and/or software 16 generate a pseudo-random number in the range (0,1]. This pseudo-random number is used to select one of the links exiting from the node associated with the currently-selected track. Returning to the previous example of network 10 (
After selection of the new node for playback in block 272, method 200 proceeds to block 274, where the currently-selected track is updated to be the newly-selected track (i.e. the track selected in block 272). Method 200 then proceeds through block 276 (explained in more detail below) to block 220, where it begins to playback the new currently-selected track.
During playback of the currently-selected track, a user may interact with system 12 by activating the ‘next’ command. As with the play command, a user may activate the next command using any suitable hardware or software input. In the illustrated embodiment of method 200, activation of the next command is detected in block 250. If the user does not activate the next command (block 250 NO output), then, in the absence of any other user input, method 200 loops through block 260 back to block 220, where it continues to play the currently-selected track. When the next command is activated (block 250 YES output), playback of the currently-selected track ends and method 200 proceeds through blocks 272, 274, 276 (as described above) to select and begin to play a new track. In method 200, block 270 is bypassed when a user activates the next command. In other embodiments, the play history list is updated when a user activates the next command.
During playback of the currently-selected track, a user may also interact with system 12 by activating a ‘restart’ command. As with the other user commands, a user may activate the restart command using any suitable hardware or software input. In method 200, activation of the restart command is detected in block 230. If the user does not activate the restart command (block 230 NO output), then, in the absence of other user input, method 200 loops back through blocks 240, 250 and 260 to block 220, where it continues to play the currently-selected track. If the restart command is activated (block 260 YES output), then playback of the currently-selected track is restarted in block 235 before proceeding back to block 220.
A user may also interact with system 12 by activating the ‘previous’ command. As with the other user commands, a user may activate the previous command using any suitable hardware or software input. In method 200, activation of the previous command is detected in block 240. If the user does not activate the previous command (block 240 NO output), then, in the absence of other user input, method 200 loops back through blocks 250 and 260 to block 220, where it continues to play the currently-selected track. If the previous command is activated (block 240 YES output), then playback of the currently-selected track ends and the currently-selected track is replaced (in block 245) with the track associated with the node corresponding to the most recently added pointer on the play history list. For example, if the previous command is activated while the play history list is play history list 300 of
Block 245 also involves removing the pointer to the node associated with the most recently played back track from the play history list.
In some embodiments, selection of the new node for playback in block 272 involves the use of a taboo mechanism which helps to prevent repetition in playback. In accordance with one particular embodiment, before a track 17 is about to start being played back, a taboo list is updated with information about the track 17 and/or its associated node. In method 200, the taboo list is updated in block 276 (i.e. after the newly-selected track is updated to be the currently-selected track in block 274 and before playback of the new currently-selected track commences in block 220).
If, on the other hand, the preliminary new track selection is on the taboo list (block 620 YES output), then method 600 proceeds to block 640, where the difference between the current time and the playback time of the preliminary selected track (i.e. the playback time contained in the taboo list for the node associated with the preliminary selected track) is compared to a taboo threshold time TT. If the difference between the current time and the playback time of the preliminary selected track is greater than the taboo threshold time TT (block 640 YES output), then method 600 proceeds to block 630 where the preliminary new track selection is finalized as the new track. If the difference between the current time and the playback time of the preliminary selected track is less than or equal to the taboo threshold time TT, then the preliminary new track is rejected and method 600 proceeds to block 610, where a new preliminary track is selected and method 600 repeats itself.
The taboo threshold time TT may be a user-configurable parameter or may be a parameter that is automatically defined by software 16. The taboo threshold time TT need not be constant and may depend on many factors, such as the number of nodes in network 10 for example. In cases where the playback times of the data elements in taboo list 400 correspond to discrete intervals other than clock-based times, then the taboo threshold time TT need not be a clock-based time and may be a threshold number of discrete intervals.
Whenever a new data element is added to the taboo list in block 276, all data elements whose playback times are further away from the current time than the taboo threshold time TT (i.e. all data elements for which current time-playback time>TT) may be removed from the taboo list. This avoids having the taboo list grow indefinitely. If a taboo list mechanism is used, the taboo list may remain unaffected by activation of the previous command (block 240 of method 200) discussed above.
In may be possible, in some circumstances, that all of the nodes of network 10 are on the taboo list and the differences between the current time and the taboo list playback times for all of the nodes are less than the taboo threshold time TT. In method 600, a flag may be set to indicate this condition. In response to such a flag, method 600 may involve releasing a number n of nodes (preferably, the nodes corresponding to the oldest playback times) from the taboo list. Those skilled in the art will appreciate that there are other ways to overcome this condition. For example, all of the nodes may be released from the taboo list or the taboo threshold time TT may be reduced.
System 700 also comprises a probability assessor 706 for determining a probability of a transition from each media track 17 to one or more of the other media tracks 17 in set 702 based, at least in part, on the properties determined by media content analyzer 704. Probability assessor 706 may use vector distance functions as described above to assess probabilities and assign them to links of network 10 as described above. Probability assessor 706 may also receive input from external sources. System 700 also comprises a playlist generator 708 for selecting a sequence of media tracks 17 for playback based at least in part on the probabilities determined by probability assessor 706.
The probabilistic audio networks described above may be used in a variety of different kinds of audio playback systems/devices and a variety of different environments. Non-limiting examples of suitable systems/devices and environments include:
Probabilistic audio networks of the type described above may be created manually, automatically, or semi-automatically to reflect the preferences of specific users. Audio networks of the type described above (i.e. including a plurality of links and nodes) may be packaged and sold as pre-prepared audio networks. Such pre-prepared audio networks may be added to a user's existing network (in accordance with the methods of adding nodes discussed above) or may be installed as stand-alone networks. Such pre-prepared audio networks may correspond to, and be marketed as, the preferences of celebrities or other well-known persons, such as pop stars, actors, TV personalities, sports stars, etc. Such pre-prepared audio networks may also be designed for a specific purpose (i.e. playback in a bar, store or shopping center). Such pre-prepared networks may be commercially distributed via the internet or on storage media, such as CDs or DVDs, for example.
Certain implementations of the invention comprise computer processors which execute software instructions which cause the processors to perform a method of the invention. For example, one or more processors in a dual modulation display system may implement data processing steps in the methods described herein by executing software instructions retrieved from a program memory accessible to the processors. The invention may also be provided in the form of a program product. The program product may comprise any medium which carries a set of computer-readable signals comprising instructions which, when executed by a data processor, cause the data processor to execute a method of the invention. Program products according to the invention may be in any of a wide variety of forms. The program product may comprise, for example, physical media such as magnetic data storage media including floppy diskettes, hard disk drives, optical data storage media including CD ROMs, DVDs, electronic data storage media including ROMs, flash RAM, or the like. Where specified, the program product may also comprise transmission-type media such as digital or analog communication links. The instructions may be present on the program product in encrypted and/or compressed formats.
Where a component (e.g. a software module, processor, assembly, device, circuit, etc.) is referred to above, unless otherwise indicated, reference to that component (including a reference to a “means”) should be interpreted as including as equivalents of that component any component which performs the function of the described component (i.e., that is functionally equivalent), including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated exemplary embodiments of the invention.
As will be apparent to those skilled in the art in the light of the foregoing disclosure, many alterations and modifications are possible in the practice of this invention without departing from the spirit or scope thereof. For example:
This application claims priority from U.S. patent application No. 60/636,290 filed 15 Dec. 2004 which is hereby incorporated by reference herein. This application is related to the co-pending application entitled SYSTEMS AND METHODS FOR STORING, MAINTAINING AND PROVIDING ACCESS TO INFORMATION which is filed together herewith and which is hereby incorporated by reference herein.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CA05/01896 | 12/15/2005 | WO | 00 | 6/14/2007 |
Number | Date | Country | |
---|---|---|---|
60636290 | Dec 2004 | US |