The present invention relates generally to entertainment systems. More particularly, the present invention relates to digital audio files used in connection with such entertainment systems.
The vast majority of vehicles currently in use incorporate vehicle entertainment systems for entertaining drivers and passengers during their travels. For example, vehicle audio systems provide information and musical programming to many motorists daily. These audio systems typically include an AM/FM radio receiver that receives radio frequency (RF) signals. These RF signals are then processed and rendered as audio output.
In addition to a radio receiver, vehicle audio systems often incorporate media players for playing prerecorded music. For example, some vehicle audio systems incorporate cassette or compact disc (CD) players. An increasing number of vehicle audio systems also incorporate media players for playing audio files stored in any of a variety of formats, including, for example, the MP3, MP3Pro, WMA, AAC, and Ogg-Vorbis formats. These digitally compressed formats allow storage media, such as CDs, to store many more songs relative to uncompressed formats, such as the CD Audio format. For example, with a compression of 10:1, it is possible to store well over 100 songs on a single CD or thousands of songs on a hard disc drive (HDD), depending on disc capacity.
In addition, the compressed data contains metadata information about the aural component in an ID3 tag. This metadata may include, for example, the song title, artist, album, and genre. Using this metadata provides an easy filter mechanism for the user to select criteria for browsing a song list or library. Metadata can be entered manually by the user or automatically captured through automatic naming software that accesses a data source, such as Gracenote. Gracenote is a CD database metadata lookup service that uses data from Internet users' manual data entries.
One of the advantages of metadata is that media players can provide an extensive library browsing capability using metadata fields such as the artist name, the album name, and the genre. The genre field is particularly useful because it allows the user to filter songs within a song library that fit a particular mood. However, for the genre field to be truly effective, the grouping classification must be meaningful to the user. Currently, the genre lists used by automatic naming software are often too large and fine-grained for the user to differentiate among genres. For example, it may not be clear whether a particular song should be classified as “rock,” “soft rock,” or “classic rock.” Automatic naming software generally supports more than 100 genre types. For example, Gracenote currently supports over 250 genre types. Furthermore, with Gracenote, there can be only one genre defined per audio file. That is, a particular song can be classified under “rock,” but not under both “rock” and “soft rock” simultaneously. Because there is no standard for encoding genre metadata, the user can easily become confused when using genre metadata as a way to browse or filter songs for play from a song library. In addition, in the context of in-vehicle applications, scrolling through a list of up to 255 genres can result in excessive head-down time and driver distraction, increasing the risk of accidents.
According to various example embodiments of the present invention, a genre set is reduced from a large set of distinct genre classifications to a smaller set of broader genre classifications. The genre set can be reduced using statistical classification techniques, such as, for example, similarity matrices or may be arbitrarily defined a priori.
One embodiment of the invention is directed to a method to classify a plurality of audio files. Metadata associated with the audio files is received as input. The metadata comprises a set of original genre entries. The set of original genre entries is correlated with a set of consolidated genre entries. The number of consolidated genre entries is less than the number of original genre entries. The consolidated genre entries are associated with the audio files as a function of the correlation of the set of original genre entries with the set of consolidated genre entries. This method may be embodied in processor-readable media.
In another embodiment, an entertainment subsystem includes a media subsystem configured to retrieve data from a data storage medium storing audio files. A microprocessor is operatively coupled to the media subsystem and configured to receive as input metadata associated with the audio files, the metadata comprising a set of original genre entries. The microprocessor is also configured to correlate the set of original genre entries with a set of consolidated genre entries, the number of consolidated genre entries being less than the number of original genre entries, and to associate the consolidated genre entries with the audio files as a function of the correlation of the set of original genre entries with the set of consolidated genre entries.
Various embodiments may provide certain advantages. For instance, reducing a genre set to a manageable number of genre classifications may reduce the amount of time that a user spends scrolling through a listing of genre classifications. As a result, driver interaction with the user interface and the resulting potential for driver distraction may be reduced. In addition, locating individual audio files may be facilitated.
Additional objects, advantages, and features of the present invention will become apparent from the following description and the claims that follow, considered in conjunction with the accompanying drawings.
According to various embodiments, a media system reduces a genre set from a large set of distinct genre classifications to a smaller set of broader genre classifications. The genre set is reduced using statistical classification techniques, such as, for example, similarity matrices. In this way, the amount of time that a user spends scrolling through a listing of genre classifications can be reduced, thereby reducing driver interaction with the user interface and the resulting potential for driver distraction. In addition, locating individual audio files may be facilitated.
The following description of various embodiments implemented in a vehicle-based entertainment system is to be construed by way of illustration rather than limitation. This description is not intended to limit the invention or its applications or uses. For example, while various embodiments are described as being implemented in a vehicle-based media system, it will be appreciated that the principles of the invention are applicable to media systems operable in other environments, such as home media systems.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of various embodiments. It will be apparent to one skilled in the art that some embodiments may be practiced without some or all of these specific details. In other instances, well known components and process steps have not been described in detail.
Various embodiments may be described in the general context of processor-executable instructions, such as program modules, being executed by a processor. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed processing environments in which tasks are performed by remote processing devices that are linked through a communications network or other data transmission medium. In a distributed processing environment, program modules and other data may be located in both local and remote storage media, including memory storage devices.
Referring now to the drawings,
A media subsystem 116 is configured to read a storage medium, such as a CD, a CD-ROM, a Secure Digital (SD) memory card, a MultiMedia Card (MMC), or hard disk drive (HDD). The media subsystem 116 receives the storage medium, for example, via a receptacle 118 formed in the head unit 102. The storage medium may store data, such as audio files in a variety of formats, including but not limited to the MP3, MP3Pro, WMA, AAC, and Ogg-Vorbis formats. These digitally compressed formats allow storage media, such as CDs, to store many more songs relative to uncompressed formats, such as the CD Audio format. For example, with a compression of 10:1, it is possible to store well over 100 songs on a single CD or thousands of songs on a hard disk drive (HDD), depending on disc capacity.
When a storage medium is inserted into the media subsystem 116, for example, through a receptacle 118 in the head unit 102, the media subsystem 116 reads data from the storage medium and communicates the data to a microprocessor 120, typically via a buffer (not shown). The data is then provided to one or more additional components, including, but not limited to, a digital signal processor (DSP) and a digital-to-analog converter (DAC), which convert the digital data signal to an analog signal. Speakers 122 then generate sound in response to the analog signal.
The microprocessor 120 is typically configured to operate with one or more types of processor readable media. Processor readable media can be any available media that can be accessed by the microprocessor 120 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, processor readable media may include storage media and communication media. Storage media includes both volatile and nonvolatile, removable and nonremovable media implemented in any method or technology for storage of information such as processor-readable instructions, data structures, program modules, or other data. Storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile discs (DVDs) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and that can be accessed by the microprocessor 120. Communication media typically embodies processor-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media. Combinations of any of the above are also intended to be included within the scope of processor-readable media.
Certain storage media can store hundreds or thousands of audio files. As disclosed above, the audio files may be stored in any of a variety of formats, including but not limited to the MP3, MP3Pro, WMA, AAC, and Ogg-Vorbis formats. These audio files may be associated with one or more metadata fields that store information relating to the audio files. Metadata facilitates sorting and filtering audio files, particularly large numbers of audio files. The metadata fields may include, for example, artist name, track title, and album title fields.
Another example of a metadata field that is useful for classifying audio files is a genre field. The genre field stores genre information associated with the audio files, such as whether a particular audio file encodes a classic rock song, a country song, etc. As described above, however, the genre lists used by automatic naming software are often too large and fine-grained for the user to differentiate among genres. For example, it may not be clear whether a particular song, such as Eric Clapton's “Layla,” should be classified as “rock,” “soft rock,” or “classic rock.” Automatic naming software generally supports more than 100 genre types. For example, Gracenote currently supports over 250 genre types. Futhermore, with Gracenote, there can be only one genre defined per audio file. That is, a particular song can be classified under “rock,” but not under both “rock” and “soft rock” simultaneously. Because there is no standard for encoding genre metadata, the user can easily become confused when using genre metadata as a way to browse or filter songs for play from a song library. In addition, in the context of in-vehicle applications, scrolling through a list of up to 255 genres can result in excessive head-down time and driver distraction, increasing the risk of accidents.
According to various embodiments, the vehicle entertainment system 100 employs a customizable, reduced genre set that is reclassified from the original genre metadata. Thus, for example, while the original genre metadata may represent 100 or more distinct genre labels, the reduced genre set represents a significantly smaller number, e.g., fewer than 20. By employing the reduced genre set, the vehicle entertainment system 100 may reduce the amount of time that a user spends scrolling through a listing of genre classifications. As a result, driver interaction with the user interface and the resulting potential for driver distraction may be reduced. In addition, locating individual audio files may be facilitated.
The vehicle entertainment system 100 reclassifies audio files using the reduced genre set by, for example, employing similarity matrix correlations or statistical classification techniques.
The original genre entries 162 can be correlated with the consolidated genre entries 172 in a number of ways.
Using the similarity matrix 180, genre reduction and reclassification is based on a set of correlation criteria. For example, the example similarity matrix 180 of
Another way to correlate the original genre entries 162 with the consolidated genre entries 172 involves a cluster analysis.
A hierarchical tree dendrogram may be used to depict various levels of cluster solutions, which correspond to candidate sets of consolidated genre entries 172.
For customized genre groupings, the user can select specific categories that can either be expanded into subcategories or collapsed into fewer categories. The user can also globally collapse or expand a genre set from an optimal number of categories, e.g., 18-20 in the example shown in
As demonstrated by the foregoing discussion, various embodiments may provide certain advantages, particularly in the context of vehicle entertainment systems in which the potential for driver distraction should be reduced. The vehicle entertainment system reduces the genre set to a manageable number of genre classifications, thereby reducing the amount of time that a user spends scrolling through a listing of genre classifications. As a result, driver interaction with the user interface and the resulting potential for driver distraction may be reduced. In addition, locating individual audio files may be facilitated.
It will be understood by those who practice the invention and those skilled in the art that various modifications and improvements may be made to the invention without departing from the spirit and scope of the disclosed embodiments. The scope of protection afforded is to be determined solely by the claims and by the breadth of interpretation allowed by law.