The present invention relates to a search device and associated methods, especially to a search device and associated methods using music emotions for retrieving, browsing and organizing music collections.
As technology improves the storage of a database, a single database can store thousands of pieces of music. Therefore, efficient organization and musical search methods are needed.
Research about organizing, browsing and searching music has been reported in the field of music emotion classification (MEC). MEC provides a more convenient and efficient way of organizing and searching music by using music emotions, rather than a conventional text-based search method.
A typical approach is to categorize music collections into a number of classes or taxonomies (such as happy, sad, angry or relaxing). Thus, a person can search, organize and browse the music collections without remembering a music title, an artist name or any keyword relating to a desired music.
However, classifying work obviously faces a challenge of determining correct emotions of the music and the description of each emotion varies from person to person.
The objective of a search device and associated methods in accordance with the present invention is to automatically sense a desired music emotion and automatically retrieve a corresponding piece of music for the particular desired music emotion sensed.
The search device and associated methods in accordance with the present invention uses music emotions to browse, organize and retrieve music collections.
The search device comprises a processor and an interface. The processor uses machine learning techniques to determine music emotion according to music features and organizes music by emotions for browsing and retrieving music collections. The interface connects to the processor and allows a person to retrieve desired music from the processor.
Methods associated with the search device comprise a processor initialization method, a method of loading new music into the search device and several methods of retrieving desired music from the search device.
With reference to
The search device (1) may be a mobile phone, a computer or an MPEG-1 Audio Layer-3 (MP3) player and comprises a processor (10) and an interface (20).
The processor (10) organizes music by emotions for browsing and retrieving music collections and comprises a database (13), a feature extractor (11) and an emotional predictor (12).
The database (13) stores multiple music collections, music information and corresponding emotional values.
The feature extractor (11) extracts features of music from audio spectrums, frequencies, harmonic signal distributions, power distribution, loudness, level, pitch, dissonance, timbral texture, rhythmic content and pitch content.
The emotional predictor (12) predicts music's emotion using machine learning techniques based on extracted features and gives each musical selection two emotional values.
The emotional values comprise an arousal value (Ra) and a valence value (Rv).
The music information comprises music title, artist, lyrics, music type, language and record company.
The interface (20) connects to the processor (10) and allows a person to input a description of emotion and retrieve the desired music from the database (13) based on the specified music emotion and comprises a platform (21).
The platform (21) is a two-dimensional coordinate system and may be a flat panel touch display.
The coordinate system comprises multiple parallel longitudinal arousal sensors and multiple parallel valence sensors. The valence sensors intersect the longitudinal sensors and define multiple emotional coordinates Ei (Ra, Rv).
The desired music emotion may comprise the emotional coordinate that conforms to the emotional values, the optional music information or both.
With further reference to
The processor initialization method (3) uses machine learning techniques to build a data model for music emotion predictions and comprises acts of sampling (30), subjective testing (31), feature extraction (32) and setting data model (33).
The act of sampling (30) comprises choosing multiple clips of music from different fields and languages, converting the music clips to a uniform audio format (such as 22 KHz, 16 bits, mono channel waveform audio format (WAV) music) and possibly trimming the music clips to 25 seconds for efficient sampling.
The act of subjective testing (31) surveys a group of people, records their emotional assessment of the music clip into two basic values and averages the basic values as a basis for the music emotion for each music clip.
Each basic value of the arousal point (Ra) and the valence point (Rv) of the emotional coordinates ranges from −1.0 to 1.0 ([−1.0, 1.0]).
The act of feature extraction (32) extracts features of each music clip from the feature extractor and may apply a conventional spectral contrast algorithm, a conventional Daubechies wavelets coefficient histogram (DWCH) algorithm, a conventional PsySound program and a conventional Marsyas program.
The spectral contrast algorithm extracts features through audio spectrums, frequencies and harmonic signal distributions of the clipped music.
The Daubechies wavelets coefficient histogram (DWCH) algorithm extracts features by calculating the average, variance or skewness power distribution of the music clip.
The PsySound program extracts features of loudness, level, pitch and dissonance from the music clip.
The Marsyas program extracts features of timbral texture, rhythmic content and pitch content from the music clip.
The act of setting a data model (33) sets a data model for the emotional predictor and executes at least one conventional regression algorithm to formulate the prediction of music emotion into a regression function. The conventional regression algorithms may be selected from a multiple linear regression (MLR), a support vector regression (SVR) and an AdaBoost RT algorithm.
The method of loading new music (4) predicts emotional values of new music collections by the emotional predictor, senses the emotional coordinates of the desired music emotion, retrieves corresponding music and comprises acts of extracting music features (40), determining emotional values (41) and matching emotional value (42).
The act of extracting music features (40) extracts features of new music by using feature extraction (32).
The act of determining emotional values (41) predicts emotional values of music based on the data model and stores the emotional values in the database (13).
The act of matching emotional values (42) with the desired emotional value to obtain the desired music emotion through the interface (20) selects the nearest emotional values to the desired music emotion stored in the database (13) and returns corresponding music to the interface (20).
Therefore, music in the database (13) can be browsed and retrieved by giving the emotional coordinates Ei (Ra, Rv) of the desired music emotion to the interface (20), and the processor (10) will return the music nearest to the desired emotion.
Since the search device (1) automatically predicts the emotion of the music without any manual intervention, music collections are emotional coordinates in a 2-dimensional coordinate system.
A retrieval method using the search device (1) is selected from a query-by-emotion-point (QBEP) method, a query-by-emotion-trajectory (QBET) method, a query-by-artist-emotion (QBAE) method and a query-by-emotion-region (QBER) method.
The QBEP method comprises acts of emotional coordinate indication and music collection retrieval.
The act of emotional coordinate indication comprises indicating an emotional coordinate of desired music emotion from the platform.
The act of music retrieval comprises receiving the emotional coordinate and retrieving music with the nearest emotional coordinate.
The QBET method comprises acts of emotional coordinate indication and music collection retrieval.
The act of emotional coordinate indication comprises sensing a trajectory of desired music emotion from the platform and quantizing emotional coordinates in the trajectory.
The act of music collection retrieval comprises retrieving corresponding music collections and generating a music playlist of the music collections.
The QBAE method is a combination of the QBEP and a conventional artist-based retrieval method, retrieves corresponding music collections of a particular artist with corresponding music's emotions and comprises acts of entity determination, music collection retrieval and artist retrieval.
The act of desired music emotion determination may be selection of emotional coordinates or an artist.
If the desired music emotion is selected based on an artist, the act of music collections retrieval match the artist with the desired music emotion in the database, and the matching emotional coordinates are displayed on the platform.
If the desired music emotion is selected based on emotional coordinates, the act of artist retrieval generates an artist list corresponding to the emotional coordinates.
The QBER method retrieves multiple pieces of music to generate a music playlist in a corresponding region and comprises indicating emotional coordinates and retrieving a music collection.
The act of emotional coordinate indication comprises sensing a free region of a desired music emotion from the platform and quantizing the emotional coordinates in the region.
The act of music collection retrieval comprises retrieving corresponding music collections of the emotional coordinates and generating a music playlist of the music collections.
By using the foregoing methods and the associated device, music of a particular emotion selection can be easily retrieved without knowing the music titles or artist.
Number | Date | Country | Kind |
---|---|---|---|
097148087 | Dec 2008 | TW | national |