The invention concerns generally the technological field of furnishing terminal equipment of communication systems with selectable audio characteristics. Especially the invention concerns a method and arrangement for providing a large degree of selectability to individual users concerning ringing tones and other sounds emitted by their terminal equipment.
Portable terminals of cellular radio systems have conventionally been mobile telephones, but the development trend at the priority date of this patent application is towards more versatile terminal equipment with features from e.g. palmtop computers, telephones, positioning devices and personal digital assistants (PDAs). The conventional way of producing a ringing tone in a portable terminal is to use a buzzer which is optimized for efficiency in producing a high output sound pressure level. The buzzers that are most commonly used only accept a single square wave as an input waveform. A square input wave on a constant frequency gives rise to a monophonic output buzz with constant pitch. It is possible to play simple monophonic melodies with the buzzer by composing the input signal as a sequence of relatively short square wave trains. It is possible to use the loudspeaker of the mobile terminal to emit more versatile sounds, but in practice it may be difficult to obtain a reasonably high output sound pressure level without sacrificing compact size, efficiency in energy consumption and usability in the telephone mode.
Manufacturers have conventionally provided their mobile terminals with a selection of alternative ringing tones by storing a number of different buzzer input sequences into the terminal's memory. A user can select one of these preprogrammed tones by performing a simple programming step. Practical experience has shown that consumers are eager to personalize their mobile terminals according to their own taste, which has led to a phenomenal success of services that sell downloadable ringing tones. The known method of downloading a ringing tone from a network requires the user to send an SMS message (Short Messaging Services) to a certain ringing tone server coupled to the fixed parts of the cellular network, said message indicating the user's willingness to download a new ringing tone and preferably also identifying a particular melody which the user is interested in. The server responds with a specifically formatted SMS message that contains machine-readable instructions which the portable terminal can use to reproduce the ringing tone in question.
Although the selectability and downloading services described above has concentrated on ringing tones, it would be possible to use similar methods and arrangements to select personal tones or melodies for all occasions when the portable terminal emits an indicatory audio signal. Such occasions comprise but are not limited to indicator tones for key depressing, alarm sounds for battery depletion and other threatening events as well as amusing sounds for games.
The drawbacks of the prior art arrangements for providing selectability to portable terminals' audio characteristics are related to the limited sound reproduction capability on one hand and to the shortage of various resources on the other. With resources we mean the memory space and allocatable processing capability of the portable terminal itself as well as the allocatable transmission resources between the terminal and the fixed parts of the cellular radio network. We will illustrate the resource question with some examples.
At the priority date of this patent application one of the most popular ways of distributing arbitrary high quality audio sequences in electronic form is MP3 or MPEG-2 Layer 3 coded audio, where MPEG originally comes from Motion Picture Experts Group. The MP3 audio encoding is based on a method where an original audio sequence is recorded, digitized and compressed by performing a number of mathematical transformations on short consecutive frames of the digitized signal. One minute of MP3 encoded audio signal results in approximately 8 Mbits of data depending on the used compression rate. If we set the minimum temporal length of a ringing tone at ten seconds, a single melody would require over 1.3 Mbits of memory when stored. This is far too much regarding the limited amount of memory allocatable to ringing tones in known portable terminals. The downloading of such a ten-second audio sequence over the known GSM (Global System for Mobile telecommunications) digital cellular network at 9.6 kbit/s would take well over two minutes, which is unacceptable in terms of network loading and communication cost. Decoding an MP3 encoded bitstream into a for suitable for playback requires quite intensive processing.
At the priority date of this patent application there is one portable terminal on the market, known by the registered trademark “Nokia 9110 Communicator” of Nokia Corporation, that supports the playback of arbitrary audio tones encoded by Pulse Code Modulation or PCM. A typical 8-bit PCM encoded wave file that represents ten seconds of emitted signal with relatively low audio quality has the size of 640 kbits. Although this is considerably less than what is required by the MP3 encoded sequence, it is still too much for large-scale downloading.
It is an object of the present invention to provide a method and an arrangement for offering a wide variety of selectable audio characteristics to the users of terminal equipment with reasonable requirements concerning memory space, processing capability and transmission resources. It is a further object of the invention to provide compatibility of the method and arrangement with a large selection of terminal types and operating software. An additional object of the invention is to make it easy for the user to tailor the audio characteristics of terminal equipment according to personal taste.
The objects of the invention are achieved by presenting audio sequences in a form with a score information part and an instrument information part. The instrument information part contains synthesis parameters that define the timbre, or the synthesized sound or sequence of sounds. The score information part contains instructions that define the usage of the instrument information. Additionally there is provided compatibility information describing the compatibility of such audio sequences with known terminal capabilities.
The method according to the first embodiment of the invention is characterized in that it comprises the steps of
The method according to the second embodiment of the invention is characterized in that it comprises the steps of
The invention also applies to an apparatus which comprises a network device. It is characterized in that the network device comprises
According to the invention a service provider or a similarly acting other body maintains a database that comprises a plurality of sound packets. A sound packet is understood in this context as an entity that comprises a piece of musical score information and a set of parameters that relate to the “instruments” or synthesized sound sources which should be used to play the score. A sound packet is preferably self-contained in the sense that once it has been loaded into terminal equipment with appropriate processing and audio outputting capabilities, it enables the terminal to output a certain passage of audio signal where the synthesized sounds described by the parameters perform the presentation written into the score information. Said database contains also information about the compatibility of the stored sound packets with the capabilities of known terminal types. For downloading into a certain terminal equipment of known type only those sound packets are made available that do not exceed the terminal's capabilities.
The novel features which are considered as characteristic of the invention are set forth in particular in the appended Claims. The invention itself, however, both as to its construction and its method of operation, together with additional objects and advantages thereof, will be best understood from the following description of specific embodiments when read in connection with the accompanying drawings.
a illustrates an advantageous database arrangement,
b illustrates another advantageous database arrangement,
a illustrates a software tool for applying the invention,
b illustrates further software tools for applying the invention,
The idea of organizing a piece of music electronically into a score information part and a parameter or instrument information part is known as such. In the following we will first describe some known solutions of this kind.
Within the field of musical synthesizers there are known the concepts of patches and patch maps. Each stored synthesized instrument sound is designated with an associated patch number, and the table that correlates patch numbers with instruments is known as the patch map. One of the major standards controlling musical synthesizing and exchange of information related thereto between electronic devices is MIDI (Musical Instrument Digital Interface). It is possible to compose a piece of synthesized music with one synthesizer and transfer it in digital form into another synthesizer. The digital representation of the piece of music contains information about e.g. which patch number(s) should be associated with each individual “channel” or voice in a musical score. If a receiving synthesizer uses the same patch map as the one with which the piece was composed, it is able to playback the piece exactly as it was at the composing stage. Within MIDI the most commonly used standard for instrument mapping is known as the GM or General MIDI. Known extensions to it are known as XG, GS and GM 2.0.
None of these instrument mapping standards actually describes how the actual instrument voice should be produced. Known sound synthesis technologies are e.g. FM (Frequency Modulation), wavetable synthesis and physical modelling.
For downloading sounds that can be associated to patch numbers in a patch map a SoundFont® file format has been introduced by Creative Labs Corporation where a collection of 16-bit digital samples is associated with synthesis information required to articulate the digital signal in the audio domain. The MIDI Manufacturers Association or MMA has also introduced a sound sample downloading format known as Downloadable Sounds level 1 (DLS-1). Recently these sound downloading formats have been merged into a new standard known as DLS-2. It is also known as SASBF or Structured Audio Sample Bank Format within the MPEG-4 multimedia standard. Commercial implementations of DLS-2 do not exist at the priority date of this patent application.
Staccato Systems Inc. has introduced an audio technology known as SynthScript® Down Loadable Algorithms or DLA, which is based on physical modelling of instrument voices. A processing engine known as the SynthCore® is required to convert a SynthScriptDLA text file into playing music. The processing engine also supports the GM, XG and DLS-1 synthesis mechanisms refelTed to above.
Additionally there is known a musical data file format known as the Rich Music Format or RMF. It determines how a single file format can be used to incorporate all sample, performance and copyright information of a piece of music. The performance portion is based on the MIDI file model with some extended control functions.
Although the above-described methods and arrangements for representing audio sequences are known to the public at the priority date of the present patent application, they are not directly applicable to ringtone and other audio characteristics download services for portable terminal. In the following we describe the method and apparatus according to the invention, making use of the above-mentioned known concepts at appropriate points.
FIG. I illustrates the conceptual composition of a sound packet according to an advantageous embodiment of the invention. The sound packet 100 comprises a score information part 101 which may be regarded as a song book or music case that contains the notes which should be played and relate synthesis instructions. The score information part may consist of score data subparts 102, 103 each of which comprises the score of a single song. Each score data subpart may further comprise sub-subparts each of which comprises the score of a single voice in that song. Additionally the sound packet comprises a instrument information part 104 which contains the instrument data, i.e. the parameters that a musical synthesizer needs to set up the “band” that should be used to play the score(s) contained in the score information part 101. These parameters are most advantageously organized into instrument data subparts 105, 106 so that each instrument data subpart defines a single instrument that may be used to play one or more of the voices defined by the score information subparts 102, 103.
Previously we have noted that the invention does not concern only the generation of ringing tones but it can be applied to the generation of other indicative audio signal as well. We may designate the latter class of voices generally as User Interface or UI sounds. In the embodiment of
Additionally
In order to facilitate the handling of sound packets it is advantageous to include into the sound packet structure a header part 121 which comprises general information like an identifier 122 of the sound packet, compatibility information 123 describing the compatibility of the sound packet with different known terminal types or just laying out some minimum allocatable resources (like processing capacity in MIPS and allocatable memory in kbits) required to use the sound packet, and copyright information 124 concerning the sound packet if applicable. The invention does not limit the contents of the header part 121.
A separate header part could also be included in each score information part 101, instrument information part 104, UI sounds part 107 and/or generic audio part 110, or even to every subpart and/or sub-subpart. Such header part could comprise e.g. specified copyright information and/or resource requirement information concerning only that part of the sound packet.
The sound packet approach illustrated in
The size of a sound packet 100 in bits, as well as the processing capability required to playback the piece of music described therein in intended tempo, will depend heavily on the used synthesis technology, the accuracy and quality of the synthesized sounds, the diversity of the band or number of different instrument sounds, and the number of simultaneous voices, i.e. polyphony. It is possible to compose e.g. a very simple sound packet where only a single coarsely encoded instrument voice plays one or few notes, or an immensely complex sound packet where a doubled symphony orchestra with high-quality instrument voices performs a Wagner ouvelture backwards in quadrupled tempo. The processing capacity required to decode and playback a sound packet is mostly determined by the degree of polyphony associated with the song to be played, i.e. the number of simultaneously playing voices.
A part of the invention is that it is somehow indicated, what are the resource requirements of a certain sound packet and/or which known terminal equipment types it is compatible with. Compatibility with a certain terminal equipment type means in this context that it is known that a normal terminal equipment of that type has enough allocatable memory and processing capability to download, store and playback that sound packet. Above we have noted that one way of indicating compatibility is to provide within the sound packet a header part where compatibility with known terminal types or the minimum amount of allocatable resources is explicitly recited. However, the compatibility information need not be an explicit part of the sound packet at all.
The invention does not limit the form of the score information part and the instrument information part, although it is regarded as advantageous to use a form taken from the above-mentioned existing standards. A score information part of a sound packet may be quite compact relative to the instrument information part. In practice, score information parts and instrument information parts are represented in different forms. It is possible e.g. to use the known SMS format, SAOL format or Csound score data format for scores, and a wavetable or physical modelling method for the instruments. It is also possible to use a common RMF or Rich Music Format file that encompasses both the score information part and the instrument information part.
a illustrates a structure of sound packets stored in a database schematically shown as 200. Said database is most advantageously maintained in a service provider's computer with fixed connections to a cellular radio network. The sound packets themselves 201, 202, 203, 204, 205 and 206 are most advantageously stored only once, i.e. only one copy (except for a potential back-up copy) of each sound packet appears in the database. In order to make only those sound packets available to a particular terminal type that are compatible with the allocatable resources in that terminal type the database or its associated handling functions comprises a terminal type selector block 213 as well as a number of terminal type blocks 211, 212 and 213. Each terminal type block is a collection of pointers where each pointer points to one sound packet which is known to be compatible with the terminal type in question. The idea behind this arrangement is that when a query is made to the database, it is first checked by the functions of block 213 whether the query comprises an indication of a particular terminal type. If such an indication is found, the appropriate terminal type block 211, 212 or 213 is called and the pointers in the called terminal type block are noted so that only those sound packets are made available for querying that are compatible with the terminal type in question. It is left to the discretion of eventual implementers to decide, whether a query with no terminal type indication is answered by making no sound packets available, by making all sound packets available or in some other way. The invention does not limit the number of sound packets or terminal type blocks in the database, or the number of pointer connections between a terminal type block and sound packets.
b illustrates an alternative database arrangement where a database 200′ again comprises a number of sound packets 201, 202, 203, 204, 205 and 206. Instead of a terminal type based selection arrangement the database or its associated handling functions comprise a compatibility wizard 220. When a query is made to the database, the compatibility wizard 220 checks whether the query comprises an indication of allocatable memory space and processing capability. If such indications exist, the compatibility wizard 220 checks fiom the known capacity requirements of the sound packets 201, 202, 203, 204, 205 and 206 which of them are within the limits set by the indicated allocatable memory space and processing capability. The compatibility wizard 220 then makes only those sound packets available for querying that are compatible with the indicated allocatable resources.
Other arrangements than those in
The operation of the database 300 in
The database and function structure shown in
Previously we have noted that a score information part corresponds roughly to a song book, a score data subpart corresponds to a song in the song book and a score data sub-subpart corresponds to the notes of a single voice in the song. In a very versatile embodiment following the database architecture of
Within the embodiment of
It is possible to make the terminal type identification automatic in order to get rid of steps 403 to 406. The most straightforward way of doing this is to make the terminal send its type identification to the database already at step 402. The terminal type may be explicitly given, or the terminal may transmit for example its IMEI code (International Mobile Equipment Identifier) or a corresponding code a part of which is the serial number of the terminal. The manufacturers usually apply some systematics in appointing serial numbers to different terminal types so it may be possible to arrange the database to compare the transmitted serial number to a simple table and deduce the terminal type according to the range of serial numbers into which the transmitted terminal number falls. Another way of at least partly simplifying steps 403 to 406 is to make the database place its request 403 for the terminal type in such machine-readable form that the terminal does not need to bother the user with steps 404 and 405; the terminal could send its type-indicating answer 406 automatically.
In any case we assume that the database has become aware of the terminal type or otherwise specified limitations concerning allocatable capacity. At step 407 the database composes a selection list consisting of only those stored sound packets which are compatible with the indicated terminal type. At step 408 it sends the composed selection list to the terminal, which displays it to the user at step 409. The user makes his selection at step 410 and the terminal forwards it to the database at step 411. This triggers the actual downloading at step 412. The downloaded sound packet is stored into the memory of the terminal at step 413. If necessary, a previously stored sound packet is at the same time removed from the memory either automatically or after having asked the user for confirmation. The completion of the downloading is indicated to the user at step 414.
In
On the basis of the method illustrated in
a and 5b give a schematic overview of the software tools that are required to implement an advantageous embodiment of the invention.
b illustrates some software tools that are mainly meant to run in a computer 510 rather than terminal equipment, although as the borderline between portable terminals of cellular radio systems and portable computers is getting blurred, this assumption is by no means limiting. A combiner/converter tool 511 is meant to be a basic tool for combining separate score files, instrument information and possibly separate UI sound sequences and generic audio files into sound packets. Conversions may be needed if the original files are in other formats than what are specified as the allowable information formats within a sound packet. The combiner/converter tool is mosty advantageously equipped with a compatibility unit that may not let the user to compose a certain sound packet if its memory or processing capacity requirements would be beyond the capabilities of a given terminal type or beyond explicitly given limiting values. At least the compatibility unit should be able to provide a completed sound packet with an identifier that either explicitly announces the suitability of the sound packet for certain terminal types or at least lays down the memory or processing capacity requirements thereof. It is assumed that using a combiner/converter tool 511 should not require specific musical expertise.
A composer tool or sequencer 512 also appears in
A personal digital assistant or PDA 609 may also be used to communicate a sound packet to the terminal equipment 601 by any means including but not being limited to data calls, infrared connections, LPRF connections and direct cable. The PDA 609 may have received the sound packet either directly from a database or from the devices 605, 606, 607 or 608 of the above-explained PC computer environment. Another possible sound packet communication channel is through a bidirectional TV/Set Top Box connection and a corresponding device 610. Naturally data calls, infrared connections, LPRF connections, direct cables and other means may be used to transfer sound packets from other portable terminals 611 or older mobile telephones 612.
The terminal equipment 701 also needs to comprise a processor 703 with its associated circuitry so that it is able to convert the digital information contained within a sound packet into an audio frequency signal that can be lead to an acoustic transducer. The required processing capability is not exceptionally high if the previously explained file formats are used which have lower degree of polyphony than e.g. the minimum polyphony of the GM-1 or GM-2 specification. The same applies to the memory 704: as long as the sound packet approach is used to guarantee that only that information need to be stored that will actually be used for reproducing the desired acoustic functions, the memory technology of the priority date of this patent application suffices for implementing the required amount of memory into terminal equipment.
Finally the terminal equipment 701 needs to comprise an acoustic transducer 705 that is preferably more advanced than the monophonic square-wave driven buzzers of conventional mobile telephones. Constructing small-sized lightweight loundspeakers is not difficult as such, so it is merely a conventional engineering task to select a suitable transducer type and integrate it to the structures of the terminal equipment.
The architecture of the terminal equipment 701 must enable the communication of received information from the transceiver 702 to the processor 703 and further to the memory 704. Additionally the processor 703 must be able to read data from the memory 704 and to transmit it over the transceiver 702 to a cellular radio network. For emitting the audible signals represented in sound packets the processor 703 must be able to read stored sound packet data from the memory 704, to process it into an audio frequency signal and to direct the result to the transducer 705 for converting it into acoustic form. All these connections are easily implemented by a person skilled in the art.
We will conclude by discussing an alternative approach to the actual transmission of sound packets between a database coupled to a network and a number of terminals. Previously we have assumed that each downloading of a sound packet takes place at an explicit order from a certain tenminal so that the sound packet is delivered to that terminal only. No actual limitations have been placed regarding the transmission channel, but there is certain implicit pointing towards point-to-point connections through cellular radio networks and/or packet-switched communication networks between computers. However, it is possible to arrange for a broadcast-type delivery of sound packets either so that a certain collection of sound packets is transmitted at certain intervals irrespective of whether some terminal has ordered a transmission or not, or so that each terminal has at least a limited opportuility of influencing the selection of sound packets that is available through broadcasting.
From the sound packet database 801 and the other content sources 802 there are connections to a multiplexing and channel encoding block 803 which is a part of a larger transmission station 804. Said multiplexing and channel encoding block 803 constructs a multiplexed transmission stream according to the employed standard(s), e. g. DVB, and feeds it into a broadcast transmitter 805, also known as the head-end. The multiplexed transmission stream is transmitted through a broadcast transmission channel 806 which may be e.g. a cable television network or a radio transmission system involving repeater stations in link masts and/or in satellites.
A terminal system 807 comprises a receiver 808 that is arranged to receive and at least partially decode the received multiplexed transmission stream. Partial decoding means in this context that the receiver may be able to decode one or few components of the multiplexed transmission stream even when it is unable to touch the other components. In this patent application we discuss the use of sound packets, so we may assume that the receiver and decoder block 808 is able to decode at least that part of the multiplexed transmission stream that contains the information originally obtained from the sound packet database 801. The decoded information is fed into a processor 809 and a memory 810, and based on this information the processor 809 is able to construct an audio frequency signal stream that is fed into the acoustic transducer 811 for outputting an acoustic signal. A receiving buffer may be needed between blocks 808 and 809.
Up to this point the arrangement of
It should be noted that the terminal system 807 need not be a single device. It can involve two or more devices like a cable television receiver with integrated set-top box features and a mobile telephone. The local communication connection between them may exploit one or several of the short-range communication technologies referred to in association with
A unidirectional embodiment of distributing sound packets through an arrangement according to
An even further alternative is to feed into the multiplexer and channel encoder block 803 such sound packets that include sounds from the movies or other programs that are currently coming from the other content sources block 802. This would require some kind of synchronization in the operation of blocks 801 and 802. It could be commercially very attractive if a user who is enthusiastically watching a new music video or box office hit movie from television could simultaneously download the theme songs and/or the characters' key lines (like the notorious “I'll be back!” from a known American action movie) into his terminal equipment to be used as ringing tones and other sounds by simply activating the local communication link between the terminal equipment and the television set.
In any case the sound packets will be multiplexed and channel encoded into the transmission stream so that basically the same selection of sound packets is available to every terminal system, or at least to every terminal system having similar capabilities. It is then on the responsibility of the terminal system to screen the available selection of sound packets so that only compatible ones are presented as selectable options to the user, to perform the actual selection on the basis of user action and to store the selected sound packet to memory.
A simple “semi-bidirectional” embodiment of distributing sound packets through an arrangement according to
A more versatile and truly bidirectional arrangement could be such where the terminal system 807 and the sound packet database 801 conducted an initiation, terminal type identification and selection process like steps 401 to 411 in
An advantageous addition to the invention is the use of encryption to protect sound packets and/or their parts against illegal copying, editing or use after a predetermined time limit etc. The sound packets or their parts may be stored in the databases in already encrypted form, or the encryption may take place dynamically in association with the downloading to terminal equipment. The terminal equipment must naturally then be equipped with suitable decryption means. The use of encryption for protecting stored and/or transmitted pieces of digital data is known as such. The invention does not limit the nature or implementation of the encrypting-decrypting process.
Although we have in the foregoing discussed exclusively the possibility of storing audio-related presentation instructions to the score information parts, the invention may also be applied to the transfer of other kinds of presentation information, like MIDI-type control commands for lighting or synchronized karaoke words for the songs to be performed.
Number | Date | Country | Kind |
---|---|---|---|
19991865 | Sep 1999 | FI | national |
PCT/FI00/00737 | Aug 2000 | WO | international |
Number | Date | Country | |
---|---|---|---|
Parent | 10070055 | May 2002 | US |
Child | 10963397 | Oct 2004 | US |