The invention relates to a method and a system for reducing the reception of unwanted messages by using feature patterns.
With the increasing spread of Internet telephony (voice over IP, VoIP in brief), it is expected that VoIP users will be increasingly exposed to so-called SPIT (SPAM over Internet Telephony). At present, advertising calls to conventional PSTN (Public Switched Telephone Network) users are normally always charged to the caller. Calls to VoIP users, in contrast, can be made almost free of cost due to the deviating charging model for the caller, which leads to the expectation of a massive SPIT influx for the future. The possibility of sending recorded voice files in masses, in particular, should be of interest to advertisers. It must be assumed that the VoIP users affected will demand suitable measures from their respective VoIP providers in order to be protected against unwanted calls.
Counter measures against SPIT inter alia are so-called white lists and black lists. A white list contains for a user X user-specific information relating to those other users Y in the communication network which have been graded as trustworthy and are thus authorized to call user X. A black list, in contrast, contains user-specific information relating to those other users Y which have been graded as not trustworthy and are thus not authorized to call user X.
However, SPIT protection with the aid of white and black lists is ineffective in the case of an unknown user calling for the first time since the user-specific data of the unknown user cannot be contained either in a white list or a black list of the called user in this case.
It is also conceivable to classify messages also as SPIT on the basis of their similarity to a message previously recognized as SPIT message. If a message occurs in batches, this is also a strong indication of an unwanted message.
However, an exact comparison, for example in the form of a pure comparison at the level of the bit streams representing the messages to be compared, does not lead to the target since even a slight modification, which is inaudible to the called party, for example due to recoding or an accidental delay at the beginning of the message, would lead to a difference between the messages compared.
The invention discloses a method and a system to such an extent that the reception of unwanted messages in a communication network is reduced.
One embodiment of the invention is a method for determining a feature pattern for a voice message, the voice message being present in the form of a numerically coded audio signal generated by sampling. The method comprises at least the following steps for determining the feature pattern on the basis of the numerically coded audio signal:
In a first step, non-voice portions of the audio signal are suppressed by filtering out irrelevant frequency ranges during an application of a suitable signal filter to the audio signal, particularly application of a bandpass filter.
In a second step, a mapping rule (SQR) is applied for mapping all elements of the numerically coded audio signal into the range of the positive numbers.
In a third step, a sampling rate of the audio signal, characterizing the sampling, is adapted.
In a fourth step, the new range of values, produced by the adaptation of the sampling rate, of all elements of the numerically coded audio signal is normalized with respect to a maximum value and a mean value.
The invention also relates to a system for carrying out the method represented and to devices and a corresponding communication network.
The invention entails the advantage that the reception of unwanted messages is reduced.
An example of the embodiment of the invention is represented in the drawings and will be described in greater detail in the text which follows.
According to the invention, a feature pattern FP is determined for a message M. In this context, the message M is a voice message in a communication network, for example a Voice over IP communication network. The message M is available in the form of a numerically coded audio signal generated by sampling. The method according to the invention is characterized by a plurality of steps during which the feature pattern FP is determined on the basis of the numerically coded audio signal. The determination of the feature pattern FP is here irreversible, the message M can thus not be reconstructed out of the feature pattern FP.
The feature pattern FP determined can be, for example, stored and/or transmitted to portions within or outside of the communication network for further processing. It is also possible to compare the feature pattern FP determined with a second feature pattern FP of a second message M and to determine whether the two messages match one another in contents.
Firstly, non-voice portions of the audio signal are suppressed in a first step by filtering out irrelevant frequency ranges during an application of a suitable signal filter to the audio signal. In this context, the application of a bandpass filter BPF is particularly advantageous since the bandpass filter BPF mainly leaves the frequency range relevant to voice unchanged but largely filters out non-voice portions.
In a second step, a mapping rule SQR is applied for mapping all elements of the numerically coded audio signal (samples) into the range of the positive numbers. The mapping rule SQR advantageously represents, for example, a squaring or absolute-value module: In the case of the squaring module, all elements of the numerically coded audio signal are squared, in the case of the absolute-value module, the corresponding amount is formed for all elements of the numerically coded audio signal.
In a third step, a sampling rate of the audio signal, characterizing the sampling, is adapted by means of an addition module AS. The addition module AS in each case incrementally combines a set of elements of the numerically coded audio signal, resulting in an altered sampling rate of the audio signal. The number n of samples combined per second is adjustable.
In a fourth step, the new range of values, produced by the adaptation of the sampling rate, of all elements of the numerically coded audio signal is normalized with respect to a maximum value and a mean value by means of a normalizer RA. The normalizer RA preferably performs a linear transformation of the samples of the audio signal in such a manner that a normalization to a maximum value of 1 and a mean value of 0 is carried out.
Following the method shown, all modified elements of the numerically coded audio signal are output. The result of the method represented is a sequence of numbers between −1 and 1 which represent the feature pattern FP for the message M.
The sequence of steps represented above is variable and not restricted to the sequence shown. In particular, steps can be left out, reordered or carried out several times.
In a further embodiment of the invention, in an additional restriction step, the duration in time of the audio signal is restricted to a predetermined measure, wherein the restriction step can be carried out at any point in the method. The limiting of the length preferably occurs as early as possible in the sequence of steps in order to minimize the computing effort in the subsequent steps.
In a further embodiment of the invention, the DC portion of the audio signal is removed before the bandpass filter BPF is applied, the DC portion representing the long-term mean value of the audio signal.
For the comparison of a second feature pattern FP2 of a second message M2 with a first feature pattern FP1 of a first message M1, the cross correlation function c(k) of the two feature patterns is determined. This function c(k) is defined as follows for two data series s1(i) and s2(j), the two data series representing the samples of the first and of the second message, respectively:
If one of the result values of the correlation function c(k) exceeds a predetermined threshold value, the messages are classified as identical. Otherwise, the messages are assessed as being nonidentical.
In a further embodiment of the invention, a continuous or a multi-step measure for the equality of two messages M1, M2 can be derived from the maximum value of c(k). In this context, a continuous measure for the equality has an infinite number of intermediate steps but a multi-step measure, in contrast, only has a finite number of intermediate steps.
In a further embodiment of the invention, the ratio C1/C0 between the maximum of the cross correlation function c(k) and the maximum C0 of the autocorrelation function (feature pattern of the first message M1 correlates with itself) can also be used for determining a measure for the equality of two messages M1, M2.
In a further embodiment of the invention, the threshold value predetermined with respect to the correlation function c(k) or the reference value for a multi-step classification can be determined from the auto- and cross-correlation functions of other messages stored in the system.
The method according to the invention is efficient since a feature pattern FP for a message M only contains a small amount of data. In this manner, the feature space based on a message M is greatly reduced. The small amount of data per feature pattern FP allows, for example, very efficient storage and/or retransmission of a feature pattern FP within a communication system. In contrast to a bit-by-bit comparison of messages M or a comparison of values derived directly from the audio signal of a message M such as, for example, hash values, the method according to the invention is also suitable for comparing messages which have been digitized independently of one another—for example after transmission by an analog voice network or recoding of the messages. Furthermore, the method according to the invention is insensitive to a certain measure of superimposed interfering noises in various variants of a message M. Messages M of equal or almost equal contents can be recognized reliably and robustly. Messages of identical contents in principle can be reliably recognized even with relatively small differences between two messages M1, M2 such as, for example, a different form of address or the insertion of small individual portions into one of the messages M1, M2. The method thus makes it possible to determine that two messages M1, M2 carry the same voice information with high probability. The resultant magnitude of the feature patterns FP1, FP2 can be influenced here by adapting the data rate and by limiting the length of the audio signal.
A further advantage of the invention lies in that, although a feature pattern FP1 for a message M1 is suitable for comparison with a second feature pattern FP2 for a second message M2, the original voice message can no longer be calculated back from a feature pattern FP1, FP2. This is the only way in which the method can also be used in a distributed analysis system in which feature patterns are transmitted in the communication network with the aim of comparison without the receiver obtaining knowledge of the original voice message therefrom.
In one embodiment of the invention, the method according to the invention is carried out by a voice box server.
In a further embodiment of the invention, the method according to the invention is carried out by at least one client and at least one server in a communication network, wherein the client determines a feature pattern FP for a message M and wherein the server carries out the comparison of feature patterns FP for various messages M. In this process, the client represents, for example, a network-based voice box system or a terminal such as, for example, an answering machine. The server is provided, for example, by a network operator as part of an answering machine service. As an alternative, the server can also be offered by an independent operator.
Number | Date | Country | Kind |
---|---|---|---|
10 2006 032 543.5 | Jul 2006 | DE | national |
This application is a national stage application of PCT/EP2007/057266, filed Jul. 13, 2007, which claims the benefit of priority to German Application No. 10 2006 032 543.5, filed Jul. 13, 2006, the contents of which hereby incorporated by reference.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2007/057266 | 7/13/2007 | WO | 00 | 12/23/2009 |