Multi parametric classification of cardiovascular sounds

FIELD OF THE INVENTION

The present invention relates to methods and systems for classification of heart sounds recorded from a living subject into classes describing whether or not murmurs due to coronary artery stenosis is present in the heart sound.

BACKGROUND OF THE INVENTION

Coronary artery disease is the single most common cause of death from cardiovascular disease in the western world. The heart muscle receives its blood supply through the coronary arteries, and atherosclerosis is the most common pathophysiologic process occurring in the coronary arteries giving rise to coronary artery disease (CAD). Atherosclerosis is a process that builds up plaques within the artery, and the blood flow can therefore be is reduced or even blocked by the plaque. The constantly working heart requires a continuous and efficient blood supply in order to work properly. Defects in the blood supply may be very severe and even fatal. Increasing degrees of luminal diameter reduction or stenosis of the coronary artery will first limit reserve flow, then reduce flow at rest and may finally totally occlude the vessel.

There is a need for measuring/detecting coronary artery stenosis for clinicians and other medical professionals to diagnose CAD. Once a diagnose has been made a cure/treatment could be started.

Today several non-invasive techniques for measuring/detecting the severity of a stenosis or its presence inside a coronary artery exist. This can be done by magnetic resonance imaging (MRI), in vivo intravascular ultrasound (IVUS) or optical coherence tomography (OCT). However, the above-mentioned techniques are all rather complicated and expensive to use and therefore only patients with specific symptoms are offered such examinations. The consequence is that most patients have a critical stenosis when examined.

Clinicians and other medical professionals have long relied on auscultatory sounds to aid in the detection and diagnosis of physiological conditions. For instance, a clinician may utilize a stethoscope to monitor and record heart sounds in order to detect heart valve diseases. Furthermore, the recorded heart sounds could be digitized, saved and stored as data files for later analysis. Devices have been developed that apply algorithms to electronically recorded auscultatory sounds. One example is an automated blood-pressure monitoring device. Other examples include analysis systems that attempt to automatically detect physiological conditions based on the analysis of auscultatory sounds. For instance, artificial neural networks have been discussed as one possible mechanism for analyzing auscultatory sounds and providing an automated diagnosis or suggested diagnosis. Using these conventional techniques, it is difficult to provide an automated device for diagnosis of coronary stenosis using auscultatory sounds. Moreover, it is often difficult to implement the conventional techniques in a manner that may be applied in real-time or pseudo real-time to aid the clinician.

OBJECT AND SUMMARY OF THE INVENTION

The object of the present invention is to solve the above-mentioned problems.

This is achieved by a method for classifying a cardiovascular sound recorded from a living subject. The method comprises the step of extracting at least two signal parameters from said cardiovascular sound, said at least two signal parameters characterize at least two different properties of at least a part of said cardiovascular sound. The method further comprises the step of classifying said cardiovascular sound using said at least two signal parameters in a multivariate classification method.

Hereby a simple method for classifying cardiovascular sounds is achieved and the method is furthermore very robust since different properties of the cardiovascular sound is taken into account and used in a multivariate classification method. The cardiovascular sound related to turbulence consists of at least two components: a broad band component caused by turbulent blood flow colliding with the arterial wall and a narrow banded component related to the resonance frequency of the artery wall, therefore different variables describing different properties are needed in order to perform a robust classification. The different properties describe different characteristics of the cardiovascular sound and would therefore be uncorrelated and therefore provide different information of the cardiovascular sound. Different properties could for instance be the time duration of the diastolic segment of cardiovascular sound, the time duration of the systolic cardiovascular sound, the most dominant frequency component of the sound, the bandwidth of different frequency components, the energy in two frequency bands, the mobility of part of the signal, the complexity of the signal, the power ratio between different parts of the signal, e.g. two different segments or two different frequency bands, morphological characteristics such as correlation ratios between different segments or amplitude change over time. The method could easily be implemented in any kind of data processor unit and therefore be e.g. integrated in a software program which clinicians and doctors could use in order to classify the cardiovascular sound. Furthermore, the method could be integrated in a digital stethoscope and the stethoscope could therefore be used in order to classify a patient's cardiovascular sound. Since doctors and other clinicians are familiar with a stethoscope, they could easily be taught to use the stethoscope to classify the cardiovascular sound. The result is that the classification could assist the doctor or other clinicians to diagnose whether or not the patient suffers from CAD.

In another embodiment of the method, at least one of said at least two signal parameters is a frequency parameter describing a property in the frequency domain of at least a part of said cardiovascular sound. Hereby the frequency components of the cardiovascular sound could be used as a parameter in the multivariable classification method. Frequency parameters are very good parameters for classifying whether or not murmurs due to stenosis are present in a cardiovascular sound because the stenosis would change the frequency components of the cardiovascular sound.

In another embodiment of the method, at least one of said at least two signal parameters describes a property in the time domain of at least a part of said cardiovascular sound. Hereby time properties of the cardiovascular sound could be used as a parameter in the multivariable classification method. Time properties like the mobility or number of turning points are good indicators, whether or not murmurs due to stenosis are present in cardiovascular sound. Furthermore, by using both time and frequency parameters a very robust classification of the cardiovascular is achieved since time and frequency properties are often uncorrelated.

In another embodiment of the method, at least one of said frequency is parameters is a frequency level parameter describing a frequency level property of at least a part of said cardiovascular sound. Hereby it is achieved that a frequency level property of the cardiovascular sound is used in the multivariable classification method. The murmurs would typically change the frequency level of the cardiovascular sound, and by using parameters describing the frequency level of the sound a robust classification of the cardiovascular sound could be achieved.

In another embodiment of the method, at least one of said at least two signal parameters is a frequency bandwidth parameter describing a frequency bandwidth property of at least a part of said cardiovascular sound. Hereby the bandwidth of, for instance, dominating frequency components could be used in the multivariable classification method. The advantage of using a frequency bandwidth property of the cardiovascular sound is that murmurs often has a limited frequency bandwidth, and the frequency bandwidth parameter would therefore be a good indicator of whether or not murmurs due to stenosis are present in the cardiovascular sound.

In another embodiment of the method, at least one of said frequency level properties characterizes the most powerful frequency component of at least a part of said cardiovascular sound. This parameter is a very useful parameter as the murmurs due to stenosis typically have a dominating frequency component between 200-800 Hz. And if the most powerful frequency component is inside this interval, it would be a good indication of the presence of murmurs due to stenosis.

In another embodiment of the method, at least one of said frequency bandwidth properties characterizes the bandwidth of the most powerful frequency component of at least a part of said cardiovascular sound. Hereby the bandwidth of the most powerful frequency component could be used in the multivariable classification method. This bandwidth would most likely depend on whether or not murmurs due to stenosis are present in the cardiovascular sound.

In another embodiment of the method, at least one of said time parameters is a property characterizing the mobility of at least a part of said cardiovascular sound. The mobility is a good indicator of whether or not murmurs due to stenosis are present in the cardiovascular sound. The mobility describes the variance of the sound, and since murmurs would cause larger variance in the sound the mobility would be a good indicator.

In another embodiment of the method, the method further comprises the step of dividing said cardiovascular sound into at least one sub-segment and at least one of said signal parameters is extracted from said at least one sub-segment. Hereby it is achieved that the cardiovascular sound could be divided into sub-segments, e.g. into a systolic part and a diastolic part. Thereby relevant sub-segments could be used to extract the above-described different parameters.

In another embodiment of the method, the method further comprises the step of modelling at least a part of said cardiovascular sound and at least one of said signal parameters is extracted from said model. Hereby time models and frequency models of the cardiovascular sound or sub-segments of the sound could e.g. be used to extract the above-described parameters. The advantage of using models is that the models could enhance the signal properties, e.g. by using an envelope function or an autoregressive model. Furthermore, models would simplify and optimize the calculation process when the method is implemented in a data processor.

In another embodiment of the method, the multivariate classification method is a discriminant function. Hereby a simple and fast implementation of the classification method is achieved. Furthermore, any number of parameters could be used in the discriminant function, and the different parameters could also be weighted differently depending on the parameters' significance. The discriminant function could also be trained using cardiovascular test sounds is recorded from patients suffering from stenosis and healthy patients. Thereby the weights of the different parameters could be optimized to experimental data.

The invention further relates to a system for classifying a cardiovascular sound recorded from a living subject, said system comprises processing means for extracting at least two signal parameters from said cardiovascular sound, said at least two signal parameters characterizes at least two different properties of at least a part of said cardiovascular sound; processing means for classifying said cardiovascular sound using said at least two signal parameters using a multivariate classification method. Hereby a system for classifying a cardiovascular sound can be constructed and hereby the same advantages as described above are achieved.

In a further embodiment of the system, said processing means for extracting at least two signal parameters from said cardiovascular sound is adapted to extract at least one frequency parameter describing a property in the frequency domain of at least a part of said cardiovascular sound. Hereby the same advantages as described above are achieved.

In a further embodiment of the system, said processing means for extracting at least two signal parameters from said cardiovascular sound is adapted to extract at least one time parameter describing a property in the time domain of at least a part of said cardiovascular sound. Hereby the same advantages as described above are achieved.

In a further embodiment of the system, said processing means adapted to extract at least one of said frequency parameters are further adapted to extract at least one frequency level parameter describing a frequency level property of at least a part of said cardiovascular sound. Hereby the same advantages as described above are achieved.

In a further embodiment of the system, said processing means adapted to extract at least one frequency parameter is further adapted to extract at least one frequency bandwidth parameter describing a frequency bandwidth property of at least a part of said cardiovascular sound. Hereby the same advantages as described above are achieved.

In a further embodiment of the system, said processing means adapted to extract at least one frequency level property is further adapted to extract the most powerful frequency component of at least a part of said cardiovascular sound. Hereby the same advantages as described above are achieved.

In a further embodiment of the system, said processing means adapted to extract at least one of said frequency bandwidth properties are further adapted to extract the bandwidth of the most powerful frequency component of at least a part of said cardiovascular sound. Hereby the same advantages as described above are achieved.

In a further embodiment of the system, said processing means for extracting at least one time parameter are further adapted to extract the mobility of at least a part of said cardiovascular sound. Hereby the same advantages as described above are achieved.

In a further embodiment of the system, said system further comprises processing means for dividing said cardiovascular sound into at least one sub-segment and at least one of said signal parameters is extracted from said at least one sub-segment. Hereby the same advantages as described above are achieved.

In a further embodiment of the system, said system further comprises processing means for modelling at least a part of said cardiovascular sound and in that said processing means for extracting at least two signal parameters from said cardiovascular sound are further adapted to extract at least one of said parameters from said model. Hereby the same advantages as described above are achieved.

In a further embodiment of the system, said multivariate classification method used by said processing means for classification of said cardiovascular sound is a discriminant function. Hereby the same advantages as described above are achieved.

The invention further relates to a computer-readable medium having stored therein instructions for causing a processing unit to execute a method as described above. Hereby the same advantages as described above are achieved.

The invention further relates to a stethoscope comprising recording means adapted to record a cardiovascular sound from a living subject, storing means adapted to store said recorded cardiovascular sound, a computer-readable medium and a processing unit, said computer-readable medium having stored therein instructions for causing said processing unit to execute a method according to claims 1-12 and thereby classify said recorded cardiovascular sound. Hereby the method according to the present invention can be implemented in a stethoscope and the above-described advantages are achieved.

The invention further relates to a server device connected to a communication network comprising receiving means adapted to receive a cardiovascular sound recorded form a living subject through said communication network, storing means adapted to store said received cardiovascular sound, a computer-readable medium and a processing unit, said computer-readable medium having stored therein instructions for causing said processing unit to execute a method as described above and thereby classify said received cardiovascular sound. Hereby the method according to the present invention can be implemented in a server connected to a communication network. The server could then perform the above-described method and the above-described advantages are achieved.

In another embodiment of the server, said receiving means are further adapted to receive said cardiovascular sound from a client connected to said communication network. Hereby a clinician/doctor could send a cardiovascular sound to the server using a client device such as a laptop. The server could thereafter classify the received cardiovascular sound. The above-described advantages are hereby achieved.

In another embodiment of the server, the server device further comprises means for sending said classification of said cardiovascular sound to at least one client unit connected to said communication network. Hereby the result of the classification can be sent back to a client, and the clinician/doctor can therefore receive the result of the classification. The above-described advantages are hereby achieved.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a graph of a typical heart sound,

FIG. 2 illustrates a fluid dynamic model of an arterial stenosis,

FIG. 3 illustrates an overview in form of a flow diagram of the method according to the present invention,

FIG. 4 illustrates an embodiment of the system according to the present invention,

FIG. 5 illustrates another embodiment of the method according to the present invention,

FIG. 6 illustrates a flow diagram of the segmentation method,

FIG. 7 illustrates for a heart sound the relationship between the envelope autocorrelation of a cardiac cycle and the cardiac cycle,

FIG. 8 illustrates the implementation of a Bayesian network used to calculate the probability of a sound being an S1, S2 and noise sound.

DESCRIPTION OF EMBODIMENTS

FIG. 1 illustrates a graph of a typical heart sound recorded by a stethoscope and shows the amplitude (A) of the sound pressure at the y-axis and time (t) at the x-axis. The heart sounds reflect events in the cardiac cycle: the deceleration of blood, turbulence of the blood flow and the closing of valves. The closing of the valves is typically represented by two different heart sounds, the first (S1) and the second (S2) heart sound. The first and second heart sounds are illustrated in the figure, and S1 marks the beginning of systole which is the part of the cardiac cycle in which the heart muscle contracts, forcing the blood into the main blood vessels, and the end of the diastole which is the part of the heart cycle during which the heart muscle relaxes and expands. During diastole, blood fills the heart chambers. The duration of systolic segments is nearly constant around 300 ms for healthy subjects. Given a pulse of 60 beats per minute the duration of a cardiac cycle will be one second on average, and the duration of the diastole will be 700 ms. However, the diastolic durations are not constant, but will vary depending on the subject's pulse. In addition, smaller variations of the diastolic duration are introduced due to neural regulation and the effects of respiration.

FIG. 2 illustrates a fluid dynamic model of an arterial stenosis and shows an artery (201) with a stenotic lesion (202). The arrows (203) indicate the blood flow through the artery. Vortices (204) will occur when high velocity blood exits a stenotic lesion (202). These vortices collide with the arterial wall (205) and are transformed into pressure vibrations that cause the arteries to vibrate at their resonance frequencies. The result is that soundwaves in the form of murmers (206) with a frequency corresponding to the aterial wall's resonance frequencies are created and emitted from the arterial wall. Resonance frequencies in the arterial segment are increased if a stenosis is present and their frequencies depend on the diameter of the stenotic segment compared to the diameter of the artery. As the severity of a stenosis increases, so does the resonance frequency. The resonance frequency of a partial occluded stenotic artery is most likely between 200 Hz to 1100 Hz. The intensity of the vortice fluctuations depends on the blood flow so that murmurs from the left coronary arteries are most intense during diastole, when the blood flow through these arteries is highest. Murmurs from the right coronary arteries are most intense during diastole if there is a stenosis in branches of the right coronary artery supplying the right-sided cavities, whereas the murmur more likely will be systolic from those branches of the right coronary artery giving arterial blood to the left ventricle. The intensity of murmurs not only depends on the blood flow, but also on the frequency content of a murmur. High murmur frequencies are more suppressed by the chest wall compared to low frequencies. The murmurs caused by the arterial vibrations would affect the graph of a heart sound recorded by e.g. a stethoscope.

FIG. 3 illustrates an overview in form of a flow diagram of the method according to the present invention. The method could for instance be implemented as a software program running on a computer or on a microcontroller implemented in a stethoscope. In short, the method starts with an initialization (301), receiving a test signal (302), dividing the test signal into relevant segments (306), filtering the relevant segments (307); calculating/developing a model of the signal (308) in relevant segments; extracting different parameters from the signal and the model (309), performing an analysis of the signal (310) using the extracted parameters and classifying the relevant segments into two groups: one indicating that the signal contains murmurs due to stenosis (311), and one indicating that the signal does not contain murmurs due to stenosis (312).

After the method has been initialized (301) the method receives the test signal (302) as a data file (303). The test signal would be the heart sound from a person (304) recorded and digitalized into a data file, e.g. by a digital stethoscope (305). The test signal would be similar to the heart sound illustrated in FIG. 1, however, the duration of the test signal would typically be 5-15 times longer than the signal shown in FIG. 1. Once the test signal has been received (302), segmentation (306) is performed in order to detect and divide the test signal into segments. The segmentation process would typically detect the heart sounds S1 and S2 and thereafter divide the test signal into systolic and diastolic parts. Hereafter the test signal is filtered (307), and the filtration process includes an autoregressive filter that reduces white noise in the signal and a band pass filter that only lets frequencies between 450-1100 Hz pass. The test signal would thereafter contain the frequencies caused by the vibrations of the arterial wall when stenosis is present in the artery. The autoregressive filter could be implemented as a Kalman filter that is a powerful estimator of past, present and future states and it can do so even when the precise nature of the modelled system is unknown. This is a desirable feature in the present application when reducing the effects of noise since the exact composition of a murmur is unknown. A first order Kalman filter can reduce the effects of white noise and smooth the noisy heart sound recordings for further processing. The band pass filter could be implemented as a wavelet filter. In another embodiment the Kalman filter is omitted in order to simplify the implementation of the method in e.g. a microprocessor and further to reduce the number of calculations performed by the microprocessor.

When the signal has been filtered (307), relevant segments are selected for further analysis. In one embodiment a part of the diastolic segment is selected for further analysis as the murmur due to stenosis is most likely to be audible in the diastolic segment.

A mathematical model of the signal in the selected segment is hereafter calculated/developed (308) using the sampled heart sound in the data file. The model is used to extract parameters that characterize the sound in the segment and could be used to categorize whether or not the murmurs due to stenosis exist in the sound segment. In the present embodiment an autoregressive all-pole parametric estimation (AR-model) is used to model the signal. In the AR-model the sampled sound signal, y, from the data file is modelled as a linear combination of M past values of the signal and the present input, u, driving the sound generating process. The model can be described by the following equation:

$\begin{matrix} y (n) = - \sum_{p = 1}^{M} a_{p} y (n - p) + u (n) & [3.1] \end{matrix}$

where M represents the model order, A_pthe AR coefficients and n the sample number. The AR coefficients are determined through an autocorrelation and by minimizing the error associated with the model.

The AR model in this embodiment is used to extract frequency parameters describing the heart sound. A second order model M=2 is preferred because it makes a better separation between the frequency parameters extracted from a heart sound with murmurs present and the frequency parameters extracted from a heart sound with murmurs present.

Thereafter different parameters are extracted (309) from the sampled signal and the AR model using signal processing techniques. Some parameters could be extracted from the selected segments. Each parameter characterize the heart sound in the selected segments and could therefore be used to categorize the heart sound, e.g. whether or not murmurs due to stenosis are present in the heart sound. The parameters can in this embodiment be the number of turnings points per signal length, TP; the mobility of the signal, MB; pole magnitude, PM; normalized AR-peak frequency, NF; and AR spectral ratio, SR.

The number of turning points TP is extracted from the sampled signal in the time domain, and it is found by calculating the number of turns the signal performs in the time domain per unit time. This could be done by determining the amount of local maxima in a time period. Thus:

$\begin{matrix} T P = \frac{number of turns}{signal length} & [3.2] \end{matrix}$

The mobility MB is extracted from the sampled signal in the time domain and found by calculating the variance, σ_y, of the signal in the time domain and the variance of the signal's first derivative, σ_y′. The mobility is hereafter found by:

$\begin{matrix} M B = \frac{\sqrt{σ_{y^{'}}^{2}}}{\sqrt{σ_{y}^{2}}} = \frac{σ_{y^{'}}}{σ_{y}} & [3.3] \end{matrix}$

The pole magnitude PM is found by transforming the AR-model into the z-domain and calculating the magnitude of the poles in z-domain described by the AR-spectrum.

The normalized AR peak frequency NF is based on the assumption that murmurs due to stenosis are more likely to be found in the diastolic segment than in the systolic segment. The NF is found by calculating the angle of the poles in the AR-spectrum in the z-plane and transforming this into a frequency of both a diastolic segment and a systolic segment. If the absolute difference between the two is less than 25 Hz, which is typical in cases where no murmurs due to stenosis are present, then 25 Hz is subtracted from the diastolic peak frequency. If the average diastolic frequency is more than 50 Hz greater than the average systolic peak frequency, which is typical when murmurs due to stenosis are present, then 25 Hz is added to the average peak diastolic frequency.

The AR spectral ratio SR is found by calculating the ratio of the energy in the frequency rang 200-500 Hz to the energy in the frequency range 500-1000 Hz of a diastolic segment.

The extracted parameters are thereafter used in a multiparametric discriminant function in order to classify whether or not the sound segment contains murmurs due to stenosis (310). In this embodiment a linear discriminant function is used to classify the sound segments. The linear discriminant function combines weighted features into a discriminant score g(x) and could be described by:

g(x)=w₁x₁+w₂x₂+w₃x₃+ . . . +w_kx_k+w_i0=w^Tx+w₀ [3.4]

where x is the feature vector consisting of the extracted parameters, k represents the number of features, i represents the classes and w is a weight vector that holds the discriminant coefficients. In the case where only two classes must be separated, a single discriminant function is used. A two class classifier is called a dichotomizer. A dichotomizer normally classifies the feature vectors with the decisions boarder g(x)=0 (due to the constant w₀). If the discriminant score g(x) is greater than zero the segment is assigned to class 1, otherwise it is assigned to class 2. Since g is a linear function g(x)=0 it defines a hyperplane decision surface, dividing the multi dimensional space into two half sub spaces. The discriminant score g(x) is the algebraic distance to the hyper-plane. The discriminant function needs to be trained in order to find the weights values, w, and make a safe and robust classification of the sound segments. The discriminant training procedure needs to be performed before using the system, and the purpose of the procedure is to find the optimal weights values of w so that the hyper plane separates the feature vectors optimally. The training procedure is in one embodiment carried out by using 18 test sounds recorded from 18 test persons where nine test persons have coronary stenosis and nine test persons do not have coronary stenosis. The discriminant training procedure is performed by using the statistical software program SPSS v.12.0 for windows (SPSS inc., Chicago Ill., USA). The above-mentioned parameters are extracted from the 18 training sounds and used as statistical inputs to the software program. The resulting discriminant could be:

g(x)=164.709MB−0.061NF−78.027PM+27,188SR+91.878TP+33,712 [3.5]

where MB is the mobility of the signal, NF the AR-peak frequency, PM the pole magnitude, SR the AR spectral ratio and TP the number of turning point.

If the result of the discriminant function is larger than zero (g(x)>0) then the sound segment does not contain murmurs due to stenosis (312). On the other hand, if the discriminant function is smaller than zero (g(x)<0) then the sound segment contains murmurs due to stenosis (311).

The discriminant function could by a person skilled in the art easily be adjusted to include additional or fewer parameters in order to develop a proper discriminant function that can be used to classify the heart sound. Further parameters could for instance be:

The Complexity, CP, of the sampled signal in the time domain. This parameter is based on the ratio of the mobility of the first derivative of the signal to the mobility of the signal itself where y″ is the second derivative of the filtered heart sound signal. The complexity measure is relatively sensitive to noisy signals since it is based on the second derivative.

$\begin{matrix} C P = \frac{{MB}_{y^{'}}}{{MB}_{y}} = \frac{σ_{y^{″}} / σ_{y^{'}}}{σ_{y^{'}} / σ_{y}} & [3.6] \end{matrix}$

Further, the AR-peak frequency (PF) could be extracted and used in the discriminant function. The AR-peak frequency could be found by calculating the angle of the AR poles in the z-plane.

The parameters used in the discriminant function could be extracted from different segments of the heart sound, e.g. a number of different diastolic segments where a number of parameters is extracted from each diastolic segment. Thereafter an average value of each parameter could be calculated and used as input in the discriminant function.

FIG. 4
a illustrates an embodiment of the system according to the present invention where a server (401) is programmed to execute the method described in FIG. 3. Furthermore, the server is connected to a network (402), e.g. the Internet and adapted to on request to receive and analyze heart sound. Clinicians or other medical professionals would record the heart sound from a patient by a digital stetoscope (305) and thereafter transmit the digitalized heart sound to a personal computer (403). The clinician can hereafter send a request to the server in order to have the heart sound analyzed. Once the server has analyzed the heart sound the result is automatically sent back to the clinician. FIG. 4b illustrates a flow diagram of the process and the communication between the personal computer (403) and the server. The left hand side represents the client side (410) and the right hand side represents the server side (411). First the client sends a heart sound in digital form to the server (412). Thereafter the server performs the method illustrated in FIG. 3 and sends (413) the result of the analysis back to the client where it is displayed (414) to the clinician. The clinician could hereafter evaluate the result in order to choose the right treatment of the patient.

The system according to the present invention could also be implemented as an all in one digital stethoscope. The stethoscope would therefore automatically perform the analysis described in FIG. 3 when a heart sound has been recorded. This means that the method described in FIG. 3 needs to be implemented in stethoscopes' processing means, and the result of the analysis could e.g. be displayed on a small LCD integrated in the stethoscope. An advantage of this embodiment is that most clinicians are familiar with a digital stethoscope and could therefore easily learn to use the stethoscope to diagnose whether or not the patient has a coronary stenosis.

FIG. 5 illustrates another embodiment of the method described in FIG. 3. When the signal has been filtered (307), relevant segments are selected for further analysis. In one embodiment a part of the diastolic segment is selected for further analysis as the murmur due to stenosis is most likely to be audible in the diastolic segment. In this embodiment the diastolic segment comprising respiration sounds is discarded (501). This is done by calculating the energy level of the diastolic segment in the frequency band 200-440 Hz and comparing this energy level with the median energy level of the entire diastolic segment. The diastolic segment would be discarded if the energy level of the 200-440 Hz frequency band is a factor 1.1 larger than the energy level in the entire diastolic segment.

The remaining diastolic segments are hereafter divided into sub-segments (502) with a duration of 37.5 ms or 300 samples. This is done because the blood flow in the coronary artery is not constant during a diastole, and the murmurs due to stenosis would therefore not be constant.

The variance of the signal in all sub-segments is then calculated and the sub-segments with a variance larger than 1.3 of the median variance of all sub-segments are then discarded (503). Hereby sub-segments comprising high noise spikes are removed.

Thereafter (504) none stationary sub-segments are removed. This is done by dividing the sub-segment into sub-sub-segments with a duration of 3.75 ms or 30 samples and then calculate the variance of each sub-sub-segments. Thereby an outline of the variance throughout the sub-segment is constructed. The variance of the outline is then calculated and the sub-segment is removed if the variance of the outline is larger than 1.

At this point a number of sub-segments have been discarded in order to remove noisy and none stationary sub-segments. This would typically result in 30-50 sub-segments from a cardiovascular recording of approximately 10 seconds.

The remaining sub-segments are thereafter used in step (308) and (309) as described in FIG. 3 in order to extract parameters describing different properties of the cardiovascular signal. Thereafter the median of each parameter is calculated using the values of the parameter form each sub-segment (505). The median of each parameter is thereafter used in the multiparametric discriminant function as described in FIG. 3. In this embodiment the following parameters are used: the mobility, the power-ratio and the pole-amplitude of a 3 pole in an AR model of order 6.

FIG. 6 illustrates a flow diagram of the segmentation method (306) according to the present invention used to automatic divide a heart sound (601) into sub-segments. The heart sound (601) has been recorded by a stethoscope and the signal has been digitized in order to digitally process the signal. The graph shows the amplitude (A) of the sound intensity as a function of time (t). The heart sounds reflect events in the cardiac cycle; the deceleration of blood, turbulence of the blood flow and the closing of valves. The closing of the valves is typically represented by two different heart sounds, the first (S1) and the second (S2) heart sound. The first and second heart sounds are illustrated in the figure, and (S1) marks the beginning of systole, which is the part of cardiac cycle in which the heart muscle contracts, forcing the blood into the main blood vessels, and the end of the diastole which is the part of the heart cycle during which the heart muscle relaxes and expands. During diastole, blood fills the heart chambers.

The purpose of the segmentation method is to classify the recorded heart sound into systolic, diastolic and noise segments. The illustrated method includes steps of noise reduction (602) followed by envelope creation (603). The noise reduction could be implemented as a high-pass filer followed by removal of high amplitude friction noise spikes due to external noise like movement of the stethoscope during recording and thereafter a low pass filter. The purpose of the envelope creation is to enhance the trend of the signal. The envelope is in this embodiment created by calculating the Shannon energy of the signal:

se(n)=x(n)²·log x(n)²

where x is the signal and se is the Shannon energy. The high amplitude components in the signal are weighted higher than low amplitude components when calculating the Shannon energy. The envelope (613) of the heart sound (601) calculated by using the Shannon energy is shown in figure (613), and it can be seen that the heart sounds S1 and S2 are enhanced.

In order to classify the detected sounds into systolic segments, diastolic segments and noise components based on interval durations on either side of the heart sounds S1 and S2, it is necessary to know how long the intervals between S1's and S2's are. Therefore, the durations of the heart cycles (systolic and diastolic intervals) are extracted from an autocorrelation of the envelope (604). This process is described in detail in FIG. 7.

Candidates S1's and S2's are then detected (605) using the time intervals extracted above and a threshold (614) on the envelope (613). To reduce the number of detected noise spikes, a minimum requirement is applied to the candidate segments, which effectively removes some of the erroneously detected noise spikes. In some recordings there is a big difference between the intensity of S1 and S2 sounds. This causes a problem since some of the low intensity sounds may be missed by the threshold. As a result the segmentation method performs a test for missing S1 and S2 sounds (606). If it can be determined that some segments are missing, the threshold procedure is rerun (607) using lower local thresholds.

Once the signal has been divided into segments as described above interval parameters and frequency parameters for each segment are then extracted (608). The parameters aid in the classification of the sounds into systolic segments and diastolic segments.

The interval parameters are four Boolean parameters extracted for each sound by comparing the time duration to the previous sound and to the next sound with the time intervals extracted using the autocorrelation. The parameters are:

- AfterDia: Is true if the sound is succeeded by a second sound after a period corresponding to the duration of a diastole,
- AfterSys: Is true if the sound is succeeded by a second sound after a period corresponding to the duration of a systole,
- BeforeDia: Is true if the sound follows a second sound after a period corresponding to the duration of a diastole,
- BeforeSys: Is true if the sound follows a second sound after a period corresponding to the duration of a systole.

The frequency parameter divides the sounds into low frequency and high frequency sounds by calculating the median frequency of the sound. This is useful information as the first heart sound is expected to be a low frequency sound and the second heart sound is expected to be a high frequency sound.

The parameters are parsed into a Bayesian network where the probability of a segment being a S1, S2 and noise sound is computed (609). The figure illustrates a bar chart (615) of the probability calculated for each sound in the heart signal (601). Each sound would typically have one dominating probability indicating the type (S1, S2 or noise) of the sound. Thereby all sounds are classified into S1, S2 and noise sounds. However, the probability of the three types would in some cases be more or less equal and in such cases it is not possible to classify the sound into a S1, S2 or noise sound using the Bayesian network.

The probabilities are used in the last step (610) to divide and verify the heart signal into systole and diastole segments. This is done by using the position of the identified S1 and S2 sounds to mark the beginning of a systolic and diastolic sound segment respectively

The final result of the method (611) is the beginnings and ends of all identified systoles and diastoles. Therefore a “train” (616) of alternating systoles (617) and diastoles (618) can be created. Once the systoles and diastoles have been identified they can be used in further data handling, e.g. to extract further parameters from these segments and thereafter use the parameters to classify the medical condition of the recorded heart sound.

FIG. 7 illustrates the relationship between the envelope autocorrelation and the cardiac cycle, and how the intervals between heart sounds S1 and S2 can be found from the autocorrelation.

FIG. 7
a illustrates the envelope autocorrelation with the normalized autocorrelation at the y-axis (NA) and the displacement (m) of the shifted envelope at the x-axis.

FIG. 7
b illustrates the displacement (m1) when the shifted envelope (701) is displaced by the duration of the systole corresponding to the unshifted envelope (702). The y-axis shows the amplitude (A) of the envelope and the x-axis the time (t). The S1's in the displaced envelope are multiplied by the S2's in the unshifted envelopes resulting in the first peak (703) seen in the autocorrelation.

FIG. 7
c illustrates the displacement (m2) when the shifted envelope (701) is displaced by the duration of the diastole corresponding to the unshifted envelope (702). The displaced S2's are multiplied by the S1's in the unshifted envelope resulting in the second peak (704) seen in the autocorrelation.

FIG. 6
b illustrates the displacement (m3) when the shifted envelope (701) is displaced by the duration of the cardiac cycle corresponding to the unshifted envelope (702). The S1's in the displaced envelope are multiplied by the S1's in the unshifted envelope, and the S2's in the displaced envelope are multiplied by the S2's in the unshifted envelope. When this occurs the dominating peak (705) in the autocorrelation is produced.

The interval between the heart sounds could therefore be found by measuring the distance between the peaks in the autocorrelation as described above.

FIG. 8 illustrates the implementation of the Bayesian network used to calculate the probability of a sound of being an S1, S2 and noise sound in step (809). The basic concept in the Bayesian network is the conditional probability and the posterior probability. The conditional probability describes the probability of the event a given the event b.

P(a|b)=x_c [8.1]

If the above equation describes the initial conditional probability, the posterior probability would be:

P(b|a)=x_p [8.2]

According to Bayes' rule the relation between the posterior probability and the conditional probability is:

$\begin{matrix} P (b  a) = \frac{P (a  b) P (b)}{P (a)} & [8.3] \end{matrix}$

where P(a) is the prior probability for the event a, and P(b) is the prior probability for the event b. Equation [8.3] only describes the relation between one parent and one child, but since the event a can be the combination of several events {a₁, a₂, , , a_n} the equation can be expanded to:

$\begin{matrix} P (b  a_{1}, a_{2},,,,, a_{n}) = \frac{P (a_{1}, a_{2},,,,, a_{n}  b) P (b)}{P (a_{1}, a_{2},,,,, a_{n})} & [8.4] \end{matrix}$

Since the goal is to find the probability for the different states of b when a₁and a₂are known, P(a₁, a₂, , , a_n) is just a normalizing constant k and [7.4] can be simplified to:

P(b|a₁,a₂, , , a_n)=k·P(a₁,a₂, , , a_n|b)P(b) [8.5]

If child events (a₁, a₂. . . a_n) are conditionally independent, equation [8.5] can be generalized to:

$\begin{matrix} P (b  a_{1}, a_{2},,,,, a_{n}) = k \cdot P (b) \prod_{i = 1}^{N} P (i  b) & [8.6] \end{matrix}$

where N is the number of known events a. Equation [8.6] is useful in determining the probability of the event b if the states of all a events are known and if all a events are conditionally independent. A Bayesian network based on equation [8.6] is called a naive Bayesian network because it requires conditional independency of the children.

The task for the Bayesian network is to evaluate the type of each detected sound above the detection threshold. For each of these sounds, the posterior probability of being an S1 sound, an S2 sound or a noise component is calculated and the Bayesian network is constructed using one parent and five children. The parent is a sound above the envelope threshold (801), and the children are the five parameters described above: Frequency (802), AfterSys (803), AfterDia (804), BeforeSys (805) and BeforeDia (806). When determining the posterior probability for the type of a particular sound, the prior probability for the different states of a sound type P(S) and the conditional probabilities must be known, i.e. the conditional probabilities that “AfterSys” is in a given state when S is a given type, P(AfterSys|S). This posterior probability requires definition of P(S), P(AfterSys|S), P(AfterDia|S), P(BeforeSys|S), P(BeforeDia|S) and P(Frequency|S) before the equation [8.6] can be used to calculate the posterior probability of a sound being a particular type of sound.

The prior probability that a sound is an S1, S2 or a noise component changes between recordings. In the optimal recording, where no noise components are detected, the prior probability for noise is zero, P(S_=Noise)=0. If this is the case and an equal number of S1's and S2's are detected, the prior probability that the detected sound is an S1 is 50%, and similar for S2. Therefore, P(S_=S1)=P(S_=S2)=0.5 if P(S_=noise)=0. However, this optimal condition cannot be assumed for real signals, and noise sounds would be detected. This will increase the prior probability that a given sound is noise.

The exact probability of a detected sound being noise, P(S_=noise) can be defined if the number of detected noise sounds, N_noiseand the total number of detected sounds, N_soundsare known. For instance, if it is known that four noise sounds are detected, N_noise=4, and the total number of detected sounds is 20, the probability that the sound being examined is a noise sound is P(S_=Noise)=4/20. However, in most signals N_noiseis unknown and an estimate of N_noiseis therefore necessary, and this estimate can be based on already available information since the duration of a heart cycle is known from the envelope autocorrelation (804). The expected number of cardiac cycles in one recording can therefore be calculated by dividing the length of the recording with the length of the cardiac cycles. The number of S1's and S2's in a recording is therefore twice the number of cardiac cycles in a recording. The prior probability of the sound type would therefore be:

$\begin{matrix} P (S_{= noise}) = \frac{N_{noise}}{N_{sound}} & [8.7] \end{matrix}$

and the prior probability that the detected sound is an S1 or S2:

$\begin{matrix} P (S_{= s 1}) = P (S_{= s 2}) = \frac{1 -= P (S_{= noise})}{2} & [8.8] \end{matrix}$

The conditional probability that an S1 is followed by an S2 sound after an interval corresponding to the duration of a systole, P(AfterSys|S_=S1), depends on several factors. The S1 sounds will normally be followed by S2 sounds after an interval of duration equal to the systole. Deviations from this can also occur, e.g. when S1 is the last sound in the recording, or if S2 is missing because it is not detected by the threshold. It may also occur that a weak (below threshold) S2 is detected because noise occurs in the tolerance window associated with those sounds. The probability that “AfterSys” is false if the sound is an S1 sound may thus be calculated as

P(AfterSys_=false|S_=S1)=P(EndSound∪Singlesound),NoiseInWin) [8.9]

where “EndSound” is an event describing that the sound is the last sound in the recording. “SingleSound” describes that S1 is not followed by S2 as the next S2 sound is not detected due to sub-threshold amplitude. “NoiselnWin” describes noise occurrence in the window, where the S2 sound was expected. The conditional probability that “AfterSys” is true given that the examined sound is an S1 sound is given by:

P(AfterSys_=true|S_=S1)=1−P(AfterSys_=false|S_=S1) [8.10]

If the examined sound is an S2 sound it is not likely that any sound occurs after an interval corresponding to the systolic duration since the next S1 sound will occur after the duration of the diastole. An exception is if a noise sound occurs in the window P(NoiseInWin) or if the systole and diastole durations are equal. If the duration of the diastole is equal to the duration of the systole, the S1 sound which follows the S2 sound after the duration of a diastole occurs in both the systole tolerance window and in the diastole tolerance window. This will happen if the heart rate of the subject is high. The probability that a sound occurs in both tolerance windows (overlap) is equal to the degree of the overlap between the systole and diastole tolerance window. This probability is termed P(Overlap). Therefore, the conditional probability that a sound occurs in the window after systole duration if the examined sound is an S2 sound is:

P(AfterSys_=true|S_=S2)=P(Overlap∪NoiseInWin) [8.11]

The conditional probability that a sound does not occur after a systole duration, if the examined sound is an S2, is the opposite of the conditional probability that it does occur:

P(AfterSys_=false|S_=S2)=1−P(AfterSys_=true|S_=S2) [8.12]

The conditional probability that a detected noise sound is followed by another sound after the systole duration is based on the probability that a sound of any kind is present in a segment with the length of the used tolerance window. This can be estimated from the ratio of the tolerance window length multiplied by the number of detected sounds minus one to recording length.

P(SoundInWin|S_=S2)=1−P(AfterSys_=true|S_=S2) [8.12]

The conditional probability that a detected noise sound is followed by another sound after the systole duration, P(AfterSys|S_=noise), is based on the probability that a sound of any kind is present in a segment with the length of the used tolerance window. This can be estimated from the ratio of the tolerance window length multiplied by the number of detected sounds minus one to recording length. The conditional probability that a noise sound is followed by another sound after a systole duration is therefore:

$\begin{matrix} \begin{matrix} P ({AfterSys}_{= true}  S_{= noise}) = P (SoundInWin) \\ = \frac{(N_{sound} - 1) \cdot 2 \cdot {Sys}_{tol}}{RecLength} \end{matrix} & [8.13] \end{matrix}$

where N_soundis the number of sounds within the recording, Sys_totis the duration of a systole and RecLength is the length of the recording. The conditional probability that a noise is not followed by another sound after the systole interval is the opposite:

P(AfterSys_=false|S_=noise)=1−P(SoundInWin) [8.14]

The conditional probabilities for P(AfterDia|S), P(BeforeSys|S) and P(BeforeDia|S) are based on the same assumptions used to define P(AfterSys|S). These conditional probabilities can be found in the tables below:

False
True

P(AfterSys|S)

S1
P((EndSound ∪ SingleSound), NoiseInWin)
1 −

P((EndSound ∪ SingleSound), NoiseInWin)

S2
1 − P(Overlap ∪ NoiseInWin)
P(Overlap ∪ NoiseInWin)

Noise
1 − P(SoundInWin)
P(SoundInWin)

P(AfterDia|S)

S1
1 − P(Overlap ∪ NoiseInWin)
P(Overlap ∪ NoiseInWin)

S2
P((EndSound ∪ SingleSound), NoiseInWin
1 −

P((EndSound ∪ SingleSound), NoiseInWin)

Noise
1 − P(SoundInWin)
P(SoundInWin)

P(AfterSys|S)

S1
1 − P(Overlap ∪ NoiseInWin)
P(Overlap ∪ NoiseInWin)

S2
P((EndSound ∪ SingleSound), NoiseInWin
1 −

P((EndSound ∪ SingleSound), NoiseInWin)

Noise
1 − P(SoundInWin)
P(SoundInWin)

P(AfterDia|S)

S1
P((EndSound ∪ SingleSound), NoiseInWin)
1 −

P((EndSound ∪ SingleSound), NoiseInWin)

S2
1 − P(Overlap ∪ NoiseInWin)
P(Overlap ∪ NoiseInWin)

Noise
1 − P(SoundInWin)
P(SoundInWin)

It has previously been found that the frequency parameter classified 86% of the S1 sounds as low frequent and 80% of the S2 sounds as high frequent. 85% of all noise sounds were classified as high frequent. This information was used as the conditional probabilities between the frequency parameter P(Frequency|S):

P(Frequency|S)
Low
High

S1
0.86
0.14

S2
0.20
0.80

Noise
0.15
0.85

When all conditional probabilities are found, equation [8.6] is used by the Bayesian network to calculate the posterior probabilities for all detected sounds. This way, three probabilities are calculated for each sound that reflect how likely it is that the current sound is a given type.

It should be noted that the above-mentioned embodiments rather illustrate than limit the invention, and that those skilled in the art will be able to suggest many alternative embodiments without departing from the scope of the appended claims.

Multi parametric classification of cardiovascular sounds

Information

Publication Number

Date Filed

Date Published

Inventors

CPC

US Classifications

International Classifications

Abstract

Description

Claims

PCT Information