Auscultation, the act of listening to the sounds of internal organs, is a valuable and simple diagnostic tool for detecting heart dysfunction, because of its non-invasive ability to provide useful information concerning the integrity and function of the heart valves and also on the hemodynamics of the heart. But, a disturbing percentage of medical graduates cannot properly diagnose heart conditions using a stethoscope. The art of listening to heart sounds and interpreting their meaning is difficult to master as the sounds are the result of several events of short duration that occur in a very small interval of time. The poor sensitivity of human ears in the low frequency range, the range in which the heart sounds occur, makes this task even more difficult.
Augmenting the information available to the physician with automatic auscultation (e.g., computer-aided auscultation using digital signal processing techniques to display a representation of heart sounds along with diagnostic information) may greatly improve the chances of correct diagnosis and avoid the need for costly screening tests. The aim of automatic auscultation is not necessarily to replace the human expert but to provide auxiliary information to help the human expert make an informed decision. An important part of automatic auscultation is the robust detection of heart rate and the location of primary heart sounds.
In automatic auscultation, heart sounds may be recorded using a diagnostic sound recording device such as an electronic stethoscope and displayed graphically in a phonocardiogram (PCG), in which the x-axis represents time and the y-axis represents a measure of the intensity of sound, i.e., amplitude. The audio signal resulting from a recording of heart sounds is a multi-component signal that includes primary heart sound components and abnormal components. The primary heart sound components, S1 and S2, are composite acoustic signals generated by valve closures (i.e., S1 is caused by the closure of the mitral and tricuspid values and S2 is caused by the closing of the aortic and pulmonary valves). The abnormal components may be clicks, snaps, and murmurs (i.e., noises associated with the damage of valves and improper functioning of valves), which can indicate abnormalities in heart structures. Two other components may also be present in the heart sounds, S3 and S4. S3 occurs at the beginning of diastole just after S2 and may, in some cases, be an indication of an abnormality. S4 occurs at the beginning of systole just before S1, and may also, in some cases, be an indication of an abnormality.
The localization of the abnormal components indicates different dysfunctional causes. For example, the diagnosis of heart valve disorders is based on the presence of different kind of murmurs in the cardiac cycle. A cardiac cycle is delimited by a single systole and a single diastole. Some of the features indicative of different types of murmurs include the location of the murmur, i.e., whether the murmur is present in systole or diastole, the intensity of murmur relative to the primary heart sound components, and the shape of the murmur. Accordingly, the major components of the cardiac cycle need to be separated to aid in diagnosis.
Segmentation of heart sounds into associated cardiac cycles and the detection of the location of S1 and S2 is a primary step prior to the automated analysis of heart sounds for diagnostic purposes. Thus, robust detection and segmentation of heart sounds is needed for automatic auscultation. Various approaches for heart sound segmentation have been proffered including using a reference electrocardiogram (ECG) signal or/and carotid pulse, using PCG signals only in the time and/or frequency domains, or using wavelet transform. More specifically, in one known segmentation approach, an adaptive tracking algorithm based on wavelet transform is used. This approach relies on information regarding the physical position of the recording to identify S1. Further, this approach, although robust to high-frequency noise, may cause false detection when noises overlap in frequency.
In another known segmentation approach, the audio signal is filtered to suppress high frequency murmurs and then the peaks of the energy profile are picked to locate S1 and S2. This approach requires the heart rate be known and used as auxiliary input to detect the S1 and S2 locations. Further, filtering can be detrimental in detection of clicks and snaps that occur very close to S1 and S2. In addition, this approach may not perform well when there is spectral overlap between S1 and S2 and pathological conditions with high energy content. In yet another known segmentation approach, ECG signals are used to perform segmentation. In this approach, the Shannon energy measure is used to segment S1 and S2. Again, this approach may not perform well when there is overlap between the primary heart sounds and murmurs.
Embodiments of the invention provide methods, systems, and computer readable media for heart sound identification. Embodiments provide for the location of the primary heart sounds, S1 and S2, in an audio signal of heart sounds in a manner that is robust in the presence of pathological heart conditions such as rumbles, murmurs, clicks, and snaps. Kurtosis in the time domain is used to distinguish an S1 or S2 peak from some types of murmur peaks and kurtosis in the frequency domain is used to distinguish an S2 peak from peaks associated with a late systolic murmur. In addition, in some embodiments, timing based error correction is applied to help insure that the peaks selected for S1 and S2 are appropriate. Further, some embodiments include heart rate detection that is computationally inexpensive and works for a wide range of heart rates. In addition, some embodiments include diagnostic support for identifying pathological heart conditions indicated in the audio signal.
Particular embodiments in accordance with the invention will now be described, by way of example only, and with reference to the accompanying drawings:
Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. Like elements in the various figures are denoted by like reference numerals for consistency.
In the following detailed description of embodiments of the invention, numerous specific details are set forth in order to provide a more thorough understanding of the invention. However, it will be apparent to one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.
In general, embodiments of the invention provide for robust identification of the primary heart sounds S1 and S2 in the presence of pathological conditions such as diastolic rumble, systolic murmurs, ejection clicks, etc. The primary heart sounds may be located even when a pathological heart condition masks one or both of the primary heart sounds and for a wide range of heart rates (e.g., 38 to 300 beats per minute (BPM)). More specifically, in one or more embodiments of the invention, in an audio signal of heart sounds, the locations of peaks corresponding to S1 and S2 in each cardiac cycle in the signal are identified. Further, kurtosis in the time domain is used to distinguish the S1 peaks and the S2 peaks from the peaks of some types of murmurs. In addition, kurtosis in the frequency domain may be used to distinguish the S2 peaks from the peaks of a late systolic murmur and/or the presence of S3 peaks. In some embodiments of the invention, timing based error correction is used to further ensure that peak locations selected for S1 and S2 are appropriate.
In some embodiments of the invention, after all of the S1 and S2 peaks in the audio signal are located, the heart rate may be determined based on the number of S1 peaks located and the sampling frequency. Further, in one or more embodiments of the invention, the locations of the S1 and S2 peaks may be used in conjunction with information about the location of murmurs found while identifying S1 and S2 and information regarding the correction of S1 and/or S2 peaks during timing based error correction to provide additional diagnostic information for the classification of murmurs and other pathological conditions indicated by the heart sounds. An annotated graphical representation of the heart sounds (i.e., a phonocardiograph) that shows the locations of the S1 and S2 peaks may also be displayed. In some embodiments of the invention, the heart rate may and/or any additional diagnostic information regarding pathological conditions found in the heart sounds may also be displayed.
The transmission of the digital audio signal to the processing device (104) may be wired or wireless. More specifically, the sound capture device (102) may be directly connected to the processing device (104) (e.g., using a USB port) or may be communicatively coupled to the processing device (104) by a network (not specifically shown). The network may be a wide area network (WAN) such as the Internet, a wireless network, a local area network (LAN), or a combination of networks.
The processing device (104) is a computing system (e.g., a microprocessor, a personal computer, a laptop computer, a server, a mainframe, a personal digital assistant, a television, a mobile phone, an iPod, an MP3 player, etc.) configured to receive the digital audio signal from the sound capture device (102) and to process the signal to identify the primary heart sounds, S1 and S2, in each cardiac cycle recorded in the signal. The processing device (104) may also be configured to determine the heart rate once the primary heart sounds are identified. Further, the processing device (104) may be configured to provide additional diagnostic information regarding pathological conditions present in the recorded heart sounds. The processing device also includes functionality to generate an annotated PCG of the digital signal and provide the PCG to the output device (106) for display. The annotations in the PCG may include locations of S1 and S2, the heart rate, and/or the additional diagnostic information. More specifically, the processing device includes functionality to store executable instructions implementing a method for heart sound identification as described herein and to execute those instructions.
The transmission of the PCG to the output device (106) may be wired or wireless. More specifically, the output device (106) may be directly connected to the processing device (104) (e.g., using a USB port, a controller card, control circuitry, etc.) or may be communicatively coupled to the processing device (104) by a network (not specifically shown). The network may be a wide area network (WAN) such as the Internet, a wireless network, a local area network (LAN), or a combination of networks.
The output device (106) is configured to receive the PCG from the processing device (104) and to display the PCG. The output device (106) may be any display device capable of displaying the PCG such as, for example, a computer monitor, a display screen of a handheld computing device, etc. The output device (106) may also be another computing system that includes a display device.
The system of
The processing device (204) is one or more processors configured to receive the digital audio signal from the sound capture device (202). More specifically, the processing device may be a digital signal processor (DSP), a microprocessor, or a combination of a DSP and a microprocessor. The processing device (204) is further configured to process the signal to identify the primary heart sounds, S1 and S2, in each cardiac cycle recorded in the signal. The processing device (204) may also be configured to determine the heart rate once the primary heart sounds are identified. Further, the processing device (204) may be configured to provide additional diagnostic information regarding pathological conditions present in the recorded heart sounds. The processing device also includes functionality to generate an annotated PCG of the digital signal and provide the PCG to the output device (206) for display. The annotations in the PCG may include locations of S1 and S2, the heart rate, and/or the additional diagnostic information. More specifically, the processing device (204) includes functionality to store executable instructions implementing a method for heart sound identification as described herein and to execute those instructions.
The output device (206) is a display screen included in the body of the digital stethoscope (208) and operatively connected to the processing device (204) by control circuitry. Further, the output device (206) is configured to receive the PCG from the processing device (204) and to display the PCG.
As shown in
Subsequently, the initial S1 peak in the audio signal is identified within a search window beginning at the start of the audio signal (302). The length of this search window is an important factor in detecting S1 and S2 locations. Normal heart rate in healthy adults is usually between 60-100 BPMs. However, heart rates for newborns and children under the age of one can range from 100-180 BPMs for newborns and children under the age of one. If the window length is too small, the first S2 peak in the audio signal may be identified as the subsequent S1 peak (i.e., the S1 peak at the beginning of the next cardiac cycle). If the window length is too large, the subsequent S1 peak may not be found if the heart rate is at the higher end of the heart rate range. In one or more embodiments of the invention, two window lengths are used, a large window length and a small window length. The large window length, which is also the default window length, is used initially, and, as is explained in more detail below, if the use of this large window length fails to appropriately locate S1 and S2 peaks, the search window is decreased to the small window length and the audio signal is processed again using the smaller search window. Further, as is described in more detail below, a hop length (i.e., the distance to the starting location of the next search window) is decreased. In one or more embodiments of the invention, the large window length is 200 ms and the small window length is 100 ms.
The initial S1 peak may be identified by finding a maximum value, i.e., the amplitude of the highest peak, and a minimum value, i.e., the amplitude of the lowest peak, within the search window. If the difference between the maximum value and the minimum value is greater than a predetermined amount, the highest peak may be the initial S1 peak. If the difference between the maximum value and the minimum value is less than or equal to the predetermined amount, then the length of the search window is increased by a predetermined number of milliseconds and a new maximum value and minimum value are found. In one or more embodiments of the invention, this predetermined amount is 0.8 and the predetermined number of milliseconds is 50 ms.
The process of increasing the search window length and finding a new maximum value and minimum value is repeated until either a maximum value and a minimum value are found for which the difference is greater than the predetermined amount or a maximum length of the search window is reached. In one or more embodiments of the invention, this maximum length is 1200 ms. If the maximum length of the search window is reached without finding an acceptable maximum value and minimum value, then the maximum value within the maximum search window length is selected as a possible initial S1 peak if the maximum value is greater than a predetermined amount. If this maximum value is not greater than the predetermined amount, an error is indicated and processing of the audio signal terminates. In one or more embodiments of the invention, this predetermined amount is 0.25.
Once a peak that may be the initial S1 peak is located, this candidate peak is checked using time domain kurtosis to see if it may be a murmur peak. As one of ordinary skill in the art would know, an S1 (or an S2) may peak earlier than a murmur. In the methods described herein, this known early occurrence is exploited to distinguish an S1 peak (or S2 peak) from a later occurring murmur peak. Specifically, time domain kurtosis (i.e., kurtosis of the signal as it varies in the time domain) is used to distinguish an S1 peak (or S2 peak) from a murmur peak. Three kurtosis values are calculated in the time domain: a kurtosis (K) of the segment of the audio signal that is a predetermined number of milliseconds on either side of the candidate peak, a kurtosis (K1) of segment that is the predetermined number of milliseconds before the candidate peak, and a kurtosis (K2) of the segment that is the predetermined number of milliseconds after the candidate peak. In one or more embodiments of the invention, the predetermined number of milliseconds is 100. K is usually higher for an S1 peak (or an S2 peak) than for a murmur peak. Also, the difference between K1 and K2 for an S1 peak (or an S2 peak) is much larger than for a murmur peak. Accordingly, if K is greater than a predetermined value, V, or if the absolute difference between K1 and K2 is greater than a predetermined value, V2, then the candidate peak is not a murmur. Otherwise, the candidate peak is a murmur. In one or more embodiments of the invention, V is 4.0 and V2 is 6.0.
If the candidate peak is found to be a murmur peak, the search for the initial S1 peak is continued as described above with an increased search window length. The location of this murmur peak may also be stored for later use in providing additional diagnostic information to identify the murmur. If the candidate peak is not found to be a murmur peak, then it is identified as the initial S1 peak.
After identifying the initial S1 peak, the search window is moved by a sufficient number of milliseconds, i.e., a hop length, to a location before the subsequent S1 peak (i.e., the S1 peak at the beginning of the next cardiac cycle) (304). More specifically, the beginning of the search window is moved to a location that is a hop length away from the initial S1 peak. For purposes of locating the first S1 peak after the initial S1 peak, the length of this search window may be the same as the initial length of the search window used in identifying the initial S1 peak, i.e., either the large window length or the small window length. In one or more embodiments of the invention, the hop length is 400 ms if the large window length is used and 200 ms if the small window length is used.
Referring again to
If the maximum tolerance is reached without finding an acceptable peak, the maximum search window length is increased by a predetermined amount, the tolerance is returned to its initial value, and the above described search for an acceptable peak is repeated until either an acceptable peak is found or the maximum search window length reaches a predetermined length limit. In one or more embodiments of the invention, this predetermined amount is 100 ms and the predetermined length limit is 1200 ms.
If the predetermined length limit is reached without finding an acceptable peak, the maximum value, i.e., the amplitude of the highest peak, within the search window with a length of the predetermined length limit is found. If this maximum value is greater than a predetermined percentage of the amplitude of the previous S1 peak, then this highest peak may be the subsequent S1 peak. In one or more embodiments of the invention, this predetermined percentage is thirty-three percent. In some instances, a peak that is much smaller than the previous S1 peak may be the subsequent S1 peak. For example, if a ventricular septal defect is present, the subsequent S1 peak can be much smaller than the previous S1 peak. Also, improper recording or a change in auscultation location (i.e., where the stethoscope is placed on the chest) can cause variations in the amplitudes of S1 peaks. If this highest peak does not have sufficient amplitude, then if a murmur peak was found while identifying the previous S1 peak, the murmur peak is identified as the subsequent S1 peak. If no murmur peak was found, an error is indicated and the processing of the audio signal terminates.
Once a peak that may be the subsequent S1 peak is located, this candidate peak is checked using time domain kurtosis as described above to see if it may be a murmur peak. If the candidate peak is found to be a murmur peak, the search for the subsequent S1 peak is continued as described above with an increased search window length. The location of this murmur peak may also be stored for later use in providing additional diagnostic information to identify the murmur. If the candidate peak is not found to be a murmur peak, then it is identified as the subsequent S1 peak.
Referring again to
If the maximum value does not meet the criterion (or in embodiments in which the murmur check is performed, the maximum value is found to be a murmur peak, then a maximum value, i.e., the amplitude of the highest peak, is found for a segment that begins at the same location as above and ends a location determined by the sum of the location of the previous S1 peak and a predetermined percentage of the length of the search window in which the subsequent S1 peak was found. In one or more embodiments of the invention, this predetermined percentage is seventy-five percent. If this maximum value is greater than a predetermined percentage of the difference between the maximum value and minimum value found when identifying the initial S1 peak, then this highest peak may be the S2 peak. In one or more embodiments of the invention, this predetermined percentage is 12.5 percent.
If the maximum value does not meet this criterion, then if the previous S1 peak is the initial S1 peak, the previous S1 peak is actually an S2 peak that occurred at the beginning of the audio signal. Although not specifically shown in
If the previous S1 peak is not the initial S1 peak, then the peak at the location determined by the sum of the location of the previous S1 peak and an average distance between an S1 peak and an S2 peak may be the S2 peak. In one or more embodiments of the invention, the default average distance between an S1 peak and an S2 peak is 350 ms. As is explained in more detail below, the average distance may be adjusted as S1 and S2 peaks are located.
Once a peak that may be the S2 peak is located, this candidate S2 peak is checked to see if it is an S3 peak or an opening snap peak. This check may be performed as follows. First, the maximum value, i.e., the amplitude of the highest peak, is found for a segment that begins at a location determined by the sum of the location of the previous S1 peak and the predetermined duration of an S1 peak and ends at a location determined by the difference between the location of the candidate S2 peak and a predetermined percentage of the predetermined duration of an S2 peak. In one or more embodiments of the invention, this predetermined percentage is thirty-three percent. If this maximum value is less than a predetermined percentage of the amplitude of the candidate S2 peak, then the candidate S2 peak is not an S3 peak or an opening snap peak and is identified as the S2 peak. In one or more embodiments of the invention, this predetermined percentage is fifty percent.
If the maximum value meets the amplitude criteria, the new peak is checked using frequency domain kurtosis (i.e., kurtosis of the signal as it varies in the frequency domain) to determine whether it is a late systolic murmur peak. More specifically, kurtosis of the Fourier transform magnitude is used to determine if the new peak is due to a murmur. The magnitude of the Fourier transform of a segment beginning at the location of the candidate S2 peak is computed and the associated kurtosis measure, G1, is found. Similarly, the magnitude of the Fourier transform of a segment beginning at the location of the new peak is computed and the associated kurtosis measure, G2, is found. In one or more embodiments of the invention, the length of the segments is the nearest power of two to the length in samples that equals 50 ms of time. For example, the length is 512 if the sampling frequency is 11025 Hz and 256 if the sampling frequency is 4000 Hz. If the absolute difference between the geometric mean of G1 and G2 and the arithmetic mean of G1 and G2 is greater than a predetermined value and if G1 is greater than G2, then the new peak is identified as a possible murmur peak. In one or more embodiments of the invention, this predetermined value is 3.5.
If the new peak is not a found to be a murmur peak, then it is identified as the S2 peak and the candidate S2 peak is identified as a possible S3 peak or opening snap peak. In one or more embodiments of the invention, the location of the possible S3/opening snap peak may be stored for later use in providing additional diagnostic information regarding the presence of S3/opening snap peaks in the heart sounds. If the new peak is a possible late systolic murmur, then the candidate S2 peak is identified as the S2 peak. In one or more embodiments of the invention, the location of the late systolic murmur peak may be stored for later use in providing additional diagnostic information to identify the murmur.
Once the S2 peak is identified, a check is then made to verify that the distance between the previous S1 peak and the S2 peak is smaller than the distance between the S2 peak and the subsequent S1 peak (308). If this distance check fails, then different actions are taken depending on whether or not S1 and S2 peaks are being identified for the initial cardiac cycle or a subsequent cardiac cycle (320). If the initial S1 and S2 peaks of the first full cardiac cycle in the audio signal are being identified, then the previous S1 peak is actually an S2 peak that occurred at the beginning of the audio signal. The beginning of the search window is moved to a location that is a hop length away from the subsequent S1 peak. The subsequent S1 peak is identified as the initial/previous S1 peak (i.e., the S1 peak at the beginning of the initial cardiac cycle in the audio signal) (322) and the method loops back to (304) to repeat the identification of the subsequent S1 peak and the S2 peak.
If the initial S1 and S2 peaks are not being identified (320), then a check is made to determine if there are peaks between the previous S1 peak and the S2 peak that are not murmurs (324). More specifically, a check is made to determine if there is a valid S1 peak and a valid S2 peak between the identified previous S1 peak and the identified S2 peak. If valid S1 and S2 peaks are found, the length of the search window is too large. The processing of the audio signal is restarted (302) using the small window length and a smaller hop length. In one or more embodiments of the invention, the smaller hop length is one half of the hop length used with the large window length. Further, the average distances between peaks (discussed below) and the expected durations of S1 and S2 are also reset to smaller initial values. In one or more embodiments of the invention, the smaller initial values are one half of the initial values used with the large window length.
If valid S1 and S2 peaks are not found, then a check is made to determine if the distance between the previous S1 peak and the subsequent S1 peak is acceptable (326). This check is made because it is possible for an S2 peak to be selected as the subsequent S1 peak if the next S1 is a small peak. In one or more embodiments of the invention, a check is made to determine if the distance between the previous S1 peak and the subsequent S1 peak is within a predetermined percentage of an average distance between S1 peaks. In one or more embodiments of the invention, the default average distance between S1 peaks is 800 ms and, as is explained in more detail below, the average distance is adjusted as S1 peaks are located in the audio signal. Further, in one or more embodiments of the invention, the predetermined percentage is twenty percent.
If the distance is acceptable, then no change is made to the identified subsequent S1 peak and the method continues with timing based error correction (312) as described below. If the distance is not acceptable, a new maximum value found within an acceptable distance of the previous S1 peak, this new peak is identified as the subsequent S1 peak (328), and the method continues with timing based error correction (312) as described below. In one or more embodiments of the invention, the new maximum value is found in the segment beginning at a location determined by the sum of the location of the previous S1 peak and the difference between the average distance between S1 peaks and a predetermined percentage of the average distance (i.e., location+average distance−percentage of average distance) and ending at a location determined by the sum of the location of the previous S1 peak, the average distance between S1 peaks, and the predetermined percentage of the average distance (i.e., location+average distance+percentage of average distance). In one or more embodiments of the invention, the predetermined percentage is ten percent.
The check to determine if there is a valid S1 peak and a valid S2 peak between the identified previous S1 peak and the identified S2 peak may be done as follows. The two largest peaks, peak1 and peak2, between the previous S1 peak and the S2 peak are located, where peak1 refers to the peak closer to the S2 peak and peak2 refers to the peak closest to the previous S1 peak. If the difference in amplitude between the previous S1 peak and peak1 is greater than a predetermined amount or the difference in amplitude between the S2 peak and peak2 is greater than the predetermined amount, then there are no valid peaks between the previous S1 peak and the S2 peak and no other checking needs to be performed. In one or more embodiments of the invention, this predetermined amount is 0.3. Otherwise, if the distance between peak1 and peak2 is smaller than a predetermined percentage of the distance between the previous S1 peak and the S2 peak, then there are no valid peaks between the previous S1 peak and the S2 peak and no further checking needs to be performed. In one or more embodiments of the invention, this predetermined percentage is twenty-five percent.
If the distance between peak1 and peak2 does not meet this criterion, then a check is made to determine if peak1 and peak2 are murmur peaks. This check is made using time domain kurtosis. More specifically, the kurtosis, h1, of the segment beginning and ending a predetermined number of milliseconds on either side of the location of peak 1 is computed and the kurtosis, h2, of the segment beginning and ending the predetermined number of milliseconds on either side of the location of peak2 is computed. In one or more embodiments of the invention, this predetermined number of milliseconds is 75 ms. If the absolute value of the ratio of the maximum of h1 and h2 and the minimum of h1 and h2 is greater than a predetermined value, then peak1 and peak2 are murmur peaks and there are no valid peaks between the previous S1 peak and the S2 peak. In one or more embodiments of the invention, this predetermined value is 1.2. If the absolute value does not meet this criterion, then there are valid peaks between the previous S1 peak and the S2 peak.
Referring again to
Timing based error correction compares certain distances (i.e., amount of time elapsed) between the previous S1 peak, the S2 peak, the subsequent S1 peak, and/or the previous S2 peak (i.e., the S2 peak identified for the previous cardiac cycle) against expected distances between such peaks. If an actual distance exceeds an expected distance by more than a predetermined threshold, an attempt is made to locate a peak that is within the expected distance. If such a peak is located, it is identified as the subsequent S1 peak or the S2 peak, depending on which distance is being checked. Further, if changes are made to either the subsequent S1 peak or the S2 peak during the correction process, information regarding the changes may be stored for later use in providing additional diagnostic information related to the identification of murmurs. For example, if any subsequent S1 peak is corrected, this correction may be indicative of aortic stenosis. In addition, if S2 peaks are corrected, aortic regurgitation may be present.
In one or more embodiments of the invention, timing based error correction is performed as follows. Initially, the distance between the previous S1 peak and the subsequent S1 peak is checked. If the distance is not within a predetermined percentage of the average distance between two S1 peaks, then a search is performed for a nearby peak that meets this criterion and could possibly be the “real” subsequent S1 peak. In one or more embodiments of the invention, the predetermined percentage is twenty percent. Further, in one or more embodiments of the invention, the average distance between two S1 peaks is initially set to 800 ms. This average distance is updated using the actual distance between the previous S1 peak and the subsequent S1 peak after the time based error correction process is complete.
To locate a nearby peak, the highest peak in the segment that begins at the location determined by the sum of the location of the previous S1 peak and the average distance between S1 peaks less the predetermined percentage of the average distance (location+average distance−percentage of average distance) and ends at the location determined by the sum of the location of the previous S1 peak, the average distance, and the predetermined percentage of the average distance (location+average distance+percentage of average distance). If the amplitude of the new peak is greater than a predetermined percentage of the previous S1 peak, then the new peak is identified as the subsequent S1 peak. Otherwise, the subsequent S1 peak is not changed. In one or more embodiments of the invention, the predetermined percentage is thirty-three percent.
Next, the distance between the previous S2 peak and the S2 peak is checked. If the distance is not within a predetermined percentage of the average distance between two S2 peaks, then a search is performed for a nearby peak that meets this criterion and could possibly be the “real” S2 peak. In one or more embodiments of the invention, this predetermined percentage is twenty percent. Further, in one or more embodiments of the invention, the average distance between two S2 peaks is initially set to 800 ms. This average distance is updated using the actual distance between the previous S2 peak and the S2 peak after the time based error correction process is complete.
To locate a nearby peak, the highest peak in the segment that begins at the location determined by the sum of the location of the previous S2 peak and the average distance between S2 peaks less the predetermined percentage of the average distance (location+average distance−percentage of average distance) and ends at the location determined by the sum of the location of the previous S2 peak, the average distance, and the predetermined percentage of the average distance (location+average distance+percentage of average distance). If the amplitude of the new peak is greater than a predetermined percentage of the previous S2 peak, then the new peak is identified as the S2 peak. Otherwise, the S2 peak is not changed. In one or more embodiments of the invention, the predetermined percentage is thirty-three percent.
Next, the distance between the previous S1 peak and the S2 peak is checked. If the distance is not within a predetermined percentage of the average distance between an S1 peak and an S2 peak in the same cardiac cycle, then a search is performed for a nearby peak that meets this criterion and could possibly be the “real” S2 peak. In one or more embodiments of the invention, the predetermined percentage is twenty percent. Further, in one or more embodiments of the invention, the average distance between an S1 peak and an S2 peak in the same cardiac cycle is initially set to 350 ms. This average distance is updated using the actual distance between the previous S1 peak and the S2 peak after the time based error correction process is complete.
To locate a nearby peak, the highest peak in the segment that begins at the location determined by the sum of the location of the previous S1 peak and the average distance between an S1 peak and an S2 peak in the same cardiac cycle less the predetermined percentage of the average distance (location+average distance−percentage of average distance) and ends at the location determined by the sum of the location of the previous S1 peak, the average distance, and the predetermined percentage of the average distance (location+average distance+percentage of average distance). If the amplitude of the new peak is greater than a predetermined percentage of the previous S2 peak, then the new peak is identified as the S2 peak. Otherwise, the S2 peak is not changed. In one or more embodiments of the invention, the predetermined percentage is thirty-three percent.
Finally, the distance between the S2 peak and the subsequent S1 peak is checked. If the distance is not within a predetermined percentage of the average distance between an S2 peak in one cardiac cycle and the S1 peak of the next cardiac cycle, then a search is performed for a nearby peak that meets this criterion and could possibly be the “real” subsequent S1 peak. In one or more embodiments of the invention, the predetermined percentage is twenty percent. Further, in one or more embodiments of the invention, the average distance between an S2 peak in one cardiac cycle and the S1 peak of the next cardiac cycle is initially set to 450 ms. This average distance is updated using the actual distance between the S2 peak and the subsequent S1 peak after the time based error correction process is complete.
To locate a nearby peak, the highest peak in the segment that begins at the location determined by the sum of the location of the S2 peak and the average distance between an S2 peak in one cardiac cycle and the S1 peak of the next cardiac cycle less the predetermined percentage of the average distance (location+average distance−percentage of average distance) and ends at the location determined by the sum of the location of the S2 peak, the average distance, and the predetermined percentage of the average distance (location+average distance+percentage of average distance). If the amplitude of the new peak is greater than a predetermined percentage of the subsequent S1 peak, then the new peak is identified as the subsequent S1 peak. Otherwise, the subsequent S1 peak is not changed. In one or more embodiments of the invention, the predetermined percentage is thirty-three percent.
Referring again to
If valid S1 and S2 peaks are not found between the previous S1 peak and the S2 peak, then if the end of the audio signal has not been reached (316), the next S2 peak and S1 peak in the audio signal are located. The beginning of the search window is moved to a location that is a hop length away from the subsequent S1 peak. The method then loops back to identify the next S1 peak and S2 peak in the audio signal (304) as described above. Note that the subsequent S1 peak becomes the previous S1 peak in the new iteration.
If the end of the audio signal has been reached (316), then the heart rate and/or other diagnostic information may be calculated and displayed (318) in a PCG. In addition, the locations of the S1 and S2 peaks may demarcated in the PCG using symbols, colors, and/or any other suitable demarcation scheme. Further, in one or more embodiments of the invention, the heart rate and/or other diagnostic information may also be calculated and displayed along with the PCG as the audio signal is being analyzed rather than waiting until end of the signal is reached.
The heart rate may be determined based on the number of S1 peaks located in the audio signal and the sampling frequency of the signal. More specifically, if Ls is the number of S1 peaks, Fs is the sampling frequency of the audio signal, x is the location of the first S1 peak in the signal, and y is the location of the last S1 peak in the signal, then the heart rate is equal to ((Ls−1)*60*Fs)/(y−x) BPM.
The other diagnostic information that may be calculated and displayed depends upon what information may have been stored during the analysis of the audio signal. For example, the types of murmurs are generally indicated by where in the cardiac cycle the murmur is located. For example, a diastolic murmur sound occurs after the S2 sound, a systolic murmur sound occurs between the S1 sound and the S2 sound, with an early systolic murmur sound occurring close to the S1 sound and a late systolic murmur sound occurring close to the S2 sound. If the locations of potential murmurs as detected by the previously described kurtosis computations are stored, this information can be used in conjunction with S1 and S2 locations to help determine what type of murmur is present. Information saved during timing based error correction regarding correction of S1 and S2 peaks may also be used to provide diagnostic information. As previously mentioned, if any S1 peak is corrected by the timing based error correction, aortic stenosis may be indicated. Further, if S2 peaks are corrected by timing based error correction, aortic regurgitation may be indicated. In addition, once a murmur peak is located, it is possible to provide the time duration of the murmur and information regarding the intensity and frequency content of the murmur.
Turning now to
With the definitions provided in Table 1, the flow graph in
Initially, an audio signal of heart sounds is received and normalized and t is set to 1 (400). A peak is then located within a search window that meets the criteria for being the initial S1 peak (i.e., S1 (t)) in the signal (401-406). The located peak is then tested to see if it is a murmur peak (407). The test for a murmur peak is described below in reference to
When the initial S1 peak is located (409), the search window is moved (410), and a peak is located within the search window that meets the criteria for being the next S1 peak (i.e., S1 (t+1)) in the audio signal is located (411-423). The located peak is then tested to see if it is a murmur peak (424). The test for a murmur peak is described below in reference to
When the next S1 peak is located (426), a peak between the previous S1 peak and the next S1 peak is located that meets the criteria for being the S2 peak (427-435). Once this candidate S2 peak is located, the candidate S2 peak is checked to see if it is actually an S3 peak or a late systolic murmur peak (436-439). If the candidate S2 peak is found to be an S3 peak or a late systolic murmur peak, another peak is identified as the S2 peak (440). Otherwise, the candidate S2 peak is accepted as the S2 peak, pending timing based error correction. The checking of the candidate S2 peak to see if it is a late systolic murmur includes performing frequency domain kurtosis (438-439).
Once the S2 peak is located, further checks are performed to ensure that the peaks located for the previous S1 peak, the next S1 peak, and the S2 peak are actually the previous (or initial) S1 peak, the next S1 peak, and the S2 peak (441-448). One of the checks that may be performed is a check to see if there are peaks between the previous S1 peak and the S2 peak that are not murmurs (444-445), i.e., that there are peaks between peaks selected as the previous S1 peak and the S2 peak that may also be S1 and S2 peaks. The check for non-murmur peaks is performed only if the distance between the previous S1 peak and the S2 peak is greater than the distance between the S2 peak and the next S1 peak. This check for non-murmur peaks is described below in reference to
If the further checks are successfully completed, then if the first iteration of the S1/S2 location process has been completed (449) (i.e., the S1 peak and S2 peak for the first cardiac cycle in the audio stream have been located), timing based error correction is performed to further ensure that the peaks located for S2 and the next S1 are the correct peaks (450-465). As was previously discussed, timing based error correction uses various average distances between S1 and or S2 peaks to verify the current selections for the S2 peak and the next S1 peak. After timing based error correction is performed, the average distances are updated based on the locations of the S1 and S2 peaks located in the current iteration (466).
A final check is then made to ensure that the peaks located for previous S1 peak and the S2 peak are actually the previous (or initial) S1 peak and the S2 peak (441-448). This final check is a check to see if there are peaks between the previous S1 peak and the S2 peak that are not murmurs, i.e., that there are peaks between peaks selected as the previous S1 peak and the S2 peak that may also be S1 and S2 peaks (467-468). This check for non-murmur peaks is described below in reference to
If non-murmur peaks are not found between the previous S1 peak and the S2 peak, and the end of the audio signal has not been reached (469), the method loops back to (410) to locate the next S2 peak and the next S1 peak in the audio signal. If the end of the audio signal has been reached, then the heart rate and other diagnostic information may be calculated and displayed in a PCG of the audio signal (470). In addition, the locations of the S1 and S2 peaks may demarcated in the PCG using symbols, colors, and/or any other suitable demarcation scheme.
Embodiments of the methods described herein may be implemented on virtually any type of computing system. For example, as shown in
Further, those skilled in the art will appreciate that one or more elements of the aforementioned computer system (1900) may be located at a remote location and connected to the other elements over a network. Further, embodiments of the invention may be implemented on a distributed system having a plurality of nodes, where each portion of the system and software instructions may be located on a different node within the distributed system. In one embodiment of the invention, the node may be a computer system. Alternatively, the node may be a processor with associated physical memory. The node may alternatively be a processor with shared memory and/or resources. Further, software instructions to perform embodiments of the invention may be stored on a computer readable medium such as a compact disc (CD), a diskette, a tape, a file, or any other computer readable storage device.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.
It is therefore contemplated that the appended claims will cover any such modifications of the embodiments as fall within the true scope and spirit of the invention.
The present application claims priority to U.S. Provisional Patent Application No. 61/023,581, filed on Jan. 25, 2008, entitled “Robust Heart Rate Detection In The Presence of Pathological Conditions.”
Number | Date | Country | |
---|---|---|---|
61023581 | Jan 2008 | US |