The present invention generally relates to systems and methods for distinguishing between sounds and, in particular, to systems and methods for determining a volume of ingested fluid by monitoring swallowing sounds.
One of the major concerns of a nursing mother is the verification that her infant has ingested an appropriate amount of breast milk. The ingestion of the appropriate amount of milk provides feedback to the mother on level of hydration, fullness of the infant's stomach, and the infant's general comfort. Solving this problem would allow the mother to feel more secure and confident about the infant's feeding, and could signal a potential problem if the infant is not receiving enough milk.
There is a need for a system and method to monitor fluid intake of an individual and provide an indication of the intake volume so that an assessment can be made about the individual's status and needs.
In one form, the invention determines the number of swallows to indicate the volume of fluid ingested. The system and method acoustically monitors the number of times a swallow occurs during feeding using a acoustic sensor coupled with conditioning and processing. By multiplying the number of swallows by the volume per swallow, the total volume of fluid ingested can be determined.
Other objects and features will be in part apparent and in part pointed out hereinafter.
Corresponding reference characters indicate corresponding parts throughout the drawings.
Newborns consume about 30 to 90 ml (1 to 3 US fluid ounces), and after the age of four weeks, infants consume about 120 ml (4 US fluid ounces) per feed. Each infant is different, and as it grows the amount will increase. The system and method of the invention assists in determining an infant's intake, particularly during breast feeding.
A processor 114, such as a microprocessor or a personal computer, receives the conditioned signal 112. Alternatively, the processor 114 may be a DSP (digital signal processor) chip such as TI-6713 manufactured by Texas Instruments, Inc. The processor 114 is programmed with algorithms and/or instructions for discriminating swallowing sounds from other detected sounds and determining a number of swallows indicated by the conditioned signal 112. In addition, the processor 114 is programmed with instructions for calculating the volume of the fluid ingested as a function of the determined number of swallows. An output indicator, such as an audio device or a display 116, driven by the processor 114 provides an indication (such as number of swallows and total volume consumed), which indication corresponds to the calculated volume. It is also contemplated that the A/D converter 111C may be an integral part of the processor 114.
In one embodiment, the sensor 104 may be a digital sensor which directly provides a digital signal 108D to the processor 114 so that signal conditioning circuit is optional and not employed. In this embodiment, the processor 114 would be programmed with filtering instructions.
In the event that sensor 104 is a microphone, a sensor driver 118 may be employed to excite its membranes in order to convert mechanical energy to electrical energy. It is contemplated that the sensor driver 118 may be different for various types of microphones. For example, a piezoelectric accelerometer, which requires a constant current source, may be used as the sensor 104. One embodiment of a microphone circuit for a piezoelectric accelerometer is shown in
Those skilled in the art will recognize that several types of microphones may be used for sensor 104. In one embodiment, contact microphones, or accelerometers, or vibration sensors may be employed because of the higher S/N ratio such devices provide. An example of such an accelerometer is model no. 352C22 manufactured by PCB Piezotronics. Both qualitative and quantitative observations may be used in the course of making a hardware selection for the sensor.
A power supply 120, such as a battery or an external source, may be used to provide power to the components of the system 102. Also, a start/reset device 122 such as a switch or button may be connected to the processor to turn on, reset or turn off the system 102.
In one embodiment, the invention comprises a method for measuring a volume of a fluid ingested by swallows, comprising:
Noise in an acoustic system such as this may be defined as any electronic signal other than the “sound of liquid swallowing”. Hence, as described below, an acoustic signature of vocalization, breathing or air swallowing could also be considered as noise. Although these signatures may have a similar amplitude and frequency range as acoustic signatures of swallowing, it has been found that the sound of swallowing has a very distinct wave shape as observed from time domain and frequency-time domain analysis (see below), as compared to the signature of various noises.
Quantitatively, noise could be defined as background noise or electronic white noise. In some embodiments, the sensor should be selected such that signal 108 corresponding to a swallow has an amplitude which is above this background noise. For example, the S/N ratio can be calculated as:
Where P=power and A=amplitude (e.g., current, voltage).
From qualitative analysis and a familiarity of the sensing approaches noted below, a S/N ratio below 10 dB was found to be unidentifiable. Most signals that were identifiable had S/N ratios above 15 dB.
Microphone placement may also be an important consideration when designing such a system 102. In order to develop a base line system and prove feasibility of detecting sounds of swallowing, a piezoelectric accelerometer may be selected. In one embodiment, this accelerometer may be taped on the front of an infant's neck. However, subsequent studies suggest that similar acoustic data could be collected through a condenser microphone placed inside a breast feeding pillow. Thus, as illustrated in
In one embodiment, a hardware filter such as band pass filter 110A may be employed to eliminate signals outside the desired range of frequency (500 Hz<f<5 kHz). One embodiment of such a band pass filter is shown in
To prove feasibility, a “Behringer UB802” mixer may be used to filter and amplify acoustic signals. As indicated earlier, acoustic signals of swallowing lie between 500 Hz and 5 kHz. Hence, signals below 500 Hz (i.e., sounds of breathing, heartbeat, etc.) could be eliminated by adjusting the gain of this frequency band to −12 dB. All signals above 5 kHz (i.e. sounds like higher pitch vocalization noises, high frequency electronic noises, etc.) could be eliminated by adjusting the gain of this frequency band to −12 dB. Finally, all signals that belonged within the desired frequency range of 500 Hz-5 kHz were amplified 10-12 dB by the amplifier 110B.
In one embodiment, the amplifier 110B may be configured to amplify the signal within the desired range, such as by using a two stage amplification circuit. Two stage amplification helps in achieving the desired S/N ratio. All signals within the desired frequency range of 500 Hz-5 kHz are amplified 10-12 dB after the band pass filter 110A by a dual stage amplifier such as the amplifier 110B, as illustrated in
In one embodiment, the analog to digital converter 110C may include a 24 bit sound card. A simplified version of a 2 bit flash A/D converter is shown in the
According to one embodiment of the invention, acoustic data is collected during breast feeding and analyzed to identify the acoustic features of interest and, particularly, the sound of swallowing. The acoustic data may be classified into several categories, including but not limited to:
1) Milk Swallow, including only swallows of a substantial volume of milk;
2) Air Swallow, including dry swallows or other swallows that do not involve the ingestion of milk;
3) Breathing, including inhaling and exhaling;
4) Vocalization, including vocal cord sounds from the infant or others; and
5) Noise, including scratching and other ambient noises.
Each of these categories of data may be analyzed in time and frequency domains. Some of the distinct features observed in these classes are illustrated in figures below.
Milk Swallow: In the time domain an acoustic signature of a milk swallow is divided into three distinct parts: a “click”, a “chug” and a “click”, as illustrated in
In the frequency domain, it was observed that the sound of swallows fall between 500 Hz-5 kHz. In time and frequency domains, a pattern unique to milk swallow was also observed, as shown in
Vocalization: This category of data includes sounds of crying, coughing, etc. Although vocalization was observed to have a similar frequency as milk swallow, it was observed that the waveform of the signal is different in time and frequency-time scales, as shown in
Noise: This category included background noises such has scratching caused due to movement, electronic noise and other signals that could not be classified into categories mentioned above.
The peak frequency of such signals was different from other categories of acoustic data, as illustrated in
1) variations within different infants;
2) variations with the same infant;
3) microphone placement that affects S/N ratio; and
4) unpredictable ambient noise like scratching, etc.
In one embodiment, the processor 114 may be trained to detect an acoustic signature of milk swallows. Alternatively and in addition, an algorithm based on the observations described above may be used to develop an inclusion criterion for detecting an acoustic signature of milk swallows, as illustrated in
Parameters like peak amplitude and duration of a “click”, a “chug” and a “click” were used as parameters as shown in
In one embodiment, it is contemplated that sounds of swallowing may be classified using a wave-shape type of algorithm which tends to work well with this type of data set. However, such wave-shape algorithm may be sensitive and could fail when subjected to a highly variable environment. Hence, there may be a need to implement a more robust algorithm
In one embodiment, it is contemplated that sounds of swallowing may be classified using a probabilistic model (e.g., HMM) as noted herein. This computes probability of occurrence of an event/state based on a large training set of data. However, although this algorithm was fairly robust, it may be computationally intensive. Hence, a third approach may be an HMM/ANN hybrid technique including elements from the wave shaping algorithm.
In accordance with the above, the processor 114 executes determining instructions 200 as illustrated in
Although developing such algorithms as noted above can be successful in detecting and discriminating sounds of liquid swallows, there are other techniques and/or approaches contemplated for developing an algorithm. One such approach is based on Automatic Speech Recognition (ASR), and uses Hidden Markov Models (HMM) to create instructions for execution by processor 114 that are capable of identifying different sounds present during nursing. For example, other alternatives in use in ASR fall into the areas of discriminate analysis (linear, multiple, non-linear), pattern recognition, and pattern classification. Another such approach is based on artificial neural networks for processing and discriminating sounds of liquid swallows. Another common alternative in ASR is also the use of Artificial Neural Networks (ANN). ANN's are trained on a large body of data, “learning” the patterns of different kinds of signals, and classifying test pattern to the best matching pattern. Once trained, they are computationally efficient and easy to implement. Hidden Markov Models are more complex, but typically are used to handle variable duration signals, such as in swallowing. Both methods are robust to a certain degree of missing features or noise. One contemplated embodiment uses a hybrid of the ANN computational efficiency and the HMM variable duration modes.
The adaptive speech recognition approach uses a large set of data to train a processor for specific parameters. This trained processor can then stochastically predict acoustic signatures of swallowing. This approach is very useful for a large data set with a lot of variations.
In one embodiment, Automatic Speech Recognition (ASR), and Hidden Markov Models (HMM) are used to create instructions that are capable of identifying different sounds present during nursing.
In one approach from two clinical studies, training samples of five classes of sounds were isolated: Milk swallow; Air swallow; Breathing; Vocalization (crying, grunting, etc); and Other.
Automatic Speech Recognition includes feature extraction approaches which tend to extract and reduce the dimensionality of the speech signal to mediate online processing. The following is a description of one embodiment of an ASR approach using the swallowing sound example illustrated in
Often the first step in decomposing the spectral content of a speech signal is the Short Time Frequency Transform (STFT), where time is along the x-axis and frequency along the y-axis, with the DC signal at the bottom. In this example, 256 channels are used. A common approach is then to convert these to the cepstral coefficients, typically 12 channels plus one for the power. The cepstral coefficients have the advantage that they are largely uncorrelated, which is advantageous for pattern recognition.
Thereafter, it is common to apply the RASTA algorithm for environmental adaptation and simulating auditory masking. The RASTA output is normally weighted to give higher bands more equal footing with the power. Lastly, it is common to also calculate the “delta-ceps”, namely, the trajectory or direction vector of the coefficients over time, and similarly to weight them with a gain factor.
In typical ASR applications, one may use the RASTA and the delta-ceps together to give a 26-feature vector (13 from each) at each time window, typically 10 ms (100 Hz). However, the adaptation afforded by the RASTA algorithm may diminish the effect of the power levels too severely. Therefore, this current example will only use the first two cepstral coefficients for simplicity and demonstration.
Having extracted the features from the training set, recognition models for different types of sounds are created. In ASR, where there is a large volume of available training data, sub-models are built that recognize short building blocks of words (e.g. phonemes). Because of the limited amount of training data in this example, a recognition model for each training sound is created.
The basis for the models used in this example is the Hidden Markov Model (HMM). Rather than describe the theory, which can easily be found elsewhere, the example will describe an implementation. It is assumed for this example that the process of swallowing consists of a number of states (i.e. of the esophageal opening, contractions, etc) which, in concert, perform the sequential actions of swallowing, but cannot be directly observed—hence, they are referred to as hidden states. Each state has a probability of staying in the current state, or moving into the next state of swallowing. In this way, the model captures variability in duration of each stage. Moreover, each state can result in a variety of sounds (or none!). So, a HMM is a doubly probabilistic model of a system.
In the analysis of sounds to determine swallowing, two of the common problems to which HMM are applied are addressed: 1) learning the parameters of an HMM that fit an observation or group of observations, and 2) recognition of a given observation as belonging to a particular model.
In this example, HMM coefficients were determined from 39 training samples. The HMM states for the example sound is illustrated in
Again, because of the limited sample size, full cross validation, or the “leave one out” method may be performed. For example, the first swallowing sound was tested against all the models, except for the model that was built on this sound. Each of the 39 sounds is subsequently tested in the same way.
For each sound, a model is found which provides a match, and the model class (e.g. one of the milk swallowing models) is compared to the “true” classification of the sound. This gives resulting true and false positives which can be evaluated.
In summary, the system and method of the invention addresses the problem of ambiguity in the amount of milk ingested by an infant during nursing. In one embodiment, the invention comprises a small microphone that would be placed on or near an infant's throat to pick up the audio signal of each swallow and a receiver that gathers the signal and displays a result. The microphone picks up the audio signal from each of the infant's swallows. As noted above, various analog and/or digital audio filtering techniques may be applied so that the receiver is sensitive to the characteristic signal from a swallow. Other external ambient audio signals are ignored to avoid affecting the measurement. The number of swallows during a feeding is counted. The beginning and ending of the feeding may be triggered by a switch or button activated by the mother.
The following non-limiting examples are provided to further illustrate various options of the present invention. There could be several variations of the display and processor. For example, the sensor 104 may be wireless and transmit its signal to the filter 110A or to the processor 114, if the signal is digital. The swallow volume may be tuned or changed depending on the infant's age or size. It is contemplated that the sensor 104 could be enclosed in a soft, loosely held strap located around the infant's neck, or be placed under the infant, or be attached to the infant's clothing, or not contact the infant at all, depending on the sensitivity and selectivity of the sensor and filter. The processor 114 may have an internal memory that stores the fluid volume from previous feedings, allowing the mother to evaluate a longer-term window of feeding and hydration and enabling trend monitoring.
In one example, the display 116 may be provided with a red/green light indicator that turns green as it processes/detects sounds of swallowing. This reassures the mother that feeding is going on properly. This device will not require any user inputs.
In another example, the processor 114 processes sounds of swallowing and the display 116 displays the number of swallows and estimates the volume of milk ingested by multiplying the number of swallows with the average milk ingested by an infant during each swallow. The processor 114 may be programmed to receive specific user inputs like age, weight and/or gender, etc. of the infant in order to adjust the processing of the sounds and/or the calculations. For example, with increased age and/or weight, the average volume per swallow may be increased. Also, the volume per swallow may be different for males and females, also depending on weight and/or age.
One advantage to this acoustic system is that it operates passively and no external signal (such as ultrasound analysis) is transmitted from the system to the infant to monitor swallowing. Thus, the infant does not have to be subjected to any external signals.
When introducing elements of the present invention or the preferred embodiments(s) thereof, the articles “a”, “an”, “the” and “said” are intended to mean that there are one or more of the elements. The terms “comprising”, “including” and “having” are intended to be inclusive and mean that there may be additional elements other than the listed elements.
In view of the above, it will be seen that the several objects of the invention are achieved and other advantageous results attained.
As various changes could be made in the above constructions, products, and methods without departing from the scope of the invention, it is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative and not in a limiting sense.
Having described the invention in detail, it will be apparent that modifications and variations are possible without departing from the scope of the invention defined in the appended claims.