In modern medical practices physicians often dictate medical reports into a telephone transcription center. As a physician dictates a report via telephone into the transcription center the report is recorded into an audio file, typically in digital format, and stored into a database. The transcription center will stop recording audio when it receives a hang up signal from the telephone system. The recorded audio file is then accessed either by a manual transcriptionist for transcribing, or more recently, the audio file is processed by a speech recognition engine. In the latter case, the speech recognition engine processes the audio into text and prepares a written report of the original dictated report.
It has been found that the transcription center will sometimes continue to record audio even after the physician has completed the dictation and hung up the telephone. In these cases the transcription center does not recognize a hang up signal from the telephone system or the telephone fails to immediately send a hang up signal and close the telephone line. This results in audio files recorded at the transcription center that often include anomaly portions, e.g. include hang up sounds, busy signals, fast busy signals and extended superfluous “dead” air portions. It is not uncommon for audio files to contain a limited amount of actual valuable audio, i.e., medical dictation, and a substantial portion of anomaly signals. In some cases it has been found that the anomaly portion of any particular audio file can be up to and even more than ten minutes in length.
Audio files containing anomalies can waste valuable database, transcriptionist and overall transcription system resources. Further, such audio files have been known to crash speech recognition engines as well as providing poor samples for acoustic signatures for a particular physician during speech recognition engine training.
Various systems have been implemented to solve this problem. For example there are systems which detect of a busy signals, however such system tend to run to run Fast Fourier Transform (“FFT”) or some of kind of signal detection, it might be a little slow to do that or you need special hardware to do that. It is desirable to provide a system that performs the searching and detection of anomalies in sound recordings in fast and efficient manner and without the need for additional expensive software or hardware.
The present invention includes a system and method for detecting and repairing audio recordings that contain busy signals and extended periods of silence by searching for clusters of silence by reviewing the amplitude in an audio recording sample and listing each silence and sample time.
In a first aspect, the present invention includes a method for detecting and repairing audio recordings, where the method includes comprising detecting an anomaly in the audio recording, detecting at least one silence in the audio recording, determining the length of the at least one silence, detecting a busy signal in the audio recording, and identifying the detected busy signal. In some embodiments the present invention includes identifying fast or slow busy signals or identifying the rate of the busy signals, where in some embodiments a rate of X equal fast and a rate of Y equals slow.
The present invention may also include the steps of detecting a final silence and detecting a hang-up silence. In some embodiment detecting a hang-up silence may include the step of determining the location of said Hang-up silence in the audio recording or determining the location of the hang-up silence by subtracting the length of the detected busy signal from the location of the first inter-busy signal.
In a second aspect, the present invention includes a method for fixing busy signals found in an audio recording, where the method includes the steps of determining whether said busy signal is fixable, determining whether fast busy signal equals a pre-determined value X, determining whether slow busy signal equals a pre-determined value Y, determining whether hang-up silence equals a pre-determined value Z, determining whether Final silence equals a pre-determined value XX, determining spacing between said fast busy signal and slow busy signal and determining spacing between fast busy signal and final silence. In some embodiments X, Y, Z and XX have the following values: X=1; Y=1; Z=1; and XX=1.
In a third aspect, the present invention includes a method for fixing silences located in an audio recording, the method including the steps of determining whether a silence is fixable, determining whether fast busy signal equals a pre-determined value X, determining whether slow busy signal equals a pre-determined value Y, determining whether hang-up silence equals a pre-determined value Z, determining whether Final silence equals a pre-determined value XX determining spacing between said fast busy signal and slow busy signal, and determining spacing between fast busy signal and final silence In some embodiments X, Y, Z and XX have the following values: X=0; Y=0; Z=0; and XX=1.
In a fourth aspect, the present invention includes a method for identifying a busy signal or a final silence, the method including the steps of determining whether said busies or final silence exist, determining whether fast busy signal equals a pre-determined value X, determining whether slow busy signal equals a pre-determined value Y, determining whether hang-up silence equals a pre-determined value Z, determining whether Final silence equals a pre-determined value XX, determining spacing between said fast busy signal and slow busy signal, and determining spacing between fast busy signal and final silence. In some embodiments X, Y, Z and XX have the following values: X>O OR Y>0; Z=1; and XX>0.
In a fifth aspect, the present invention includes a method for fixing anomalies such as busy and fixing silence routines, in audio recordings, where the method includes the steps of truncating said audio recording at pre-determined anomaly point and outputting said truncated audio recording. In some embodiments the truncating step includes an anomaly point equal to the hang-up point or an anomaly point equal to the first silence point. IN some embodiment the truncated audio recording is output to a speech recognition engine.
In a sixth aspect, the present invention includes a computer apparatus structured and arranged to detect and repair audio recordings that contain busy signals and extended periods of silence by searching for clusters of silence by reviewing the amplitude in an audio recording sample and listing each silence and sample time.
The above advantages and features are of representative embodiments only, and are presented only to assist in understanding the invention. It should be understood that they are not to be considered limitations on the invention as defined by the claims, or limitations on equivalents to the claims. Additional features and advantages of the invention will become apparent from the drawings, the following description, and the claims.
While the specification concludes with claims particularly pointing out and distinctly claiming the present invention, it is believed the same will be better understood from the following description taken in conjunction with the accompanying drawings, which illustrate, in a non-limiting fashion, the best mode presently contemplated for carrying out the present invention, and in which like reference numerals designate like parts throughout the figures, wherein:
For simplicity and illustrative purposes, the principles of the present invention are described by referring mainly to exemplary embodiments thereof. However, one of ordinary skill in the art would readily recognize that the same principles are equally applicable to, and can be implemented in, all types of network systems, and that any such variations do not depart from the true spirit and scope of the present invention. Moreover, in the following detailed description, references are made to the accompanying figures, which illustrate specific embodiments. Electrical, mechanical, logical and structural changes may be made to the embodiments without departing from the spirit and scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense and the scope of the present invention is defined by the appended claims and their equivalents.
In practicing the invention audio recordings in the form of digitized samples may be analyzed in certain predetermined amount so time, for example, every 10-100 milliseconds. Each sample may be analyzed to determine the amplitude and to search for strings or series of amplitudes or that are very near to zero. A first review of the samples may search for clusters of silence in the audio recording data where certain speech signals can be located along with associated gaps therein. A list or index may be prepared identifying the location of non-silence and silences in the speech, along with the time period and number of each. In some embodiments, certain pre-determined minimum and maximum lengths of non-silence and/or silences can be set so that non-silences and silences in speech outside the pre-determined lengths can be bypassed during the search.
As an example, busy signal located within an audio recording will likely be characterized as a very regular series of silences. Such an example is easily identifiable as a slow busy signal by locating a series of silences having a pre-determined amount of samples in length. A fast busy signal may be identified by locating a silence period in between the individual signals of the fast busy signals. By using such exemplary criteria it may be possible to locate repeated periods of silences, which periods of silence may have a pre-determined regular length as well. Using the above described exemplary steps it is possible to determine where a fast busy signal is located within an audio file apart from the location of a slow busy signal. It is further possible by use of the present invention to determine the presence of a sequence of a particular pattern in each of the slow busy and fast busy signals as well as mixed patterns including alternating fast and slow busy signals. Such determinations may provide easily obtainable locations for possible truncation points in the audio file and repair the audio file for speech recognition processes.
In utilizing the present invention it is essential that you obtain an audio recording. An audio in digital format is preferable, but not critical. There are embodiment of the present invention that may be implement for use with analog recordings. Also preferable is a recording that is encoded with certain pre-determined characteristics of busy signals. The invention is designed for US versions of PBX and telephone systems the present invention may be implement for other non U.S. telephone systems, such as European systems where the duration of certain busy signals is different than in standard U.S. practice.
Turning now to
The system in box 30 identifies strings of busy signals (described below). The search determines the lengths of silences and determines the location of and length of such silences for comparison against a pre-determined amount of time (or length). An example of a code for searching samples is around amplitude of 0 to identify silences and to search for lengths of silences the code is around 50 samples of silence. In box 30 the system identifies a busy signal period determining the locations of strings of silences that are about the same length. Alternatively, the search may be conducted for strings in the relative silence. Also, alternatively, if busy signals are not located, the absolute silences may be strings searched for of silences. The location of any strings of silences at or near the same length may indicate the location of data in the audio recording which may be truncated at the end of the process.
In box 40 the system determines the location of a specified period, likely at the end or near the end of the audio file, which is at or near absolute silence.
In box section 50, the system determines the location of the final hang-up silence. The hang-up silence is the silence period just after the call is terminated and just before the first busy signal starts. Since this system locates the first of the evenly spaced silence periods in between the busy signal tones, the system looks for the short silence period starting one evenly spaced period before the first silence period.
In box 60 the system determines whether any busy signal identified is fixable. In some embodiments, if a slow busy is found, a fast busy is found, a hang-up silence is found, and a final silence is found, and there is a certain ordering of the events, then the busy signal is determined to be fixable.
If the busy is determined as a fixable busy signal, then the system attempts to fix the audio recording by properly truncating the audio recording at a pre-determined location (discussed below) in box 70. In determining whether a signal is fixable, a sequence of experienced events in the recording is compared a set of pre-determined events. For example, if the system determines a slow busy signal followed directly by a fast busy signal, followed directly by a period of silence, followed directly by the end of the file then the sequence may fit a pre-designed template of anomalies programmed into the system search mechanism. If the experienced sequence does fit the pre-designed template the system flows to box 70 where the audio recording is truncated truncate, preferably at the beginning of the slow busy signal and writes out a repaired wave file.
If the system fails to determine a fixable busy signal, the system flows to box 80, where the system determines the location of a fixable silence in the audio file. If the system locates a pre-determined sequence of silences the system flows to box 90 where the audio file is truncated at the appropriate location. If the system does not locate a fixable silence the system flows to box 100 to determine whether the presence of any busy signal anywhere in the file, or to determine the location a final silence in the audio file. In the event the system cannot locate an anomaly, typically where the anomaly does match a pre-determined set of criteria exactly, the system will return and error (box 110) notifying the user that the audio file cannot be fixed.
Turning now to
Referring to
Referring now to
While the invention has been described with reference to the exemplary embodiments thereof, those skilled in the art will be able to make various modifications to the described embodiments without departing from the true spirit and scope. The terms and descriptions used herein are set forth by way of illustration only and are not meant as limitations. In particular, although the method has been described by examples, the steps of the method may be performed in a different order than illustrated or simultaneously. Those skilled in the art will recognize that these and other variations are possible within the spirit and scope as defined in the following claims and their equivalents.
For the convenience of the reader, the above description has focused on a representative sample of all possible embodiments, a sample that teaches the principles of the invention and conveys the best mode contemplated for carrying it out. The description has not attempted to exhaustively enumerate all possible variations. Further undescribed alternative embodiments are possible. It will be appreciated that many of those undescribed embodiments are within the literal scope of the following claims, and others are equivalent.
This application claims the benefit of priority of U.S. Provisional Patent Application Ser. No. 60/506,762, entitled “METHOD, SYSTEM AND APPARATUS FOR REPAIRING AUDIO RECORDING,” filed Sep. 30, 2003, which is hereby incorporated by reference in its entirety. This application relates to co-pending U.S. patent application Ser. No. 10/447,290, entitled “SYSTEM AND METHODS UTILIZING NATURAL LANGUAGE PATIENT RECORDS,” filed on May 29, 2003; co-pending U.S. patent application Ser. No. 10/413,405, entitled “SYSTEMS AND METHODS FOR CODING INFORMATION,” filed April 15, 2003; co-pending U.S. patent application Ser. No. 11/068,493, entitled “A SYSTEM AND METHOD FOR NORMALIZATION OF A STRING OF WORDS,” filed on Feb. 28, 2005; co-pending U.S. patent application Ser. No. 10/448,320, entitled “METHOD, SYSTEM, AND APPARATUS FOR DATA REUSE,” filed on May 30, 2003; co-pending U.S. patent application Ser. No. 10/787,889, entitled “SYSTEM, METHOD AND APPARATUS FOR PREDICTION USING MINIMAL AFFIX PATTERNS,” filed on Feb. 27, 2004; co-pending U.S. patent application Ser. No. 10/448,317, entitled “METHOD, SYSTEM, AND APPARATUS FOR VALIDATION,” filed on May 30, 2003; co-pending U.S. patent application Ser. No. 10/448,325, entitled “METHOD, SYSTEM, AND APPARATUS FOR VIEWING DATA,” filed on May 30, 2003; co-pending U.S. patent application Ser. No. 10/953,448, entitled “SYSTEM AND METHOD FOR DOCUMENT SECTION SEGMENTATIONS,” filed on Sep. 30, 2004; co-pending U.S. patent application Ser. No. 10/953,471, entitled “SYSTEM AND METHOD FOR MODIFYING A LANGUAGE MODEL AND POST-PROCESSOR 1NFORMATION,” filed on Sep. 29, 2004; co-pending U.S. patent application Ser No. 10/951,291, entitled “SYSTEM AND METHOD FOR CUSTOMIZING SPEECH RECOGNITION INPUT AND OUTPUT,” filed on Sep. 27, 2004; co-pending U.S. patent application Ser No. 10/953,474, entitled “SYSTEM AND METHOD FOR POST PROCESSING SPEECH RECOGNITION OUTPUT,” filed on Sep. 29, 2004; co-pending U.S. patent application Ser. No. 11/069,203, entitled “SYSTEM AND METHOD FOR GENERATING A PHASE PRONUNCIATION,” filed on Feb. 28, 2005; co-pending U.S. patent application Ser. No. 11/007,626, entitled “SYSTEM AND METHOD FOR ACCENTED MODIFICATION OF A LANGUAGE MODEL,” filed on Dec. 7,2004; co-pending U.S. patent application Ser. No. 10/948,625, entitled “METHOD, SYSTEM, AND APPARATUS FOR ASSEMBLY, TRANSPORT AND DISPLAY OF CLINICAL DATA,” filed on Sep. 23, 2004; and co-pending U.S. patent application Ser. No. 10/840,428, entitled “CATEGORIZATION OF INFORMATION USING NATURAL LANGUAGE PROCESSING AND PREDEFINED TEMPLATES,” filed on Sep. 23, 2004, all which are hereby incorporated by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
4477698 | Szlam et al. | Oct 1984 | A |
4696029 | Cohen | Sep 1987 | A |
4965763 | Zamora | Oct 1990 | A |
5253164 | Holloway et al. | Oct 1993 | A |
5325293 | Dorne | Jun 1994 | A |
5327341 | Whalen et al. | Jul 1994 | A |
5392209 | Eason et al. | Feb 1995 | A |
5544360 | Lewak et al. | Aug 1996 | A |
5664109 | Johnson et al. | Sep 1997 | A |
5799268 | Boguraev | Aug 1998 | A |
5809476 | Ryan | Sep 1998 | A |
5832450 | Myers et al. | Nov 1998 | A |
5970463 | Cave et al. | Oct 1999 | A |
6014663 | Rivette et al. | Jan 2000 | A |
6021202 | Anderson et al. | Feb 2000 | A |
6052693 | Smith et al. | Apr 2000 | A |
6055494 | Friedman | Apr 2000 | A |
6088437 | Amick | Jul 2000 | A |
6182029 | Friedman | Jan 2001 | B1 |
6192112 | Rapaport et al. | Feb 2001 | B1 |
6243444 | O'Neal | Jun 2001 | B1 |
6292771 | Haug et al. | Sep 2001 | B1 |
6347329 | Evans | Feb 2002 | B1 |
6405165 | Blum et al. | Jun 2002 | B1 |
6434547 | Mishelevich et al. | Aug 2002 | B1 |
6438533 | Spackman et al. | Aug 2002 | B1 |
6553385 | Johnson et al. | Apr 2003 | B2 |
6915254 | Heinze et al. | Jul 2005 | B1 |
6947936 | Suermondt et al. | Sep 2005 | B1 |
7124144 | Christianson et al. | Oct 2006 | B2 |
20020007285 | Rappaport | Jan 2002 | A1 |
20020095313 | Haq | Jul 2002 | A1 |
20020143824 | Lee et al. | Oct 2002 | A1 |
20020169764 | Kincaid et al. | Nov 2002 | A1 |
20030046264 | Kauffman | Mar 2003 | A1 |
20030061201 | Grefenstette et al. | Mar 2003 | A1 |
20030088403 | Chan et al. | May 2003 | A1 |
20030115080 | Kasravi et al. | Jun 2003 | A1 |
20030208382 | Westfall | Nov 2003 | A1 |
20030233345 | Perisic et al. | Dec 2003 | A1 |
20040103075 | Kim et al. | May 2004 | A1 |
20040139400 | Allam et al. | Jul 2004 | A1 |
20040186746 | Angst et al. | Sep 2004 | A1 |
20040220895 | Carus et al. | Nov 2004 | A1 |
20040243545 | Boone et al. | Dec 2004 | A1 |
20040243551 | Boone et al. | Dec 2004 | A1 |
20040243552 | Titemore et al. | Dec 2004 | A1 |
20040243614 | Boone et al. | Dec 2004 | A1 |
20050108010 | Frankel et al. | May 2005 | A1 |
20050114122 | Uhrbach et al. | May 2005 | A1 |
20050120020 | Carus et al. | Jun 2005 | A1 |
20050120300 | Schwager et al. | Jun 2005 | A1 |
20050144184 | Carus et al. | Jun 2005 | A1 |
Number | Date | Country | |
---|---|---|---|
20050207541 A1 | Sep 2005 | US |
Number | Date | Country | |
---|---|---|---|
60506762 | Sep 2003 | US |