Field
Embodiments included herein generally relate to the processing of Hidden Markov Models (HMMs). More particularly, embodiments relate to a processing engine configured to process different types of HMM structures.
Background
Hidden Markov Models (HMMs) are used in many applications such as, for example, speech recognition, text-to-speech applications, handwriting recognition, gesture recognition, bioinformatics, and cryptanalysis. These applications can implement different types of HMM structures such as, for example, Ergodic models, left-to-right models, and parallel path left-to-right models. Each of these HMM models includes one or more states, in which a score can be attributed to each HMM model based on its respective one or more states.
For example, a probability can be computed to assess whether a particular HMM model produces an observed sequence of states. Here, a score can be attributed to how well the particular HMM model matches the observed sequence of states. The score can be calculated by enumerating every possible state sequence in the HMM model. For HMM models with a high number of states and/or a complex structure, this can be a computationally-intensive process. This issue is further exacerbated by the limitation of HMM processing engines being only capable of handling one type of HMM structure (e.g., Ergodic HMM structure, left-to-right HMM structure, or parallel path left-to-right HMM structure), thus limiting the implementation of the HMM processing across different HMM applications.
Therefore, there is a need for an HMM processing engine configured to efficiently process different types of HMM structures.
An embodiment includes a method for processing an HMM data structure. The method includes receiving Hidden Markov Model (HMM) information from an external system. The method also includes processing back pointer data and first HMM state scores for one or more NULL states in the HMM information. Second HMM state scores are updated for one or more non-NULL states in the HMM information based on at least one predecessor state. Further, the method includes transferring the second HMM state scores to the external system.
Another embodiment includes an apparatus for processing an HMM data structure. The apparatus includes a state type fetch module, a processing module, and an output list module. The state type fetch module is configured to determine a presence of one or more NULL and non-NULL states in HMM information. The processing module is configured to process back pointer data and first HMM state scores for the one or more NULL states and process second HMM states scores for one or more non-NULL states based on at least one predecessor state. The processing module is also configured to update the back pointer data based on the HMM information associated with the one or more non-NULL states. Further, the output list module is configured to transfer the updated back pointer data and second HMM state scores to an external system.
A further embodiment includes a tangible computer readable medium having stored therein one or more sequences of one or more instructions for execution by one or more processors to perform a method for processing a Hidden Markov Model (HMM) structure. The method includes receiving Hidden Markov Model (HMM) information from an external system. The method also includes processing back pointer data and first HMM state scores for one or more NULL states in the HMM information. Second HMM state scores are updated for one or more non-NULL states in the HMM information based on at least one predecessor state. Further, the method includes transferring the second HMM state scores to the external system.
Further features and advantages of the embodiments disclosed herein, as well as the structure and operation of various embodiments, are described in detail below with reference to the accompanying drawings. It is noted that the invention is not limited to the specific embodiments described herein. Such embodiments are presented herein for illustrative purposes only. Additional embodiments will be apparent to a person skilled in the relevant art based on the teachings contained herein.
The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments of the present invention and, together with the description, further serve to explain the embodiments and to enable a person skilled in the relevant art to make and use the invention.
Embodiments will now be described with reference to the accompanying drawings. In the drawings, generally, like reference numbers indicate identical or functionally similar elements. Additionally, generally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.
Input device 110 is configured to receive an incoming signal such as, for example, an incoming voice signal, video signal, multi-media signal, non-visible light signals, handwritten text, or a signal of any kind representative of a gesture. Input device 110 can convert the incoming signal to a digitally-formatted signal and transfer the digitally-formatted signal to processing unit 120 and/or co-processing unit 140 for further processing. For example, if the incoming signal is a voice signal, input device 110 can convert acoustical vibrations associated with the incoming voice signal to an analog signal. Input device 110 can further digitize the analog signal using an analog-to-digital converter (not shown in
Processing unit 120 and co-processing unit 140 can be used to process the resulting digital signal from input device 110. For example, processing unit 120 and co-processing unit 140 can be used to decode speech.
Although system 100 is described in the context of a speech recognition environment, based on the description herein, a person skilled in the relevant art will recognize that the embodiments disclosed herein can be applied to other types of environments such as, for example and without limitation, text-to-speech applications, handwriting recognition, gesture recognition, bioinformatics, and cryptanalysis. These other types of environments are within the spirit and scope of the embodiments disclosed herein.
Element 320 provides the HMM information. Element 320 includes:
Element 330 provides the <STATE> field. In an embodiment, the <STATE> field occurs once per HMM state. Each <STATE> field includes:
Element 340 provides the <OBS_PROB> field. In an embodiment, the <OBS_PROB field occurs once per HMM state, in which this field indicates the observation probability for the HMM state. In an embodiment, if HMM states and observation probabilities (OBS_PROB) do not have a one-to-one mapping (e.g., senones shared among multiple HMM states), OBS_PROB_ID can be used to identify the observation probability associated with a particular HMM state. As would be understood by a person skilled in the relevant art, multiple HMM states in an HMM structure can share the same senone—i.e., the same value for <OBS_PROB> or senone score. In an embodiment, the value of the observation probability for the HMM state can either be stored in the OBS_PROB field or referred to in the OBS_PROB_ID field (e.g., as a pointer).
Element 350 provides the <PATH> field. In an embodiment, the <PATH> field occurs once per HMM state and includes:
Element 360 provides the <PATH_INFO> field. In an embodiment, the <PATH_INFO> field occurs once per predecessor HMM state and includes:
Element 370 provides the <STATE_TYPE_FLAGS> field. Each bit in this field corresponds to a respective “type” of an HMM state, according to an embodiment. For example, each bit in the <STATE_TYPE_FLAGS> field can either have a value of ‘0’ or ‘1’. A value of ‘0’ indicates that the HMM state is “NULL” or “non-emitting.” Conversely, a value of ‘1’ indicates that the HMM state is “non-NULL” or “emitting.”
HMM data structure 300 can be applied to any HMM structure including, but not limited to, Ergodic, left-to-right, and parallel path left-to-right HMM structures.
The HMM structures of
In referring to Table 1, in an embodiment, the state pointer fields (<STATE>_PTR) can have the following values:
In Table 2, for st_ptr1, OBS_PROB_ID is a pointer to the value of the observation probability for state S0 (i.e., ob_prob1). Similarly, for st_ptr2 and st_ptr3, their respective OBS_PROB_IDs are pointers to the values of the observation probabilities for states S1 and S2, respectively.
In an embodiment and in referring to Table 1, the path pointer fields (<PATH>_PTR) can have the following values:
In Table 3, for path_ptr1, the first predecessor path refers to path “a00” and the second path refers to path “ain0” in
In referring to Table 4, pointers have a value that indicates the location of the information within the HMM data structure. For example, HMM data structure 600 of
For pointers that refer to other pointers in HMM data structure 600, the HMM information can be located in a similar manner as described above. For example, in referring to byte index 24—PRED_STATE_PTR for path_ptr1—in Table 4, this field has a value of 8. Here, the value of 8 represents the byte index in HMM data structure 600 for HMM state information associated with st_ptr1 (state S0). The state information for path_ptr2 and path_ptr3 can be located in HMM data structure 600 in a similar manner. Also, the observation probability information for OBS_PROB_ID for st_ptr1, st_ptr2, and st_ptr3 can be located in HMM data structure 600 in a similar manner—e.g., OBS_PROB_ID for st_ptr1 has a value of 17, which refers to the byte index of ob_prob1.
Although the above description of HMM data structure 600 is in the context of HMM structure 500 of
In referring to
In referring to
Upon receipt of the HMM information, interface 722 transfers the HMM information to HMM processing engine 724 and memory device 726 via a data bus 723 and a data bus 727, respectively. In an embodiment, interface 722 transfers the HMM information to HMM processing engine 724 and memory device 726 in a format conforming to HMM data structure 300 of
In an embodiment, HMM processing engine 724 updates HMM state scores and writes the updated state scores to memory device 726, via a data bus 725, based on the format defined by HMM data structure 300 of
Further, in an embodiment, HMM processing engine 724 has a priori knowledge of the HMM data structure that it receives from interface 722 and accesses state and path information based on their respective indices in the HMM data structure. For example, in referring to Table 4 above, HMM processing engine 724 has knowledge of the various fields of HMM data structure 600 in
In referring to
In an embodiment, state pointers associated with a NULL state (<STATE_TYPE_FLAG>=0) are processed first, followed by state pointers associated with a non-NULL state (<STATE_TYPE_FLAG>=1). For state pointers without NULL states, then only state pointers with non-NULL states are processed as described below.
For state pointers associated with a NULL state, state type fetch module 820 passes these state pointers to SPPOP fetch module 830. In addition to the state pointers associated with a NULL state, SPPOP fetch module 830 receives probability and path information from the HMM data structure (e.g., <OBS_PROB> and <PATH>_PTR) stored in input list module 810. State score updater module 840 updates the state scores of the HMM data structure based on the state, probability, and path information received from SPPOP fetch module 830. In an embodiment, the updated state scores are transferred from state score updater module 840 to input list module 810. The updated state scores are also transferred from state score updater module 840 to output list module 860 via state write module 850, according to an embodiment. In addition, updated back pointer data is transferred to input list module 810 and output list module 860, according to an embodiment. Any remaining state pointers associated with a NULL state are processed in a similar manner.
In step 920, for each predecessor state associated with the HMM state, a temporary value (TEMP) is set to equal the summation of the HMM state's current score (CURR_SCR), a transition probability associated with the predecessor state (TRAN_PROB), and a score of the predecessor state (PRED_CURR_SCR):
TEMP=CURR_SCR+TRANS_PROB+PRED_CURR_SCR.
In referring to HMM data structure 600 of
In step 930, if the temporary value is greater than the maximum value (TEMP>MAX), then in step 940, the maximum value is set to the temporary value: MAX=TEMP. In addition, the back pointer (BKPTR) for the HMM is set to the predecessor state: BKPTR=PRED_STATE_PTR. Otherwise, method 900 continues to step 950.
In step 950, for any remaining predecessor states associated with a NULL state, steps 920-940 are repeated. For example, in referring to HMM data structure 600 of
In step 960, after looping through the predecessor states to find the maximum value (MAX), the HMM state's current score (CURR_SCR) is set equal to the maximum value (MAX): CURR_SCR=MAX. State score updater module 840 of
In step 970, after the back pointer and state score information have been updated for each predecessor state, state score updater module 840 transfers the back pointer and state score information to input list module 810 and to output list module 860 (via state write module 850). In an embodiment, state score updater module writes the updated back pointer and state score information to input list module 810 and output list module 860 (via state write module 850) in a format consistent with HMM data structure 300 of
For state pointers associated with a non-NULL state, state type fetch module 820 passes these state pointers to SPPOP fetch module 830. In addition to the state pointers associated with a non-NULL state, SPPOP fetch module 830 receives probability and path information from the HMM data structure (e.g., <OBS_PROB> and <PATH>_PTR) stored in input list module 810. State score updater module 840 updates the state scores of the HMM data structure based on the state, probability, and path information received from SPPOP fetch module 830.
In an embodiment, the updated state scores are transferred from state score updater module 840 to state write module 850, which then transfers the updated state scores to output list module 860. In addition to state scores, back pointers (BKPTR) for each state can also be updated based on a predecessor state with higher probability. Any remaining state pointers associated with a non-NULL state are processed in a similar manner.
In step 1020, for each predecessor state associated with the HMM state, a temporary value (TEMP) is set to equal the summation of the HMM state's current score (CURR_SCR), a transition probability associated with the predecessor state (TRAN_PROB), and a score of the predecessor state (PRED_CURR_SCR):
TEMP=CURR_SCR+TRANS_PROB+PRED_CURR_SCR.
In referring to HMM data structure 600 of
In step 1030, if the temporary value is greater than the maximum value (TEMP>MAX), then in step 1040, the maximum value is set to the temporary value: MAX=TEMP. In addition, the back pointer (BKPTR) for the HMM is set to the predecessor state: BKPTR=PRED_STATE_PTR. Otherwise, method 1000 continues to step 1050.
In step 1050, for any remaining predecessor states associated with a non-NULL state, steps 1020-1040 are repeated. For example, in referring to HMM data structure 600 of
In step 1060, after looping through the predecessor states to find the maximum value (MAX), the HMM state's current score (CURR_SCR) is set to equal to the summation of the maximum value and an observation probability associated with the HMM state: CURR_SCR=MAX+OBS_PROB. State score updater module 840 of
In step 1070, after the back pointer and state score information have been updated, state score updater module 840 transfers the back pointer and state score information to output list module 860 (via state write module 850). In an embodiment, state score updater module 840 writes the updated back pointer and state score information to output list module 860 (via state write module 850) in a format consistent with HMM data structure 300 of
In addition to updating HMM back pointer and state score information, HMM processing engine 724 of
In step 1110, HMM information is received by, for example, HMM module 720 from external system 710 of
In an embodiment, the external system can include a memory device (e.g., memory device 130 of
In step 1120, back pointer and state score data for one or more NULL states in the HMM information are updated. For each predecessor state associated with the one or more NULL states, the back pointer and state score data are updated based on a comparison of a summation of a state score associated with the one or more NULL states, a transition probability associated with the one or more NULL states, and a score of a predecessor state to a maximum value. An example of this comparison is discussed above with respect to step 930 of
In step 1130, the back pointer data is updated based on the HMM information associated with the one or more non-NULL states. For each predecessor state associated with the one or more non-NULL states, the HMM state score is updated based on a comparison of a summation of a state score associated with the one or more NULL states, a transition probability associated with the one or more NULL states, and a score of a predecessor state to a maximum value. An example of this comparison is discussed above with respect to step 1030 of
In step 1140, an HMM state score for one or more non-NULL states in the HMM information is updated based on at least one predecessor state. Similar to step 1060 of
In step 1150, the updated back pointer data and HMM state score is transferred to the external system. This transfer of updated back pointer data and HMM state score is similar to step 1070 of
A benefit, among others, of the embodiments disclosed herein is the flexibility of the HMM processing engine (e.g., HMM processing engine 724 of
Various aspects of embodiments of the present invention may be implemented in software, firmware, hardware, or a combination thereof.
It should be noted that the simulation, synthesis and/or manufacture of various embodiments of this invention may be accomplished, in part, through the use of computer readable code, including general programming languages (such as C or C++), hardware description languages (HDL) such as, for example, Verilog HDL, VHDL, Altera HDL (AHDL), or other available programming and/or schematic capture tools (such as circuit capture tools). This computer readable code can be disposed in any known computer-usable medium including a semiconductor, magnetic disk, optical disk (such as CD-ROM, DVD-ROM). As such, the code can be transmitted over communication networks including the Internet. It is understood that the functions accomplished and/or structure provided by the systems and techniques described above can be represented in a core that is embodied in program code and can be transformed to hardware as part of the production of integrated circuits.
Computer system 1200 includes one or more processors, such as processor 1204. Processor 1204 may be a special purpose or a general-purpose processor such as, for example, processing unit 120 and co-processing unit 140 of
Computer system 1200 also includes a main memory 1208, preferably random access memory (RAM), and may also include a secondary memory 1210. Secondary memory 1210 can include, for example, a hard disk drive 1212, a removable storage drive 1214, and/or a memory stick. Removable storage drive 1214 can include a floppy disk drive, a magnetic tape drive, an optical disk drive, a flash memory, or the like. The removable storage drive 1214 reads from and/or writes to a removable storage unit 1218 in a well-known manner. Removable storage unit 1218 can comprise a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 1214. As will be appreciated by a person skilled in the relevant art, removable storage unit 1218 includes a computer-usable storage medium having stored therein computer software and/or data.
Computer system 1200 (optionally) includes a display interface 1202 (which can include input and output devices such as keyboards, mice, etc.) that forwards graphics, text, and other data from communication infrastructure 1206 (or from a frame buffer not shown) for display on display unit 1230.
In alternative implementations, secondary memory 1210 can include other similar devices for allowing computer programs or other instructions to be loaded into computer system 1200. Such devices can include, for example, a removable storage unit 1222 and an interface 1220. Examples of such devices can include a program cartridge and cartridge interface (such as those found in video game devices), a removable memory chip (e.g., EPROM or PROM) and associated socket, and other removable storage units 1222 and interfaces 1220 which allow software and data to be transferred from the removable storage unit 1222 to computer system 1200.
Computer system 1200 can also include a communications interface 1224. Communications interface 1224 allows software and data to be transferred between computer system 1200 and external devices. Communications interface 1224 can include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, or the like. Software and data transferred via communications interface 1224 are in the form of signals which may be electronic, electromagnetic, optical, or other signals capable of being received by communications interface 1224. These signals are provided to communications interface 1224 via a communications path 1226. Communications path 1226 carries signals and can be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, a RF link or other communications channels.
In this document, the terms “computer program medium” and “computer-usable medium” are used to generally refer to tangible media such as removable storage unit 1218, removable storage unit 1222, and a hard disk installed in hard disk drive 1212. Computer program medium and computer-usable medium can also refer to tangible memories, such as main memory 1208 and secondary memory 1210, which can be memory semiconductors (e.g., DRAMs, etc.). These computer program products provide software to computer system 1200.
Computer programs (also called computer control logic) are stored in main memory 1208 and/or secondary memory 1210. Computer programs may also be received via communications interface 1224. Such computer programs, when executed, enable computer system 1200 to implement embodiments of the present invention as discussed herein. In particular, the computer programs, when executed, enable processor 1204 to implement processes of embodiments of the present invention, such as the steps in the methods illustrated by flowchart 900 of
Embodiments are also directed to computer program products including software stored on any computer-usable medium. Such software, when executed in one or more data processing device, causes a data processing device(s) to operate as described herein. Embodiments of the present invention employ any computer-usable or -readable medium, known now or in the future. Examples of computer-usable mediums include, but are not limited to, primary storage devices (e.g., any type of random access memory), secondary storage devices (e.g., hard drives, floppy disks, CD ROMS, ZIP disks, tapes, magnetic storage devices, optical storage devices, MEMS, nanotechnological storage devices, etc.), and communication mediums (e.g., wired and wireless communications networks, local area networks, wide area networks, intranets, etc.).
It is to be appreciated that the Detailed Description section, and not the Summary and Abstract sections, is intended to be used to interpret the claims. The Summary and Abstract sections may set forth one or more but not all example embodiments of the present invention as contemplated by the inventors, and thus, are not intended to limit the present invention and the appended claims in any way.
Embodiments of the present invention have been described above with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed.
The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying knowledge within the skill of the relevant art, readily modify and/or adapt for various applications such specific embodiments, without undue experimentation, without departing from the general concept of the present invention. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by a person skilled in the relevant art in light of the teachings and guidance.
The breadth and scope of the present invention should not be limited by any of the above-described example embodiments, but should be defined only in accordance with the following claims and their equivalents.
Number | Name | Date | Kind |
---|---|---|---|
4977598 | Doddington | Dec 1990 | A |
5526444 | Kopec | Jun 1996 | A |
5710865 | Abe | Jan 1998 | A |
5745600 | Chen | Apr 1998 | A |
5806034 | Naylor | Sep 1998 | A |
5825921 | Dulong | Oct 1998 | A |
6230128 | Smyth | May 2001 | B1 |
6285981 | Kao | Sep 2001 | B1 |
6374222 | Kao | Apr 2002 | B1 |
6490555 | Yegnanarayanan | Dec 2002 | B1 |
6678415 | Popat | Jan 2004 | B1 |
6801656 | Colmenarez | Oct 2004 | B1 |
20020165717 | Solmer | Nov 2002 | A1 |
20030182113 | Huang | Sep 2003 | A1 |
20050154589 | Nishitani | Jul 2005 | A1 |
20050209851 | Shu | Sep 2005 | A1 |
20050234906 | Ganti | Oct 2005 | A1 |
20080059188 | Konopka | Mar 2008 | A1 |
20090222266 | Sakai | Sep 2009 | A1 |
20090271581 | Hinrichs, Jr. | Oct 2009 | A1 |
20110208521 | McClain | Aug 2011 | A1 |
20110288835 | Hasuo | Nov 2011 | A1 |
20120057775 | Suzuki | Mar 2012 | A1 |
20120071891 | Itkowitz | Mar 2012 | A1 |
20130138589 | Yu | May 2013 | A1 |
Entry |
---|
Pending Application, U.S. Appl. No. 13/725,260, inventors Fastow et al., filed Dec. 21, 2012 (Not Published). |
Number | Date | Country | |
---|---|---|---|
20150106405 A1 | Apr 2015 | US |