The present invention relates to translation of text from a source language to a target language and more particularly to machine translation of such text.
The advent of the information revolution and the Internet has resulted in a need for the availability of documents in different languages. This multilingualism has in turn triggered a need for machine translation systems that are easily adaptable, quicker to train, fast, reasonably accurate, and cost effective. Such systems substantially extend the reach of knowledge and information. Statistical machine translation systems, which are based on the principles of information theory and statistics, have benefited from the availability of increased electronic data storage capacity and processing power. Such translation systems can be trained for a particular language pair, thus reducing deployment time and cost, and enabling easier maintenance and optimization for specific domain or language usage.
Consider, for example, translation of text in a source language (say French sentence f) into a target language (say English sentence e). Every target language sentence may be viewed as a possible translation of a source language sentence. For each such possible target sentence e of the source sentence f, there exists a score or probability that the target sentence e is a faithful translation of source sentence f (P(e|f)). More specifically, the string e that maximizes this score is the best translation:
Best e=Max P(e|f)
Using Bayes Theorem:
Best e=Max P(f|e).P(e)
A machine translation system thus has three main components: a translation model that assigns a probability or score P(f|e) to the event when a Target String e is translated to a source string f, a language model that assigns a probability or score P(e) to a target string e, and a decoder. The decoder takes a previously unseen sentence f and attempts to determine the sentence e that maximizes P(e|f), or equivalently, maximizes P(f|e).P(e).
Decoding is a discrete optimization problem whose goal is to determine a target sentence or portion of text that optimally corresponds to a source sentence or portion of text. The decoding problem is known to belong to a class of problems popularly known as NP-hard problems. NP-hard problems are computationally difficult and solutions thereof elude polynomial time algorithms.
In the decoding problem, it is required to find the most probable translation of a given portion of text in a source language. The language and translation models are also given. Thus, decoding represents a combinatorial search problem whose search space is prohibitively large. The challenge is in devising a scheme for efficiently searching the solution space for a solution.
Conventional decoders are primarily concerned with providing a solution under real world constraints such as limited memory, processing power and time. Consequently, speed and/or accuracy of decoding is/are compromised. Since the space of possible translated sentences or text portions is extremely large, conventional decoders typically examine only a portion of that space and thus risk missing good solutions.
Decoding time is generally a function of sentence or text length and conventional decoders are frequently unable to translate sentences of relatively longer length in a satisfactory amount of time. Whilst speed of decoding is of particular importance to real-time translation applications such as web page translation, bulk document translation, real time speech to speech translation systems, etc., accuracy of decoding is of prime importance in applications such as the translation of government documents and technical manuals.
U.S. Pat. No. 5,991,710, entitled “Method and System for Natural Language Translation”, issued to Brown, P. F., et al. on Dec. 19, 1995, relates to statistical translation methods and systems and more particularly to translation and language models for use by a decoder. Assigned to International Business Machines Corporation, the subject matter disclosed in U.S. Pat. No. 5,991,710 is incorporated herein by reference.
Yang, Y., and Waibel, A., in a paper entitled “Decoding Algorithm in Statistical Machine Translation”, published in the Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL), Madrid, Spain, July 1997, describe a stack decoding algorithm for statistical translation.
Tillmann, C., Vogel, S., Ney, H., and Zubiaga, A., in a paper entitled “A DP based Search Using Monotone Alignments in Statistical Translation”, published in the Proceedings of 35th Annual Meeting of the Association for Computational Linguistics (ACL), Madrid, Spain, July 1997, describe a search algorithm for statistical translation based on dynamic programming.
Ulrich, G., et. al., in a paper entitled “Fast Decoding and Optimal Decoding for Machine Translation”, published in the Proceedings of 39th Annual Meeting of the Association for Computational Linguistics (ACL), Toulouse, France, 2001, compare the speed and output quality of a stack decoder with a fast greedy decoder and a slow but optimal decoder that treats decoding as an integer-programming optimization problem.
The stack and integer programming decoders are slow and are thus not particularly useful for applications that require fast translation. The greedy decoder, on the other hand, is fast but compromises on accuracy. Dynamic programming, while fast, suffers from a monotonicity constraint.
A need thus exists for a translation means or decoder that performs well in terms of both speed and accuracy. A need also exists for a decoder that can translate relatively long sentences in real time with a satisfactory degree of accuracy.
Aspects of the present invention provide a method, an apparatus and a computer program product for decoding source text in a first language to target text in a second language. The source text is decoded into an intermediate text portion based on a fixed alignment between words in the source text and words in the intermediate text portion and an alignment between words in the source text and words in the intermediate text portion is determined. The steps of decoding the source text and determining an alignment are alternately repeated while a decoding improvement in the intermediate text portion can be obtained. Finally, the intermediate text portion is output as the target text. The step of alternately repeating the source text decoding and alignment determination steps may be repeated for each of a plurality of lengths of the intermediate text portion.
Decoding may initially be performed based on an initial alignment that maps words in the source text to word positions in the intermediate text portion.
The decoded text may comprise an optimal translation for a fixed alignment, which may be generated based on dynamic programming.
The alignment may comprise an optimal alignment but may alternatively comprise an improved alignment relative to a previous alignment.
Aspects of the present invention also provide a method, an apparatus and a computer program product method for translating source text in a first language to translated text in a second language. An alignment between words in the source text and positions of words in the translated text is determined and an optimal translation of the source text is generated based on the alignment. The alignment and translation are performed repeatedly for each of a plurality of lengths of the translated text.
A small number of embodiments are described hereinafter, by way of example only, with reference to the accompanying drawings in which:
Embodiments of methods, apparatuses and computer program products are described herein for statistical translation decoding of text from a source language into a target language. The embodiments described relate to translation of French into English. However, it is not intended that the present invention be limited in this manner as the principles of the present invention have general applicability to translation between other source and target languages. Embodiments of the invention may also perform translation of portions of text other than sentences such as paragraphs, pages and n-grams.
Conceptually, the translation model comprises a table of probability scores P(f|e) that are indicative of the degree of association of every possible pair of English and French sentences <e, f> and the language model comprises a table of probability scores P(e) for every possible English sentence e. Construction of the tables is difficult, at least on account of the number of conceivable sentences in any language being substantially large. Approximations are used in generating the probability tables and the search problem is thus a decoding problem to determine an optimal English sentence e given a novel French sentence f. Determination of the optimal English sentence is computationally hard and requires efficient and accurate search techniques.
Suppose that a French sentence f has |f| words denoted by f1, f2, . . . , fj, . . , f|f| and a corresponding English sentence e has |e| words denoted by e1, e2, . . . ei, . . . , e|e|. Although a word-by-word translation is insufficient for complete and accurate translation of the sentence f to the sentence e, a relationship nonetheless exists between the individual words of the two sentences. Such a relationship is known as an alignment. The alignment between the individual words of sentences f and e is denoted by a which is a tuple of order |f|. The individual elements of the tuple α1, α2, . . . , αj, . . . , α|f| are integers in the range of 1 to |e|, each of which denote which French word f an English word e is aligned to. Each French word f is aligned to exactly one English word. Numerous possible alignments are possible and, given the above model, the fundamental probability is the joint probability distribution P(e,a|f), where the alignment a is hidden. Such a model comprises of individual word-to-word translation probabilities, the alignment probabilities and the language model probabilities. When two or more French words align to a single English word, the number of French words generated by the single English word is known as the fertility of the word. Each English word has a fertility probability associated with it, which provides an indication of how many French words that particular English word may correspond to.
The decoding problem may be defined as one of finding the most probable translation ê in English (target language) of a given French (source language) sentence f in accordance with the fundamental equation of Statistical Machine Translation:
ê=argmaxe Pr(f|e)Pr(e) (1)
Rewriting the translation model Pr(f|e) as Σa Pr(f,a|e), where a denotes an alignment between the source sentence and the target sentence, the decoding problem can be restated as:
ê=argmaxe Σa Pr(f,a|e)Pr(e) (2)
Even when the translation model is as simple as the IBM Model 1 and the language model Pr(e) is a bigram language model, the decoding problem is NP-hard. IBM models 1 to 5 relate to statistical translation models, as described in U.S. Pat. No. 5,477,451, the subject matter of which is incorporated herein by reference. Practical solutions to equation 2 focus on finding sub-optimal solutions. However, a relatively simpler equation may be obtained by relaxing equation 2:
(ê, â)=argmax(e,a)Pr(f,a|e)Pr(e) (3)
Solving equation 3 is a joint optimization problem in that a pair (ê, â) is searched for.
Two basic observations are particularly relevant for devising a solution for equation 3. The first observation is that given a target length l and an alignment ã that maps source words to target positions, it is simple to compute the optimal target sentence ê. For reference purposes, this procedure is known as FIXED_ALIGNMENT_DECODING. The optimal solution for FIXED_ALIGNMENT_DECODING can be computed in O(m) time for IBM models 1 to 5 using dynamic programming.
The second observation is that for a given target sentence {tilde over (e)}, it is simple to compute an improved or optimal alignment â that maps the source words to the target words:
{circumflex over (a)}=argmaxaPr(f,a|{tilde over (e)}) (4)
The optimal alignment between the source and target sentences can be determined using the Viterbi algorithm, which is well known and comprehensively described in the literature. For IBM models 1 and 2, the Viterbi alignment can be computed using a straightforward algorithm in O(ml) time. For higher models, an approximate Viterbi alignment can be computed by an iterative local search procedure, which searches in the neighbourhood of the current best alignment for a better alignment. The first iteration can begin with any arbitrary alignment (e.g., the Viterbi alignment of IBM Model 2). It is possible to implement one iteration of local search in O(ml) time. Typically, the number of iterations is bounded in practice by O(m) and the local search therefore takes O(m2l) time. However, the methods, apparatuses and computer program products described herein do not specifically require computation of an optimal alignment. Any alignment that improves the current alignment can be used. It is straightforward to identify such an alignment using restricted swaps and moves in O(m) time. For reference purposes, the term ‘Viterbi’ is used to denote any linear time algorithm for computing an improved alignment between a source sentence and an associated translation.
At step 320, the French sentence f provided in step 310 is decoded into an English sentence e using an Alignment Alternating Search decoding method that returns the translated English sentence a_E, and a score a_score associated with the translated English sentence a_E. The Alignment Alternating Search decoding method iteratively improves an initial estimate of the alignment a_E.
At step 330, the French sentence f provided in step 310 is decoded into an English sentence e using a TargetAlternatingSearch decoding method that returns the translated English sentence t_E, and a score t_score associated with the translated English sentence t_E. The TargetAlternatingSearch decoding method iteratively improves an initial estimate of the target sentence t_E.
At step 340, a determination is made whether the score a_score returned by the AlignmentAlternatingSearch decoding method is higher than the score t_score returned by the TargetAlternatingSearch decoding method. If a_score>t_score (Y), the translated English sentence a_E is output as the better translation at step 340. Otherwise (N), the translated English sentence t_E is output as the better translation at step 350.
A source sentence f of length m words (m>0) is input at step 410.
The length l and alignment ã of the target sentence may optionally be specified at step 420. A determination is made at step 430 whether the length l of the target sentence ê is specified. If not (N), the length l of the target sentence ê is assumed to be the same as the length m of the source sentence f at step 435. In either case, processing continues at step 440. A determination is made at step 440 whether the alignment ã between the source sentence f and the target sentence ê is specified. If not (N), an alignment ã between the source sentence f and the target sentence ê is guessed at step 745. The alignment ã may represent a trivial alignment that maps the source word fj to target position j (i.e., ãj=j) or may be guessed more intelligently. In either case, processsing continues at step 450.
At step 450, an optimal translation e of the source sentence f is computed with the length l of the target sentence and the alignment ã between the source and target sentences kept fixed. The optimal translation ê is computed by maximising Pr(f,ã|e)Pr(e), that is by solving the equation: ê=argmaxePr(f,ã|e)Pr(e) for the fixed alignment (i.e., by solving FIXED_ALIGNMENT_DECODING using the dynamic programming technique described hereinafter).
The optimal translation ê is returned at step 460. As the above equation for fixed alignment decoding can be solved in O(m) time, the method of
A source sentence f of length m words (m>0) is input at step 510.
The optimal target language sentence ê and alignment ã between the source sentence f and target sentence ê are initialized to null at step 520.
At step 530, a processing loop variable l, which corresponds to the length of the target sentence ê, is initialized for execution of steps 540 to 585 for each value of l from m/2 to 2m, where m is the length of the source sentence f. Other ranges of sentence length may alternatively be selected, however, a range of target sentence length from m/2 to 2m will likely be appropriate in most cases.
At step 540, a processing loop variable a is initialized for execution of steps 550 to 575 for each alignment between the source sentence f and the target sentence ê.
At step 550, a target sentence e is computed using the linear time NaiveDecode algorithm described in
At step 560, a determination is made whether the target sentence e returned in step 550 is better than the stored best translation ê. If so (Y), the stored best translation ê and the associated alignment â are updated. In either case, processing continues at step 570.
If there is another alignment to process (Y), at step 570, the next alignment is loaded at step 575 and processing returns to step 550 according to the processing loop initiated in step 540. If there are no more alignments to process (N), at step 570, processing continues at step 580.
If there is another length to process (Y), at step 580, the next length is loaded at step 585 and processing returns to step 540 according to the processing loop initiated in step 530. If there are no more lengths to process (N), at step 580, the optimal translation ê and associated alignment are returned at step 590.
The NaiveOptimalDecode algorithm of
NaiveDecode is a linear time decoding algorithm that can be used to compute a sub-optimal solution for equation 3 (the relaxed version of equation 2), whereas NaiveOptimalDecode is an exponential time decoding algorithm that can be used to compute the optimal solution. It is thus desirable to obtain an algorithm or method that is close to NaiveDecode in complexity but close to NaiveOptimalDecode in quality. The complexity of NaiveOptimalDecode may be reduced by carefully reducing the number of alignments that are examined. For example, if only a small number g(m) of alignments in NaiveOptimalDecode are examined, a solution may be found in O(mg(m)) time.
A source sentence f of length m words (m>0) is input at step 605.
The optimal target language sentence e(0) and the alignment a(0) between the source sentence f and target sentence e(0) are initialized to null at step 610.
At step 615, a processing loop variable l, which corresponds to the length of the target sentence e(0), is initialized for execution of steps 620 to 660 for each value of l from m/2 to 2m, where m is the length of the source sentence f. Other ranges of sentence length may alternatively be selected, however, a range of target sentence length from m/2 to 2m will likely be appropriate in most cases.
At step 620, the variables e and a are initialized to null.
At step 625, an initial alignment is guessed from the source French sentence. The initial alignment can be trivially determined, say by mapping each word in the source French sentence f to a word position in the target sentence e, or can be guessed more intelligently. A processing loop is also initialized for execution of steps 630 to 640 while an improvement in the current solution is possible.
At step 630, a target sentence e is computed using the linear time NaiveDecode algorithm described in
At step 635, an improved alignment for the target sentence e computed in step 630 is computed using the Viterbi algorithm. The source sentence f and the target sentence e are passed to the Viterbi algorithm, which returns an improved alignment a.
At step 640, a determination is made whether a further improvement in the target sentence e is possible. For example, a determination may be made whether the score for the current target sentence is better than the previous score by a sufficient amount.
If an improvement is possible (Y), processing returns to step 630 according to the processing loop initiated in step 625. If an improvement is not possible or is not of sufficient magnitude (N), step 645 determines whether the current translation is better than the previously stored best translation. If a better translation (Y), the current target sentence e and associated alignment a are stored as the optimal target sentence e(0) and associated alignment a(0), respectively, at step 650.
If there is another length to process (Y), at step 655, the next length is loaded at step 660 and processing returns to step 620 according to the processing loop initiated in step 615. If there are no more lengths to process (N), at step 655, the optimal translation e(0) is returned at step 665.
AlignmentAlternatingSearch searches for a good translation by varying the length of the target sentence. For a sentence length l, the algorithm finds a translation of length l and then iteratively improves the translation. In each iteration, the algorithm solves two subproblems: FIXED_ALIGNMENT_DECODING and VITERBI_ALIGNMENT. The inputs to each iteration are the source sentence f, the target sentence length l, and an alignment a between the source and target sentences. Thus, AlignmentAlternatingSearch finds a better translation e for f by solving FIXED_ALIGNMENT_DECODING using NaiveDecode. Having computed e, the algorithm computes a better alignment (â) between e and f by solving VITERBI_ALIGNMENT using the Viterbi algorithm. The new alignment thus found is used by AlignmentAlternatingSearch in the subsequent iteration. At the end of each iteration, AlignmentAlternatingSearch checks whether it has made process and ultimately returns the best translation of the source f and its score across a range of target sentence lengths.
The analysis of AlignmentAlternatingSearch is complicated by the fact that the number of iterations depends on the input (i.e., NaiveDecode and Viterbi are repeatedly executed while an improvement in the solution is possible). It is reasonable to assume that the length of the source sentence (m) is an upper bound on the number of iterations. In practice, however, the number of iterations is typically O(l). There are 3m/2 candidate sentence lengths for the translation (l varies from m/2 to 2m) and both NaiveDecode and Viterbi are O(m). therefore, the time complexity of AlignmentAlternatingSearch is O(m2).
A source sentence f of length m words (m>0) is input at step 705.
The optimal target language sentence e(0) and the alignment a(0) between the source sentence f and target sentence e(0) are initialized to null at step 710.
At step 715, a processing loop variable l, which corresponds to the length of the target sentence e(0), is initialized for execution of steps 720 to 760 for each value of l from m/2 to 2m, where m is the length of the source sentence f. A different range for target sentence may be selected if appropriate, as described hereinbefore.
At step 720, the variables e and a are initialized to null.
At step 725, an initial target sentence is guessed from the source French sentence. The initial sentence can be determined, say by picking the best target English word translation for each source word in the French source sentence or can be guessed more intelligently. A processing loop is also initialized for execution of steps 730 to 740 while an improvement in the current solution is possible.
At step 730, we solve the VITERBI_DECODING problem where an improved alignment for the target sentence e is computed using the viterbi algorithm. At step 735, we perform FIXED_ALIGNMENT_DECODING where the source sentence f, the length l and an alignment a are passed to NaiveDecode, which returns a target sentence e.
At step 740, a determination is made whether a further improvement in the target sentence e is possible. This improvement can be determined for example by seeing whether the score for the current target sentence is better than the previous score by a sufficient amount.
If an improvement is possible (Y), processing returns to step 730 according to the processing loop initiated in step 725. If an improvement is not possible or is not of sufficient magnitude (N), step 745 determines whether the current translation is better than the previously stored best translation. If a better translation (Y), the current target sentence e and associated alignment a are stored as the optimal target sentence e(0) and associated alignment a(0), respectively, at step 750.
If there is another length to process (Y), at step 755, the next length is loaded at step 760 and processing returns to step 720 according to the processing loop initiated in step 715. If there are no more lengths to process (N), at step 755, the optimal translation e(0) is returned at step 765.
TargetAlternatingSearch searches for a good translation by varying the length of the target sentence. For a sentence length l, the algorithm finds a translation of length l and then iteratively improves the translation. In each iteration, the algorithm solves two subproblems: FIXED_ALIGNMENT_DECODING and VITERBI_ALIGNMENT. The inputs to each iteration are the source sentence f, the target sentence length l, and an alignment a between the source and target sentences. Thus, TargetAlternatingSearch finds a better translation e for f by solving FIXED_ALIGNMENT_DECODING using NaiveDecode. Having computed e, the algorithm computes a better alignment (â) between e and f by solving VITERBI_ALIGNMENT using the Viterbi algorithm. The new alignment thus found is used by TargetAlternatingSearch in the subsequent iteration. At the end of each iteration, TargetAlternatingSearch checks whether it has made process and ultimately returns the best translation of the source f and its score across a range of target sentence lengths.
The AlignmentAlternatingSearch and TargetAlternatingSearch decoding methods described in
Fixed Alignment Decoding
Each of NaiveDecode, NaiveOptimalDecode, TargetAlternatingSearch and AlignmentAlternatingSearch use a linear time algorithm FIXED_ALIGNMENT_DECODING, which finds the optimal translation given the length l of the target sentence and the alignment â that maps source words to target positions. A dynamic programming based solution to this problem is based on a new formulation of the IBM translation models.
Consider a source French sentence f of |f| words f1, f2, fj, . . . , an alignment â represented by α1, α2, α3, . . . and a partial target sentence e comprising words e1, e2, . . . ei, . . . Let φ(i) be the fertility of the English word ei at target position i. Alignment â maps each of the source words fj, j=1, . . . , m to a target position in the range [0 . . . , l]. A mapping ψ is defined from [0, . . . , l] to subsets of {1, . . . , m} as follows:
ψ(i)={j:jε{1, . . . , m}Λãj=i}Vi=0, . . . , l.
Table 1, below, shows breaking up of Pr(f, ã|e) into constituents Ti, Di and N1:
As a consequence, Pr(f, ã|e) Pr(e) can be written as:
The foregoing reformation of the optimization function of the decoding problem allows dynamic programming to be used for solving FIXED_ALIGNMENT_DECODING efficiently. Notably, each word ei has only a constant number of candidates in the vocabulary. Therefore, the set of words ei, . . . , ei that maximises the LSH of the above optimization function can be found in O(m) time using the standard Dynamic Programming algorithm.
Computer Implementation of French to English Translation Decoder Embodiment
The algorithms have been implemented in the C++ computer programming language and executed on an IBM RS-6000 dual processor workstation with 1 GB of RAM. A French-English translation model (based on IBM Model 3) was built by training over a corpus of 100,000 sentence pairs from the Hansard corpus. The translation direction was from French to English. The English language model used for decoding was built by training over a corpus consisting of about 800 million words. The test sentences were divided into several classes based on length. There were 300 test French sentences in each of the length classes. Four algorithms were implemented, namely:
In order to provide comparative results, the dynamic programming based Held-Karp algorithm by Tillman (2001) was also implemented. Average times taken for translation of each length class were computed for each of the five algorithms and are shown in
The graph of
At step 910, source text in a first language is decoded based on a fixed alignment between words in the source text and words in the target text. An alignment between words in the source text and words in the target text is determined at step 920. Either of steps 910 and 920 may be executed initially. If step 910 is executed first, an initial alignment may be guessed or estimated. Alternatively, if step 920 is executed first, an initial decoded text may be generated.
Steps 910 and 920 are repeated at step 930 while a decoding improvement in the target text can be obtained. Thereafter, the target text in a second language is output at step 940.
At step 1010, an alignment between words in the source text and positions of words in the target text is determined. At step 1020, an optimal translation of the source text is generated based on the alignment determined in step 1010. At step 1030, steps 1010 and 1020 are repeated for each of a plurality of lengths of the translated text.
The computer software involves a set of programmed logic instructions that may be executed by the computer system 1100 for instructing the computer system 1100 to perform predetermined functions specified by those instructions. The computer software may be expressed or recorded in any language, code or notation that comprises a set of instructions intended to cause a compatible information processing system to perform particular functions, either directly or after conversion to another language, code or notation.
The computer software program comprises statements in a computer language. The computer program may be processed using a compiler into a binary format suitable for execution by the operating system. The computer program is programmed in a manner that involves various software components, or code means, that perform particular steps of the methods described hereinbefore.
The components of the computer system 1100 comprise: a computer 1120, input devices 1110, 1115 and a video display 1190. The computer 1120 comprises: a processing unit 1140, a memory unit 1150, an input/output (I/O) interface 1160, a communications interface 1165, a video interface 1145, and a storage device 1155. The computer 1120 may comprise more than one of any of the foregoing units, interfaces, and devices.
The processing unit 1140 may comprise one or more processors that execute the operating system and the computer software under control of the operating system. The memory unit 1150 may comprise random access memory (RAM), read-only memory (ROM), flash memory and/or any other type of memory known in the art for use under direction of the processing unit 1140.
The video interface 1145 is connected to the video display 1190 and provides video signals for display on the video display 1190. User input to operate the computer 1120 is provided via the input devices 1110 and 1115, comprising a keyboard and a mouse, respectively. The storage device 1155 may comprise a disk drive or any other suitable non-volatile storage medium.
Each of the components of the computer 1120 is connected to a bus 1130 that comprises data, address, and control buses, to allow the components to communicate with each other via the bus 1130.
The computer system 1100 may be connected to one or more other similar computers via the communications interface 1165 using a communication channel 1185 to a network 1180, represented as the Internet.
The computer software program may be provided as a computer program product, and recorded on a portable storage medium. In this case, the computer software program is accessible by the computer system 1100 from the storage device 1155. Alternatively, the computer software may be accessible directly from the network 1180 by the computer 1120. In either case, a user can interact with the computer system 1100 using the keyboard 1110 and mouse 1115 to operate the programmed computer software executing on the computer 1120.
The computer system 1100 has been described for illustrative purposes. Accordingly, the foregoing description relates to an example of a particular type of computer system suitable for practicing the methods and computer program products described hereinbefore and hereinafter. Other configurations or types of computer systems can equally well be used to practice the methods and computer program products described hereinbefore and hereinafter, as would be readily understood by persons skilled in the art.
Embodiments of methods, apparatuses and computer program products have been described hereinbefore for performing statistical translation decoding. The foregoing description provides exemplary embodiments only, and is not intended to limit the scope, applicability or configurations of the invention. Rather, the description of the exemplary embodiments provides those skilled in the art with descriptions for implementing an embodiment of the invention. Various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the invention as set forth in the claims hereinafter.