Tibetan Character Constituent Analysis Method, Tibetan Sorting Method And Corresponding Devices

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit and priority of Chinese Patent Application No. 201610528753.9 filed Jul. 5, 2016. The entire disclosure of the above application is incorporated herein by reference.

FIELD

The present invention relates to the field of natural language processing, in particular to a Tibetan character constituent analysis method, a Tibetan sorting method and corresponding devices.

BACKGROUND

Like other languages, automatic computer Tibetan sorting method is also widely used in various fields of Tibetan information technology, including Tibetan dictionary and thesaurus sorting, information retrieval, text sorting and the like. Since the research on the Tibetan information technology in the early 1980s, the research on the automatic computer Tibetan sorting has never been stopped. With the development of the Tibetan information technology, an automatic Tibetan sorting algorithm is generally adopted in the prior art to sort the Tibetan.

However, as the existing sorting algorithms and models are not perfect and are error-prone and too complicated, the existing Tibetan sorting methods have no universality or compatibility, which is inconvenient for the use of the automatic computer Tibetan sorting.

SUMMARY

The present invention provides a Tibetan character constituent analysis method, a Tibetan sorting method and corresponding devices, which have universality and compatibility, and can facilitate the use of automatic computer Tibetan sorting.

On one aspect, a Tibetan character constituent analysis method is provided, including: S10, acquiring a Tibetan text to be analyzed; S20, using Tibetan characters in the Tibetan text as the input of a preset finite state automaton group; and S30, acquiring the constituents of the Tibetan characters according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the Tibetan characters in the Tibetan text are correctly spelled; the finite state automaton group includes 24 finite state automata, and any finite state automaton M_i=(Σ_i, Q_i, δ_i, q_i, F_i); the Σ_irepresents a finite set of terminal symbols of a preset Tibetan spelling formal grammar G_i; the Q_irepresents a union of a finite set V_iof non-terminal symbols of the Tibetan spelling formal grammar G_iand the F_i; the δ_irepresents a state transition function of the finite state automaton M_iacquired by mapping from a direct product Q_i*Σ_iof Q_iand Σ_ito Q_i; the q_irepresents an initial state of the finite state automaton M_i; q_iεQ_i; the F_irepresents a finite set of termination states of the finite state automaton M_i, and F_i⊂Q_i; and the custom-character is a positive integer, and ≦24.

On another aspect, a Tibetan sorting method is provided, including: S10, acquiring at least two Tibetan characters to be sorted; S20, respectively using the at least two Tibetan characters to be sorted as the input of a preset finite state automaton group; S30, acquiring the constituents of the Tibetan characters according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the input Tibetan characters are correctly spelled; and S40, sorting the at least two Tibetan characters according to the constituents of the at least two Tibetan characters to acquire a sorting result; the finite state automaton group includes 24 finite state automata, and any finite state automaton M_i=(Σ_i, Q_i, δ_i, q_i, F_i); the Σ_irepresents a finite set of terminal symbols of a preset Tibetan spelling formal grammar G_i; the Q_irepresents a union of a finite set V_iof non-terminal symbols of the Tibetan spelling formal grammar G_iand the F_i; the δ_irepresents a state transition function of the finite state automaton M_iacquired by mapping from a direct product Q_i*Σ_iof Q_iand Σ_ito Q_i; the q_irepresents an initial state of the finite state automaton M_i; q_iεQ_i; the F_irepresents a finite set of termination states of the finite state automaton M_iand F_i⊂Q_i; and the custom-character is a positive integer, and ≦24.

On a third aspect, a Tibetan sorting method is provided, including: S10, acquiring at least two Tibetan words to be sorted; S20, respectively acquiring Tibetan characters in the at least two Tibetan words; S30, respectively using the Tibetan characters in the at least two Tibetan words as the input of a preset finite state automaton group; S40, acquiring the constituents of the Tibetan characters according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the input Tibetan characters are correctly spelled; and S50, sorting the at least two Tibetan words according to the constituents of the each Tibetan character in the at least two Tibetan words to acquire a sorting result; the finite state automaton group includes 24 finite state automata, and any finite state automaton M_i=(Σ_i, Q_i, δ_i, q_i, F_i); the Σ_irepresents a finite set of terminal symbols of a preset Tibetan spelling formal grammar G_i; the Q_irepresents a union of a finite set V_iof non-terminal symbols of the Tibetan spelling formal grammar G_iand the F_i; the δ_irepresents a state transition function of the finite state automaton M_iacquired by mapping from a direct product Q_i*Σ_iof Q_iand Σ_ito Q_i; the q_irepresents an initial state of the finite state automaton M_i; q_iεQ_i; the F_irepresents a finite set of termination states of the finite state automaton M_iand F_i⊂Q_i; and the custom-character is a positive integer, and ≦24.

On a fourth aspect, a Tibetan character constituent analysis device is provided, including:

a text acquisition module, used for acquiring a Tibetan text to be analyzed;

a text input module, connected with the text acquisition module and used for using Tibetan characters in the Tibetan text as the input of a preset finite state automaton group; and

a constituent analysis module, connected with the text input module and used for acquiring the constituents of the Tibetan characters according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the Tibetan characters in the Tibetan text are correctly spelled;

the finite state automaton group includes 24 finite state automata, and any finite state automaton M_i=(Σ_i, Q_i, δ_i, q_i, F_i); the Σ_irepresents a finite set of terminal symbols of a preset Tibetan spelling formal grammar G_i; the Q_irepresents a union of a finite set V_iof non-terminal symbols of the Tibetan spelling formal grammar G_iand the F_i; the δ_irepresents a state transition function of the finite state automaton M_iacquired by mapping from a direct product Q_i*Σ_iof Q_iand Σ_ito Q_i; the q_irepresents an initial state of the finite state automaton M_i; q_iεQ_i; the F_irepresents a finite set of termination states of the finite state automaton M_iand F_i⊂Q_i; and the custom-character is a positive integer, and ≦24.

On a fifth aspect, a Tibetan sorting device is provided, including:

a Tibetan character acquisition module, used for acquiring at least two Tibetan characters to be sorted;

a Tibetan character input module, connected with the Tibetan character acquisition module and used for respectively using the at least two Tibetan characters to be sorted as the input of a preset finite state automaton group;

a constituent analysis module, connected with the Tibetan character input module and used for acquiring the constituents of the Tibetan characters according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the input Tibetan characters are correctly spelled; and

a sorting module, connected with the constituent analysis module and used for sorting the at least two Tibetan characters according to the constituents of the at least two Tibetan characters to acquire a sorting result;

the finite state automaton group includes 24 finite state automata, and any finite state automaton M_i=(Σ_i, Q_i, δ_i, q_i, F_i); the Σ_irepresents a finite set of terminal symbols of a preset Tibetan spelling formal grammar G_i; the Q_irepresents a union of a finite set V_iof non-terminal symbols of the Tibetan spelling formal grammar G_iand the F_i; the δ_irepresents a state transition function of the finite state automaton M_iacquired by mapping from a direct product Q_i*Σ_iof Q_iand Σ_ito Q_i; the q_irepresents an initial state of the finite state automaton M; q_iεQ_i; the F_irepresents a finite set of termination states of the finite state automaton M_iand F_i⊂Q_i; and the custom-character is a positive integer, and ≦24.

On a sixth aspect, a Tibetan sorting device is provided, including:

a Tibetan word acquisition module, used for acquiring at least two Tibetan words to be sorted;

a Tibetan character acquisition module, connected with the Tibetan word acquisition module and used for respectively acquiring Tibetan characters in the at least two Tibetan words;

a Tibetan character input module, connected with the Tibetan character acquisition module and used for respectively using the Tibetan characters in the at least two Tibetan words as the input of a preset finite state automaton group;

a sorting module, connected with the constituent analysis module and used for sorting the at least two Tibetan words according to the constituents of the each Tibetan character in the at least two Tibetan words to acquire a sorting result;

The present invention has the following beneficial effects: the Tibetan text to be analyzed is used as the input of the finite state automaton group, and the constituents of the Tibetan characters are acquired according to the target finite state automaton which determines that the Tibetan characters are correct, therefore Tibetan character constituent analysis is achieved, and Tibetan sorting can be further achieved according to the constituents of the Tibetan characters. As the finite state automaton group corresponds to the Tibetan spelling formal grammar, the technical solutions provided by the embodiments of the present invention can solve the problem that the existing Tibetan sorting methods have no universality or compatibility, which is inconvenient for the use of automatic computer Tibetan sorting.

DRAWINGS

FIG. 1 is a flowchart of a Tibetan character constituent analysis method provided by a first embodiment of the present invention;

FIG. 2 is a flowchart of a Tibetan sorting method provided by a second embodiment of the present invention;

FIG. 3 is a flowchart of a Tibetan sorting method provided by a third embodiment of the present invention;

FIG. 4 is a schematic diagram of a structure of a Tibetan character constituent analysis device provided by a fourth embodiment of the present invention;

FIG. 5 is a schematic diagram of a structure of a Tibetan sorting device provided by a fifth embodiment of the present invention;

FIG. 6 is a schematic diagram of a structure of a Tibetan sorting device provided by a sixth embodiment of the present invention.

DETAILED DESCRIPTION

The present invention will be further illustrated below in combination with accompanying drawings and embodiments. But the usage and the objective of these exemplary implementations are merely used for citing the present invention, but do not constitute any form of limitation to the actual protection scope of the present invention, let alone limit the protection scope of the present invention hereto.

First Embodiment

As shown in FIG. 1, the embodiment of the present invention provides a Tibetan character constituent analysis method, including the following steps.

Step 101, a Tibetan text to be analyzed is acquired.

In the embodiment, the Tibetan text acquired in the step 101 can only contain one Tibetan character and can also contain a plurality of Tibetan characters, and this is not limited herein. Specifically, when the Tibetan text contains a plurality of Tibetan characters, the acquired Tibetan text can be firstly segmented with an character as a unit to acquire at least one Tibetan character; and the segmentation mode can be that the acquired Tibetan text is segmented with an character as a unit according to a Tibetan character separator, a vertical character, a double-vertical character and a space character.

Particularly, when the Tibetan text contains a plurality of Tibetan characters, it may also be a Tibetan word composed of a plurality of Tibetan characters, at this time, the acquired Tibetan text can be segmented according to a specific separator and other signs, and this is not limited herein.

Step 102, the Tibetan characters in the Tibetan text are used as the input of a preset finite state automaton group.

In the embodiment, when the Tibetan text only contains one Tibetan character, the step 102 specifically includes: using the Tibetan character as the input of the preset finite state automaton group; and when the Tibetan text only contains a plurality of Tibetan characters, the step 102 specifically includes: respectively using the Tibetan characters in the Tibetan text as the input of the preset finite state automaton group.

In the embodiment, the finite state automaton group includes 24 finite state automata, wherein any finite state automaton M_i=(Σ_i, Q_i, δ_i, q_i, F_i); the Σ_irepresents a finite set of terminal symbols of a preset Tibetan spelling formal grammar G_i; the Q_irepresents a union of a finite set V_iof non-terminal symbols of the Tibetan spelling formal grammar G_iand the F_i; the δ_irepresents a state transition function of the finite state automaton M_iacquired by mapping from a direct product Q_i*Σ_iof Q_iand Σ_ito Q_i; the q_irepresents an initial state of the finite state automaton M_i; q_iεQ_i; the F_irepresents a finite set of termination states of the finite state automaton M_iand F_i⊂Q_i; and the custom-character is a positive integer, and ≦24.

In the embodiment, 24 Tibetan spelling formal grammars are preset, and each Tibetan spelling formal grammar corresponds to one finite state automaton; and at least one Tibetan character is used as the input of each preset finite state automaton in sequence. The finite set of the terminal symbols of the Tibetan spelling formal grammar G_iis a subset of a set L consisting of 30 Tibetan consonants, 5 reverse scripts, 4 vowel symbols and 1 long vowel symbol, and includes characters (symbols) actually occurring in a sentence (a Tibetan character belonging to a certain structure) of the language; the set of the non-terminal symbols of the Tibetan spelling formal grammar G_iincludes words that do not actually occur in the sentence of the language, but play the function of variables in deduction, and are equivalent to the grammatical category in the language. For example, the non-terminal symbol can be a variable of an SVO (Subject Verb Object) word order of the Chinese, the SOV (Subject Object Verb) word order of the Tibetan and other grammars, but it does not occur in a specific sentence, that is, it implicitly works, but cannot be seen.

Elements in the finite set of the terminal symbols and the finite set of the non-terminal symbols correspond to specific Tibetan spelling formal grammars. The initial state of the finite state automaton M_iis a state, in which the automation just starts to work, and this state is a state in which the automaton primarily receives input characters; and the termination state refers to a final state of the automaton. Specifically, the automata in the finite state automaton group can be a determined type and can also be an undetermined type; and to facilitate the understanding and improve the implementation efficiency, the automata of the determined types provided by the embodiment are taken as an example for illustration.

In the embodiment, the process of acquiring the finite state automaton group can include: acquiring the Tibetan spelling formal grammar G_i, wherein the G_i=(T_i, V_i, S_i, P_i); acquiring a termination state identifier E_iof the finite state automaton group M_i; judging whether a finite set P_iof production rules of the Tibetan spelling formal grammar G_icontains a production rule S_i→ custom-character ; if so, acquiring F_iwith values of S_iand E_i; if not, acquiring F_iwith a value E_i; and acquiring the finite state automaton M_iaccording to the T_i, V_i, S_iand F_i, wherein T_irepresents the finite set of the terminal symbols of the Tibetan spelling formal grammar G_i; S_irepresents a start symbol of the Tibetan spelling formal grammar G_i; S_iεV_i; custom-character represents a null character; and a finite set Σ_iof the input characters of the finite state automaton M_iis equivalent to the finite set T_iof the terminal symbols of the Tibetan spelling formal grammar G_i; and the initial state q_iof the finite state automaton M_iis equivalent to the start symbol S_iof the Tibetan spelling formal grammar G_i.

Wherein, the process of acquiring the Tibetan spelling formal grammar includes: acquiring the finite set T_iof the terminal symbols, wherein T_iis a subset of the set L, and the set L includes 30 Tibetan consonants, 5 reverse scripts, 4 vowel symbols and 1 long vowel symbol; acquiring the finite set V_iof the non-terminal symbols; acquiring the start symbol S_i, wherein S_iεV_i; acquiring the finite set P_iof the production rules; and acquiring the corresponding Tibetan spelling formal grammar G_iaccording to the T_i, V_i, S_iand P_i. Wherein, the process of acquiring the finite set P_iof the production rules can include: at first, acquiring a preset Tibetan spelling grammar formal description system; and then acquiring the finite set P_iof the production rules according to the Tibetan spelling grammar formal description system.

In the embodiment, the preset Tibetan spelling grammar formal description system can be established according to a set theory method, and the specific form is as follows:

Tibetan spelling grammar 1: elements in a set Root={b₁, b₂, b₃, b₄, b₅, . . . , b₃₀, b₃₁, b₃₁, b₃₁, b₃₄, b₃₅} respectively correspond to 30 Tibetan consonants and 5 Tibetan reverse scripts, and then any Tibetan character corresponding to b_iε Root can constitute a root of a Tibetan character.

Tibetan spelling grammar 2: for a set Prefix={b₃, b11, b15, b16, b23}, Prefix⊂Root, any Tibetan character corresponding to b_iε Prefix, (j=3, 11, 15, 16, 23) can constitute a prefix of the Tibetan character.

Tibetan spelling grammar 3: for a set Suffix={b₃, b₄, b₁₁, b₁₂, b₁₅, b₁₆, b₂₃, b₂₅, b₂₆, b₂₈}, Suffix⊂Root, any Tibetan character corresponding to b_iεSuffix, (j=3, 4, 11, 12, 15, 16, 23, 25, 26, 28) can constitute a suffix of the Tibetan character.

Tibetan spelling grammar 4: for a set Postfix={b₁₁, b28}, Postfix⊂Suffix⊂Root, any Tibetan character corresponding to b_iεPostfix, (j=11, 28) can constitute a postfix of the Tibetan character.

Tibetan spelling grammar 5: for a set Superfix={b₂₅, b26, b28}, Superfix⊂Root, any Tibetan character corresponding to b_iεSuperfix, (j=25, 26, 28) can constitute a superfix of the Tibetan character.

Tibetan spelling grammar 6: for a set Subfix={b₂₀, b₂₄, b₂₅, b₂₆}, Subfix⊂Root, any Tibetan character corresponding to b_iεSubfix, (j=20, 24, 25, 26) can constitute a subfix of the Tibetan character.

Tibetan spelling grammar 7: for a set Vowel=Vowel₁{a}, Vowel₁={i, u, e, o} corresponds to 4 Tibetan vowel characters, and a represents a Tibetan long vowel character. The Tibetan roots corresponding to b_jεRoot, (j=1, 23, 5, 7, . . . , 33, 34, 35) can be spelled with vowel characters corresponding to vεVowel, u and a can only be spelled below consonants, and the rest 3 vowel characters can only be spelled above consonants.

Tibetan spelling grammar 8: when the Tibetan roots corresponding to b_jεRoot, (j=1, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17, 19, 29) are spelled with the superfixes corresponding to b_iεSuperfix, (i=25, 26, 28), the following grammar rules must be satisfied:

1. b_jεRoot, (j=1, 3, 4, 7, 8, 9, 11, 12, 15, 16, 17, 19) can only be spelled with b₂₅εSuperfix.

2. b_jεRoot, (j=1, 3, 4, 5, 7, 9, 11, 13, 15, 29) can only be spelled with b₂₆εSuperfix.

3. b_jεRoot, (j=1, 3, 4, 8, 9, 11, 12, 13, 15, 16, 17) can only be spelled with b₂₈εSuperfix.

Tibetan spelling grammar 9: when the Tibetan roots corresponding to b_jεRoot, (j=1, 2, 3, 8, 9, 10, 11, 13, 14, 15, 16, 18, 21, 22, 25, 26, 27, 28, 29) are spelled with the subfixes corresponding to b_iεSubfix, (i=20, 24, 25, 26), the following grammar rules must be satisfied:

1. b_jεRoot, (j=1, 2, 3, 8, 11, 18, 21, 22, 25, 26, 27, 29) can only be spelled with b₂₀εSubfix.

2. b_jεRoot, (j=1, 2, 3, 13, 14, 15, 16) can only be spelled with b₂₄εSubfix.

3. b_jεRoot, (j=1, 2, 3, 9, 10, 11, 13, 14, 15, 16, 28, 29) can only be spelled with b₂₅εSubfix.

4. b_jεRoot, (j=1, 3, 15, 22, 25, 28) can only be spelled with b₂₆εSubfix.

5. b_jεRoot, (j=29) can only be spelled with b₁₄εSubfix.

(Note: to spell the [f] phonetic symbol in other languages, and b₂₉and b₁₄spelling forms occur in the modern Tibetan. According to the traditional Tibetan spelling grammar, b₂₉cannot be used as the superfix, and b₁₄cannot be used as the subfix either, therefore, as a special condition, when b₂₉is spelled with b₁₄, b₁₄is deemed as the “subfix”.)

Tibetan spelling grammar 10: when the Tibetan roots corresponding to b_iεRoot, (i=1, 3, 12, 13, 15, 16, 17) are simultaneously spelled with the superfixes corresponding to b_jεSuperfix, (j=25, 28) and the subfixes corresponding to b_kεSubfix, (k=20, 24, 25), the following grammar rules must be satisfied:

1. when being spelled with b₂₅εSuperfix, b_iεRoot can be simultaneously spelled with b₂₄εSubfix; and when being spelled with b₂₈εSuperfix, b_iεRoot can be simultaneously spelled with b_kεSubfix, (k=24, 25).

2. When being spelled with b₂₅εSuperfix, b₃εRoot can be simultaneously spelled with b₂₄εSubfix; and when being spelled with b₂₈εSuperfix, b₃εRoot can be simultaneously spelled with b_kεSubfix, (k=24, 25).

3. When being spelled with b₂₈εSuperfix, b₁₂εRoot can be simultaneously spelled with b₂₅εSubfix.

4. When being spelled with b₂₈εSuperfix, b₁₃εRoot can be simultaneously spelled with b_kεSubfix, (k=24, 25).

5. When being spelled with b₂₈εSuperfix, b₁₅εRoot can be simultaneously spelled with b_kεSubfix, (k=24, 25).

6. When being spelled with b₂₅εSuperfix, b₁₆εRoot can be simultaneously spelled with b₂₄εSubfix; and when being spelled with b₂₈εSuperfix, b₁₆εRoot can be simultaneously spelled with b_kεSubfix, (k=24, 25).

7. When being spelled with b₂₅εSuperfix, b₁₇εRoot can be simultaneously spelled with b₂₀εSubfix.

Tibetan spelling grammar 11: when the Tibetan roots corresponding to b_iεRoot, (i=1, 3, 4, 7, 8, 9, 11, 12, 17, 19) are simultaneously spelled with the prefixes corresponding to b₁₅εPrefix and the superfixes corresponding to b_jεSuperfix, (j=25, 26, 28), the following grammar rules must be satisfied:

1. b_iεRoot, (i=1, 3, 4, 7, 8, 9, 11, 12, 17, 19) can be spelled with b₂₅εSuperfix.

2. b_iεRoot, (i=9,11) can be spelled with b₂₆εSuperfix.

3. b_iεRoot, (i=1, 3, 4, 8, 9, 11, 12, 17) can be spelled with b₂₈εSuperfix.

Tibetan spelling grammar 12: when the Tibetan roots corresponding to b_iεRoot, (i=1, 2, 3, 11, 13, 14, 15, 16, 22, 25, 28) are simultaneously spelled with the prefixes corresponding to b_iεPrefix, (j=11, 15, 16, 23) and the subfixes corresponding to b_kεSubfix, (k=20, 24, 25, 26), the following grammar rules must be satisfied:

1. b_iεRoot, (i=1, 3, 13, 15, 16) can be spelled with b₁₁εPrefix and b₂₄εSubfix.

2. b_iεRoot, (i=1, 3, 13, 15) can be spelled with b₁₁εPrefix and b₂₅εSubfix.

3. b_iεRoot, (i=1, 3) can be spelled with b₁₅εPrefix and b₂₄εSubfix.

4. b_iεRoot, (i=1, 3, 28) can be spelled with b₁₅εPrefix and b₂₅εSubfix.

5. b_iεRoot, (i=1, 22, 25, 28) can be spelled with b₁₅εPrefix and b₂₆εSubfix.

6. b_iεRoot, (i=2, 3) can be spelled with b₁₆εPrefix and b_kεSubfix, (k=24,25).

7. b_iεRoot, (i=2, 3, 14, 15) can be spelled with b₂₃εPrefix and b₂₄εSubfix.

8. b_iεRoot, (i=2, 3, 11, 14, 15) can be spelled with b₂₃εPrefix and b₂₅εSubfix.

Tibetan spelling grammar 13: when the Tibetan roots corresponding to b_iεRoot, (i=1, 3) are spelled with the prefixes corresponding to b₁₅εPrefix, the superfixes corresponding to b_jεSuperfix, (i=25, 28) and the subfixes corresponding to b_kεSubfix, (i=24, 25), the following grammar rules must be satisfied:

1. b_iεRoot, (i=1, 3) can be spelled with b₁₅εPrefix, b₂₅εSuperfix and b₂₄εSubfix.

2. b_iεRoot, (i=1, 3) can be spelled with b₁₅εPrefix, b₂₈εSuperfix and b₂₅εSubfix.

3. b_iεRoot, (i=1,3) can be spelled with b_isεPrefix, b₂₈εSuperfix and b₂₄εSubfix.

Tibetan spelling grammar 14: when being spelled with the prefixes corresponding to b_jεPrefix, (j=3, 11, 15, 16, 23), the Tibetan roots corresponding to b_iεRoot, (i=1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 24, 27, 28) must be simultaneously spelled with the vowel symbols corresponding to vεVowel, Vowel={i, u, e, o}, or one suffix corresponding to b_kεSuffix, (k=3, 4, 11, 12, 15, 16, 23, 25, 26, 28), and the following grammar rules must be satisfied:

1. b_iεRoot, (i=5, 8, 9, 11, 12, 17, 21, 22, 24, 27, 28) can only be spelled with b₃εPrefix.

2. b_iεRoot, (i=1, 3, 4, 13, 15, 16) can only be spelled with b₁₁εPrefix.

3. b_iεRoot, (i=1, 3, 5, 9, 11, 17, 21, 22, 27, 28) can only be spelled with b₁₅εPrefix.

4. b_iεRoot, (i=2, 3, 4, 6, 7, 8, 10, 11, 12, 18, 19) can only be spelled with b₁₆εPrefix.

5. b_iεRoot, (i=2, 3, 6, 7, 10, 11, 14, 15, 18, 19) can only be spelled with b₂₃εPrefix.

Tibetan spelling grammar 15: the Tibetan roots corresponding to b_jεRoot, (j=1, 2, 3, 4, 5, 6, 7, 8, 9, 10, . . . , 21, 22, 23, 24, 25, 26, 27, 28, 29, 30) can be spelled with any suffix corresponding to b_iεSuffix, (i=3, 4, 11, 12, 15, 16, 23, 25, 26, 28).

Tibetan spelling grammar 16: the use of the Tibetan postfixes is only related to the suffixes. The Tibetan suffixes corresponding to b_iεSuffix, (i=3, 4, 12, 15, 16, 25, 26) can be spelled with the postfixes corresponding to b_jεPostfix, (j=11,28), and the following grammar rules must be satisfied:

1. b₁₁εPostfix can only be spelled with b_iεSuffix, (i=12, 25, 26).

2. b₂₈εPostfix can only be spelled with b_iεSuffix, (i=3, 4, 15, 16).

Tibetan spelling grammar 17: when being spelled with the Tibetan subfixes corresponding to b_jεSubfix, (j=24, 25), the Tibetan roots corresponding to b_iεRoot, (i=3, 11, 14) can be simultaneously spelled with the Tibetan subfixes corresponding to b₂₀εSubfix. The specific rules are as follows:

1. when being spelled with b₂₅εSubfix, b_iεRoot, (i=3,11) can be simultaneously spelled with b₂₀εSubfix.

2. When being spelled with b₂₄εSubfix, b₁₄εRoot can be simultaneously spelled with b₂₀εSubfix.

Tibetan spelling grammar 18: the Tibetan consonants corresponding to b₂₉εRoot can be spelled with the Tibetan consonants corresponding to b₁₄εRoot, and b₁₄εRoot is correspondingly located below b₂₉εRoot.

Tibetan spelling grammar 19: when being spelled with the Tibetan consonants corresponding to b₁₄εRoot, the Tibetan consonants corresponding to b₂₉εRoot can be simultaneously spelled with the Tibetan suffixes corresponding to b_iεSuffix, (i=3, 4, 11, 12, 15, 16, 23, 25, 26, 28).

Tibetan spelling grammar 20: the Tibetan characters having no suffix can be spelled with the Tibetan consonants corresponding to b₂₃εRoot, and at this time, the Tibetan consonants corresponding to b₂₃εRoot must be spelled with the vowel symbols (i, e, u, o) corresponding to vεVowel, Vowel={i, u, e, o}.

Tibetan spelling grammar 21: besides the special spelling in the grammars 17, 18, 19 and 20, the Tibetan characters are spelled according to the sequence of the prefixes, the superfixes, the roots, the subfixes, the vowel symbols, the suffixes and the postfixes.

In the embodiment, T_irepresents the finite set of the terminal symbols of the Tibetan spelling formal grammar G_i; S_irepresents the start symbol of the Tibetan spelling formal grammar G_i; S_iεV_i; custom-character represents a null character; the finite set Σ_iof the input characters of the finite state automaton M_iis equivalent to the finite set T_iof the terminal symbols of the Tibetan spelling formal grammar G_i; and the initial state q_iof the finite state automaton M_iis equivalent to the start symbol S_iof the Tibetan spelling formal grammar G_i. Wherein, S_irepresents any possible sentence (it is a Tibetan character in the application herein) in the language L (G_i) generated by the grammar G_i, so S_iis a special non-terminal symbol.

Specifically, the specific forms of the 24 Tibetan spelling formal grammars G₁to G₂₄are as follows:

Tibetan spelling formal grammar G₁: the spelling formal grammar G₁of the Tibetan roots and the vowel symbols is a quadruple (T₁, V₁, S₁, P₁), wherein:

(1) terminal symbol

T₁=T_B∪T_o, wherein:

T_B={b₁, b₂, b₃, b₄, b₅, . . . , b₃₅}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o, a}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁={S₁, B_1,1, B_1,2};

(3) S₁is a non-terminal symbol in V₁and is a start symbol; and

(4) a production set of the grammar G₁is: P₁={

S₁→b₁|b₂|b₃|b₄|b₅| . . . |b₃₀|b₃₁|b₃₂|b₃₃|b₃₄|b₃₅,

S₁→b₁B_1,1|b₂B_1,1|b₃B_1,1|b₄B_1,1|b₅B_1,1| . . . |b₃₀B_1,1,

S₁→b₃₁B_1,2|b₃₂B_1,2|b₃₃B_1,2|b₃₄B_1,2|b₃₅B_1,2,

B_1,1→i|u|e|o|a,

B_1,2→i|u|e|o}

With respect to a Tibetan spelling structure 2:

Tibetan spelling formal grammar G₂: the spelling formal grammar G₂of the Tibetan superfixes, the roots and the vowels is a quadruple (T₂, V₂, S₂, P₂), wherein:

(1) terminal symbol

T₂=T_B∪T_o, wherein:

T_B={b₁, b₃, b₄, b₅, b₇, b₈, b₉, b₁₁, b₁₂, b₁₃, b₁₅, b₁₆, b₁₇, b₁₉, b₂₅, b₂₆, b₂₈, b₂₉}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₂={S₂, B_2,1, B_2,2, B_2,3, B_2,4};

(3) S₂is a non-terminal symbol in V₂and is the start symbol;

(4) the production set of the grammar G₂is: P₂={

S₂→b₂₅B_2,1|b₂₆B_2,2|b₂₈B_2,3,

B_2,1→b₁|b₃|b₄|b₇|b₈|b₉|b₁₁|b₁₂|b₁₅|b₁₆|b₁₇|b₁₉,

B_2,1→b₁B_2,4|b₃B_2,4|b₄B_2,4|b₇B_2,4|b₈B_2,4|b₉B_2,4|b₁₁B_2,4|b₁₂B_2,4|b₁₅B_2,4|b₁₆B_2,4|b₁₇B_2,4|b₁₉B_2,4,

B_2,2→b₁|b₃|b₄|b₅|b₇|b₉|b₁₁|b₁₃|b₁₅|b₂₉,

B_2,2→b₁B_2,4|b₃B_2,4|b₄B_2,4|b₅B_2,4|b₇B_2,4|b₉B_2,4|b₁₁B_2,4|b₁₃B_2,4|b₁₅B_2,4|b₂₉B_2,4,

B_2,3→b₁|b₃|b₄|b₈|b₉|b₁₁|b₁₂|b₁₃|b₁₅|b₁₆|b₁₇,

B_2,3→b₁B_2,4|b₃B_2,4|b₄B_2,4|b₈B_2,4|b₉B_2,4|b₁₁B_2,4|b₁₂B_2,4|b₁₃B_2,4|b₁₅B_2,4|b₁₆B_2,4|b₁₇B_2,4,

B_2,4→i|u|e|o}

With respect to a Tibetan spelling structure 3:

Tibetan spelling formal grammar G₃: the spelling formal grammar G₃of the Tibetan roots, the subfixes and the vowel symbols is a quadruple (T₃, V₃, S₃, P₃), wherein:

(1) terminal symbol

T₃=T_B∪T_o, wherein:

T_B{b₁, b₂, b₃, b₈, b₉, b₁₀, b₁₁, b₁₃, b₁₄, b₁₅, b₁₆, b₁₈, b₂₀, b₂₁, b₂₂, b₂₄, b₂₅, b₂₆, b₂₇, b₂₈, b₂₉}, the elements thereof correspond to the Tibetan consonant characters; and T₀={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₃={S₃, B_3,1, B_3,2, B_3,3, B_3,4, B_3,5, B_3,6, B_3,7, B_3,8, B_3,9, B_3,10};

(3) S₃is a non-terminal symbol in V₃and is the start symbol; and

(4) the production set of the grammar G₃is: P₃={

S₃→b₁B_3,1|b₃B_3,1,

S₃→b₂B_3,2,

S₃→b₁₁B_3,3|b₂₉B_3,3,

S₃→b₈B_3,4|b₁₈B_3,4|b₂₁B_3,4|b₂₆B_3,4|b₂₇B_3,4,

S₃→b₉B_3,5|b₁₀B_3,5,

S₃→b₁₃B_3,6|b₁₄B_3,6|b₁₆B_3,6,

S₃→b₂₂B_3,7|b₂₅B_3,7,

S₃→b₂₈B_3,8,

S₃→b₁₅B_3,9,

B_3,1→b₂₀|b₂₄|b₂₅|b₂₆,

B_3,1→b₂₀B_3,10|b₂₄B_3,10|b₂₅B_3,10|b₂₆B_3,10,

B_3,2→b₂₀|b₂₄|b₂₅,

B_3,2→b₂₀B_3,10|b₂₄B_3,10|b₂₅B_3,10,

B_3,3→b₂₀|b₂₅,

B_3,3→b₂₀B_3,10|b₂₅B_3,10,

B_3,4→b₂₀,

B_3,4→b₂₀B_3,10,

B_3,5→b₂₅,

B_3,5→b₂₅B_3,10,

B_3,6→b₂₄|b₂₅,

B_3,6→b₂₄B_3,10|b₂₅B_3,10,

B_3,7→b₂₀|b₂₆,

B_3,7→b₂₀B_3,10|b₂₆B_3,10,

B_3,8→b₂₅|b₂₆,

B_3,8→b₂₅B_3,10|b₂₆B_3,10,

B_3,9→b₂₄|b₂₅|b₂₆,

B_3,9→b₂₄B_3,10|b₂₅B_3,10|b₂₆B_3,10,

B_3,10→i|u|e|o}

With respect to a Tibetan spelling structure 4:

Tibetan spelling formal grammar G₄: the spelling formal grammar G₄of the superfixes, the Tibetan roots, the subfixes and the vowel symbols is a quadruple (T₄, V₄, S₄, P₄, wherein:

(1) terminal symbol

T₄=T_B∪T_o, wherein T_B={b₁, b₃, b₁₂, b₁₃, b₁₅, b₁₆, b₁₇, b₂₀, b₂₄, b₂₅, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₄={S₄, B_4,1, B_4,2, B_4,3, B_4,4, B_4,5, B_4,6B_4,7};

(3) S₄is a non-terminal symbol in V₄and is the start symbol; and

(4) the production set of the grammar G₄is: P₄={

S₄→b₂₅B_4,1,

S₄→b₂₈B_4,2,

B_4,1→b₁B_4,3|b₃B_4,3|b₁₆B_4,3,

B_4,1→b₁₇B_4,4,

B_4,2→b₁B_4,5|b₃B_4,5|b₁₃B_4,5|b₁₅B_4,5|b₁₆B_4,5,

B_4,2→b₁₂B_4,6,

B_4,3→b₂₄,

B_4,3→b₂₄B_4,7,

B_4,4→b₂₀,

B_4,4→b₂₀B_4,7,

B_4,5→b₂₄|b₂₅,

B_4,5→b₂₄B_4,7|b₂₅B_4,7,

B_4,6→b₂₅,

B_4,6→b₂₅B_4,7,

B_4,7→i|u|e|o}

With respect to a Tibetan spelling structure 5:

Tibetan spelling formal grammar G₅: the spelling formal grammar G₅of the Tibetan prefixes, the superfixes, the roots and the vowel symbols is a quadruple (T₅, V₅, S₅, P₅), wherein:

(1) terminal symbol

T₅=T_B∪T_o, wherein:

T_B={b₁, b₃, b₄, b₇, b₈, b₉, b₁₁, b₁₂, b₁₅, b₁₇, b₁₉, b₂₅, b₂₆, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₅={S₅, B_5,1, B_5,2, B_5,3, B_5,4, B_5,5};

(3) S₅is a non-terminal symbol in V₅and is the start symbol; and

(4) the production set of the grammar G₅is: P₅={

S₅→b₁₅B_5,1,

B_5,1→b₂₈B_5,2,

B_5,1→b₂₆B_5,3,

B_5,1→b₂₅B_5,4,

B_5,2→b₁|b₃|b₄|b₈|b₉|b₁₁|b₁₂|b₁₇,

B_5,2→b₁B_5,5|b₃B_5,5|b₄B_5,5|b₈B_5,5|b₉B_5,5|b₁₁B_5,5|b₁₂B_5,5|b₁₇B_5,5,

B_5,3→b₉|b₁₁,

B_5,3→b₉B_5,5|b₁₁B_5,5;

B_5,4→b₁|b₃|b₄|b₇|b₈|b₉|b₁₁|b₁₂|b₁₇|b₁₉,

B_5,4→b₁B_5,5|b₃B_5,5|b₄B_5,5|b₇B_5,5|b₈B_5,5|b₉B_5,5|b₁₁B_5,5|b₁₂B_5,5|b₁₇B_5,5|b₁₉B_5,5,

B_5,5→i|u|e|o}

With respect to a Tibetan spelling structure 6:

Tibetan spelling formal grammar G₆: the spelling formal grammar G₆of the Tibetan prefixes, the roots, the subfixes and the vowel symbols is a quadruple (T₆, V₆, S₆, P₆), wherein:

(1) terminal symbol

T₆=T_B∪T_o, wherein:

T_B={b₁, b₂, b₃, b₁₁, b₁₃, b₁₄, b₁₅, b₁₆, b₂₂, b₂₃, b₂₄, b₂₅, b₂₆, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₆={S₆, B_6,1, B_6,2, B_6,3, B_6,4, B_6,5, B_6,6, B_6,7, B_6,8, B_6,9, B_6,10, B_6,11};

(3) S₆is a non-terminal symbol in V₆and is the start symbol; and

(4) the production set of the grammar G₆is: P₆={

S₆→b₁₁B_6,1|b₁₅B_6,2|b₁₆B_6,3|b₂₃B_6,4,

B_6,1→b₁₆B_6,5,

B_6,1→b₁B_6,9|b₃B_6,9|b₁₃B_6,9|b₁₅B_6,9,

B_6,2→b₁B_6,6,

B_6,2→b₂₂B_6,7|b₂₅B_6,7,

B_6,2→b₂₈B_6,8,

B_6,2→b₃B_6,9,

B_6,3→b₂B_6,9|b₃B_6,9,

B_6,4→b₂B_6,9|b₃B_6,9|b₁₄B_6,9|b₁₅B_6,9,

B_6,4→b₁₁B_6,10,

B_6,5→b₂₄,

B_6,5→b₂₄B_6,11,

B_6,6→b₂₄|b₂₅|b₂₆,

B_6,6→b₂₄B_6,11|b₂₅B_6,11|b₂₆B_6,11,

B_6,7→b₂₆,

B_6,7→b₂₆B_6,11,

B_6,8→b₂₅|b₂₆,

B_6,8→b₂₅B_6,11|b₂₆B_6,11,

B_6,9→b₂₄|b₂₅,

B_6,9→b₂₄B_6,11|b₂₅B_6,11,

B_6,10→b₂₅,

B_6,10→b₂₅B_6,11,

B_6,11→i|u|e|o}

With respect to a Tibetan spelling structure 7:

Tibetan spelling formal grammar G₇: the spelling formal grammar G₇of the Tibetan prefixes, the superfixes, the roots, the subfixes and the vowel symbols is a quadruple (T₇, V₇, S₇, P₇), wherein:

(1) terminal symbol

T₇=T_B∪T_o, wherein:

T_B={b₁, b₃, b₁₅, b₂₄, b₂₅, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₇{S₇, B_7,1, B_7,2, B_7,3, B_7,4, B_7,5, B_7,6};

(3) S₇is a non-terminal symbol in V₇and is the start symbol; and

(4) the production set of the grammar G₇is: P₇={

S₇→b₁₅B_7,1,

B_7,1→b₂₈B_7,2,

B_7,1→b₂₅B_7,3,

B_7,2→b₁B_7,4|b₃B_7,4,

B_7,3→b₁B_7,5|b₃B_7,5,

B_7,4→b₂₄|b₂₅,

B_7,4→b₂₄B_7,6|b₂₅B_7,6,

B_7,5→b₂₄,

B_7,5→b₂₄B_7,6,

B_7,6→i|u|e|o}

With respect to a Tibetan spelling structure 8:

Tibetan spelling formal grammar G₈: the spelling formal grammar G₈of the Tibetan prefixes, the roots and the vowel symbols is a quadruple (T₈, V₈, S₈, P₈), wherein:

(1) terminal symbol

T₈=T_B∪T_o, wherein:

T_B={b₁, b₂, b₃, b₄, b₅, b₆, b₇, b₈, b₉, b₁₀, b₁₁, b₁₂, b₁₃, b₁₄, b₁₅, b₁₆, b₁₇, b₁₈, b₁₉, b₂₁, b₂₂, b₂₃, b₂₄, b₂₇, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₈={S₈, B_8,1, B_8,2, B_8,3, B_8,4, B_8,5, B_8,6};

(3) S₈is a non-terminal symbol in V₈and is the start symbol; and

(4) the production set of the grammar G₈is: P₈={

S₈→b₃B_8,1|b₁₁B_8,2|b₁₅B_8,3|b₁₆B_8,4|b₂₃B_8,5,

B_8,1→b₅B_8,6|b₈B_8,6|b₉B_8,6|b₁₁B_8,6|b₁₂B_8,6|b₁₇B_8,6|b₂₁B_8,6|b₂₂B_8,6|b₂₄B_8,6|b₂₇B_8,6|b₂₈B_8,6,

B_8,2→b₁B_8,6|b₃B_8,6|b₄B_8,6|b₁₃B_8,6|b₁₅B_8,6|b₁₆B_8,6,

B_8,3→b₁B_8,6|b₃B_8,6|b₅B_8,6|b₉B_8,6|b₁₁B_8,6|b₁₇B_8,6|b₂₁B_8,6|b₂₂B_8,6|b₂₇B_8,6|b₂₈B_8,6,

B_8,4→b₂B_8,6|b₃B_8,6|b₄B_8,6|b₆B_8,6|b₇B_8,6|b₈B_8,6|b₁₀B_8,6|b₁₁B_8,6|b₁₂B_8,6|b₁₈B_8,6|b₁₉B_8,6,

B_8,5→b₂B_8,6|b₃B_8,6|b₆B_8,6|b₇B_8,6|b₁₀B_8,6|b₁₁B_8,6|b₁₄B_8,6|b₁₅B_8,6|b₁₈B_8,6|b₁₉B_8,6,

B_8,6→i|u|e|o}

With respect to a Tibetan spelling structure 9:

Tibetan spelling formal grammar G₉: the spelling formal grammar G₉of the Tibetan prefixes, the roots, the vowel characters and the suffixes is a quadruple (T₉, V₉, S₉, P₉), wherein:

(1) terminal symbol

T₉=T_B∪T_o, wherein:

T_B={b₁, b₂, b₃, b₄, b₅, b₆, b₇, b₈, b₉, b₁₀, b₁₁, b₁₂, b₁₃, b₁₄, b₁₅, b₁₆, b₁₇, b₁₈, b₁₉, b₂₁, b₂₂, b₂₃, b₂₄, b₂₅, b₂₆, b₂₇, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₉={S₉, B_9,1, B_9,2, B_9,3, B_9,4, B_9,5, B₉, B_9,7};

(3) S₉is a non-terminal symbol in V₉and is the start symbol; and

(4) the production set of the grammar G₉is: P₉={

S₉→b₃B_9,1|b₁₁B_9,2|b₁₅B_9,3|b₁₆B_9,4|b₂₃B_9,5,

B_9,1→b₅B_9,7|b₈B_9,7|b₉B_9,7|b₁₁B_9,7|b₁₂B_9,7|b₁₇B_9,7|b₂₁B_9,7|b₂₂B_9,7|b₂₄B_9,7|b₂₇B_9,7|b₂₈B_9,7,

B_9,1→b₅B_9,6|b₈B_9,6|b₉B_9,6|b₁₁B_9,6|b₁₂B_9,6|b₁₇B_9,6|b₂₁B_9,6|b₂₂B_9,6|b₂₄B_9,6|b₂₇B_9,6|b₂₈B_9,6,

B_6,2→b₁B_9,7|b₃B_9,7|b₄B_9,7|b₁₃B_9,7|b₁₅B_9,7|b₁₆B_9,7,

B_9,2→b₁B_9,6|b₃B_9,6|b₄B_9,6|b₁₃B_9,6|b₁₅B_9,6|b₁₆B_9,6,

B_9,3→b₁B_9,7|b₃B_9,7|b₅B_9,7|b₉B_9,7|b₁₁B_9,7|b₁₇B_9,7|b₂₁B_9,7|b₂₂B_9,7|b₂₇B_9,7|b₂₈B_9,7,

B_9,3→b₁B_9,6|b₃B_9,6|b₅B_9,6|b₉B_9,6|b₁₁B_9,6|b₁₇B_9,6|b₂₁B_9,6|b₂₂B_9,6|b₂₇B_9,6|b₂₈, B_9,6,

B_9,4→b₂B_9,7|b₃B_9,7|b₄, B_9,7|b₆B_9,7|b₇B_9,7|b₈B_9,7|b₁₀B_9,7|b₁₁B_9,7|b₁₂B_9,7|b₁₈B_9,7|b₁₉B_9,7,

B_9,4→b₂B_9,6|b₃B_9,6|b₄B_9,6|b₆B_9,6|b₇B_9,6|b₈B_9,6|b₁₀B_9,6|b₁₁B_9,6|b₁₂B_9,6|b₁₈B_9,6|b₁₉B_9,6,

B_9,5→b₂B_9,7|b₃B_9,7|b₆B_9,7|b₇B_9,7|b₁₀B_9,7|b₁₁B_9,7|b₁₄B_9,7|b₁₅B_9,7|b₁₈B_9,7|b₁₉B_9,7,

B_9,5→b₂B_9,6|b₃B_9,6|b₆B_9,6|b₇B_9,6|b₁₀B_9,6|b₁₁B_9,6|b₁₄B_9,6|b₁₅B_9,6|b₁₈B_9,6|b₁₉B_9,6,

B_9,6→iB_9,7|uB_9,7|eB_9,7|oB_9,7,

B_9,7→b₃|b₄|b₁₁|b₁₂|b₁₅|b₁₆|b₂₃|b₂₅|b₂₆|b₂₈}

With respect to a Tibetan spelling structure 10:

Tibetan spelling formal grammar G₁₀: the spelling formal grammar G₁₀of the Tibetan prefixes, the superfixes, the roots, the vowel symbols and the suffixes is a quadruple (T₁₀, V₁₀, S₁₀, P₁₀), wherein:

(1) terminal symbol

T₁₀=T_B∪T_o, wherein:

T_B={b₁, b₃, b₄, b₇, b₉, b₁₁, b₁₂, b₁₅, b₁₆, b₁₇, b₁₉, b₂₃, b₂₅, b₂₆, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁₀={S₁₀, B_10,1, B_10,2, B_10,3, B_10,4, B_10,5, B_10,6};

(3) S₁₀is a non-terminal symbol in V₁₀and is the start symbol; and

(4) the production set of the grammar G₁₀is: P₁₀={

B_10,1→b₂₈B_10,2|b₂₆B_10,3|b₂₅B_10,4,

B_10,2→b₁B_10,6|b₃B_10,6|b₄B_10,6|b₈B_10,6|b₉B_10,6|b₁₁B_10,6|b₁₂B_10,6|b₁₇B_10,6,

B_10,2→b₁B_10,5|b₃B_10,5|b₄B_10,5|b₈B_10,5|b₉B_10,5|b₁₁B_10,5|b₁₂B_10,5|b₁₇B_10,5,

B_10,3→b₉B_10,6|b₁₁B_10,6,

B_10,3→b₉B_10,5|b₁₁B_10,5,

B_10,4→b₁B_10,6|b₃B_10,6|b₄B_10,6|b₇B_10,6|b₈B_10,6|b₉B_10,6|b₁₁B_10,6|b₁₂B_10,6|b₁₇B_10,6|b₁₉B_10,6,

B_10,4→b₁B_10,5|b₃B_10,5|b₄B_10,5|b₇B_10,5|b₈B_10,5|b₉B_10,5|b₁₁B_10,5|b₁₂B_10,5|b₁₇B_10,5|b₁₉B_10,5,

B_10,5→iB_10,6|uB_10,6|eB_10,6|oB_10,6,

B_10,6→b₃|b₄|b₁₁|b₁₂|b₁₅|b₁₆|b₂₃|b₂₅|b₂₆|b₂₈}

With respect to a Tibetan spelling structure 11:

Tibetan spelling formal grammar G₁₁: the spelling formal grammar G₁₁of the Tibetan prefixes, the roots, the subfixes, the vowel symbols and the suffixes is a quadruple (T₁₁, V₁₁, S₁₁, P₁₁), wherein:

(1) terminal symbol

T₁₁=T_B∪T_o, wherein:

T_B={b₁, b₂, b₃, b₄, b₁₁, b₁₂, b₁₃, b₁₄, b₁₅, b₁₆, b₂₂, b₂₃, b₂₄, b₂₅, b₂₆, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁₁={S₁₁, B_11,1, B_11,2, B_11,3, B_11,4, B_11,5, B_11,6, B_11,7, B_11,8, B_11,9, B_11,10, B_11,11, B_11,12};

(3) S₁₁is a non-terminal symbol in V₁₁and is the start symbol; and

(4) the production set of the grammar G₁₁is: P₁₁={

S₁₁→b₁₁B_11,1|b₁₅B_11,2|b₁₆B_11,3|b₂₃B_11,4,

B_11,1→b₁₆B_11,5,

B_11,1→b₁B_11,9|b₃B_11,9|b₁₃B_11,9|b₁₅B_11,9,

B_11,2→b₁B_11,6,

B_11,2→b₂₂B_11,7|b₂₅B_11,7,

B_11,2→b₂₈B_11,8,

B_11,2→b₃B_11,9,

B_11,3→b₂B_11,9|b₃B_11,9,

B_11,4→b₂B_11,9|b₃B_11,9|b₁₄B_11,9|b₁₅B_11,9,

B_11,4→b₁₁B_11,10,

B_11,5→b₂₄B₁₂,

B_11,5→b₂₄B_11,11,

B_11,6→b₂₄B_11,12|b₂₅B_11,12|b₂₆B_11,12,

B_11,6→b₂₄B_11,11|b₂₅B_11,11|b₂₆B_11,11,

B_11,7→b₂₆B_11,12,

B_11,7→b₂₆B_11,11,

B_11,8→b₂₅B_11,12|b₂₆B_11,12,

B_11,8→b₂₅B_11,11|b₂₆B_11,11,

B_11,9→b₂₄B_11,12|b₂₅B_11,12,

B_11,9→b₂₄B_11,11|b₂₅, B_11,11,

B_11,10→b₂₅B_11,12,

B_11,10→b₂₅B_11,11,

B_11,11→iB_11,12|uB_11,12|eB_11,12|oB_11,12,

B_11,12→b₃|b₄|b₁₁|b₁₂|b₁₅|b₁₆|b₂₃|b₂₅|b₂₆|b₂₈}

With respect to a Tibetan spelling structure 12:

Tibetan spelling formal grammar G₁₂: the spelling formal grammar G₁₂of the Tibetan prefixes, the superfixes, the roots, the subfixes, the vowel symbols and the suffixes is a quadruple (T₁₂, V₁₂, S₁₂, P₁₂), wherein:

(1) terminal symbol

T₁₂=T_B∪T_o, wherein:

T_B={b₁, b₃, b₄, b₁₁, b₁₂, b₁₅, b₁₆, b₂₃, b₂₄, b₂₅, b₂₆, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁₂={S₁₂, B_12,1, B_12,2, B_12,3, B_12,4, B_12,5, B_12,6, B_12,7};

(3) S₁₂is a non-terminal symbol in V₁₂and is the start symbol; and

(4) the production set of the grammar G₁₂is: P₁₂={

S₁₂→b₁₅B_12,1,

B_12,1→b₂₈B_12,2,

B_12,1→b₂₅B_12,3,

B_12,2→b₁B_12,4|b₃B_12,4,

B_12,3→b₁B_12,5|b₃B_12,5,

B_12,4→b₂₄B_12,7|b₂₅B_12,7,

B_12,4→b₂₄B_12,6|b₂₅B_12,6,

B_12,5→b₂₄B_12,7,

B_12,5→b₂₄B_12,6,

B_12,6→iB_12,7|uB_12,7|eB_12,7|oB_12,7,

B_12,7→b₃|b₄|b₁₁|b₁₂|b₁₅|b₁₆|b₂₃|b₂₅|b₂₆|b₂₈}

With respect to a Tibetan spelling structure 13:

Tibetan spelling formal grammar G₁₃: the spelling formal grammar G₁₃of the Tibetan prefixes, the roots, the vowel symbols, the suffixes and the postfixes is a quadruple (T₁₃, V₁₃, S₁₃, P₁₃), wherein:

(1) terminal symbol

T₁₃=T_B∪T_o, wherein:

(2) non-terminal symbol set

V₁₃={S₁₃, B_13,1, B_13,2, B_13,3, B_13,4, B_13,5, B_13,6, B_13,7, B_13,8, B_13,9};

(3) S₁₃is a non-terminal symbol in V₁₃and is the start symbol; and

(4) the production set of the grammar G₁₃is: P₁₃={

S₁₃→b₃B_13,1|b₁₁B_13,2|b₁₅B_13,3|b₁₆B_13,4|b₂₃B_13,5,

B_13,1→b₅B_13,6|b₈B_13,6|b₉B_13,6|b₁₁B_13,6|b₁₂B_13,6|b₁₇B_13,6|b₂₁B_13,6|b₂₂B_13,6|b₂₄B_13,6|b₂₇B_13,6|b₂₈B_13,6,

B_13,2→b₁B_13,6|b₃B_13,6|b₄B_13,6|b₁₃B_13,6|b₁₅B_13,6|b₁₆B_13,6,

B_13,3→b₁B_13,6|b₃B_13,6|b₅B_13,6|b₉B_13,6|b₁₁B_13,6|b₁₇B_13,6|b₂₁B_13,6|b₂₂B_13,6|b₂₇B_13,6|b₂₈B_13,6,

B_13,4→b₂B_13,6|b₃B_13,6|b₄B_13,6|b₆B_13,6|b₇B_13,6|b₈B_13,6|b₁₀B_13,6|b₁₁B_13,6|b₁₂B_13,6|b₁₈B_13,6|b₁₉B_13,6,

B_13,5→b₂B_13,6|b₃B_13,6|b₆B_13,6|b₇B_13,6|b₁₀B_13,6|b₁₁B_13,6|b₁₄B_13,6|b₁₅B_13,6|b₁₈B_13,6|b₁₉B_13,6,

B_13,6→iB_13,7|uB_13,7|eB_13,7|oB_13,7,

B_13,6→b₃B_13,8|b₄B_13,8|b₁₅B_13,8|b₁₆B_13,8,

B_13,6→b₁₂B_13,9|b₂₅B_13,9|b₂₆B_13,9,

B_13,7→b₃B_13,8|b₄B_13,8|b₁₅B_13,8|b₁₆B_13,8,

B_13,7→b₁₂B_13,9|b₂₅B_13,9|b₂₆B_13,9,

B_13,8→b₂₈,

B_13,9→b₁₁}

With respect to a Tibetan spelling structure 14:

Tibetan spelling formal grammar G₁₄: the spelling formal grammar G₁₄of the Tibetan prefixes, the superfixes, the roots, the vowel symbols, the suffixes and the postfixes is a quadruple (T₁₄, V₁₄, S₁₄, P₁₄), wherein:

(1) terminal symbol

T₁₄=T_B∪T_o, wherein:

T_B={b₁, b₃, b₄, b₁₁, b₁₂, b₁₃, b₁₅, b₁₆, b₁₇, b₂₀, b₂₄, b₂₅, b₂₆, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁₄={S₁₄, B_14,1, B_14,2, B_14,3, B_14,4, B_14,5, B_14,6, B_14,7, B_14,8};

(3) S₁₄is a non-terminal symbol in V₁₄and is the start symbol; and

(4) the production set of the grammar G₁₄is: P₁₄={

S₁₄→b₁₅B_14,1,

B_14,1→b₂₈B_14,2|b₂₆B_14,3|b₂₅B_14,4,

B_14,2→b₁B_14,5|b₃B_14,5|b₄B_14,5|b₈B_14,5|b₉B_14,5|b₁₁B_14,5|b₁₂B_14,5|b₁₇B_14,5,

B_14,3→b₉B_14,5|b₁₁B_14,5,

B_14,4→b₁B_14,5|b₃B_14,5|b₄B_14,5|b₇B_14,5|b₈B_14,5|b₉B_14,5|b₁₁B_14,5|b₁₂B_14,5|b₁₇B_14,5|b₁₉B_14,5,

B_14,5→iB_14,6|uB_14,6|eB_14,6|oB_14,6,

B_14,5→b₃B_14,7|b₄B_14,7|b₁₅B_14,7|b₁₆B_14,7,

B_14,5→b₁₂B_14,8|b₂₅B_14,8|b₂₆B_14,8,

B_14,6→b₃B_14,7|b₄B_14,7|b₁₅B_14,7|b₁₆B_14,7,

B_14,6→b₁₂B_14,8|b₂₅B_14,8|b₂₆B_14,8,

B_14,7→b₂₈,

B_14,8→b₁₁}

With respect to a Tibetan spelling structure 15:

Tibetan spelling formal grammar G₁₅: the spelling formal grammar G₁₅of the Tibetan prefixes, the roots, the subfixes, the vowel symbols, the suffixes and the postfixes is a quadruple (T₁₅, V₁₅, S₁₅, P₁₅), wherein:

(1) terminal symbol

T₁₅=T_B∪T_o, wherein:

T_B{b₁, b₂, b₃, b₄, b₁₁, b₁₂, b₁₃, b₁₄, b₁₅, b₁₆, b₂₂, b₂₃, b₂₄, b₂₅, b₂₆, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁₅={S₁₅, B_15,1, B_15,2, B_15,3, B_15,4, B_15,5, B_15,6, B_15,7, B_15,8, B_15,9, B_15,10, B_15,11, B_15,12, B_15,13, B_15,14};

(3) S₁₅is a non-terminal symbol in V₁₅and is the start symbol; and

(4) the production set of the grammar G₁₅is: P₁₅={

S₁₅→b₁₁B_15,1|b₁₅B_15,2|b₁₆B_15,3|b₂₃B_15,4,

B_15,1→b₁₆B_15,5,

B_15,1→b₁B_15,9|b₃B_15,9|b₁₃B_15,9|b₁₅B_15,9,

B_15,2→b₁B_15,6,

B_15,2→b₂₂B_15,7|b₂₅B_15,7,

B_15,2→b₂₈B_15,8,

B_15,2→b₃B_15,9,

B_15,3→b₂B_15,9|b₃B_15,9,

B_15,4→b₂B_15,9|b₃B_15,9|b₁₄B_15,9|b₁₅B_15,9,

B_15,4→b₁₁B_15,10,

B_15,5→b₂₄B_15,11,

B_15,6→b₂₄B_15,11|b₂₅B_15,11|b₂₆B_15,11,

B_15,7→b₂₆B_15,11,

B_15,8→b₂₅B_15,11|b₂₆B_15,11,

B_15,9→b₂₄B_15,11|b₂₅B_15,11,

B_15,10→b₂₅B_15,11,

B_15,11→iB_15,12|uB_15,12|eB_15,12|oB_15,12,

B_15,11→b₃B_15,13|b₄B_15,13|b₁₅B_15,13|b₁₆B_15,13,

B_15,11→b₁₂B_15,4|b₂₅B_15,14|b₂₆B_15,14,

B_15,12→b₃B_15,13|b₄B_15,13|b₁₅B_15,13|b₁₆B_15,13,

B_15,12→b₁₂B_15,14|b₂₅B_15,14|b₂₆B_15,14,

B_15,13→b₂₈,

B_15,14→b₁₁}

With respect to a Tibetan spelling structure 16:

Tibetan spelling formal grammar G₁₆; the Tibetan character spelling grammar G₁₆of the Tibetan prefixes, the superfixes, the roots, the subfixes, the vowel symbols, the suffixes and the postfixes is a quadruple (T₁₆, V₁₆, S₁₆, P₁₆), wherein:

(1) terminal symbol

T₁₆=T_B∪T_o, wherein:

T_B{b₁, b₃, b₄, b₁₁, b₁₂, b₁₅, b₁₆, b₂₄, b₂₅, b₂₆, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁₆={S₁₆, B_16,1, B_16,2, B_16,3, B_16,4, B_16,5, B_16,6, B_16,7, B_16,8, B_16,9};

(3) S₁₆is a non-terminal symbol in V₁₆and is the start symbol; and

(4) the production set of the grammar G₁₆is: P₁₆={

S₁₆→b₁₅B_16,1,

B_16,1→b₂₈B_16,2,

B_16,1→b₂₅B_16,3,

B_16,2→b₁B_16,4|b₃B_16,4,

B_16,3→b₁B_16,5|b₃B_16,5,

B_16,4→b₂₄B_16,6|b₂₅B_16,6,

B_16,5→b₂₄B_16,6,

B_16,6→iB_16,7|uB_16,7|eB_16,7|oB_16,7,

B_16,6→b₃B_16,8|b₄B_16,8|b₁₅B_16,8|b₁₆B_16,8,

B_16,6→b₁₂B_16,9|b₂₅B_16,9|b₂₆B_16,9,

B_16,7→b₃B_16,8|b₄B_16,8|b₁₅B_16,8|b₁₆B_16,8,

B_16,7→b₁₂B_16,9|b₂₅B_16,9|b₂₆B_16,9,

B_16,8→b₂₈,

B_16,9→b₁₁}

With respect to a Tibetan spelling structure 17:

Tibetan spelling formal grammar G₁₇: the spelling formal grammar G₁₇of the Tibetan roots, the vowel symbols and the suffixes is a quadruple (T₁₇, V₁₇, S₁₇, P₁₇), wherein:

(1) terminal symbol

T₁₇=T_B∪T_o, wherein:

T_B={b₁, b₂, b₃, b₄, b₅, . . . , b₃₀}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁₇={S₁₇, B_17,1, B_17,2};

(3) S₁₇is a non-terminal symbol in V₁₇and is the start symbol; and

(4) the production set of the grammar G₁₇is: P₁₇={

S₁₇→b₁B_17,1|b₂B_17,1|b₃B_17,1|b₄B_17,1|b₅B_17,1| . . . |b₃₀B_17,1,

S₁₇→b₁B_17,2|b₂B_17,2|b₃B_17,2|b₄B_17,2|b₅B_17,2| . . . |b₃₀B_17,2,

B_17,1→|iB_17,2|uB_17,2|eB_17,2|oB_17,2,

B_17,2→b₃|b₄|b₁₁|b₁₂|b₁₅|b₁₆|b₂₃|b₂₅|b₂₆|b₂₈}

With respect to a Tibetan spelling structure 18:

Tibetan spelling formal grammar G₁₈: the spelling formal grammar G₁₈of the Tibetan superfixes, the roots, the vowel symbols and the suffixes is a quadruple (T₁₈, V₁₈, S₁₈, P₁₈), wherein:

(1) terminal symbol

T₁₈=T_B∪T_o, wherein:

T_B={b₁, b₃, b₄, b₅, b₇, b₈, b₉, b₁₁, b₁₂, b₁₃, b₁₅, b₁₆, b₁₇, b₁₉, b₂₃, b₂₅, b₂₆, b₂₈, b₂₉}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁₈={S₁₈, B_18,1, B_18,2, B_18,3, B_18,4, B_18,5};

(3) S₁₈is a non-terminal symbol in V₁₈and is the start symbol; and

(4) the production set of the grammar G₁₈is: P₁₈={

S₁₈→b₂₅B_18,1|b₂₆B_18,2|b₂₈B_18,3,

B_18,1→b₁B_18,5|b₃B_18,5|b₄B_18,5|b₇B_18,5|b₈B_18,5|b₉B_18,5|b₁₁B_18,5|b₁₂B_18,5|b₁₅B_18,5|b₁₆B_18,5|b₁₇B_18,5|b₁₉B_18,5,

B_18,1→b₁B_18,4|b₃B_18,4|b₄B_18,4|b₇B_18,4|b₈B_18,4|b₉B_18,4|b₁₁, B_18,4|b₁₂B_18,4|b₁₅B_18,4|b₁₆B_18,4|b₁₇B_18,4|b₁₉B_18,4,

B_18,2→b₁B_18,5|b₃B_18,5|b₄B_18,5|b₅B_18,5|b₇B_18,5|b₉B_18,5|b₁₁B_18,5|b₁₃B_18,5|b₁₅B_18,5|b₂₉B_18,5,

B_18,2→b₁B_18,4|b₃B_18,4|b₄B_18,4|b₅B_18,4|b₇B_18,4|b₉B_18,4|b₁₁B_18,4|b₁₃B_18,4|b₁₅B_18,4|b₂₉B_18,4,

B_18,3→b₁B_18,5|b₃B_18,5|b₄, B_18,5|b₈B_18,5|b₉B_18,5|b₁₁B_18,5|b₁₂B_18,5|b₁₃B_18,5|b₁₅B_18,5|b₁₆B_18,5|b₁₇B_18,5,

B_18,3→b₁B_18,4|b₃B_18,4|b₄B_18,4|b₈B_18,4|b₉B_18,4|b₁₁B_18,4|b₁₂B_18,4|b₁₃B_18,4|b₁₅B_18,4|b₁₆B_18,4|b₁₇B_18,4,

B_18,4→iB_18,5|uB_18,5|eB_18,5|oB_18,5,

B_18,5→b₃|b₄|b₁₁|b₁₂|b₁₅|b₁₆|b₂₃|b₂₅|b₂₆|b₂₈}

With respect to a Tibetan spelling structure 19:

Tibetan spelling formal grammar G₁₉: the spelling formal grammar G₁₉of the Tibetan roots, the subfixes, the vowel symbols and the suffixes is a quadruple (T₆, V₆, S₆, P₆), wherein:

(1) terminal symbol

T₁₉=T_B∪T_o, wherein:

T_B={b₁, b₂, b₃, b₄, b₈, b₉, b₁₀, b₁₁, b₁₂, b₁₃, b₁₄, b₁₅, b₁₆, b₁₈, b₂₀, b₂₁, b₂₂, b₂₃, b₂₄, b₂₅, b₂₆, b₂₇, b₂₈, b₂₉}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₁₉={S₁₉, B_19,1, B_19,2, B_19,3, B_19,4, B_19,5, B_19,6, B_19,7, B_19,8, B_19,9, B_19,10, B_19,11};

(3) S₁₉is a non-terminal symbol in V₁₉and is the start symbol; and

(4) the production set of the grammar G₁₉is: P₁₉={

S₁₉→b₁B_19,1|b₃B_19,1,

S₁₉→b₂B_19,2,

S₁₉→b₁₁B_19,3|b₂₉B_19,3,

S₁₉→b₈B_19,4|b₁₈B_19,4|b₂₁B_19,4|b₂₆B_19,4|b₂₇B_19,4,

S₁₉→b₉B_19,5|b₁₀B_19,5,

S₁₉→b₁₃B_19,6|b₁₄B_19,6|b₁₆B_19,6,

S₁₉→b₂₂B_19,7|b₂₅B_19,7,

S₁₉→b₂₈B_19,8,

S₁₉→b₁₅B_19,9,

B_19,1→b₂₀B_19,11|b₂₄B_19,11|b₂₅B_19,11|b₂₆B_19,11,

B_19,1→b₂₀B_19,10|b₂₄B_19,10|b₂₅B_19,10|b₂₆B_19,10,

B_19,2→b₂₀B_19,11|b₂₄B_19,11|b₂₅B_19,11,

B_19,2→b₂₀B_19,10|b₂₄B_19,10|b₂₅B_19,10,

B_19,3→b₂₀B_19,11|b₂₅B_19,11,

B_19,3→b₂₀B_19,10|b₂₅B_19,10,

B_19,4→b₂₀B_19,11,

B_19,4→b₂₀B_19,10,

B_19,5→b₂₅B_19,11,

B_19,5→b₂₅B_19,10,

B_19,6→b₂₄B_19,11|b₂₅B_19,11,

B_19,6→b₂₄B_19,10|b₂₅B_19,10,

B_19,7→b₂₀B_19,11|b₂₆B_19,11,

B_19,7→b₂₀B_19,10|b₂₆B_19,10,

B_19,8→b₂₅B_19,11|b₂₆B_19,11,

B_19,8→b₂₅B_19,10|b₂₆B_19,10,

B_19,9→b₂₄B_19,11|b₂₅B_19,11|b₂₆B_19,11,

B_19,9→b₂₄B_19,10|b₂₅B_19,10|b₂₆B_19,10,

B_19,10→iB_19,11|uB_19,11|eB_19,11|oB_19m,

B_19,11→b₃|b₄|b₁₁|b₁₂|b₁₅|b₁₆|b₂₃|b₂₅|b₂₆|b₂₈}

With respect to a Tibetan spelling structure 20:

Tibetan spelling formal grammar G₂₀: the spelling formal grammar G₂₀of the superfixes, the Tibetan roots, the subfixes, the vowel symbols and the suffixes is a quadruple (T₂₀, V₂₀, S₂₀, P₂₀), wherein:

(1) terminal symbol

T₂₀=T_B∪T_o, wherein:

T_B={b₁, b₃, b₄, b₁₁, b₁₂, b₁₃, b₁₅, b₁₆, b₁₇, b₂₀, b₂₃, b₂₄, b₂₅, b₂₆, b₂₈}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₂₀={S₂₀, B_20,1, B_20,2, B_20,3, B_20,4, B_20,5, B_20,6, B_20,7, B_20,8};

(3) S₂₀is a non-terminal symbol in V₂₀and is the start symbol; and

(4) the production set of the grammar G₂₀is: P₂₀={

S₂₀→b₂₅B_20,1,

S₂₀→b₂₈B_20,2,

B_20,1→b₁B_20,3|b₃B_20,3|b₁₆B_20,3,

B_20,1→b₁₇B_20,4,

B_20,2→b₁B_20,5|b₃B_20,5|b₁₃B_20,5|b₁₅B_20,5|b₁₆B_20,5,

B_20,2→b₁₂B_20,6,

B_20,3→b₂₄B_20,8,

B_20,3→b₂₄B_20,7,

B_20,4→b₂₀B_20,8,

B_20,4→b₂₀B_20,7,

B_20,5→b₂₄B_20,8|b₂₅B_20,8,

B_20,5→b₂₄B_20,7|b₂₅B_20,7,

B_20,6→b₂₅B_20,8,

B_20,6→b₂₅B_20,7,

B_20,7→iB_20,8|uB_20,8|eB_20,8|oB_20,8,

B_20,8→b₃|b₄|b₁₁|b₁₂|b₁₅|b₁₆|b₂₃|b₂₅|b₂₆|b₂₈}

With respect to a Tibetan spelling structure 21:

Tibetan spelling formal grammar G₂₁: the spelling formal grammar G₂₁of the Tibetan roots, the vowel symbols, the suffixes and the postfixes is a quadruple (T₂₁, V₂₁, S₂₁, P₂₁), wherein:

(1) terminal symbol

T₂₁=T_B∪T_o, wherein:

(2) non-terminal symbol set

V₂₁={S₂₁, B_21,1, B_21,2, B_21,3, B_24,4, B_21,5, B_21,6, B_21,7};

(3) S₂₁is a non-terminal symbol in V₂₁and is the start symbol; and

(4) the production set of the grammar G₂₁is: P₂₁={

S₂₁→b₁B_21,1|b₂B_21,1| . . . |b₁₀B_21,1|b₁₂B_21,1|b₁₃B_21,1| . . . |b₂₂B_21,1|b₂₄B_21,1|b₂₅B_21,1| . . . |b₃₀B_21,1,

S₂₁→b₁₁B_21,2,

S₂₁→b₂₃B_21,3,

B_21,1→iB_21,4|uB_21,4|eB_21,4|oB_21,4,

B_21,1→b₃B_21,7|b₄B_21,7|b₁₅B_21,7|b₁₆B_21,7,

B_21,2→iB_21,5|uB_21,5|eB_21,5|oB_21,5,

B_21,3→b₄B_21,7|b₁₆B_21,7,

B_21,3→iB_21,6|uB_21,6|eB_21,6|oB_21,6,

B_21,4→b₃B_21,7|b₄B_21,7|b₁₅B_21,7|b₁₆B_21,7,

B_21,5→b₃B_21,7|b₄B_21,7|b₁₅B_21,7|b₁₆B_21,7,

B_21,6→b₃B_21,7|b₄B_21,7|b₁₅B_21,7|b₁₆B_21,7,

B_21,7→b₂₈}

With respect to a Tibetan spelling structure 22:

Tibetan spelling formal grammar G₂₂: the spelling formal grammar G₂₂of the Tibetan superfixes, the roots, the vowel symbols, the suffixes and the postfixes is a quadruple (T₂₂, V₂₂, S₂₂, P₂₂), wherein:

(1) terminal symbol

T₂₂=T_B∪T_o, wherein:

(2) non-terminal symbol set

V₂₂={S₂₂, B_22,1, B_22,2, B_22,3, B_22,4, B_22,5};

(3) S₂₂is a non-terminal symbol in V₂₂and is the start symbol; and

(4) the production set of the grammar G₂₂is: P₂₂={

S₂₂→b₂₅B_22,1|b₂₆B_22,2|b₂₈B_22,3,

B_22,1→b₁B_22,4|b₃B_22,4|b₄B_22,4|b₇B_22,4|b₈B_22,4|b₉B_22,4|b₁₁B_22,4|b₁₂B_22,4|b₁₅B_22,4|b₁₆B_22,4|b₁₇B_22,4|b₁₉B_22,4,

B_22,2→b₁B_22,4|b₃B_22,4|b₄B_22,4|b₅B_22,4|b₇B_22,4|b₉B_22,4|b₁₁B_22,4|b₁₃B_22,4|b₁₅B_22,4|b₂₉B_22,4,

B_22,3→b₁B_22,4|b₃B_22,4|b₄B_22,4|b₈B_22,4|b₉B_22,4|b₁₁B_22,4|b₁₂B_22,4|b₁₃B_22,4|b₁₅B_22,4|b₁₆B_22,4|b₁₇B_22,4,

B_22,4→B_22,7|uB_22,7|eB_22,7|oB_22,7,

B_22,4→b₁₂B_22,5|b₂₅B_22,5|b₂₆B_22,5,

B_22,4→b₃B_22,6|b₄B_22,6|b₁₅B_22,6|b₁₆B_22,6,

B_22,7→b₁₂B_22,5|b₂₅B_22,5|b₂₆B_22,5,

B_22,7→b₃B_22,6|b₄B_22,6|b₁₅B_22,6|b₁₆B_22,6,

B_2,25→b₁₁,

B_2,26→b₁₈}

With respect to a Tibetan spelling structure 23:

Tibetan spelling formal grammar G₂₃: the Tibetan character spelling grammar G₂₃of the Tibetan roots, the subfixes, the vowel symbols, the suffixes and the postfixes is a quadruple (T₂₃, V₂₃, S₂₃, P₂₃), wherein:

(1) terminal symbol

T₂₃=T_B∪T_o, wherein:

T_B{b₁, b₂, b₃, b₄, b₈, b₉, b₁₀, b₁₁, b₁₂, b₁₃, b₁₄, b₁₅, b₁₆, b₁₈, b₂₀, b₂₁, b₂₂, b₂₄, b₂₅, b₂₆, b₂₇, b₂₈, b₂₉}, the elements thereof correspond to the Tibetan consonant characters; and T_o={i, u, e, o}, the elements thereof correspond to the Tibetan vowel characters;

(2) non-terminal symbol set

V₂₃{S₂₃, B_23,1, B_23,2, B_23,3, B_23,4, B_23,5, B_23,6, B_23,7, B_23,8, B_23,9, B_23,10, B_23,11, B_23,12, B_23,13};

(3) S₂₃is a non-terminal symbol in V₂₃and is the start symbol; and

(4) the production set of the grammar G₂₃is: P₂₃={

S₂₃→b₁B_23,1|b₃B_23,1,

S₂₃→b₂B_23,2,

S₂₃→b₁₁B_23,3|b₂₉B_23,3,

S₂₃→b₈B_23,4|b₁₈B_23,4|b₂₁B_23,4|b₂₆B_23,4|b₂₇B_23,4,

S₂₃→b₉B_23,5|b₁₀B_23,5,

S₂₃→b₁₃B_23,6|b₁₄B_23,6|b₁₆B_23,6,

S₂₃→b₂₂B_23,7|b₂₅B_23,7,

S₂₃→b₂₈B_23,8,

S₂₃→b₁₅B_23,9,

B_23,1→b₂₀B_23,10|b₂₄|B_23,10|b₂₅B_23,10|b₂₆B_23,10,

B_23,2→b₂₀B_23,10|b₂₄B_23,10|b₂₅B_23,10,

B_23,3→b₂₀B_23,10|b₂₅B_23,10,

B_23,4→b₂₀B_23,10,

B_23,5→b₂₅B_23,10,

B_23,6→b₂₄B_23,10|b₂₅B_23,10,

B_23,7→b₂₀B_23,10|b₂₆B_23,10,

B_23,8→b₂₅B_23,10|b₂₆B_23,10,

B_23,9→b₂₄B_23,10|b₂₅B_23,10|b₂₆B_23,10,

B_23,10→iB_23,11|uB_23,11|eB_23,11|oB_23,11,

B_23,10→b₁₂B_23,12|b₂₅B_23,12|b₂₆B_23,12,

B_23,10→b₃B_23,13|b₄B_23,13|b₁₅B_23,13|b₁₆B_23,13,

B_23,11→b₁₂B_23,12|b₂₅B_23,12|b₂₆B_23,12,

B_23,11→b₃B_23,13|b₄B_23,13|b₁₅B_23,13|b₁₆B_23,13,

B_23,12→b₁₁,

B_23,13|b₁₈}

With respect to a Tibetan spelling structure 24:

Tibetan spelling formal grammar G₂₄: the spelling formal grammar G₂₄of the Tibetan superfixes, the roots, the subfixes, the vowel symbols, the suffixes and the postfixes is a quadruple (T₂₄, V₂₄, S₂₄, P₂₄), wherein:

(1) terminal symbol

T₂₄=T_B∪T_o, wherein:

(2) non-terminal symbol set

V₂₄={S₂₄, B_24,1, B_24,2, B_24,3, B_24,4, B_24,5, B_24,6, B_24,7, B_24,8, B_24,9, B_24,10};

(3) S₂₄is a non-terminal symbol in V₂₄and is the start symbol; and

(4) the production set of the grammar G₂₄is: P₂₄={

S₂₄→b₂₅B_24,1,

S₂₄→b₂₈B_24,2,

B_24,1→b₁B_24,3|b₃B_24,3|b₁₆B_24,3,

B_24,1→b₁₇B_24,4,

B_24,2→b₁B_24,5|b₃B_24,5|b₁₃B_24,5|b₁₅B_24,5|b₁₆B_24,5,

B_24,2→b₁₂B_24,6,

B_24,3→b₂₄B_24,7,

B_24,4→b₂₀B_24,7,

B_24,5→b₂₄B_24,7|b₂₅B_24,7,

B_24,6→b₂₅B_24,7,

B_24,7→iB_24,8|uB_24,8|eB_24,8|oB_24,8,

B_24,7→b₁₂B_24,9|b₂₅B_24,9|b₂₆B_24,9,

B_24,7→b₃B_24,10|b₄B_24,10|b₁₅B_24,10|b₁₆B_24,10,

B_24,8→b₁₂B_24,9|b₂₅B_24,9|b₂₆B_24,9,

B_24,8→b₃B_24,10|b₄B_24,10|b₁₅B_24,10|b₁₆B_24,10,

B_24,9→b₁₁,

B_24,10→b₁₈}

In the embodiment, the process of acquiring a newly added non-terminal symbol E_iincludes: judging whether the finite set P_iof the production rules of the Tibetan spelling formal grammar G_icontains a production rule B→x, wherein BεV_iand xεT_i; and if so, acquiring E_iεδ_i(B, x), wherein δ_i(B, x)=φ. E_ibelongs to one of the non-terminal symbols.

Step 103, the constituents of the Tibetan characters are acquired according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the Tibetan characters in the Tibetan text are correctly spelled.

In the embodiment, the process of determining the target finite state automaton through the step 103 can include: each finite state automaton in the finite state automaton group sequentially receives at least one Tibetan character from the initial state and transfers the state; if a certain finite state automaton in the finite state automaton group can enter the termination state after transferring the state, the Tibetan text to be checked is correctly spelled; if none of the finite state automata in the finite state automaton group can enter the termination state after transferring the state, the Tibetan text to be checked is wrongly spelled. The finite state automaton which determines that the Tibetan text to be checked is correctly spelled is the target finite state automaton.

Wherein, the operation of transferring the state can be as follows: the finite state automaton M_ireceives a certain input character at a certain state, for example, q_m(q_mεQ_i), if x (xεΣ_i), if the state transition function δ_m(q_m, x)εδ_ithen the automaton enters the state q_m+1(q_m+1ε(q_m, x)), and otherwise, the state of the automaton is not changed.

In the embodiment, the process of acquiring the constituents of the Tibetan characters through the step 103 can include: at first, acquiring a target Tibetan spelling formal grammar corresponding to the target finite state automaton; and then, acquiring the constituents of the Tibetan characters according to the target Tibetan spelling formal grammar.

In the embodiment, the constituents of the Tibetan characters are in one-to-one correspondence with the Tibetan spelling formal grammars. Specifically, the constituents of the Tibetan characters have 24 basic spelling structures as follows:

Basic spelling structure 1 of the Tibetan characters: the Tibetan roots are spelled with the vowel symbols.

Basic spelling structure 2 of the Tibetan characters: the Tibetan superfixes, the roots and the vowels are spelled.

Basic spelling structure 3 of the Tibetan characters: the Tibetan roots, the subfixes and the vowel symbols are spelled.

Basic spelling structure 4 of the Tibetan characters: the superfixes, the Tibetan roots, the subfixes and the vowel symbols are spelled.

Basic spelling structure 5 of the Tibetan characters: the Tibetan prefixes, the superfixes, the roots and the vowel symbols are spelled.

Basic spelling structure 6 of the Tibetan characters: the Tibetan prefixes, the roots, the subfixes and the vowel symbols are spelled.

Basic spelling structure 7 of the Tibetan characters: the Tibetan prefixes, the superfixes, the roots, the subfixes and the vowel symbols are spelled.

Basic spelling structure 8 of the Tibetan characters: the Tibetan prefixes, the roots and the vowel symbols are spelled.

Basic spelling structure 9 of the Tibetan characters: the Tibetan prefixes, the roots, the vowel characters and the suffixes are spelled.

Basic spelling structure 10 of the Tibetan characters: the Tibetan prefixes, the superfixes, the roots, the vowel symbols and the suffixes are spelled.

Basic spelling structure 11 of the Tibetan characters: the Tibetan prefixes, the roots, the subfixes, the vowel symbols and the suffixes are spelled.

Basic spelling structure 12 of the Tibetan characters: the Tibetan prefixes, the superfixes, the roots, the subfixes, the vowel symbols and the suffixes are spelled.

Basic spelling structure 13 of the Tibetan characters: the Tibetan prefixes, the roots, the vowel symbols, the suffixes and the postfixes are spelled.

Basic spelling structure 14 of the Tibetan characters: the Tibetan prefixes, the superfixes, the roots, the vowel symbols, the suffixes and the postfixes are spelled.

Basic spelling structure 15 of the Tibetan characters: the Tibetan prefixes, the roots, the subfixes, the vowel symbols, the suffixes and the postfixes are spelled.

Basic spelling structure 16 of the Tibetan characters: the Tibetan prefixes, the superfixes, the roots, the subfixes, the vowel symbols, the suffixes and the postfixes are spelled.

Basic spelling structure 17 of the Tibetan characters: the Tibetan roots, the vowel symbols and the suffixes are spelled.

Basic spelling structure 18 of the Tibetan characters: the Tibetan superfixes, the roots, the vowel symbols and the suffixes are spelled.

Basic spelling structure 19 of the Tibetan characters: the Tibetan roots, the subfixes, the vowel symbols and the suffixes are spelled.

Basic spelling structure 20 of the Tibetan characters: the superfixes, the Tibetan roots, the subfixes, the vowel symbols and the suffixes are spelled.

Basic spelling structure 21 of the Tibetan characters: the Tibetan roots, the vowel symbols, the suffixes and the postfixes are spelled.

Basic spelling structure 22 of the Tibetan characters: the Tibetan superfixes, the roots, the vowel symbols, the suffixes and the postfixes are spelled.

Basic spelling structure 23 of the Tibetan characters: the Tibetan roots, the subfixes, the vowel symbols, the suffixes and the postfixes are spelled.

Basic spelling structure 24 of the Tibetan characters: the Tibetan superfixes, the roots, the subfixes, the vowel symbols, the suffixes and the postfixes are spelled.

It should be noted that the vowel symbols in the basic spelling structure 8 of the Tibetan characters are essential, and apart from this, the vowel symbols in the other structures are optional.

The present invention has the following beneficial effects: the Tibetan text to be analyzed is used as the input of the finite state automaton group, and the constituents of the Tibetan characters are acquired according to the target finite state automaton which determines that the Tibetan characters are correct, therefore Tibetan character constituent analysis is achieved, and Tibetan sorting can be further achieved according to the constituents of the Tibetan characters. As the finite state automaton group corresponds to the Tibetan spelling formal grammar, the technical solutions provided by the embodiments of the present invention solve the problem that the existing Tibetan sorting methods have no universality or compatibility, which is inconvenient for the use of automatic computer Tibetan sorting.

Second Embodiment

As shown in FIG. 2, the embodiment of the present invention provides a Tibetan sorting method, including:

step 201, at least two Tibetan characters to be sorted are acquired.

In the embodiment, the at least two Tibetan characters acquired in the step 201 can be independent Tibetan characters and can also be a Tibetan text composed of a plurality of Tibetan characters, and this is not limited herein. Particularly, when the Tibetan text of at least two Tibetan characters is acquired, the Tibetan text can be segmented at first, the segmentation process is similar to the segmentation mode in the step 101 as shown in FIG. 1, and thus will not be repeated redundantly herein.

Step 202, the at least two Tibetan characters to be sorted are respectively used as the input of a preset finite state automaton group.

Step 203, the constituents of the Tibetan characters are acquired according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the input Tibetan characters are correctly spelled.

In the embodiment, the process of acquiring the constituents of the Tibetan characters in the step 202 and the step 203 is similar to that in the step 102 and the step 103 as shown in FIG. 1, and thus will not be repeated redundantly herein.

Step 204, the at least two Tibetan characters are sorted according to the constituents of the at least two Tibetan characters to acquire a sorting result.

In the embodiment, for any two Tibetan characters in the at least two Tibetan characters, the sorting process in the step 204 includes: 2041, judging whether the two Tibetan characters conform to a preset constituent rule according to the constituents of the two Tibetan characters; if so, executing 2042; otherwise, executing 2044; 2042, judging whether the roots of the two Tibetan characters are the same; if so, executing 2043; otherwise, executing 2044; 2043, sequentially comparing the constituents of the two Tibetan characters according to the sequence of prefixes, superfixes, subfixes, vowels, suffixes and postfixes; executing 2045; 2044, sequentially comparing the constituents of the two Tibetan characters according to the sequence of superfixes, prefixes, subfixes, vowels, suffixes and postfixes; executing 2045; and 2045, if the comparison result is that the former Tibetan character in the two Tibetan characters is larger than the latter Tibetan character, exchanging the sequence of the two Tibetan characters; and otherwise, keeping the sequence of the two Tibetan characters unchanged. Wherein, 2041 includes: acquiring spelling structure serial numbers of the two Tibetan characters according to the constituents of the two Tibetan characters; and judging whether the two Tibetan characters conform to the preset constituent rule according to the spelling structure serial numbers of the two Tibetan characters, wherein the constituent rule includes: the spelling structure serial number of the first Tibetan character in the two Tibetan characters belongs to a set {2, 4, 18, 20, 22, 24}, and the spelling structure serial number of the second Tibetan character in the two Tibetan characters belongs to a set {5, 7, 10, 12, 14, 16}; or, the spelling structure serial number of the first Tibetan character in the two Tibetan characters belongs to the set {5, 7, 10, 12, 14, 16}, and the spelling structure serial number of the second Tibetan character in the two Tibetan characters belongs to the set {2, 4, 18, 20, 22, 24}.

In the embodiment, the constituents of the Tibetan character can be summarized as including the following 7 symbols: the root, the prefix, the superfix, the subfix, the vowel, the suffix and the postfix. When the constituents of the Tibetan character do not contain one or several certain symbols, the corresponding symbol mark of the Tibetan character is 0.

In the embodiment, after the any two Tibetan characters in the at least two Tibetan characters are sorted via the above process, all of the at least two Tibetan characters can be sorted by adopting a bubble algorithm and other sorting methods.

The present invention has the following beneficial effects: the Tibetan text to be analyzed is used as the input of the finite state automaton group, and the constituents of the Tibetan characters are acquired according to the target finite state automaton which determines that the Tibetan characters are correct, therefore Tibetan character constituent analysis is achieved, and Tibetan sorting can be further achieved according to the constituents of the Tibetan characters. As the finite state automaton group corresponds to the Tibetan spelling formal grammar, the technical solutions provided by the embodiments of the present invention solve the problem that the existing Tibetan sorting methods have no universality or compatibility, which is inconvenient for the use of automatic computer Tibetan sorting.

Third Embodiment

As shown in FIG. 3, the embodiment of the present invention provides a Tibetan sorting method, including:

step 301, at least two Tibetan words to be sorted are acquired.

Step 302, Tibetan characters in the at least two Tibetan words are respectively acquired.

In the embodiment, the at least two Tibetan words can be segmented to acquire the Tibetan characters; and the at least two Tibetan words can be divided according to a specific separator and other signs to acquire the Tibetan characters, which will not be repeated redundantly herein.

S303, the Tibetan characters in the at least two Tibetan words are respectively used as the input of a preset finite state automaton group.

Step 304, the constituents of the Tibetan characters are acquired according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the input Tibetan characters are correctly spelled.

In the embodiment, the process of acquiring the constituents of the Tibetan characters in the step 303 and the step 304 is similar to that in the step 102 and the step 103 as shown in FIG. 1, and thus will not be repeated redundantly herein.

Step 305, the at least two Tibetan words are sorted according to the constituents of the each Tibetan character in the at least two Tibetan words to acquire a sorting result.

In the embodiment, for any two Tibetan words in the at least two Tibetan words, the sorting process in the step 305 includes: 3051, respectively acquiring first Tibetan characters in the two Tibetan words; 3052, judging whether the two Tibetan characters conform to a preset constituent rule according to the constituents of the Tibetan characters; if so, executing 3053; otherwise, executing 3055; 3053, judging whether the roots of the Tibetan characters are the same; if so, executing 3054; otherwise, executing 3055; 3504, sequentially comparing the constituents of the Tibetan characters according to the sequence of prefixes, superfixes, subfixes, vowels, suffixes and postfixes; executing 3056; 3055, sequentially comparing the constituents of the Tibetan characters according to the sequence of superfixes, prefixes, subfixes, vowels, suffixes and postfixes; executing 3056; and 3056, if the comparison result is that the Tibetan characters in the former Tibetan word are larger than the corresponding Tibetan characters in the latter Tibetan word, exchanging the sequence of the two Tibetan words; if the comparison result is that the Tibetan characters in the former Tibetan word are smaller than the corresponding Tibetan characters in the latter Tibetan word, keeping the sequence of the two Tibetan words unchanged; and if the comparison result is that the Tibetan characters in the former Tibetan word are equal to the corresponding Tibetan characters in the latter Tibetan word, acquiring the next Tibetan characters in the at least two Tibetan words, and executing 3052 to 3056 until all the Tibetan characters in the two Tibetan words are completely compared. Wherein, the process of judging whether the judging whether the two Tibetan characters conform to the constituent rule in 3052 is similar to that provided in the second embodiment, and thus will not be repeated redundantly herein.

The present invention has the following beneficial effects: the Tibetan text to be analyzed is used as the input of the finite state automaton group, and the constituents of the Tibetan characters are acquired according to the target finite state automaton which determines that the Tibetan characters are correct, therefore Tibetan character constituent analysis is achieved, and Tibetan sorting can be further achieved according to the constituents of the Tibetan characters. As the finite state automaton group corresponds to the Tibetan spelling formal grammar, the technical solutions provided by the embodiments of the present invention solve the problem that the existing Tibetan sorting methods have no universality or compatibility, which is inconvenient for the use of automatic computer Tibetan sorting.

Fourth Embodiment

As shown in FIG. 4, the embodiment of the present invention provides a Tibetan character constituent analysis device, including:

a text acquisition module 401, used for acquiring a Tibetan text to be analyzed;

a text input module 402, connected with the text acquisition module and used for using Tibetan characters in the Tibetan text as the input of a preset finite state automaton group; and

a constituent analysis module 403, connected with the text input module and used for acquiring the constituents of the Tibetan characters according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the Tibetan characters in the Tibetan text are correctly spelled;

In the embodiment, the process of implementing Tibetan character constituent analysis through the text acquisition module 401, the text input module 402 and the constituent analysis module 403 is similar to the process provided by the first embodiment of the present invention, and thus will not be repeated redundantly herein.

The present invention has the following beneficial effects: the Tibetan text to be analyzed is used as the input of the finite state automaton group, and the constituents of the Tibetan characters are acquired according to the target finite state automaton which determines that the Tibetan characters are correct, therefore Tibetan character constituent analysis is achieved, and Tibetan sorting can be further achieved according to the constituents of the Tibetan characters. As the finite state automaton group corresponds to the Tibetan spelling formal grammar, the technical solutions provided by the embodiments of the present invention solve the problem that the existing Tibetan sorting methods have no universality or compatibility, which is inconvenient for the use of automatic computer Tibetan sorting.

Fifth Embodiment

As shown in FIG. 5, the embodiment of the present invention provides a Tibetan sorting device, including:

a Tibetan character acquisition module 501, used for acquiring at least two Tibetan characters to be sorted;

a Tibetan character input module 502, connected with the Tibetan character acquisition module and used for respectively using the at least two Tibetan characters to be sorted as the input of a preset finite state automaton group;

a constituent analysis module 503, connected with the Tibetan character input module and used for acquiring the constituents of the Tibetan characters according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the input Tibetan characters are correctly spelled; and

a sorting module 504, connected with the constituent analysis module and used for sorting the at least two Tibetan characters according to the constituents of the at least two Tibetan characters to acquire a sorting result;

the finite state automaton group includes 24 finite state automata, and any finite state automaton M_i=(Σ_i, Q_i, δ_i, q_i, F_i): the Σ_irepresents a finite set of terminal symbols of a preset Tibetan spelling formal grammar G_i; the Q_irepresents a union of a finite set V_iof non-terminal symbols of the Tibetan spelling formal grammar G_iand the F_i; the δ_irepresents a state transition function of the finite state automaton M_iacquired by mapping from a direct product Q_i*Σ_iof Q_iand Σ_ito Q_i; the q_irepresents an initial state of the finite state automaton M_i; q_iεQ_i; the F_irepresents a finite set of termination states of the finite state automaton M_i, and F_i⊂Q_i; and the custom-character is a positive integer, and ≦24.

In the embodiment, the process of implementing Tibetan sorting through the Tibetan character acquisition module 501, the Tibetan character input module 502, the constituent analysis module 503 and the sorting module 504 is similar to the process provided by the second embodiment of the present invention, and thus will not be repeated redundantly herein.

The present invention has the following beneficial effects: the Tibetan text to be analyzed is used as the input of the finite state automaton group, and the constituents of the Tibetan characters are acquired according to the target finite state automaton which determines that the Tibetan characters are correct, therefore Tibetan character constituent analysis is achieved, and Tibetan sorting can be further achieved according to the constituents of the Tibetan characters. As the finite state automaton group corresponds to the Tibetan spelling formal grammar, the technical solutions provided by the embodiments of the present invention solve the problem that the existing Tibetan sorting methods have no universality or compatibility, which is inconvenient for the use of automatic computer Tibetan sorting.

Sixth Embodiment

As shown in FIG. 6, the embodiment of the present invention provides a Tibetan sorting device, including:

a Tibetan word acquisition module 601, used for acquiring at least two Tibetan words to be sorted;

a Tibetan character acquisition module 602, connected with the Tibetan word acquisition module and used for respectively acquiring Tibetan characters in the at least two Tibetan words;

a Tibetan character input module 603, connected with the Tibetan character acquisition module and used for respectively using the Tibetan characters in the at least two Tibetan words as the input of a preset finite state automaton group;

a constituent analysis module 604, connected with the Tibetan character input module and used for acquiring the constituents of the Tibetan characters according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the input Tibetan characters are correctly spelled; and

a sorting module 605, connected with the constituent analysis module and used for sorting the at least two Tibetan words according to the constituents of the each Tibetan character in the at least two Tibetan words to acquire a sorting result;

the finite state automaton group includes 24 finite state automata, and any finite state automaton M_i=(Σ_i, Q_i, δ_i, q_i, F_i); the Σ_irepresents a finite set of terminal symbols of a preset Tibetan spelling formal grammar G_i; the Q_irepresents a union of a finite set V_iof non-terminal symbols of the Tibetan spelling formal grammar G_i; and the F_i; the δ_irepresents a state transition function of the finite state automaton M_iacquired by mapping from a direct product Q_i*Σ_iof Q_iand Σ_ito Q_i; the q_irepresents an initial state of the finite state automaton M_i; q_iεQ_i; the F_irepresents a finite set of termination states of the finite state automaton M_i, and F_i⊂Q_i; and the custom-character is a positive integer, and ≦24.

In the embodiment, the process of implementing Tibetan sorting through the Tibetan word acquisition module 601 to the sorting module 605 is similar to the process provided by the third embodiment of the present invention, and thus will not be repeated redundantly herein.

The present invention has the following beneficial effects: the Tibetan text to be analyzed is used as the input of the finite state automaton group, and the constituents of the Tibetan characters are acquired according to the target finite state automaton which determines that the Tibetan characters are correct, therefore Tibetan character constituent analysis is achieved, and Tibetan sorting can be further achieved according to the constituents of the Tibetan characters. As the finite state automaton group corresponds to the Tibetan spelling formal grammar, the technical solutions provided by the embodiments of the present invention solve the problem that the existing Tibetan sorting methods have no universality or compatibility, which is inconvenient for the use of automatic computer Tibetan sorting.

The order of the above embodiments is only for the purpose of convenient description, and does not represent the advantages and disadvantages of the embodiments.

Finally, it should be noted that the above embodiments are merely used for illustrating the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they could still make modifications to the technical solutions recorded in the foregoing embodiments or make equivalent substitutions to a part of technical features therein; and these modifications or substitutions do not make the essence of the corresponding technical solutions depart from the spirit and the scope of the technical solutions of the embodiments of the present invention.

Tibetan Character Constituent Analysis Method, Tibetan Sorting Method And Corresponding Devices

Information

Publication Number

Date Filed

Date Published

Inventors

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)