SYSTEM AND METHOD FOR SEMANTIC ANALYSIS OF SPEECH

Information

  • Patent Application
  • 20180190270
  • Publication Number
    20180190270
  • Date Filed
    June 14, 2016
    8 years ago
  • Date Published
    July 05, 2018
    6 years ago
Abstract
A system and method for semantic analysis of speech, the semantic speech analysis system being used for implementing semantic analysis of speech in a preset field, comprising: a storage unit (1), used for storing semantic sentences in the preset field, each semantic sentence corresponding to an address, the semantic sentences comprising characters and keywords, each keyword corresponding to a tag, and a word list being prearranged in the storage unit (1), used for storing the address of the semantic sentence in which each word appears and/or the address of the semantic sentence in which each tag appears; an acquisition unit (2), used for acquiring speech sentences to be analysed; an indexing unit (3), being respectively connected to the storage unit (1) and the acquisition unit (2), and being used for searching the semantic sentences in the storage unit (1) on the basis of the speech sentences to be analysed, acquiring candidate semantic sentences matching the speech sentences to be analysed and a corresponding candidate order; and an analysis unit (4), connected to the indexing unit (3), and being used for using a fuzzy match algorithm, on the basis of the sorted candidate sentences, to analyse the speech sentences to be analysed, and acquiring analysis results.
Description
TECHNICAL FIELD

The invention relates to the field of natural language understanding of speech, more specifically, to a system and method for high robust semantic analysis of speech.


BACKGROUND

Speech recognition involves multidisciplinary fields of phonetics, linguistics, mathematical signal processing, pattern recognition and so as on. With the development of smart devices, direct and friendly interactions between people and smart devices become an important issue. Due to the natural friendliness and convenience of spoken natural language for users, the human-computer interaction based on spoken natural language has become a tendency that has drawn more and more attention from industry. The key technology of spoken natural language interaction lies in the semantic understanding of spoken language, that is, analysing the spoken sentence of the user to obtain the intent that the user wants to express and the corresponding keywords. Generally, the way to achieve the semantic understanding of speech is to collect or write the corresponding semantic sentence manually, and then match the sentence to be analyzed with the sentence to get the analysis result. In the present methods of semantic analysis of speech, most of them are based on a certain grammatical matching, such as regular grammar and context-free grammar, which requires that the speech sentences to be analysed is exactly the same as the semantic sentence in order to analyse successfully. This makes the architect constructing the semantic understanding system need a lot of time to collect the semantic sentence. Because of the inaccurate recognition of the front-end speech recognition module, the analysis of the semantic understanding fails, and because the sentence to be analysed needs to match with a large number of semantic sentences, the analysis takes a long time and the efficiency is low.


SUMMARY OF THE INVENTION

For the deficiencies of the present method for semantic analysis of speech, the invention provides a system and method for finding the similar sentences as the speech sentences to be analysed rapidly and accurately in a large-scale semantic sentence database and for providing an accurate result.


The solution is as follows:


A system for semantic analysis of speech, used for implementing semantic analysis of speech in a preset field, comprising:


a storage unit, used for storing semantic sentences in the preset field, each semantic sentence corresponding to an address, the semantic sentences comprising characters and keywords, each keyword corresponding to a tag, and a word list being prearranged in the storage unit, used for storing an address of the semantic sentence in which each word appears and/or an address of the semantic sentence in which each tag appears;


an acquisition unit, used for acquiring speech sentences to be analysed;


an indexing unit, being respectively connected to the storage unit and the acquisition unit, and being used for searching the semantic sentences in the storage unit on the basis of the speech sentences to be analysed, acquiring candidate semantic sentences matching the speech sentences to be analysed and a corresponding candidate order; and


an analysis unit, connected to the indexing unit, and being used for using a fuzzy match algorithm, on the basis of the sorted candidate semantic sentences, to analyse the speech sentences to be analysed, and acquiring analysis results.


Preferably, the indexing unit comprises:


an extraction module, used for extracting a keyword in the speech sentences to be analysed, the keyword being same as that in the storage unit, and acquiring a tag corresponding to the keyword;


a substitution module, connected to the extraction module, and being used for replacing the keyword in the speech sentences to be analysed with the tag corresponding to the keyword, to form a substituted speech sentences;


an indexing module, connected to the substitution module, and being used for searching in the word list in the storage unit, on the basis of the character and the tag in the substituted speech sentences, to acquire the address of the semantic sentence matching the character and/or the address of the semantic sentence matching the tag;


a sorting module, connected to the indexing module, and being used for sorting the semantic sentences matching the character and/or the semantic sentence matching the tag in the substituted speech sentences, by comparing a similarity of the substituted speech sentences, to acquire the sorted candidate semantic sentences.


Preferably, the sorting module uses a score formula to acquire a score of similarity compared by the candidate semantic sentences and the substituted speech sentences;


the score formula is:






S=(S1+S2)/2


wherein, S represents the score of similarity compared by the candidate semantic sentences and the substituted speech sentences, S1 represents the portion of the character and/or the tag in the candidate semantic sentences to the substituted speech sentences, S2 represents the portion of the character and/or the tag in the candidate semantic sentences to the candidate semantic sentences.


Preferably, the step of the analysis unit using a fuzzy match algorithm, on the basis of the sorted candidate semantic sentences, to analyse the speech sentences to be analysed is that:


astablishing a finite state automation network, upon which the speech sentences to be analysed is rated, comparing scores of the speech sentences to be analysed, setting a highest score of the speech sentences to be analysed as the analysis result.


Preferably, the word list is hash table.


A method for semantic analysis of speech, applying to the system for semantic analysis of speech according to claim 1, comprising the steps of:


S1, acquiring speech sentences to be analysed;


S2, searching the semantic sentences in the storage unit on the basis of the speech sentences to be analysed, acquiring candidate semantic sentences matching the speech sentences to be analysed and a corresponding candidate order;


S3, using a fuzzy match algorithm, on the basis of the sorted candidate semantic sentences, to analyse the speech sentences to be analysed, and acquiring analysis results.


Preferably, the step S2 is:


S21, extracting a keyword in the speech sentences to be analysed, the keyword being same as that in the storage unit, and acquiring a tag corresponding to the keyword;


S22, replacing the keyword in the speech sentences to be analysed with the tag corresponding to the keyword, to form a substituted speech sentences;


S23, searching in the word list in the storage unit, on the basis of the character and the tag in the substituted speech sentences, to acquire an address of the semantic sentence matching the character and/or an address of the semantic sentence matching the tag;


S24, sorting the semantic sentences matching the character and/or the semantic sentence matching the tag in the substituted speech sentences, by comparing a similarity of the substituted speech sentences, to acquire the sorted candidate semantic sentences.


Preferably, the step S24 uses a score formula to acquire a score of similarity compared by the candidate semantic sentences and the substituted speech sentences;


the score formula is:






S=(S1+S2)/2;


wherein, S represents the score of similarity compared by the candidate semantic sentences and the substituted speech sentences, S1 represents the portion of the character and/or the tag in the candidate semantic sentences to the substituted speech sentences, S2 represents the portion of the character and/or the tag in the candidate semantic sentences to the candidate semantic sentences.


Preferably, the step S3 is:


S31, astablishing a finite state automation network for each of the candidate semantic sentences;


S32, rating the speech sentences to be analysed upon the finite state automation network;


S33, comparing scores of the speech sentences to be analysed, setting a highest score of the speech sentences to be analysed as the analysis result.


Preferably, the word list is hash table.


The beneficial effect of the solution mentioned above is as follows:


In the system for semantic analysis of speech, the sentences corresponding to the speech sentences to be analysed can be searched rapidly by the indexing unit, so as to increase the efficiency of matching; the utilized fuzzy match algorithm allows the inconsistence between the speech sentences to be analysed and the candidate semantic sentences, so as to allow the fault tolerance and increase the robust of system. In the method for semantic analysis of speech, it is available to find the sentences related to the speech sentence to be analysed rapidly, so as to increase the efficiency of matching, so as to find the similar sentences as the speech sentences to be analysed rapidly and accurately in a large-scale semantic sentence database and to output an accurate result.





BRIEF DESCRIPTIONS OF THE DRAWINGS

The accompanying drawings, together with the specification, illustrate exemplary embodiments of the present disclosure, and, together with the description, serve to explain the principles of the present invention.



FIG. 1 is a module diagram of the system for semantic analysis of speech according to an embodiment of the invention;



FIG. 2 is a flow diagram of the system for semantic analysis of speech according to an embodiment of the invention;



FIG. 3 is a flow diagram of searching the semantic sentences in the storage unit of the invention;



FIG. 4 is a flow diagram of analysing the speech sentences to be analysed of the invention;



FIG. 5 is an indexing diagram of the sentence in a reverse order of the invention;



FIG. 6 is a diagram of the finite state automation corresponding to the sentences of the invention.





DETAILED DESCRIPTION

The present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments of the invention are shown. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Like reference numerals refer to like elements throughout.


The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” or “includes” and/or “including” or “has” and/or “having” when used herein, specify the presence of stated features, regions, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, regions, integers, steps, operations, elements, components, and/or groups thereof.


Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.


As used herein, “around”, “about” or “approximately” shall generally mean within 20 percent, preferably within 10 percent, and more preferably within 5 percent of a given value or range. Numerical quantities given herein are approximate, meaning that the term “around”, “about” or “approximately” can be inferred if not expressly stated.


As used herein, the term “plurality” means a number greater than one.


Hereinafter, certain exemplary embodiments according to the present disclosure will be described with reference to the accompanying drawings.


As shown in FIG. 1, a system for semantic analysis of speech, used for implementing semantic analysis of speech in a preset field, comprising:


a storage unit 1, used for storing semantic sentences in the preset field, each semantic sentence corresponding to an address, the semantic sentences comprising characters and keywords, each keyword corresponding to a tag, and a word list being prearranged in the storage unit 1, used for storing an address of the semantic sentence in which each word appears and/or an address of the semantic sentence in which each tag appears;


an acquisition unit 2, used for acquiring speech sentences to be analysed;


an indexing unit 3, being respectively connected to the storage unit 1 and the acquisition unit 2, and being used for searching the semantic sentences in the storage unit on the basis of the speech sentences to be analysed, acquiring candidate semantic sentences matching the speech sentences to be analysed and a corresponding candidate order; and


an analysis unit 4, connected to the indexing unit 3, and being used for using a fuzzy match algorithm, on the basis of the sorted candidate semantic sentences, to analyse the speech sentences to be analysed, and acquiring analysis results.


In the embodiment, the sentences corresponding to the speech sentences to be analysed can be searched rapidly by the indexing unit 3, so as to increase the efficiency of matching; the utilized fuzzy match algorithm allows the inconsistence between the speech sentences to be analysed and the candidate semantic sentences when analyzing the speech sentences to be analysed, so that the architect constructing the semantic understanding system does not need to compile a lot of sentences with a little discrepancy. Meanwhile, it allows the fault tolerance for the error of the front-end of the speech recognition, and increase the robust of system.


In a preferred embodiment, the indexing unit 3 comprises:


an extraction module 31, used for extracting a keyword in the speech sentences to be analysed, the keyword being same as that in the storage unit 1, and acquiring a tag corresponding to the keyword;


a substitution module 32, connected to the extraction module 31, and being used for replacing the keyword in the speech sentences to be analysed with the tag corresponding to the keyword, to form a substituted speech sentences;


an indexing module 34, connected to the substitution module 32, and being used for searching in the word list in the storage unit 1, on the basis of the character and the tag in the substituted speech sentences, to acquire the address of the semantic sentence matching the character and/or the address of the semantic sentence matching the tag;


a sorting module 33, connected to the indexing module 34, and being used for sorting the semantic sentences matching the character and/or the semantic sentence matching the tag in the substituted speech sentences, by comparing a similarity of the substituted speech sentences, to acquire the sorted candidate semantic sentences.


In the embodiment, the indexing unit 3 is used for searching out the candidate semantic sentences similar to the speech sentences to be analysed when the speech sentences to be analysed are provided.


After acquiring the speech sentences to be analysed are provided, abstract the keyword thereof, detect by the word list, review all the possible characters throughout the speech sentences to be analysed and lookup if the characters exist in the word list, if so, record the position of the speech sentences to be analysed on which the character is located; detect by the statistic model, may select the Conditional Radom Fields (CRF) to train the statistic model and detect; replace the keyword in the speech sentences to be analysed with the corresponding tag. The tag in the speech sentences to be analysed and the unreplaced characters are searched. In the embodiment, each character or tag is searched in the word list, so that the address of the semantic sentences thereof is obtained. It can be recorded that how many characters and tags are matched in each of the semantic sentences and the sentences to be search. The analysis result is sorted on the basis of score of the similarity, the sentence with the highest score is selected as the candidate semantic sentence.


In a preferred embodiment, the sorting module 33 uses a score formula to acquire a score of similarity compared by the candidate semantic sentences and the substituted speech sentences;


the score formula is:






S=(S1+S2)/2


wherein, S represents the score of similarity compared by the candidate semantic sentences and the substituted speech sentences, S1 represents the portion of the character and/or the tag in the candidate semantic sentences to the substituted speech sentences, S2 represents the portion of the character and/or the tag in the candidate semantic sentences to the candidate semantic sentences


In a preferred embodiment, the step of the analysis unit 4 using a fuzzy match algorithm, on the basis of the sorted candidate semantic sentences, to analyse the speech sentences to be analysed is that:


astablishing a finite state automation network, upon which the speech sentences to be analysed is rated, comparing scores of the speech sentences to be analysed, setting a highest score of the speech sentences to be analysed as the analysis result.


In the embodiment, the analysis unit 4 can establish finite state automation network for each of the candidate semantic sentences. Each character or each tag can function as an arc of the finite state automation network. FIG. 6 shows a diagram, in which a sentence corresponds to a finite state automation. The speech sentences to be analysed is analysed and rated based on the finite state automation network. Specifically, the keyword in the speech sentences to be analysed is replaced by the corresponding tag on the basis of the result of keyword analysing. We assume that the speech sentences to be analysed have n results of keyword analysing, there would be 2n possible tags. We remove the the tags whose positions are conflicted among the 2n possible tags, and the remains are the candidate tag substitute sentence to be analysed. Then we execute a fuzzy match between the substituted speech sentences and the finite state automation network generated by each sentence, there are a lots of methods for fuzzy match, such as the method introduced in “Error-tolerant Finite-state Recognition with Applications to Morphological Analysis and Spelling Correction,” and we don't explain it any more as it is prior art. The method of fuzzy match can rapidly calculate the extent of match via the dynamic programming algorithm. We get the best sentence based on the score and acquire the corresponding analysis result.


Further, the procedure of analysing and rating allows the insertion and/or deletion and/or replacement operation between the speech sentences to be analysed and the semantic sentences of speech, and the number of the insertion and/or deletion and/or replacement operation is limited by a predetermined threshold. When the number of it is less than the predetermined threshold, the speech sentences to be analysed match the corresponding semantic sentences. When the number of it is more than the number of the predetermined threshold, the speech sentences to be analysed do not match the corresponding semantic sentences.


In a preferred embodiment, the word list is hash table.


As shown in FIG. 2, there is a method for semantic analysis of speech, applying to the system for semantic analysis of speech, comprising the steps of:


S1, acquiring speech sentences to be analysed;


S2, searching the semantic sentences in the storage unit on the basis of the speech sentences to be analysed, acquiring candidate semantic sentences matching the speech sentences to be analysed and a corresponding candidate order;


S3, using a fuzzy match algorithm, on the basis of the sorted candidate semantic sentences, to analyse the speech sentences to be analysed, and acquiring analysis results.


In the embodiment, by this method the sentence corresponding to the speech sentences to be analysed can be searched out rapidly, so that the match efficiency is increased, so as to find the sentence similar to the speech sentences to be analysed rapidly and accurately in a large-scale semantic sentence database and to provide an accurate result.


As shown in FIG. 3, in a preferred embodiment, the step S2 is:


S21, extracting a keyword in the speech sentences to be analysed, the keyword being same as that in the storage unit 1, and acquiring a tag corresponding to the keyword;


S22, replacing the keyword in the speech sentences to be analysed with the tag corresponding to the keyword, to form a substituted speech sentences;


S23, searching in the word list in the storage unit 1, on the basis of the character and the tag in the substituted speech sentences, to acquire an address of the semantic sentence matching the character and/or an address of the semantic sentence matching the tag;


S24, sorting the semantic sentences matching the character and/or the semantic sentence matching the tag in the substituted speech sentences, by comparing a similarity of the substituted speech sentences, to acquire the sorted candidate semantic sentences.


In the embodiment, the method for semantic analysis of speech comprises two parts, which are off-line phase and on-line phase. The off-line phase comprises: collecting and arranging the semantic sentences in the corresponding field according to the defined requirement. The semantic sentences thereof conform to the speech standard and the keyword from which the semantic sentence needs to be analysed is represented by tag. For example, a possible sentence in the telephone field is “call Zhang Shan,” as “Zhang Shan” is the name keyword to be analysed, we replace the keyword to be analysed with a tag, such as: “Zhang Shan” is replaced by “$name,” so that the sentence after being queried is “call $name.” We build an index for the semantic sentences in every field, we build a common index for the character and tag in the semantic sentences, in which the tag is indexed as a character. As shown in FIG. 5, the hash index in a reversed order is used. What store in the hash table is all the characters and the tags in the semantic sentences, each character and each tag is followed by a list, each element in the list stores the addresss (ID) of the character or the tag in the sentence.


The on-line phase comprises: search out the candidate semantic sentences similar to the sentences to be analysed rapidly by the index, when the speech sentences to be analysed is provided. And the step is as follows:


After acquiring the speech sentences to be analysed are provided, abstract the keyword thereof, detect by the word list, establish a hash index for each character in the word list, review all the possible characters throughout the speech sentences to be analysed and lookup if the characters exist in the hash table, if so, record the position of the speech sentences to be analysed on which the character is located; detect by the statistic model, may select the Conditional Radom Fields (CRF) to train the statistic model and detect; replace the keyword in the speech sentences to be analysed with the corresponding tag. The replacement is same as that of the off-line phase. The tag in the speech sentences to be analysed and the unreplaced characters are searched in the index. In the embodiment, each character or tag is searched in the hash index which is in a reversed order, so that the address (ID) of the semantic sentences thereof is obtained. It can be recorded that how many characters and tags are matched in each of the semantic sentences and the sentences to be search. The analysis result is sorted on the basis of score of the similarity, the sentence with the highest score is selected as the candidate semantic sentence.


In a preferred embodiment, the step S24 uses a score formula to acquire a score of similarity compared by the candidate semantic sentences and the substituted speech sentences;


the score formula is:






S=(S1+S2)/2;


wherein, S represents the score of similarity compared by the candidate semantic sentences and the substituted speech sentences, S1 represents the portion of the character and/or the tag in the candidate semantic sentences to the substituted speech sentences, S2 represents the portion of the character and/or the tag in the candidate semantic sentences to the candidate semantic sentences.


As shown in FIG. 4, in a preferred embodiment, the step S3 is:


S31, astablishing a finite state automation network for each of the candidate semantic sentences;


S32, rating the speech sentences to be analysed upon the finite state automation network;


S33, comparing scores of the speech sentences to be analysed, setting a highest score of the speech sentences to be analysed as the analysis result.


In the embodiment, it can establish finite state automation network for each of the candidate semantic sentences. Each character or each tag can function as an arc of the finite state automation network. FIG. 6 shows a diagram, in which a sentence corresponds to a finite state automation. The speech sentences to be analysed is analysed and rated based on the finite state automation network. Specifically, the keyword in the speech sentences to be analysed is replaced by the corresponding tag on the basis of the result of keyword analysing. We assume that the speech sentences to be analysed have n results of keyword analysing, there would be 2n possible tags. We remove the the tags whose positions are conflicted among the 2n possible tags, and the remains are the candidate tag substitute sentences to be analysed. Then we execute a fuzzy match between the substituted speech sentences and the finite state automation network generated by each sentence, there are a lots of methods for fuzzy match, such as the method introduced in “Error-tolerant Finite-state Recognition with Applications to Morphological Analysis and Spelling Correction,” and we don't explain it any more as it is prior art. The method of fuzzy match can rapidly calculate the extent of match via the dynamic programming algorithm. We get the best sentence based on the score and acquire the corresponding analysis result.


Further, the procedure of analysing and rating allows the insertion and/or deletion and/or replacement operation between the speech sentences to be analysed and the semantic sentences of speech, and the number of the insertion and/or deletion and/or replacement operation is limited by a predetermined threshold. When the number of it is less than the predetermined threshold, the speech sentences to be analysed match the corresponding semantic sentences. When the number of it is more than the predetermined threshold, the speech sentences to be analysed do not match the corresponding semantic sentences.


The foregoing is only the preferred embodiments of the invention, not thus limiting embodiments and scope of the invention, those skilled in the art should be able to realize that the schemes obtained from the content of specification and figures of the invention are within the scope of the invention.

Claims
  • 1.-11. (canceled)
  • 12. A system for semantic analysis of speech, used for implementing semantic analysis of speech in a preset field, comprising: a storage unit, used for storing semantic sentences in the preset field, each semantic sentence corresponding to an address, the semantic sentences comprising characters and keywords, each keyword corresponding to a tag, and a word list being prearranged in the storage unit, used for storing an address of the semantic sentence in which each word appears and/or an address of the semantic sentence in which each tag appears;an acquisition unit, used for acquiring speech sentences to be analysed;an indexing unit, being respectively connected to the storage unit and the acquisition unit, and being used for searching the semantic sentences in the storage unit on the basis of the speech sentences to be analysed, acquiring candidate semantic sentences matching the speech sentences to be analysed and a corresponding candidate order; andan analysis unit, connected to the indexing unit, and being used for using a fuzzy match algorithm, on the basis of the sorted candidate semantic sentences, to analyse the speech sentences to be analysed, and acquiring analysis results.
  • 13. The system for semantic analysis of speech according to claim 12, wherein the indexing unit comprises: an extraction module, used for extracting a keyword in the speech sentences to be analysed, the keyword being same as that in the storage unit, and acquiring a tag corresponding to the keyword;a substitution module, connected to the extraction module, and being used for replacing the keyword in the speech sentences to be analysed with the tag corresponding to the keyword, to form a substituted speech sentences;an indexing module, connected to the substitution module, and being used for searching in the word list in the storage unit, on the basis of the character and the tag in the substituted speech sentences, to acquire the address of the semantic sentence matching the character and/or the address of the semantic sentence matching the tag;a sorting module, connected to the indexing module, and being used for sorting the semantic sentences matching the character and/or the semantic sentence matching the tag in the substituted speech sentences, by comparing a similarity of the substituted speech sentences, to acquire the sorted candidate semantic sentences.
  • 14. The system for semantic analysis of speech according to claim 13, wherein the sorting module uses a score formula to acquire a score of similarity compared by the candidate semantic sentences and the substituted speech sentences; the score formula is: S=(S1+S2)/2wherein, S represents the score of similarity compared by the candidate semantic sentences and the substituted speech sentences, S1 represents the portion of the character and/or the tag in the candidate semantic sentences to the substituted speech sentences, S2 represents the portion of the character and/or the tag in the candidate semantic sentences to the candidate semantic sentences.
  • 15. The system for semantic analysis of speech according to claim 12, wherein the step of the analysis unit using a fuzzy match algorithm, on the basis of the sorted candidate semantic sentences, to analyse the speech sentences to be analysed is that: astablishing a finite state automation network, upon which the speech sentences to be analysed is rated, comparing scores of the speech sentences to be analysed, setting a highest score of the speech sentences to be analysed as the analysis result.
  • 16. The system for semantic analysis of speech according to claim 12, wherein the word list is hash table.
  • 17. A method for semantic analysis of speech, applying to the system for semantic analysis of speech according to claim 1, comprising the steps of: S1, acquiring speech sentences to be analysed;S2, searching the semantic sentences in the storage limit on the basis of the speech sentences to be analysed, acquiring candidate semantic sentences matching the speech sentences to be analysed and a corresponding candidate order;S3, using a fuzzy match algorithm, on the basis of the sorted candidate semantic sentences, to analyse the speech sentences to be analysed, and acquiring analysis results,
  • 18. The method for semantic analysis of speech according to claim 17, wherein the step S2 is: S21, extracting a keyword in the speech sentences to be analysed, the keyword being same as that in the storage unit, and acquiring a tag corresponding to the keyword;S22, replacing the keyword in the speech sentences to be analysed with the tag corresponding to the keyword, to form a substituted speech sentences;S23, searching in the word list in the storage unit, on the basis of the character and the tag in the substituted speech sentences, to acquire an address of the semantic sentence matching the character and/or an address of the semantic sentence matching the tag;S24, sorting the semantic sentences matching the character and/or the semantic sentence matching the tag in the substituted speech sentences, by comparing a similarity of the substituted speech sentences, to acquire the sorted candidate semantic sentences.
  • 19. The method for semantic analysis of speech according to claim 18, wherein the step S24 uses a score formula to acquire a score of similarity compared by the candidate semantic sentences and the substituted speech sentences; the score formula is: S=(S1+S2)/2;wherein, S represents the score of similarity compared by the candidate semantic sentences and the substituted speech sentences, S1 represents the portion of the character and/or the tag in the candidate semantic sentences to the substituted speech sentences, S2 represents the portion of the character and/or the tag in the candidate semantic sentences to the candidate semantic sentences.
  • 20. The method for semantic analysis of speech according to claim 17, wherein the step S3 is: S31, astablishing a finite state automation network for each of the candidate semantic sentences;S32, rating the speech sentences to be analysed upon the finite state automation network;S33, comparing scores of the speech sentences to be analysed, setting a highest score of the speech sentences to be analysed as the analysis result.
  • 21. The method for semantic analysis of speech according, to claim 18, wherein the word list is hash table.
  • 22. A semantic speech analysis system, for implementing semantic analysis of speech in a preset field, comprising: a a storage unit used for storing semantic sentences in the present field, each semantic sentence corresponding to an address, the semantic sentences comprising characters and keywords, each keyword corresponding to a tag, and a word list being prearranged in the storage unit;said storing unit storing the address of the semantic sentence in which each word appears and the address of the semantic sentence in which each tag appears;an acquisition unit used for acquiring speech sentences to be analyzed;an indexing unit respectively connected to the storage unit and the acquisition unit and being used for searching the semantic sentences in the storage unit on the basis of the speech sentences to be analyzed; andan analysis unit connected to the indexing unit and using a fuzzy match algorithm on a basis of sorted candidate sentences and acquiring analysis results.
Priority Claims (1)
Number Date Country Kind
201510385309.1 Jun 2015 CN national
PCT Information
Filing Document Filing Date Country Kind
PCT/CN2016/085763 6/14/2016 WO 00