This application claims the benefit of priority to Taiwan Patent Application No. 112143775, filed on Nov. 14, 2023. The entire content of the above identified application is incorporated herein by reference.
Some references, which may include patents, patent applications and various publications, may be cited and discussed in the description of this disclosure. The citation and/or discussion of such references is provided merely to clarify the description of the present disclosure and is not an admission that any such reference is “prior art” to the disclosure described herein. All references cited and discussed in this specification are incorporated herein by reference in their entireties and to the same extent as if each reference was individually incorporated by reference.
The present disclosure relates to a chat robot, and more particularly to a method and a system for processing natural language messages to have a dialogue according to user semantics, user preferences, and real-time environmental information.
Artificial intelligences (AI) are rapidly developed in various fields, and one of the artificial intelligences is a natural language chatbot that is able to process natural languages and automatically generate contents. The chatbot is, for example, a chat generative pre-trained transformer (ChatGPT) developed by OpenAI. Such a natural language chatbot utilizes a generative artificial intelligence technology to pre-train large amounts of data, and then to generate new data correlated with original data. An intelligent model is created after a deep-learning process (such as a generative adversarial network (GAN)).
Taking the ChatGPT as an example, the ChatGPT is trained by learning large amounts of network data, and can chat with users by using the natural languages. However, responses to the users are usually standard answers generated through the learning process, and these responses are not adapted to the instant status of the user in real time. Despite being the natural language chatbot, the ChatGPT fails to provide contents correlated with the instant status of the user.
In response to the above-referenced technical inadequacies, the present disclosure provides a method for processing natural language messages and a natural-language-message processing system thereof. A user can chat with a chatbot that is implemented by natural language processing (NLP) and generative artificial intelligence (generative AI) technologies. Furthermore, by referring to a user preference and real-time environmental information, dialogue contents generated by the chatbot can be consistent with the user's personal requirements and an instant circumstance.
In the natural-language-message processing system that is implemented by a computer system, a cloud server is provided, and a processing circuit is used to perform the method for processing the natural language messages. The cloud server allows the user to initiate an online dialogue procedure via a user interface. The cloud server receives user-input contents via a dialogue interface. Then, semantic features of the user-input contents can be extracted. The semantic features, user data, and the real-time environmental information are referred to for determining the dialogue contents that are consistent with the current circumstance. The semantic features, the user data, and the real-time environmental information are then processed by a natural language model operated in the online dialogue procedure, so as to generate the dialogue contents. After being imported to the online dialogue procedure, the dialogue contents are outputted.
The content received by the system via the dialogue interface can be a text, a voice, or an audiovisual content. When the content received by the system is the voice or the audiovisual content, a textization process converts the content into the text, and a semantic analysis process is performed on the text, so as to retrieve semantic features of the text.
In an aspect, the natural language model operated in the cloud server employs a transformer model for conducting machine translation, document summarization, and document generation, so as to generate the dialogue contents. Further, the cloud server performs a vector operation on the user-input content, the user preference, and the real-time environmental information, annotates the text, calculates a vector of each of words, and retrieves correlated contents based on vector distances between the words. Therefore, the dialogue contents that are consistent with the user preference and the real-time environmental information can be obtained.
Further, the vector operation is performed on historical dialogue records recorded in a database of the cloud server or a memory, so as to generate the dialogue contents that are consistent with the user's current emotion.
These and other aspects of the present disclosure will become apparent from the following description of the embodiment taken in conjunction with the following drawings and their captions, although variations and modifications therein may be affected without departing from the spirit and scope of the novel concepts of the disclosure.
The described embodiments may be better understood by reference to the following description and the accompanying drawings, in which:
The present disclosure is more particularly described in the following examples that are intended as illustrative only since numerous modifications and variations therein will be apparent to those skilled in the art. Like numbers in the drawings indicate like components throughout the views. As used in the description herein and throughout the claims that follow, unless the context clearly dictates otherwise, the meaning of “a,” “an” and “the” includes plural reference, and the meaning of “in” includes “in” and “on.” Titles or subtitles can be used herein for the convenience of a reader, which shall have no influence on the scope of the present disclosure.
The terms used herein generally have their ordinary meanings in the art. In the case of conflict, the present document, including any definitions given herein, will prevail. The same thing can be expressed in more than one way. Alternative language and synonyms can be used for any term(s) discussed herein, and no special significance is to be placed upon whether a term is elaborated or discussed herein. A recital of one or more synonyms does not exclude the use of other synonyms. The use of examples anywhere in this specification including examples of any terms is illustrative only, and in no way limits the scope and meaning of the present disclosure or of any exemplified term. Likewise, the present disclosure is not limited to various embodiments given herein. Numbering terms such as “first,” “second” or “third” can be used to describe various components, signals or the like, which are for distinguishing one component/signal from another one only, and are not intended to, nor should be construed to impose any substantive limitations on the components, signals or the like.
The present disclosure relates to a method for processing natural language messages and a natural-language-message processing system thereof. The method for processing the natural language messages can be operated in a cloud server for implementing the natural-language-message processing system. The cloud server provides services for a social media via a network, and invites users to join the social media for sharing texts, pictures, and audiovisual contents. The cloud server also provides respective chat robots (“chatbot”) in various fields, and allows the users to chat through the services provided by the cloud server. The cloud server employs artificial intelligence technologies (such as a machine learning algorithm and the natural language processing (NLP) technology) to learn data in various fields, so as to train the chatbots for providing chatting services. The cloud server also obtains a user preference by learning activity data generated in the social media of the user. The chatbot relies on chatting semantics of the user, the user preference, and real-time environmental information to generate dialogue contents that are consistent with the user's personal requirements and current environmental features.
Regarding the system implemented by the cloud server, reference can be made to
The diagram shows a cloud server 100 that is implemented by a computer system, a database, and a network, and various functional modules are implemented through collaboration of software and hardware. As shown in the diagram, a natural language processing module 101 that is used to perform the method for processing the natural language messages is provided. The natural language processing module 101 embodies a chatbot that is capable of processing natural languages. A machine-learning module 103 is used to operate a machine-learning algorithm for training a natural language model and learning network activities of the user by a deep-learning method, so as to establish the user preference. Accordingly, the chatbot can generate the dialogue contents that are consistent with the user preference. The cloud server 100 provides an external system interface module 105 that includes a circuitry and related software to connect with an external system (e.g., a first external system 111 and a second external system 112) via a network (e.g., a network 10) and retrieve data via an application program interface (API). The cloud server 100 provides a user interface module 107 that allows a user device 150 to connect with the cloud server 100 by a function of network connection. The cloud server 100 operates a web server for providing network services and allowing an application program executed in the user device 150 to obtain a corresponding service provided by the cloud server 100.
According to the system framework shown in the diagram, the cloud server 100 includes a built-in or an external database (such as an audiovisual database 110), and provides data services. For example, the user device 150 is allowed to access the audiovisual contents shared by other users and stored in the audiovisual database 110 via the network 10. The database also includes texts and pictures shared by other users. The cloud server 100 includes a user database 120 that is used to store user data, and the user data includes user personal data, uploaded texts, pictures, and audiovisual contents, and activity data relating to the network services provided by the cloud server 100. The activity data records the user's network activities, such as browsed contents, follows, likes, shares, and subscriptions. The activity data forms a user profile. Furthermore, the user database 120 stores and updates the user data in accordance with a time dimension when the dialogue contents are continuously produced over time. The user database 120 also records the user's historical dialogue records, which become dialogue records to be learned by the machine-learning method in the natural language model. The cloud server 100 includes a vector database 130, and the vector database 130 is used to record structured data that is formed by performing a vectorization algorithm on the various texts, pictures, and audiovisual contents. The structured data is configured to be compared for matching various personalized data.
According to the schematic diagram of the system framework shown in the diagram, the cloud server 100 retrieves data from the external system via the network 10 or a connection under a specific protocol. The external system is exemplified as the first external system 111 and the second external system 112. For example, the external system is a server set up by a government or an enterprise for providing open data. The cloud server 100 uses the external system interface module 105 to retrieve real-time data that meets a specific requirement via an application program interface (API) provided by each external system. The real-time data can be real-time weather, real-time traffic, real-time news, network information relating to a real-time location, etc.
The user device 150 executes an application program provided by the cloud server 100. For example, the cloud server 100 provides a social media service, and the user device 150 executes a corresponding social media application program that accesses the social media service via the user interface module 107 of the cloud server 100. In particular, the cloud server 100 uses the natural language processing module 101 to provide a natural language chatbot, such that the user can chat with the chatbot via a dialogue interface 115. On the other hand, the cloud server 100 learns the activity data to be generated when the user accesses the various services provided by the cloud server 100 through the machine-learning module 103. The activity data is, for example, data to be generated when the user manipulates the social media application program and the dialogue interface 115. The machine-learning module 103 can learn interest features of the user, so as to establish the user data.
It is worth mentioning that the texts, the pictures, and the audiovisual contents retrieved by the cloud server 100 are not structured data, and can be converted to vectorized data by way of encoding, so as to easily acquire the meaning and facilitate data searching. Further, the vectorized data can be used to compare with a search keyword provided by the user. For example, a distance function is used to calculate a distance between the search keyword and the vectorized data stored in a database. The closer the distance is, the more relevant the data is. Therefore, the user can search data through the vector database 130.
In one of the embodiments, the vector database 130 of the cloud server 100 supports multi-mode search services for texts and images, and provides structured information. For example, the various texts, pictures, and audiovisual content are textified, and a vector operation is performed on the textified contents for obtaining the vectorized data. The vectorized data can be used in a search service. The vectorized data can be used in a natural language processing process. The natural language processing process uses the natural language model to map the vectorized data to a vector space. Taking words and phrases inputted by the user as an example, the vector operation is performed on the words and the phrases, so as to obtain word vectors.
In one embodiment of the present disclosure, the method for processing the natural language messages can be implemented as a chatbot operated in the cloud server 100. The chatbot can use the natural language to chat with the users in texts and voices. In addition to responding to the user's inputted messages, the cloud server can detect personality and habits of the user by retrieving the user data via the cloud server 100 before chatting. Further, the cloud server can also acquire a real-time status from the external system (e.g., the first external system 111 or the second external system 112). For example, the cloud server obtains the local weather and news according to the location of the user, such that the response to the user is not only based on the user preference but also reflects an actual status.
The chatting service provided by the system can be one of the functions operated in the social media. The user activity in the social media can serve as the data for the system to learn the user preference. The user activity data also forms structured data in the system. Reference is made to
The social media platform data 21 is non-public data in the system. The system operating the method for processing the natural language messages retrieves viewer data 211 (data of the users accessing various contents provided by the cloud server), creator data 212 that the system provides about creators of the various contents, and commercial data 213 that the system provides for enterprises to create enterprise data for advertising purposes. Further, the system provides location-based services, and thus can obtain geographic location-related location data 214.
The user data 23 is non-public data in the system. The user data 23 includes data edited by the user himself. The user data 23 also includes viewer data 231 retrieved from the various user activity data. The viewer data 231 includes interest data of the users who act as viewers, and the interest data is obtained by the system through a machine-learning method. The interest data includes recent interest data, history interest data, and location-related interest data.
Creator data 232 of the user data 23 is data relating to the users who act as creators. The creator data 232 includes preference types and location-related data of the creators learned by the system through the machine-learning method. For example, the creator data 232 includes the data of the users who act as creators, and the interested types and locations of creators learned through the machine-learning method. The interested locations of the creators include geographic locations or specific locations within a place.
When the user is an enterprise, commercial data 233 of the user data 23 includes an enterprise commercial type and product features thereof that are obtained by the system through a machine-learning process.
The user activity data 25 is non-public data in the system. The user activity data 25 includes statistical data of activities in various services provided by the cloud server. The user activity data 25 also includes data obtained through the machine-learning process. The user activity data 25 mainly includes viewer data 251, creator data 252, and commercial data 253.
The viewer data 251 of the user activity data 25 includes viewing rates, viewing times, and activity data (such as follows, likes, comments, and subscriptions) when the users use the services provided by the cloud server. The creator data 252 of the user activity data 25 includes statistical data of the users who act as creators. The statistical data includes a quantity of followers of channels or accounts, views of the created contents, and account viewing rates. The commercial data 253 includes a quantity of followers, content views, and overall impression data obtained when the user is an enterprise.
The social media platform data 21, the user data 23, and the user activity data 25 are obtained by the cloud server that collects and learns data. The social media platform data 21, the user data 23, and the user activity data 25 form a basis for the system to operate natural language processing and a generative artificial intelligence technology, so as to implement the chatting service. The cloud server uses a processing circuit to process the above data for implementing a chatbot that can meet the requirements of personalization and instantaneousness.
In one embodiment of the present disclosure, the natural language model operated in the cloud server performs a vector operation on the content inputted by the user (via the dialogue interface), the user preference, and the real-time environmental information, annotates the texts, calculates a vector of each of the words, and queries a database for retrieving correlated contents based on vector distances among the words. Accordingly, the dialogue contents being consistent with the user preference and the real-time environmental information can be obtained. In an online dialogue procedure, a transformer model is used to conduct machine translation, document summarization, and document generation on the textized data. Reference is made to
The above-described chatting process implemented by the method for processing the natural language messages can be achieved by a graphical user interface (GUI) initiated by an application program executed in a user device. The application program can be a social media application program used to display the dialogue contents between the chatbot and the user via the graphical user interface. Reference is made to
In the flowchart shown in
According to one embodiment of the present disclosure, as shown in
On the other hand,
After that, a dialogue interface is initiated and provided for the user to input texts, pictures, or a specific audiovisual content. For example, the user can share a link to the audiovisual content. The cloud server then uses the user interface module to receive the user-input content (step S303).
According to one embodiment of the present disclosure, the online dialogue procedure implements a chatbot that applies a natural language model. The chatbot can chat with users via a dialogue interface and perform the method for processing the natural language messages on each of the user-input contents. Referring to a dialogue interface 70 shown in
In the meantime, the cloud server uses a user interface module to retrieve the user-input content. The content retrieved via the dialogue interface can be texts, voices, or audiovisual contents. When the contents to be retrieved are the voices or the audiovisual contents, the contents can be textized through a textization process, so as to be converted into texts. The texts are then processed by a semantic analysis process, so as to extract semantic features (step S305). In the above steps, the cloud server retrieves user data from a user database and the real-time environmental information from an external system, such as via the external system interface module 105 shown in
Afterwards, the semantic features that are consistent with the user-input content can be determined. For example, after a database is queried, the semantic features can be obtained by filtering. The user preference retrieved from the user data and the real-time environmental information can be obtained (step S309), and then processed by the natural language model operated in the online dialogue procedure, so as to generate the dialogue contents (step S311). The dialogue contents can then be imported to the online dialogue procedure, and outputted via the dialogue interface (step S313).
Further, the natural language model operated in the cloud server uses multi-dimensional data recorded in the database or the system memory. The data also includes historical dialogue records generated in the same online dialogue procedure. Apart from factoring in the semantic features of the dialogue, the user preference, and the real-time environmental information, the historical dialogue records generated in the current online dialogue procedure are also taken into consideration (step S315) before the chatbot generates the dialogue contents (such as in step S309). Accordingly, the natural language model is facilitated to generate the dialogue contents that are consistent with the current circumstance (step S311).
It should be noted that the historical dialogue records generated in the same online dialogue procedure can indicate the current circumstance and reflect the user's current emotion and requirements. As shown in the embodiment of
Referring to
In one further example, reference is made to the dialogue interface 80 shown in
According to one further embodiment of the present disclosure, the dialogue contents generated by the natural language model operated in the chatbot according to semantic features of the user-input content, the user preference, and the real-time environmental information can include multiple recommendation options, multiple recommended audiovisual contents, and/or links to multiple recommended friends. Reference is made to the exemplary example shown in
In the online dialogue procedure, the dialogue interface 90 shown in
Similarly, if the user expresses his wish to watch an audiovisual content, the chatbot can provide multiple recommended audiovisual contents through the recommendation options 902. If the chatbot determines that the user intends to find friends with similar interests, the recommendation options 902 can be links to multiple recommended friends.
Further, the user uses the input field 906 to respond to these recommendation options 902 by inputting a dialogue content 903. As such, the chatbot can rely on the semantics of the dialogue content 903 to provide a dialogue content 904, and multiple recommended contents 905 can be further provided based on the semantics of the above dialogue contents. In continuance to the above example, when the user responds that he desires to choose one of the meals, the chatbot firstly acquires real-time weather and traffic from an external system and the current location of the user, and then provides restaurant options having the chosen meal. Furthermore, if it is determined that the weather is bad and there is a traffic jam on the road, the chatbot will correspondingly recommend other restaurant options that are easily accessible for the user.
Reference is made to
In the flowchart of
It should be noted that, according to certain embodiments of the method for processing the natural language messages provided in the present disclosure, an artificial intelligence technology is used to learn natural languages and conduct natural language understanding on the natural languages for categorizing the texts and performing semantic analysis on the texts. When the user-input dialogue contents are being processed, a deep-learning method of a transformer model (issued by Google™ Brain in 2017) is incorporated to process the user-input natural language contents with an attribute of time sequence. If the user-input contents are not texts, the contents should be textized into texts. Thus, in the online dialogue procedure, the transformer model can be used to conduct machine translation, document summarization, and document generation.
After the semantic features of the dialogue contents inputted by the user are extracted, in view of the user preference and the current location of the user, or the interested locations analyzed from the dialogue contents of the user, the system also obtains the real-time environmental information from an external system based on the current location (step S407). It should be noted that the real-time environmental information includes one or any combination of real-time weather, real-time traffic, real-time news, and network information relating to a real-time location (e.g., POIs on a map or evaluations of POIs) obtained from one or more external systems.
Afterwards, the system uses the vector database to calculate a closest answer based on the semantic features, the user preference, and the real-time environmental information, or plus the historical dialogue records (step S409). It is worth mentioning that the data in the vector database is structured data obtained by the vector operation, and the system can rely on the vector distances to acquire the words with similar semantics from the data in the vector database. For example, a vector distance between the word “computer” appeared in the dialogue content and the word “computation” in the vector database is closer than a vector distance between the words “computer” and “running”.
In the present embodiment, the vector operation is performed on the user-input content, the interested contents of the user, and the real-time environmental information (plus the historical dialogue records if required), and then the texts obtained by the vector operation are annotated. The vector of each of the words is calculated. The correlated contents are then obtained based on the vector distances between the words in the texts. Accordingly, the dialogue contents that are consistent with the user preference and the real-time environmental information can be obtained. In one further embodiment of the present disclosure, when the vector operation is performed on the historical dialogue records, the chatbot can generate the dialogue contents that are consistent with the user's current emotion. For example, the same topic in the same online dialogue procedure can be continued, and the corresponding wordings that are consistent with the emotion can be used.
Further, the system can obtain the audiovisual contents that are consistent with the interests of the user by querying the audiovisual database based on the above-mentioned information (step S411). The chatbot uses the natural language processing technology to process the user-input contents, and uses a generative artificial intelligence technology to generate the dialogue contents (step S413). The dialogue contents are then outputted via the dialogue interface (step S415). In one further aspect, the above steps will be repeated in the online dialogue procedure, and the chatbot can use a natural language in texts or voices to chat with the user and provide the interested and instant contents (e.g., audiovisual contents or texts) to the user.
The foregoing description of the exemplary embodiments of the disclosure has been presented only for the purposes of illustration and description and is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Many modifications and variations are possible in light of the above teaching.
The embodiments were chosen and described in order to explain the principles of the disclosure and their practical application so as to enable others skilled in the art to utilize the disclosure and various embodiments and with various modifications as are suited to the particular use contemplated. Alternative embodiments will become apparent to those skilled in the art to which the present disclosure pertains without departing from its spirit and scope.
| Number | Date | Country | Kind |
|---|---|---|---|
| 112143775 | Nov 2023 | TW | national |