This application is based on and claims priority under 35 U.S.C. § 119 of an Indian patent application number 201741030547, filed on Aug. 29, 2017, in the Indian Intellectual Property Office and Indian patent application number 201741030547, filed on Aug. 21, 2018, in the Indian Intellectual Property Office, the disclosures of which are incorporated by reference herein in its entirety.
The present disclosure relates to electronic devices. More particularly it is related to a method and electronic device for providing cognitive semiotics based multimodal predictions.
In general, electronic devices dominate all aspects of modem life. Over a period of time, the manner in which the electronic devices display information on a user interface has become intelligent, efficient, and less obtrusive.
The electronic devices such as for example, a mobile phone, a portable game console or the like provides a user interface that includes an on-screen keyboard which allows a user to enter input (i.e., a text) into the user interface by touching virtual keys displayed on a touch screen display. Further, various electronic messaging systems allow users to communicate with each other using one or more different types of communication media, such as text, emoticons, icons, images, video, and/or audio. Using such electronic methods, many electronic messaging systems allow users to communicate quickly with other users.
Electronic messaging systems that include the ability to send text messages allow a sender to communicate with other users without requiring the sender to be immediately available to respond. For example, instant messaging, SMS messaging, and similar communication methods allow a user to quickly send a text message to another user that the recipient can view at any time after receiving the message. Additionally, electronic messaging systems that allow users to send messages including primarily text also use less network bandwidth and storage resources than other types of communication methods.
Basic predictive text input solutions have been introduced for assisting with input on an electronic device. These solutions include predicting which word a user is entering and offering a suggestion for completing the word. But these solutions can have limitations, often requiring the user to input most or all of the characters in a word before the solution suggests the word the user is trying to input.
In some conventional methods for instant messaging, the methods often include some limitations that the recommendation modules and relevance modules in the electronic device does not extract the typography, multimodal contents (e.g., ideograms, texts, images, GIFs, semiotics etc.) of input provided by a user for instant messaging. Further, these methods do not automatically predict the next set of multimodal contents for the user based on the previous multimodal contents which are provided by the user.
The above information is presented as background information only to help the reader to understand the present invention. Applicants have made no determination and make no assertion as to whether any of the above might be applicable as Prior Art with regard to the present application.
Aspects of the disclosure are to address at least the above-mentioned problems and/or disadvantages and to provide at least the advantages described below.
Accordingly, an aspect of the disclosure is to provide a method and electronic device for providing cognitive semiotics based multimodal predictions.
Another aspect of the disclosure is to generate one or more context based multimodal predictions in accordance with a detected input from a language model.
Another aspect of the disclosure is to display one or more context based multimodal predictions in the electronic device.
Another aspect of the disclosure is to perform one or more actions in accordance with the detected input from a user.
Another aspect of the disclosure is to extract one or more semiotics in the language model in accordance with the user input.
Another aspect of the disclosure is to generate one or more context based multimodal predictions based on the one or more semiotics in the language model.
Another aspect of the disclosure is to modify a layout of a touch screen keyboard for a subsequent input based on the detected input.
Another aspect of the disclosure is to provide multimodal predictions by applying rich text aesthetics based on the context of the detected input.
Another aspect of the disclosure is to provide one or more semiotic predictions in response to a received message.
Another aspect of the disclosure is to prioritize the one or more context based multimodal predictions based on the one or more semiotics in the language model.
Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments.
In accordance with an aspect of the disclosure, a method for providing context based multimodal predictions in an electronic device. The method includes detecting an input on a touch screen keyboard displayed on a screen of the electronic device. Further, the method includes generating one or more context based multimodal predictions in accordance with the detected input from a language model. Furthermore, the method includes displaying the one or more context based multimodal predictions in the electronic device.
In accordance with an aspect of the disclosure, the input comprises at least one of a text, a character, a symbol and a sequence of words.
In accordance with an aspect of the disclosure, the context based multimodal predictions comprises at least one of graphical objects, ideograms, non-textual representations, words, characters and symbols.
In accordance with an aspect of the disclosure, the method includes performing one or more actions in accordance with the detected input.
In accordance with an aspect of the disclosure, the one or more actions include modifying a layout of the touch screen keyboard for a subsequent input based on the detected input.
In accordance with an aspect of the disclosure, the one or more actions in accordance with the detected input includes at least one of providing rich text aesthetics based on the context of the detected input, switching the layout of the keyboard while detecting the user input, predicting one or more characters based on the context of the detected input, capitalizing one or more characters or one or more words based on the context of the detected input and recommending one or more suggestions in accordance with the user input, providing one or more semiotic predictions in response to a received message and understanding text with punctuations.
In accordance with an aspect of the disclosure, generating the one or more context based multimodal predictions in accordance with the detected input from the language model includes analyzing the detected input with one or more semiotics in the language model. The method includes extracting the one or more semiotics in the language model in accordance with the user input. The method includes generating the one more context based multimodal predictions based on the one or more semiotics in the language model. Further, the method includes feeding the one or more semiotics to the language model after the input for predicting next set of multimodal predictions.
In accordance with an aspect of the disclosure, the language model includes representations of the multimodal predictions with semiotics data corresponding to a text obtained from a plurality of data sources. The semiotics data is classified based on a context associated with the text.
In accordance with an aspect of the disclosure, each text obtained from the plurality of data sources is represented as semiotics data in the language model for generating the one or more context based multimodal predictions.
In accordance with an aspect of the disclosure, the one or more context based multimodal predictions are prioritized based on the one or more semiotics in the language model.
In accordance with another aspect of the disclosure, the disclosure provides a method for providing context based multimodal predictions in an electronic device. The method includes generating a language model containing semiotics data corresponding to a text obtained from a plurality of data sources. The method includes detecting an input on a touch screen keyboard displayed on a screen of the electronic device. Further, the method includes generating one or more context based multimodal predictions in accordance with the detected input from the language model. Furthermore, the method includes displaying the one or more context based multimodal predictions in the electronic device.
In accordance with another aspect of the disclosure, the disclosure provides an electronic device for providing context based multimodal predictions. The electronic device includes a multimodal prediction module configured to detect an input on a touch screen keyboard displayed on a screen of the electronic device. The multimodal prediction module configured to generate one or more context based multimodal predictions in accordance with the detected input from a language model. The multimodal prediction module configured to display the one or more context based multimodal predictions in the electronic device.
In accordance with another aspect of the disclosure, the disclosure provides an electronic device for providing context based multimodal predictions. The electronic device includes a language model generation module and a multimodal prediction module. The language model generation module configured to generate a language model containing semiotics data corresponding to a text obtained from a plurality of data sources. The multimodal prediction module configured to detect an input on a touch screen keyboard displayed on a screen of the electronic device. The multimodal prediction module configured to generate one or more context based multimodal predictions in accordance with the detected input from the language model. Further, the multimodal prediction module configured to display the one or more context based multimodal predictions in the electronic device.
Other aspects, advantages, and salient features of the disclosure will become apparent to those skilled in the art from the following description, which, taken in conjunction with the annexed drawings, discloses various embodiments of the disclosure.
Before undertaking the DETAILED DESCRIPTION below, it may be advantageous to set forth definitions of certain words and phrases used throughout this patent document: the terms “include” and “comprise,” as well as derivatives thereof, mean inclusion without limitation; the term “or,” is inclusive, meaning and/or; the phrases “associated with” and “associated therewith,” as well as derivatives thereof, may mean to include, be included within, interconnect with, contain, be contained within, connect to or with, couple to or with, be communicable with, cooperate with, interleave, juxtapose, be proximate to, be bound to or with, have, have a property of, or the like; and the term “controller” means any device, system or part thereof that controls at least one operation, such a device may be implemented in hardware, firmware or software, or some combination of at least two of the same. It should be noted that the functionality associated with any particular controller may be centralized or distributed, whether locally or remotely.
Moreover, various functions described below can be implemented or supported by one or more computer programs, each of which is formed from computer readable program code and embodied in a computer readable medium. The terms “application” and “program” refer to one or more computer programs, software components, sets of instructions, procedures, functions, objects, classes, instances, related data, or a portion thereof adapted for implementation in a suitable computer readable program code. The phrase “computer readable program code” includes any type of computer code, including source code, object code, and executable code. The phrase “computer readable medium” includes any type of medium capable of being accessed by a computer, such as read only memory (ROM), random access memory (RAM), a hard disk drive, a compact disc (CD), a digital video disc (DVD), or any other type of memory. A “non-transitory” computer readable medium excludes wired, wireless, optical, or other communication links that transport transitory electrical or other signals. A non-transitory computer readable medium includes media where data can be permanently stored and media where data can be stored and later overwritten, such as a rewritable optical disc or an erasable memory device.
Definitions for certain words and phrases are provided throughout this patent document, those of ordinary skill in the art should understand that in many, if not most instances, such definitions apply to prior, as well as future uses of such defined words and phrases.
The above and other aspects, features, and advantages of certain embodiments of the disclosure will be more apparent from the following description taken in conjunction with the drawings, in which:
Throughout the drawings, it should be noted that like reference numbers are used to depict the same or similar elements, features, and structures.
The following description with reference to the accompanying drawings is provided to assist in a comprehensive understanding of various embodiments of the disclosure as defined by the claims and their equivalents. It includes various specific details to assist in that understanding, but these are to be regarded as merely exemplary. Accordingly, those of ordinary skilled in the art will recognize that various changes and modifications of the various embodiments described herein can be made without departing from the scope and spirit of the disclosure. In addition, descriptions of well-known functions and constructions may be omitted for clarity and conciseness.
The terms and words used in the following description and claims are not limited to the bibliographical meanings, but are merely used by the inventor to enable a clear and consistent understanding of the disclosure. Accordingly, it should be apparent to those skilled in the art that the following description of various embodiments of the disclosure is provided for illustration purposes only and not for the purpose of limiting the disclosure as defined by the appended claims and their equivalents.
It is to be understood that the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a component surface” includes reference to one or more of such surfaces.
The various embodiments described herein are not necessarily mutually exclusive, as some embodiments can be combined with one or more other embodiments to form new embodiments.
The term “or” as used herein, refers to a non-exclusive or, unless otherwise indicated. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein can be practiced and to further enable those skilled in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
As is traditional in the field, embodiments may be described and illustrated in terms of blocks which carry out a described function or functions. These blocks, which may be referred to herein as units, modules, manager, modules or the like, are physically implemented by analog and/or digital circuits such as logic gates, integrated circuits, microprocessors, microcontrollers, memory circuits, passive electronic components, active electronic components, optical components, hardwired circuits and the like, and may optionally be driven by firmware and/or software. The circuits may, for example, be embodied in one or more semiconductor chips, or on substrate supports such as printed circuit boards and the like. The circuits constituting a block may be implemented by dedicated hardware, or by a processor (e.g., one or more programmed microprocessors and associated circuitry), or by a combination of dedicated hardware to perform some functions of the block and a processor to perform other functions of the block. Each block of the embodiments may be physically separated into two or more interacting and discrete blocks without departing from the scope of the disclosure. Likewise, the blocks of the embodiments may be physically combined into more complex blocks without departing from the scope of the disclosure.
The embodiments herein provide a method for providing context based multimodal predictions in an electronic device. The method includes detecting an input on a touch screen keyboard displayed on a screen of the electronic device. Further, the method includes generating one or more context based multimodal predictions in accordance with the detected input from a language model. Furthermore, the method includes displaying the one or more context based multimodal predictions in the electronic device.
In some embodiments, the method includes generating a language model containing semiotics data corresponding to a text obtained from a plurality of data sources. The information/knowledge/text obtained from the plurality of data sources is represented as semiotics data in the language model and the semiotics data is classified based on a context associated with the text. The language model with semiotics data can be generated at the electronic device or can be generated external to the electronic device (i.e., for example at a server).
The method and system may be used to provide cognitive semiotics based multimodal predictions in the electronic device. With the method, multimodal content in the data corpus collected from various sources is interpreted. The data corpus includes web data (such as Blogs, Posts and other website crawling) as well as user data (such as SMS, MMS, and Email data). The data is represented as at least one semiotic for the at least one multimodal content by processing or representing the data corpus with rich annotation.
The method includes generating a tunable semiotic language model on the processed data corpus, preloading the language model in the electronic device for predicting the multimodal content while the user is typing or before the user is composing the multimodal content Furthermore, the method includes generating a user language model dynamically in the electronic device from the user typed data.
Referring now to the drawings and more particularly to
Referring to
Referring to
The
Referring to
In the
The language model generation module 110 includes an interpreter 110a, a representation controller 110b and a semiotics modeling controller 110c.
In an embodiment, the interpreter 110a may be configured to extract knowledge, information, text or the like from a plurality of data sources. The knowledge, information and text include natural language text, sentences, words, phrases or the like. In an example, the interpreter 110a may be configured to extract the knowledge and patterns of various multimodal contents such as ideograms, text, image, GIFs etc. in the text obtained from the plurality of data sources which includes for example, Blogs, websites, SNS posts) and user data (including, SMS, MMS, Email), along with multimodal contents.
In an embodiment, the representation controller 110b may be configured to represent the knowledge, information and text obtained from the plurality of data sources to corresponding semiotics data. Each text obtained from the plurality of data sources is converted to semiotics data. The representation controller 110b may be configured to identify the semiotics for the multimodal contents.
The representation controller 110b converts each text to semiotics data. An example illustration of the text which is converted to semiotics data is shown in the below table.
In an embodiment, the representation controller 110b processes and understands Typography, Quantity, Multimodal content (Ideograms, Text, Image, Gif, Voice, etc.) for representing the semiotics data. The representation controller 110b processes the text with Rich Annotations.
The semiotics modeling controller 110c processes semiotic data set. In some embodiments, the semiotics modeling controller 110c may be configured to prioritize the semiotics data in the semiotic data set. Thus, the semiotics modeling controller 110c generates the language model by processing and tuning the semiotics data.
In an embodiment, the multimodal prediction module 120 may be configured to generate context based multimodal predictions in accordance with the detected input from a language model. The multimodal prediction module 120 may be configured to communicate with language model generation module 110 to identify semiotics data corresponding to the detected input in the language model.
In an embodiment, the multimodal prediction module 120 may be configured to analyze the detected input with one or more semiotics in the language model. Further, the multimodal prediction module 120 may be configured to extract the semiotics data in the language model in accordance with the user input. After extracting the semiotics data in the language model, the multimodal prediction module 120 may be configured to generate the context based multimodal predictions based on the one or more semiotics in the language model.
The processor 130 is coupled with the multimodal prediction module 120, and the memory 140. The processor 130 is configured to execute instructions stored in the memory 140 and to perform various actions for providing the context based multimodal predictions. The memory 140 also stores instructions to be executed by the processor 130. The memory 140 may include non-volatile storage elements.
Although the
When the user input the text in the electronic device 100, the semiotics recognition handler 120a interprets the multimodal contents of the texts and identifies the semiotics associated with the multimodal contents. Further, the semiotics are stored in the semiotic language modeling manager 120b to predict the next semiotics, next words and generating reverse interpretation. The action manager 120c may be configured to perform one or more actions to display the predicted multimodal content on the user interface of the electronic device 100.
In an embodiment, the action manager 120c may be configured to perform one or more actions which include modifying the layout of the touch screen keyboard, providing rich text aesthetics, predicting ideograms, capitalizing words automatically or the like. The various actions performed by the action manager 120c are described in conjunction with figures in the later parts of the description.
Further, if the calculation is based on loss, then it is propagated back to the neural network and if there is no loss the tunable semiotics are stored in tunable semiotics language modeling as shown in the
Herein, the selector may be represented as a vector.
selectorc=mc*yi Equation (1)
where me is the mask vector for a certain category c (c may be rich text, hypertext, special time and date semiotics and so on) and yi is the i-th training target. The selector vector is C bits long if the total number of categories of semiotics/words is C. Dot product between 2 vectors is represented by *.
Further, a loss coefficient may be represented as:
lossCoefficient=selector*coefficientVector Equation (2)
where coefficientVector is the vector of non-zero coefficients for different categories of semiotics/words. In the trivial case, all elements of coefficientVector are 1. Tuning the coefficientVector allows us to model different categories of semiotics differently and this can even be set as a trainable parameter which would allow the training semiotic assigned corpus to dictate the coefficient terms.
Accordingly, the calculation based on loss may be represented as:
loss=Σi=1N(lossCoefficient*CE(yp,i,yi))/Σi=1N(lossCoefficient) Equation (3)
where * is simple product and CE is cross entropy loss and the embodiments in the disclosure are considering N training examples.
At step 304, the method includes generating one or more context based multimodal predictions in accordance with the detected input from the language model. The method allows the multimodal prediction module 120 to generate the one or more context based multimodal predictions in accordance with the detected input from the language model.
At step 306, the method includes displaying the one or more context based multimodal predictions in the electronic device 100. The method allows the multimodal prediction module 120 to display the more context based multimodal predictions in the electronic device 100. The various example illustrations in which the electronic device 100 provides context based multimodal predictions are described in conjunction with the figures.
The various actions, acts, blocks, steps, or the like in the flow diagram 300 may be performed in the order presented, in a different order or simultaneously. Further, in some embodiments, some of the actions, acts, blocks, steps, or the like may be omitted, added, modified, skipped, or the like without departing from the scope of the invention.
Further, before the step 302, the method may include generating a language model containing semiotics data corresponding to a text obtained from a plurality of data sources. The method allows the language model generation module 110 to generate the language model containing semiotics data corresponding to a text obtained from a plurality of data sources.
At step 404, the method includes extracting one or more semiotics in the language model in accordance with the user input. The method allows the multimodal prediction module 120 to extract the one or more semiotics in the language model in accordance with the user input.
At step 406, the method includes generating one more context based multimodal predictions based on the one or more semiotics in the language model. The method allows the multimodal prediction module 120 to generate the one more context based multimodal predictions based on the one or more semiotics in the language model. Further, the method includes feeding the semiotics data back to the language model after the user input, for predicting next set of multimodal predictions. The semiotics data is fed back to the language model after the user input, for predicting next set of multimodal predictions.
The various actions, acts, blocks, steps, or the like in the flow diagram 400 may be performed in the order presented, in a different order or simultaneously. Further, in some embodiments, some of the actions, acts, blocks, steps, or the like may be omitted, added, modified, skipped, or the like without departing from the scope of the invention.
The multimodal prediction module 120 interprets the user input (e.g., congrats on 7th anniversary, congrats on 51st anniversary). Further, the multimodal prediction module 120 identifies the semiotic for the multimodal content (e.g., congrats on <I_NT> anniversary, congrats on <B_NT> anniversary) and generates a semiotics language modeling which is preloaded in the electronic device 100. When the user types a message (e.g., congrats on 5th), the multimodal prediction module 120 identifies the semiotics of the typed text (e.g., 5th to <NT>) and forwards the identified <NT> to the semiotics modeling controller 110c. Further, the multimodal prediction module 120 retrieves various multimodal predictions (e.g., <I_NT> anniversary) and displays it on the user interface of the electronic device 100. Thus, the multimodal prediction module 120 predicts the words ‘Anniversary’, Birthday’ and ‘Season’ based on the user input. The predictions are provided by applying rich text aesthetics. Thus, the predictions such as ‘Anniversary’, Birthday’ and ‘Season’ are provided as Bold and Italicized aesthetics as shown in the
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
With the above described method, when the user swipes from ‘O’ to ‘N,’ the multimodal prediction module 120 identifies the semiotics classified as <Time> in the language model. Thus, the multimodal prediction module 120 predicts PM, even though the user swipes from ‘O’ to ‘N.’. Thus, the multimodal prediction module 120 provides the text as DEPARTURE TIME IS 8:00 PM as shown in the
Referring to
The embodiments disclosed herein can be implemented using at least one software program running on at least one hardware device and performing network management functions to control the elements.
The foregoing description of the specific embodiments will so fully reveal the general nature of the embodiments herein that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the disclosure has been shown and described with reference to various embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents.
Although the present disclosure has been described with various embodiments, various changes and modifications may be suggested to one skilled in the art. It is intended that the present disclosure encompass such changes and modifications as fall within the scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
2017 41030547 | Aug 2017 | IN | national |
201741030547 | Aug 2018 | IN | national |