Copyright Application #1-6101841221 “Political Action Figures”, Assignee Very Important Puppets Inc., of Niagara Falls N.Y.
US Trademark Application #87751076 “VIP VERY IMPORTANT PUPPETS”, Assignee Very Important Puppets Inc., of Niagara Falls N.Y.
US Trademark Application #88012274, “Artificial Intelligence Meets Bobbleheads AIXB”, Assignee Very Important Puppets Inc., of Niagara Falls N.Y.
Natural Language Processing (NLP): The area of Computer Science and Artificial Intelligence that deals with the representation and manipulation of human natural language from a computational perspective.
Natural Language Understanding (NLU): Subfield of Natural Language Processing that deals with the problems of interpreting the meaning of natural language and representing it in a computational format.
Natural Language Generation (NLG): The subfield of Natural Language Processing that deals with the problems of generating fluent natural language utterances from formal representations of information.
Application Programming Interface (API): Set of clearly defined and documented function definitions and protocols for building application software.
Parsing: In the context of Natural Language Processing, parsing is the process of automatically analyzing a natural language text to compose a formal representation of its syntactic structure (syntactic parsing) or its meaning (semantic parsing).
Neural Network: Computing paradigm that learn a statistical model from features representing positive and negative examples, and predicts features of unseen instances by generalizing its internal representation.
Feature vector: In order to be processable by automated algorithms, the features of a text or a speech must be univocally encoded in a fixed-size numeric vector, called a feature vector.
Training: In the context of supervised machine learning, training is the process of feeding a statistical model (e.g., a neural network) with a large quantity of labeled examples, in order to compute a generalized statistical model from the data.
Pragmatic: Pragmatics is the area of linguistics and semantics that studies the relationship between utterances and their context, including the speaker's intentions, beliefs and implicatures. Pragmatic features of a text are features encoding the aspects of the text relevant to pragmatics.
Tweet: Short text message broadcast publicly on the popular microblogging platform Twitter (https://twitter.com).
While the use of chatbots in commercial websites and standalone tools like Amazon Echo is becoming commonplace, all these systems focus on parsing the natural language from the user's query and producing the right answers or executing commands in the tersest, most correct language. These are desirable properties, of course, but the unidimensional scope of the technology involved creates an army of bots all with the same style of conversation. They are generic and boring, resembling more old-style robots than human-like companions with our quirks and unique style. Users hesitate to interact with these robotic personalities because the conversation does not flow. There is no skeuomorphic, or ‘oh-so-human’ experience. The elements of humor and idiosyncratic personality are missing. A more personalized, customized user interface is needed for virtual assistants.
We reference this dialogue with a recent winner of the Loebner Prize, a Turing Test competition for chatbots:
Despite the friendly interaction, the conversation lacks a taste of human factor, for example a humorous remark. The chatbot also appears to lack empathy for its human counterpart, or interest in solving the human's problem.
We introduce a novel concept for virtual assistants whose main novel feature is the integration of custom personalities to generate human-like dialogue acts. We introduce a new way of integrating human-like personality traits within the dialogue system. Think of this as the indy version of big tech companies' robots. We developed a flexible, innovative model of personality based on a multi-dimensional representation. Our proprietary Personality Font model is trained on a substantially large quantity of speeches, interviews and Tweets, spoken by actual, opinionated people. The data we collect is manually labeled by experts according to a carefully crafted set of rules, which map semantic and pragmatic features, such as function, intent, sentiment, and syntax. Machine learning algorithms are trained on our data set in order to match the input from the user to the most appropriate answer, in terms of accuracy, but also injecting humorous and subjective features into the conversation. The result is a library of virtual assistants who talk to you in a caricatured version of their human alter ego. This creates a much more engaging, enjoyable, human-like conversation.
Our dialogue engine is based on a supervised neural model or supervised statistical model. It is trained on a large quantity of speech acts or natural language text from different sources, such as transcriptions from famous politicians, which ensure highly opinionated and subjective content (jokes, funny remarks, insults). Our dialogue engine's encoded linguistic and pragmatic features enriches the complexity of the language and the function of the specific utterances. The data is labeled according to semantic and pragmatic features such as function, intent and sentiment. The data is then used to train a machine learning algorithm to match the linguistic and semantic features extracted from the input query with the most appropriate parameters to influence the agent's reply. This machine learning module is coupled with an innovative framework of human personality.
A neural network model is trained on our data set in order to match the input from the user to the most appropriate answer in terms of accuracy, but also including humorous and subjective features into the conversation. In particular, we use a Long Short-term Memory recurrent neural network for this task, because it is able to learn latent structures from sequence-based input (i.e., natural language text). The LSTM is trained on the annotated data and is then used to predict the conversation-relevant features to apply to new instances of dialogue.
We implemented this complex model in an API, integrated with state-of-the-art Natural Language Processing functions, to allow developers to create personality-aware, customized conversational agents that can adapt to different communication scenarios. The diagram shows the architecture of the system.
We integrate two complete pipelines for Natural Language Understanding (lexical and morpho-syntactic analysis, semantic parsing, sentiment and topic modeling) and Natural Language Generation (content determination, micro-planning, surface realization). The NLU pipeline is augmented by a supervised model based on neural networks that predicts the pragmatic aspects of the language and connects to the personality matrix to inform the NLG pipeline to produce the most suitable personality for the conversation and the user. The dialogue follows a realistic pattern, by including questions and remarks from the VIP to continue the conversation (aka “threading”). How the conversation develops is also influenced by the particular conversational agent's personality.
We further introduce a formal model of human personality, based on a multi-dimensional representation of a number of features relevant to the dialogue. This framework allows us to customize the tone of the dialogue according to several parameters, and to adapt it dynamically to the specific user.
The personality model is used at both ends of the conversation. In the natural language understanding (NLU) module, the user input is analyzed and classified according to features relevant to the tone of the conversation. At generation time (NLG), the agent's replies are filtered based on its personality traits and the pragmatic features that fit the current conversation best. A carefully crafted set of rules maps the semantic and pragmatic features extracted from the text to the appropriate dimensions of the personality matrix, allowing the conversational engine to “read” the personality of the user in the NLU phase and producing the most suitable replies in the NLG phase.
The technology we developed is implemented in an API: our tool to build personality-enhanced conversational agents using the architecture described beforehand. By integrating our API, developers can create conversational agents, e.g. chatbots for websites, that are able to: 1) understand the topic as well as the tone of the conversation; 2) match the input with the most appropriate response; 3) include personalized remarks and conversational prompts into the interaction.
Our API implements a full-fledged pipelines for NLU and NLG, comprising lexical, morphological and syntactic analysis, semantic parsing and sentiment analysis in the NLU module, and content determination, macro-planning, micro-planning and surface realization in the NLG module. We leverage available state-of-the-art libraries such as the Stanford Parser1 and SimpleNLG2 for Natural Language Processing, and high-level libraries for parallel neural computation like Keras, and connect the together mapping their respective representations with custom code. The diagram depicts the system architecture in detail. In the NLU module, a series of tasks are carried out in sequential order: tokenization (word and sentence segmentation), morpho-syntactic analysis, lexical analysis (classification of the meaning of the words), sentiment analysis (extraction of subjective opinions and their polarity) and topic modeling (classification of the topic of the discourse). The NLG pipeline also follows a standard architecture: macro-planning (deciding what to say), micro-planning (deciding how to structure the utterance) and surface realization (transforming the content representation in a coherent linear form, i.e. a sentence). The macro-planning module is informed by the content of the input extracted by the NLU module. The micro-planning module is instead informed by the personality model.
A conversational threading module is responsible for keeping the conversation alive, by including extra content at the end of the replies. These could be traditional conversational prompts (“how are you?”“fine, and you?), requests for clarification (“look for Chinese restaurants in the area” “there are four, when do you want to have dinner?”), or more personality-driven remarks such as funny comments or friendly banter. The latter type of conversational threading is new and made possible by the integration of the personality matrix into the NLU/NLG pipeline. Specifically, the personality traits influence the function of the reasoner module that decides what type of conversational style (that includes threading) should be produced.
In one embodiment, a personality agent is based upon a real person or fictitious character. The personality is recognizable yet dynamic due to continual training and updating of the utterance database.
In another embodiment, we developed the personality matrix, a simple yet flexible theoretical model of personality based on a multi-dimensional representation. It is matrix of static personalities defined by a user. An example of an application of this scenario where our technology applies is customer service. A personality-aware conversational agent, i.e., a chatbot, is capable of identifying angry or dissatisfied customers, therefore adjusting the conversational tone accordingly. Similarly, the agent may identify a highly educated speaker, and select a personality from the matrix of the “academic” dimension. User interface and engagement are improved when interacting with a psycho-socially matched personality. Corporations which conduct a high volume of customer service calls may use the present dialogue engine to build a customized conversational agent that understands and generates natural language according to your client's personality.
The conversational devices pair with the mobile device of the user and are ready to start a conversation. They do what other general-purpose virtual assistants do, namely answering question by looking up Wikipedia, search for Website and videos, managing the calendar, playing music, and retrieving info such as weather forecast. However, instead of sounding like the usual robotic know-it-all companion, they interact with their user (or between themselves!) in quirky and funny ways, thanks to their unique personality encoded in their dialogue engine.
The application also includes popular assistant functions, including alarms, calendar, weather, and sports. What's a better way to wake up than mariachi ringtones? How about Mr. President warning you to “get to work or be fired!”, with our library of pre-recorded wake up calls! Keep forgetting to bring an umbrella to work? Not if Dear Leader advises you of 80% chance of nuclear showers, with our custom weather updates! The personality models add layers of humor to dialog while processing assistant commands. For example, rather then saying, “The alarm is set”, the agent may add an insult or a joke, such as “You better wake up or you'll be fired,” or, “That's very late, do you really want to sleep in”, whereby increasing the engagement and enjoyment of interacting with the agent.
The application may also stream music according to the agent's reported preferences, observed dialog themes, or recurring words. In the event that the agent is modeled after a musician, the application may also stream music created by that musician. A music licensing model is described in the claims.
Other interactive conversational features include:
The inventors contemplate “digital swag”, or graphic assets associated with a conversational agent, which may be selected from a library of assets, and purchased. The digital assets mighty include clothes, or other artifacts typically associated with the agent. A pricing strategy may price the conversational and interactive features discussed herein, whereby enabling some features as built-in, and some unlocked upon purchase.
Users may instruct their device to activate sleep mode by addressing the agent by their namesake, and uttering a command.
The user interface indicates to a user when they are engaged in an audio chat with others.
Device pairing data and conversation history may be stored on a paired device, such as a smartphone, or locally on the device, or in a remote data center. Devices may have unique digital certificate or signature which identify the device to a paired computer or smartphone. This affinity between account and serial number prevents unauthorized access to your information.
The trade name for said conversational agents is ‘Personality Font’. A Personality Font is a virtual assistant trained with speech patterns from speakers belonging to defined psychological profiles. Whereas a voice font is solely an audial representation of speech, based on voice recordings, Personality Fonts refer to the selection of words in a conversation. Beyond voice fonts, a Personality Font is a model designed with specific voice, syntactic, lexical, semantic, and psychological parameters. Each user-selected personality model will generate contextually-appropriate conversation, so, the same question will elicit different responses from each font.
The NLU pipeline labeling system is further defined by:
http://changingminds.org/techniques/conversation/articles/articles.htm
https://en.wikipedia.org/wiki/Syntactic_category
http://www1.appstate.edu/{tilde over ( )}mcgowant/grice.htm
http://ipip.ori.org/AlphabeticalItemList.htm
https://console.bluemix.net/docs/services/personality-insights/science.html#science
https://venturebeat.com/2013/10/11/how-ibms-michelle-zhou-figured-out-my-personality-from-200-tweets-interview/view-all/
https://developers.goggle.com/actions/design/how-conversqtions-work#repair
https://developers.google.com/actions/discovery/
https://developers.google.com/actions/design/principles
https://carla.umn.edu/articulation/polia/pdf files/communicative functions.pdf
https://www.aisb.org.uk/events/oebner-prize
https://www.statista.com/statistics702926.united-states-digital-voice-assistants-survey-usage/
https://www.stanford.edu/software/ex⋅parser,shtml
https://github.com/simplenlg/simplenlg