The invention relates in general to artificial intelligence and, in particular, to a computer-implemented system and method for providing an artificial intelligence powered digital meeting assistant.
As Covid-19 spread across the globe, in-person contact was discouraged and in some jurisdictions, was prohibited outside of the household. Communication between people was forced to occur via other means, including online or via text and email. Even as concerns about the spread of Covid-19 lessen, with more and more of the population getting vaccinated, many meetings are still held online via an internet-based communication platform. Important information, tasks, assignments, and other data are communicated via such platforms during meetings.
Currently, some of the internet-based communication platforms, such as Zoom and Microsoft Teams, allow a user to record the meeting. Users can later listen to or watch the meeting to obtain any missed detail. However, if reviewing a particular part of the meeting is desired, a user must either watch or listen to the full meeting or attempt to locate the correct portion of the meeting using fast-forward and rewind features, which is inconvenient and time consuming. Further, no analysis of topics discussed during the meeting is automatically performed. Instead, a user must generate a summary or independently analyze the subject matter.
Accordingly, a need exists for a meeting assistant to communicate with a communication platform to access and analyze data during meetings on the communication platform for generating meeting summaries and identifying action items discussed during. Preferably, the summary and action items are identified with high precision and recall. Additionally, the summary and action items can be used to populate task or project management software in an automated fashion.
A digital meeting assistant can be used to generate a summary and list of action items discussed in a meeting performed via an internet-based communication platform such as Zoom or Microsoft Teams. The summary and action items can be made available to or provided to a user, such as a participant and are helpful by providing meeting material directly to the user. Conversational data obtained during the meeting can also be assigned to participants as speakers of the conversational data.
An embodiment provides a computer-implemented system and method for providing an artificial intelligence powered digital meeting assistant. A recording of conversational data from a meeting facilitated via an internet-based communication platform is obtained. The conversational data from the recording is transcribed and a summary of the meeting is generated based on the conversational data. A list of action items to be performed by one or more participants of the meeting is generated based on the conversational data. The summary and the list of action items are provided to the participants.
Still other embodiments of the invention will become readily apparent to those skilled in the art from the following detailed description, wherein are embodiments of the invention by way of illustrating the best mode contemplated for carrying out the invention. As will be realized, the invention is capable of other and different embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and the scope of the invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not as restrictive.
Covid-19 forced many businesses and organizations to adopt a work from home or hybrid work policy. Even as more of the world gets vaccinated, many organizations are still allowing employees or members to work remotely or in a hybrid fashion that includes both remote and in-office work. Accordingly, numerous meetings are still being conducted through telephone calls or more popularly, via internet-based communication platforms.
Although many of the communication platforms offer recordings of the meetings, a recording must be rewatched to identify a missed portion of the meeting and attempts can be made to find a particular part of the meeting using the fast-forward and rewind features, both of which are inconvenient and time consuming. Summarizing meeting notes, distilling action items and task assignments, and finding salient points of discussion from a meeting transcription data using Advanced Machine Learning and NLP techniques can make reviewing a meeting via an internet-based communication platform simple and efficient, which in turn may lead to a higher percentage of completion of tasks assigned during the meeting.
Providing a digital assistant that automatically summarizes a meeting and generates action items is helpful for users and utilizes data from communication platforms that already exist.
A recording 25 of the meeting can be made and stored in a database 24 interconnected to the communication server 22. The recording can be processed to generate a summary of the meeting and a list of action items assigned during the meeting. A meeting server 14 can access the recording from the database 24 of the communication server 22 for storage and processing. The meeting server 14 include modules, such as a summarizer 15, action generator 16, and searcher 17. The summarizer 15 generates a summary 19 of the meeting based on the recording or a transcription of the audio recording, while the action generator 16 generates a list of action items 21 discussed or assigned during the meeting. The summary 20 and list 21 of action items are stored in a database 18 interconnected to the meeting server 14, along with the recording 19 from the communication server 22. The searcher 17 performs a search of the summary or list of action items based on a query provided by a participant of the meeting or another user. In one embodiment, the summary and list of action items are generated for each meeting conducted via the webpage or application 13a-b. In a further embodiment, the communication 22 and meeting 14 servers, as well as the databases 18, 24 can be cloud-based.
In one embodiment, each of the servers and computing devices can include a processor, such as a central processing unit (CPUs), graphics processing unit (GPU), or a mixture of CPUs and GPU, though other kinds of processors or a mixture of processors are possible. The modules can be implemented as a computer program or procedure written as source code in a conventional programming language and is presented for execution by the processors as object or byte code. Alternatively, the modules can also be implemented in hardware, either as integrated circuitry or burned into read-only memory components, and each of the computing devices and servers can act as a specialized computer. For instance, when the modules are implemented as hardware, that particular hardware is specialized to perform the computations and communication described above and other computers cannot be used. Additionally, when the modules are burned into read-only memory components, the computer storing the read-only memory becomes specialized to perform the operations described above that other computers cannot. The various implementations of the source code and object and byte codes can be held on a computer-readable storage medium, such as a floppy disk, hard drive, digital video disk (DVD), random access memory (RAM), read-only memory (ROM) and similar storage mediums. Other types of modules and module functions are possible, as well as other physical hardware components.
Once generated, the summary and action items can be provided to one or more participants of the meeting via a link, as a document, or as text in a message, such as an email or text.
The summary can provide a meeting participant or individual that was unable to attend the meeting with notes regarding salient topics discussed.
In a first phase, important phrases and utterances can be identified (step 42). An ensemble-based approach can be used to identify if an utterance is summary worthy or not. For example, models, such as BERT, Glove & Word2Vec, which is a deep learning model that creates contextual vectorization of every utterance, can be used to make the decision of inclusion. Additionally, LexRank or TextRank are graph-based importance ranking algorithms and can also be used to determine which phrases or utterances should be utilized in the summary. Those phrases or utterance determined as not summary worthy can be removed (step 43).
Subsequently, co-reference resolution can be performed (step 44) as a second phase. During a conversation, a concept or a speaker is often only explicitly mentioned once during the initiation, after which they are referred by their pronoun form. Such pronoun utterance extraction out of context does not make much sense. Hence, each pronoun to be resolved to their proper noun form, to make complete sense, which can be accomplished using heuristic rules and machine learning algorithms. Knowing and understanding who is speaking is important to determine statements, views, and opinions made by each participant.
In a third phase, utterance normalization can be performed (step 45). The dialog during a meeting is often in active form, which is not that useful in an overview or summarization setting. A conversion from active to passive has to be performed to make presentable as a summary. The conversion is performed using a combination of deep learning model and classical NLP technique called AMR (Abstract Meaning Representation). A model first encodes the text of the transcription into a graph form to extract the “core meaning” from an utterance and removing all surface level syntactic variation. After which, the text is decoded back to natural language form, the decoder being biased to create passive sentences from the utterance graph.
Along with the summary, the list of action items helps place important information from a meeting directly in front of the participants. Specifically, extracting action items from the conversational data with a designation of Assigner and Assignee facilitates completion of the action items by providing the Assignee a list of tasks to be performed.
Specifically, Parts of Speech tags can be used to extract the action items from the conversational data, as described below. For example, multiple levels of rules and filters which have been derived by analyzing the data and language, can be used to identify and extract the action items. The extracted items can help the readers in understanding the crux of the meeting even if they were absent in the meeting. Furthermore, the extracted action items, would serve as an assistant to remind assigners and assignees of tasks about the discussed tasks in the meeting.
Pre-trained Machine Learning models and filtering using an AI powered solution can be performed. A rule based system for extracting the action items which can be derived after analyzing a significant amount of data can use different filters. A particular verb filter can be applied (step 51) and sentences in the transcript or recording that do not pass the filter can be removed (step 52). For example, only those sentences would be able to pass the verb filter in which modal auxiliary verb (MD) or present form of verb (VB) are present and the MD auxiliary verb is followed by the VB verb in the sentence. Modal verbs are generally used to show if something is believed to be certain, possible or impossible. Modal verbs can also be used to talk about ability, ask permission and make requests & offers. Verb form can also be helpful in identifying tasks for action times since most tasks assignment are frequently in present or future tense.
A second, action filter can also be applied (step 53) to the transcript of the recording simultaneously with or after the verb filter has been applied. The verb filter may allow unnecessary items in the output. For example, if someone is asking for some kind of permission or any type of request, the sentence would pass the verb filter, but still should not be included as an action item to be identified. These types of sentences can be filtered out using 2 types of filters. For instance, if a Modal verb is followed immediately by a Noun or Pronoun, the sentence would most probably be a question and can be filtered out. Second, if a past participle form of modal verb e.g., “should”, “would” is not getting followed by “be”, then they are also filtered out e.g. sentences containing only “should” would be filtered out but sentences containing “should be” would not be filtered out.
The assignor and assignee of a task can be determined to identify the individual assigning the task and the individual assigned the task for accountability purposes. Further, if questions arise regarding the task, the identity of the assignor and assignee are helpful for follow up.
Providing users with a meeting summary allows all meeting participants to become appraised of important points discussed without listening to the entire meeting in an automated fashion by intelligently extracting a succinct summary of both long and short meetings. Automated task creation using structured data extracted from meeting data promotes efficient project and task management, as well as completion. As this textual summary is stored in a database, text-based searching algorithms can be used to perform intelligent search. Making all the meeting summary and lists of action items searchable by participants, bring value to everyone.
The digital assistant can also perform additional features with respect to the meeting via the internet-based communication platform, including searching a set of documents associated with the meeting.
While the invention has been particularly shown and described as referenced to the embodiments thereof, those skilled in the art will understand that the foregoing and other changes in form and detail may be made therein without departing from the spirit and scope of the invention.
Number | Date | Country | |
---|---|---|---|
63283173 | Nov 2021 | US |