The invention relates generally to conflict resolution training, and more specifically to a training system and method that trains conflict-response professionals how to speak and act in ways that can aid in the de-escalation of a variety of conflict scenarios.
Law enforcement, security, and mental health professionals are frequently confronted with individuals in conflict or in an emotional situation that requires the professional to intercede with the goal of remedying or at least diminishing the conflict or emotional situation to a level of normalcy that is compliant with local laws and/or directives. It is not enough for these professionals to be “book” educated with the local laws and/or directives. That is, book educations must be reinforced with interactive situational training in order for the professionals to gain the operational expertise and interpersonal skills needed for the critical initial engagement with individuals in conflict, and the subsequent shaping and control of a potential conflict scenario.
Current conflict-situation training approaches include lecture-style classes, live role-playing sessions, and computer-based role-playing programs. However, each of these approaches falls short of being an effective conflict de-escalation training tool. For example, lecture-style classes merely provide non-interactive education in local laws and/or directives along with suggestions of how to engage an individual in a conflict or in an emotional situation and approaches to ensure governance compliance. Thus, the classes only provide “book” knowledge with no “hands-on” interactive training that is crucial if law enforcement, security, and mental health professionals are to be trained in de-escalation tactics.
Live role-playing sessions employ human actors or role players that attempt to simulate how an individual speaks and acts in a conflict or emotional situation. However, this type of training is completely dependent on how well an actor/role player plays their part and stays “in character”. Unfortunately, both of these requirements can be very difficult to achieve, especially when the actor/role player is known to the trainee or is used over and over for multiple trainees. More importantly, it is extremely difficult for a live actor to repeat their role multiple times in order to train different approaches to de-escalation of an event or type of situation. In addition, actors and role players generally have a limited repertoire of personality and response presentations, thereby limiting the breadth of conflict or emotional situation experiences that can be provided for a trainee.
Computer-based role-playing programs generally provide a trainee with an interactive training session in front of a computer and its display. The computer will typically be programmed with a variety of conflict or emotional situation events where each such event is defined by a pre-determined tree structure of operational events based on event responses. That is, a trainee’s responses during the course of an event cause the program to traverse a pre-determined path. However, these programs do not dynamically adjust the event-response tree structure and, therefore, cannot provide a trainee with a training scenario that can be different every time. Thus, the value of computer-based role-playing programs is greatly diminished after a trainee has trained a couple of times on a type of event.
Accordingly, it is an object of the present invention to provide a system and method for conflict de-escalation training.
Another object of the present invention is to provide a conflict de-escalation training system and method that dynamically teaches a conflict resolution trainee how to speak and act in ways that will likely lead an individual in crisis toward a behavior that is more in compliance with local laws and/or directives, acceptable behavior, and the training goals and standards established for the trainee.
Still another object of the present invention is to provide a conflict de-escalation training system and method that adjusts to the speech and actions of a trainee.
Yet another object of the present invention is to provide a conflict de-escalation training system and method that provides robotic speech and actions in response to the speech and actions of a trainee.
Other objects and advantages of the present invention will become more obvious hereinafter in the specification and drawings.
In accordance with the present invention, a conflict de-escalation training system includes a reproducer of sound and images and at least one device adapted to detect speech, proximity, and movement of a human trainee relative to the reproducer. At least one database stores trigger event definitions and natural language understanding (NLU) rules. An intent analyzer, executed by the at least one hardware processor, ascertains an intent of the speech of the human trainee using the NLU rules. A trigger analyzer executed by at least one hardware processor generates a trigger enable flag when one of the trigger event definitions is satisfied by at least one of the speech, the proximity, and the movement of the human trainee. A compliancy state generator executed by the at least one hardware processor determines a state value when the trigger enable flag is generated. The state value is based on at least one of the intent of the speech, the proximity of the human trainee, and the movement of the human trainee. The state value is indicative of a level of compliance of an imaginary individual being subjected to the speech, the proximity, and the movement of the human trainee. An output generator, coupled to the reproducer and executed by the at least one hardware processor, generates a facial image and a speech response for the imaginary individual based on the state value. The facial image and speech response are provided to the reproducer for output thereby.
Other objects, features and advantages of the present invention will become apparent upon reference to the following description of the preferred embodiments and to the drawings, wherein corresponding reference characters indicate corresponding parts throughout the several views of the drawings and wherein:
The present invention is a conflict de-escalation training system and method. The term “conflict” as used herein refers to any type of interpersonal engagement between a professional (e.g., police, security personnel, mental health worker, etc.) and an individual who is in an emotional state or may enter an emotional state due to the presence of or engagement with the professional or an external event/stimuli causing the initial emotional state. The term “de-escalation” as used herein refers to a reduction or possible prevention of the conflict by virtue of the professional’s selected use of speech and actions.
The system and method provide a human trainee (hereinafter referred to simply as “trainee”) an interactive experience with an imaginary individual (hereinafter referred to as “trainer bot”) presented on a “platform” that the trainee must initially engage and then interact with over the course of an event. Briefly, the present invention makes each engagement/interaction event a dynamic training session that allows a trainee to see in real-time how his/her speech and actions affect speech (and actions in some embodiments) of the trainer bot as the trainer bot becomes more compliant or less compliant in response to the trainee’s speech and actions. As used herein, “speech” of the trainee or trainer bot includes any words, utterances, any type of verbalized sounds, etc.
As used herein, the words “compliant” or “compliance” are defined in terms of an individual’s emotional and/or physical behavior being in compliance or non-compliance with one or more of an inquiry and/or demand of the trainee. For example, a trainee’s inquiries and/or demands could be in accordance with local laws and/or directives. The local laws and/or directives are those that the trainee is trying to advance and encourage during a conflict-type engagement and interaction with the goal of reducing or preventing conflict in the engagement/interaction. However, in other instances, a trainee may make an inquiry and/or demand that an engaged individual is not required to answer under local laws and/or directives. Since one’s individual compliance is not binary (i.e., compliance is not generally one of full compliance or fully non-compliant), the present invention’s system and method dynamically and incrementally adjust the trainer bot’s level of compliance in real-time so that a trainee receives real-time feedback for their choice of speech and actions.
As mentioned above, the trainer bot presents on a “platform” that can be realized in a variety of ways without departing from the scope of the present invention. By way of illustration, several non-limiting platform embodiments will be described herein. At a minimum, the platform is a reproducer of sound and images that can be seen/heard by the trainee during a training session. In general, the images are facial images of a human face, and the sound is the speech (e.g., recordings, synthesized speech, etc.) emanating from the images of the human face. That is, the reproducer produces speech sounds and coordinated facial images of an imaginary individual where the speech sounds and facial images are those of the trainer bot. In some embodiments of the present invention, the facial images and sounds are coordinated with one another in a video stream of facial movements associated with the coordinated speech. In some embodiments of the present invention , the reproducer is mounted on a base with the base positioning the reproducer at a fixed or variable height above a ground surface to simulate a particular height of a trainer bot’s “face” commensurate with a person being in one of a standing, sitting, or lying down position. In some embodiments of the present invention, the base is a mobile base capable of motorized movement on a ground surface where such motorized base movement is coordinated with the sound/images on the reproducer as will be explained further below. In some embodiments of the present invention, the base can include a humanoid dummy with the reproducer being positioned at the head location of the dummy. In some embodiments of the present invention, a base’s humanoid dummy can have mechanized arms whose movements are coordinated with sound/images on the reproducer as will be explained further below. In some embodiments of the present invention, a mobile base capable of motorized movement supports a humanoid dummy having mechanized arms where the base’s motorized movements as well as those of the mechanized arms are coordinated with sound/images on the reproducer as will be explained further below.
In some embodiments of the present invention, the conflict de-escalation training system and method will employ artificial intelligence (“AI”) and machine learning (“ML”) apparatus and methods using non-transitory computer readable media storing machine readable and executable instructions that provide Al and ML-based conflict de-escalation training in accordance with the present invention. In terms of the apparatus and methods using such non-transitory computer readable media, the elements thereof can be realized by any combination of hardware processors and programming to implement the elements’ functionalities. Such combinations include, but are not limited to, processor-executable instructions stored on a non-transitory machine-readable medium and one or more hardware processors for executing the instructions. The instructions can be stored at the processor(s) or separate from the processor(s) without departing from the scope of the present invention. The one or more hardware processors can be co-located at local or remote locations or can be located at a combination of local and remote locations without departing from the scope of the present invention. For example, a remote special-function hardware processor could be used to improve the efficacy and efficiency of an element’s functionality. Still further and in some cases, an element’s functionality can be implemented in on-board circuitry of the trainer bot.
Referring now to the drawings and more particularly to
System 10 includes a sound and image reproducer 20, one or more speech/action detection devices 30, and one or more hardware processors 40. Reproducer 20 can be any audio/image/video reproduction device or combination of devices that can reproduce speech sounds and facial images/videos. Devices 30 detect speech 102 of trainee 100, the proximity of trainee 100 relative to reproducer 20, and movement 104 of trainee 100. By way of an illustrative example, devices 30 can include a microphone 32, a video camera 34, and proximity/motion sensors 36 (e.g., Light Detection and Ranging (LIDAR) device or sonar-based sensors). As used herein, the term “proximity” includes the sensing of threshold distances (e.g., at least 3 feet away, less than 6 feet away, etc.) and/or an actual spatial separation distance between the trainee and trainer bot. Devices 30 could also include accelerometers for detecting quick or sudden movements of trainee 100 where fast movements could be indicative of the trainee’s intent used by system 10 when generating a trainer bot’s responses as will be explained further below. Devices 30 can be completely co-located with reproducer 20, completely remotely-located with respect to reproducer 20, or can be realized by a combination of co-located and remotely-located devices without departing from the scope of the present invention. Hardware processors 40 can be any processing device, and can be completely co-located with reproducer 20, completely remotely-located with respect to reproducer 20, or can be realized by a combination of co-located and remotely-located hardware processors without departing from the scope of the present invention.
In some embodiments of the present invention and as illustrated in
A number of non-limiting embodiments of base 50 are illustrated schematically in
Referring now to
Apparatus 400 provides the functional elements for the system and method of the present invention. The four essential elements include a trigger analyzer 410, an intent analyzer 420, a compliancy state generator 430, and an output generator 440. Each of the functional elements is executed by one or more of the above-described hardware processors 40 (
Apparatus 400 also provides databases of information used by the functional elements during an event training session. The two essential databases include a trigger definition database 450 and a “natural language understanding” (“NLU”) rules database 460. Trigger definition database 450 defines trainee speech (e.g., words, phrases, utterances, etc.) and types of trainee actions (e.g., types of body movements, movement towards/away from trainer bot, etc.) that, when detected/satisfied, cause the generation of a trigger enable flag indicating that the trainee’s speech and/or actions will impact the compliancy state of the trainer bot as will be explained further below. NLU rules database 460 can be an existing or developed set of rules that map speech words/phrases to an intent associated with the words phrases. In some embodiments, NLU rules database 460 can be accessed from a third party provider via the internet or other network device. In some embodiments, one or both of databases 450 and 460 can be continually updated using ML technologies applied during event training sessions. In this way, the databases adapt to local speech patterns and dialects, local laws, and/or local directives for a particular organization.
Trigger definition database 450 can be configured in a variety of ways without departing form the scope of the present invention. For example, trigger database 450 can be configured simply as a global set of trigger definitions that apply to a trainee’s speech/action intent regardless of the type of event scenario. However, in some embodiments of the present invention, trigger definition database 450 is divided into types of event scenarios (e.g., traffic stop, domestic dispute, engagement with a drunk and disorderly individual, engagement with an individual attempting to commit self-harm, etc.) where the same speech or action can have can have a different intent depending on the type of event scenario with which it is associated.
Apparatus 400 also includes a trainer bot response model 470 used by output generator 440 to generate trainer bot responses. Response model 470 can range from simple database structures storing predetermined responses to a large managed model that is continually updated using machine learning. Such a managed model is similar to a script having a finite number of phrases or “responses” where each response is categorized under an intent. For example, under a “Greetings” intent, the following phrases could be included: “Hello”, “How are you”, “Hi there”, “Nice to see you”, and “What do you want”. Upon activation of the “Greetings” intent, output generator 440 could be configured to choose from one of these five phrases at random. Action-based intents and responses can be handled in a similar fashion. The creator of a scenario can be responsible for generating a model with a script full of responses (e.g., phrases, actions, or phrases and actions) it wants the trainer bot to be able to say/perform. The creation of compliant and non-compliant responses under each intent provides for a well-rounded scenario. The compliancy state value can be used to determine which model is actively running. Intents and their subsequent responses can vary from model to model.
In general, apparatus 400 receives (trainee-generated) inputs from devices 30 and generates trainer bot responses for output to reproducer 20 and, when included in the system, one or more of motors 60A and 60B. Apparatus 400 performs these functions throughout a training event/session as will be described further below.
Referring additionally now to
At the start of a training event/session (
Following initialization of a compliancy state value for the event/session, the interactive portion of the event/session begins when a trainee speaks and/or acts/moves where such speech/actions are detected by the above-described devices 30 as indicated at block 510 in
At decision block 530, trigger analyzer 410 evaluates one or more of the trainee’s speech/utterances, the trainee’s proximity to the trainer bot, and the trainee’s movements in front of the trainer bot to see if any of the trigger definitions (stored in database 450) are satisfied in which case a trigger enable flag is set so that processing proceeds to block 540. More specifically, trigger analyzer 410 compares one or more of the trainee’s speech, proximity, and movements with the various trigger enabling definitions in database 450. If decision block 530 indicates that a trigger definition is satisfied, processing proceeds to block 540 where a new compliancy state value is determined by compliancy state generator 430 based on the ascertained intent as will be explained further below. If no trigger definition is satisfied at decision block 530, the trigger enable flag is not set and the current compliancy state value remains the same and the trainer bot, at block 550, is free to generate a speech, base movement, and/or arm movement based on the current compliancy state value and intent analyzed by intent analyzer 420.
Determination of the compliancy state value at block 540 can be accomplished in a variety of ways without departing from the scope of the present invention. For example, in some embodiments of the present invention, each trigger-flag-enabling intent can have a corresponding compliancy state change value (e.g., +1, +2, +3, -1, -2, -3, etc.) assigned thereto that is used to increase or decrease the current compliancy state value. In other embodiments of the present invention, an algorithm can be employed at block 540 to determine the compliancy state value. By way of an illustrative example, one such algorithm has the following four variables:
These values work in a relation-based formula to determine if the system should transition towards compliancy or non-compliancy. For example, if the value of C changes relative to the value of L (either from increasing or decreasing), the difference between L and C is checked against CT and NT. If the difference is greater than or equal to either of these thresholds, the compliancy state value will transition to match the crossed threshold, and the thresholds are updated. In still other embodiments of the present invention, changes in the thresholds can be tracked over time as a measure used to change compliancy.
Assuming decision block 530 indicates that a trigger definition is satisfied, processing is passed to block 540 where a new compliancy state value is determined by compliancy state generator 430 based on the ascertained intent as described above. Additionally or alternatively, some embodiments of block 540 could use the proximity of the trainee to the trainer bot and/or the trainee’s movements to determine an adjustment to the compliancy state value. For the illustrated embodiment where an event/session has an initialized compliancy state value, compliancy state generator 430 could positively or negatively increment the compliancy state value based on the ascertained intent. For embodiments where there is not an initialized compliancy state value, block 540 could determine the initial compliancy state value that would then be adjusted as the event/session continued.
The new or updated compliancy state value determined at block 540 (
As mentioned above, the present invention can employ ML techniques to continually update one or more of the databases to increase the AI capabilities of the system. For example, ML techniques could be used to update response model 470 and the NLU rules database 460 to improve the system’s ability to ascertain speech intent at block 520.
In some embodiments of the present invention and as mentioned above, a trainee’s speech and actions could be recorded and stored for later playback. The trainer bot’s speech and actions could also be recorded. In this way, a trainee can review an event/session with or without a live instructor to achieve a spectator’s perspective of their handling of an interactive event. Accordingly,
The advantages of the present invention are numerous. Conflict de-escalation training is predicated on how a trainee’s speech and actions affect a trainer bot’s compliancy state-based responses. Trainee feedback is instantaneous thereby encouraging a trainee to maintain or modify their approach to a given situation “on the fly”. The system and method are readily adaptable to simple sound/image trainer bot responses as well as the more complex and realistic sound, image, and movement trainer bot responses.
Although the invention has been described relative to specific embodiments thereof, there are numerous variations and modifications that will be readily apparent to those skilled in the art of operational environment management requiring interpersonal dialogue with citizens (that can be replicated in a training situation through a trainer bot) in light of the above teachings. It is therefore to be understood that, within the scope of the appended claims, the invention may be practiced other than as specifically described.
Pursuant to 35 U.S.C. §119, the benefit of priority from provisional application 63/259,786, with a filing date of Aug. 12, 2021, is claimed for this non-provisional application.
Number | Date | Country | |
---|---|---|---|
63259786 | Aug 2021 | US |