The present disclosure relates to artificial intelligence and natural language processing (including large language models). In particular, the present disclosure relates to systems and methods for governance, risk, compliance, and cybersecurity for artificial intelligence natural language processing systems.
Natural language processing models (“NLP models”), including large language models (LLMs), receive natural language input from a human user, which can be a question or other input for which a computer-generated response is desired. The user input is referred to as a “prompt”. The natural language processing models generate a natural language output, also referred to as a “completion”, in response to the prompt. The NLP models are developed using artificial intelligence (AI), e.g., machine learning techniques, and large unstructured data sets. Over time, the models build databases of that enable a prediction of the probability of various possible completions based on tokens, which are words or parts of words in a natural language, such as English. After a model is built, the model can be directed to use additional resources, such as electronic documents, to complete the prompt from a user.
Known techniques NLP modeling techniques currently suffer from a number of challenges, including vulnerabilities to attempts by users to “hack” the AI algorithms at the heart of the NLP models. Hacking efforts can be designed to extract training or other confidential information, among other purposes. Human users may also attempt to use the AI for illegitimate purposes against work policies or enter prompts that are beyond the capabilities of the AI algorithm to process.
In addition, NLP models are also prone to incorrect or incomplete completions, output of potentially harmful or toxic language, the release of confidential information, and other problems.
For such models to be used in the context of regulated industries, all of the inputs and outputs including prompts, completions, and other associated data may be required for regulatory compliance as well as internal organizational purposes.
There is a desire and need to overcome these vulnerabilities and deficiencies of the current use of NLP models, including LLMs.
A system for governance, risk, compliance, and cybersecurity for an artificial intelligence natural language processing model comprises a natural language processing (NLP) model that is configured to generate a completion result in response to a received prompt, a prompt filter module coupled to the NLP model for receiving the prompt from the user prior to the natural language model, the prompt filter being configured to review the prompt for content and to determine, based on content of the prompt, whether to pass the prompt through to the NLP model, a governance filter module coupled to the NLP model that is configured to control data sources that the NLP model uses to generate the completion, and a result filter module coupled to the NLP model and receiving completion output therefrom, the result filter module being configured to review the completion result for completeness, accuracy and for certain types of banned content, and to block the result from the user if the completion result is incomplete, inaccurate or contains banned content.
In certain embodiments, the prompt filter module is configured to detect confidential or personal identification information (CPII), and to send the CPII to the result filter module, bypassing the NLP model. In such embodiments, the result filter module is further configured to reinsert the CPII that bypasses the NLP model into the completion result.
It is initially noted that a “module” as used herein is a software program or group of software programs and related data that performs and/or controls a group of related processes. A module can include applications, interfaces, libraries, scripts, procedure calls, and generally any code and data that is tailored for the processes that the module performs and controls. A module can be executed using a single hardware processor, or multiple processors acting in concert. The processors can be hosted locally, externally (e.g., on the cloud) or any combination of the two. Multiple modules may be coupled together such that data and instructions can be passed between each module.
The present disclosure describes systems and methods to overcome the aforementioned deficiencies and challenges. The disclosed systems and methods utilize modules that process the content of the prompts and completions to accomplish these goals. A number of modules are configured to review and modify (e.g., block, remove, allow) prompts based on the detected presence of certain types of content such as but not limited to, content involving governance, risk, compliance, and cybersecurity (GRCC) issues. In an example embodiment, a system for GRCC of an artificial intelligence natural language processing large language model (AI/NLP LLM) system is disclosed. The system includes a prompt filter module, a governance filter module, a result (or completion) filter module, and a prompt/completion database. According to embodiments, of the method, one or more administrators set guidelines for individual users at an organization in alignment with organizational policies.
In various implementations, the various modules of the system are configured to restrict both the user and the NLP model in terms of the prompts that can be input, the documents or other data that can be accessed, the behavior of the NLP model, and the completion result sent in response to the prompt.
Prompts that are passed are received and processed by the NLP model 210. As noted above, the processing performed by the NLP model 210 is affected by the governance filter module 115, which is configured by the administrator according to the use case. More specifically, the governance filter module 115 determines the source documents which the NLP 210 model derives content by either selecting specific documentary content 124 and/or by triggering a database query builder 118. The database query builder, in turn, accesses information from one or more databases. Both the documentary sources and other sources derived from database queries are used by the NLP model 210 to “complete” the prompt. Embodiments of the NLP model 210 are trained to avoid incorrect completions or “hallucinations” that can occur when the model is not trained using data related to the content of the prompt.
The output (“result”) of the NLP model 210 is delivered to a result filter module 230. The result filter 230 checks the result for completeness, hallucinations, etc. and, in some embodiments (as shown), is also configured to replace confidential information that bypassed the AI/NLP for confidentiality purposes. All prompts, completions, and other pertinent information are delivered to the archive 220. The compliance platform 225 enables administrative personnel to review overall system performance, performance of specific filter components, users, etc. In addition, compliance e-discovery tools can be used to provide the compliance and regulatory team the ability to create and save different types of queries and analyses.
The prompt filter module 125 further includes a banned topic filter module 316. The banned filter module can be configured by an administrator 110 to identify systemwide and specific topics, e.g., controversial topics, for blocking from delivery to the NLP model. Words, phrases, and other textual elements related to such topics can be removed from the prompt.
In the embodiment depicted in
Yet another filter module that can be incorporated in the prompt filter module 125 is a toxic language filter module 320. The toxic language filter module 320 is configured to scan the prompt for words and phrases indicative of obscenity, hate speech, discrimination, etc.
It is to be appreciated that the modules 312-320 of the prompt filter module 300 discussed above are exemplary and that in various embodiments, some of the modules can be combined, and in others, additional modules can be added to analyze and potentially block other types of content from reaching the NLP model. Prompts that are processed by the prompt filter module 125 as well as pass and fail information are sent to archive 220. Passed prompts are delivered to the NLP model 210. (at times with confidential information obfuscated) and notification are sent to the administrator 110 when a prompt fails.
Database query results and documents from the access filter module 416, if any, are passed through to a content filter module 418. In addition, the agent policy module 402 determines whether user-selected files 425 and external (e.g., public) files 430 may be processed by the content filter module 418. The content filter module 418 is configured to limit the internal and external documents according to permissions (set by the administrator 110) and as well relevance to the prompt. A duplicate filter module 422 is configured to determine and eliminate duplicate copies of the received documents in order to reduce the time and cost of the prompt completion. A cybersecurity filter 423 is configured to detect and block malicious prompt injections and other types of cyberattacks, as controlled by the administrator 110. A relevance ranker module 424 orders the documents and files according to factors including but not limited to relevance to the prompt, recency, accuracy and other metrics. The relevance ranker module 424 can also be configured to reduce the number of documents fed to the NLP module for sake of efficiency and to avoid exceeding any size limitation of the specific NLP module 210 being used.
Turning now to
It is to be appreciated that the modules 502-514 of the result filter module 130 discussed above are exemplary and that in various embodiments, the modules can be combined, and in others, additional modules can be added to analyze and potentially block other types of content from reaching the NLP model.
Passed completions are delivered to the user 505. The user 505 can then provide feedback on the completion. For example, the user 505 ca grade the completion based on a sliding scale or by a binary good/bad mark. The feedback is stored in the archive 220.
The methods and processes described herein are performed computing devices (e.g., user devices, physical servers, workstations, storage arrays, cloud computing resources, etc.) that communicate and interoperate over one or more networks to perform the described functions. Each such computing device typically includes a processor (or multiple processors) that executes program instructions or modules stored in a memory or other non-transitory computer-readable storage medium or device (e.g., solid state storage devices, disk drives, etc.). The various functions disclosed herein may be embodied in such program instructions, or may be implemented in application-specific circuitry (e.g., ASICs or FPGAs) of the computer system. Where the computer system includes multiple computing devices, these devices can be, but need not be, co-located. The results of the disclosed methods and tasks can be persistently stored by transforming physical storage devices, such as solid-state memory chips or magnetic disks, into a different state. In some embodiments, the computer system may be a cloud-based computing system whose processing resources are shared by multiple distinct business entities or other users.
The methods described in connection with the embodiments disclosed herein can be embodied directly in hardware, in a software module executed by a processor device, or in a combination of the two. A software module can reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of a non-transitory computer-readable storage medium.
Certain of the modules described herein can communicate with other modules and/or devices (e.g., databases) using data connections over a data network. Data connections can be any known arrangement for wired (e.g., high-speed fiber) or wireless data communication, using any suitable communication protocol, as known in the art.
It is to be understood that any structural and functional details disclosed herein are not to be interpreted as limiting the systems and methods, but rather are provided as a representative embodiment and/or arrangement for teaching one skilled in the art one or more ways to implement the methods.
It is to be further understood that like numerals in the drawings represent like elements through the several figures, and that not all components and/or steps described and illustrated with reference to the figures are required for all embodiments or arrangements.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising”, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Terms of orientation are used herein merely for purposes of convention and referencing, and are not to be construed as limiting. However, it is recognized these terms could be used with reference to a viewer. Accordingly, no limitations are implied or to be inferred.
Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of “including,” “comprising,” or “having,” “containing,” “involving,” and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.
While the invention has been described with reference to exemplary embodiments, it will be understood by those skilled in the art that various changes can be made and equivalents can be substituted for elements thereof without departing from the scope of the invention. In addition, many modifications will be appreciated by those skilled in the art to adapt a particular instrument, situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed as the best mode contemplated for carrying out this invention, but that the invention will include all embodiments falling within the scope of the appended claim.
This application is based on and claims priority to U.S. Provisional Patent Application 63/624,573, filed Jan. 24, 2024, the entire contents of which is incorporated by reference herein as if expressly set forth in its respective entirety herein.
Number | Date | Country | |
---|---|---|---|
63624573 | Jan 2024 | US |