This disclosure relates to threat modeling and, more particularly, to systems and methods for continuous automated threat modeling.
Threat modeling is a structured representation of information that lets users and/or administrators understand security vulnerabilities of a computer application or a computer system. That is, threat modeling is a systematic approach used in cybersecurity settings to identify, assess, and mitigate potential risks and vulnerabilities in software systems, application, and/or networks. Threat modeling involves analyzing a system's architecture, functionality, and potential threats to understand the system's security posture. Threat modeling enables informed decision-making regarding security controls, countermeasures, and validations.
Prior to this disclosure, attempts to reduce problems from creating accurate and useful threat models came in forms of one-off activities done during a system's design phase or manual processes that involved little to no automation. These traditional forms of threat modeling can be time-consuming, resource-intensive, and subject to both human biases and human mistakes. Traditional forms of threat modeling have led to outdated threat assessments, reactive (rather than proactive) security measures, and inabilities with adapting to rapidly evolving threats.
In one implementation, a system for continuous automated threat modeling based on prompt engineering using large language models is based on a threat modeling engine configured to ingest an application profile, a workload context, and a software template. The system includes the threat modeling engine generating a threat model after ingesting the application profile, the workload context, and the software template, a threat prompt generator comprising a prompt generation pipeline, a large language model integrated with the prompt generation pipeline and configured to generate a threat prompt, wherein the threat prompt is based off an annotated threat configuration, a threat taxonomy, and a prompt template, and a continuous automation module, wherein the continuous automation module retrieves the threat prompt and performs a security assessment of the threat prompt.
One or more of the following features may be included. The system may include a threat artifact generator in communication with the threat model engine. The system may include the annotated threat configuration comprising a third-party threat intelligence program. The system may include the continuous automation module comprising a code generator in communication with the large language model. The system may include the threat modeling engine further comprising a threat engine orchestrator in communication with a threat configuration composer, one or more threat artifacts, and a Relative Attacker Attractiveness analyzer. The system may include the threat modeling engine generating the one or more threat artifacts and the Relative Attacker Attractiveness analyzer. The system may include the Relative Attacker attractiveness analyzer receiving information from a threat and risk catalog and the threat engine orchestrator, analyzing the information, generating a percentage value, then sending the percentage value to a threat score generator. The system may include the threat prompt generator further comprising a prompt template composer configured to compose a threat prompt template, save the threat prompt template in a threat template repository, obtain an annotation from a threat data annotator, communicate with a threat taxonomy database, and send information to a prompt generation pipeline. The system may include the prompt generation pipeline being configured to query and train the large language model with the threat prompts. The system may include the continuous automation module further comprising a policy generator in communication with the large language model. The system may include the continuous automation module further comprising automatic and continuous generation of at least one threat report that is communicated to the threat modeling engine. The system may include the large language model being integrated with a policy generator and a code generator. The system may include the continuous automation module further comprising the large language model developing policies and code via queries and training from prior threat reports and/or prior threat prompts. The system may include the continuous automation module further comprising a template patching notification.
In another implementation, a method for continuous automated threat modeling based on prompt engineering using large language models is based on ingesting an application profile, a workload context, and a software template; identifying potential threats and vulnerabilities of a system; annotating a threat configuration; incorporating data that mitigates the potential threats and vulnerabilities of the system; generating a threat prompt based off an annotated threat configuration, a threat taxonomy, a prompt template, and prompt engineering; integrating a large language model; and enabling continuous automation of the system via automatic retrieval of the threat prompt and automatic performance of a security assessment on the threat prompt.
One or more of the following features may be included. The method may include generating threat artifacts based on the identified potential threats and vulnerabilities of the system. The method may include the threat prompt being based on fine-tuning, a sampling strategy, and a diversity algorithm.
In another implementation, a computer-readable medium storing instructions that, when executed by a computing device, causes the device to perform a method for continuous automated threat modeling based on prompt engineering using large language models has the method including ingesting an application profile, a workload context, and a software template; identifying potential threats and vulnerabilities of a system; annotating a threat configuration; incorporating data that mitigates the potential threats and vulnerabilities of the system; generating a threat prompt based off an annotated threat configuration, a threat taxonomy, a prompt template, and prompt engineering; integrating a large language model; and enabling continuous automation of the system via automatic retrieval of the threat prompt and automatic performance of a security assessment on the threat prompt.
The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features and advantages will become apparent from the description, the drawings, and the claims.
Various embodiments in accordance with the present disclosure be described with reference to the drawings, in which:
Referring to
Application profile 102 may include, but is not limited to, policies and guidelines for a particular application; attributes pertaining to the application profile's 102 privacy risk; protected health information (“PHI”); personally identifiable information (“PII”); key or significant financial systems for obtaining and recording financial information; externally-facing features; business critical system attributes service-level agreements; and computer-system availability, among other application profile components which, for example, may include information from customer databases.
Workload context 104 may include information related to workloads in a computing environment. A workload may be an application, a software, or a program. A workload may also refer to the amount of work that software imposes on underlying computing resources. A workload context may include, but is not limited to, information surrounding an application and its resources, such as service-level agreements, computing platform and version, workload type (e.g., virtual, container, serverless, etc.), cloud characteristics, deployment type, contents of a service list, and characteristics related to processing (e.g., central processing unit, or CPU) and memory resources.
Software template 106 may include, but is not limited to, a programming language (e.g., python), a version number, a software framework (e.g., a micro web framework written in a particular programming language), an assets directory, a threat framework (e.g., a model for identifying computer security threats, such as STRIDE or PASTA or LINDDUN), a dependency file (e.g., a text file), a software type (e.g., frontend or backend), an application programming interface (API) type (e.g., REST, GraphQL, gRPC, etc.), a threshold for test coverage, access types for one or more databases, one or more messaging queues, and one or more storage queues, acceptable data formats (e.g., XML or JSON), and required security tests (e.g., denial of service, CPU cycle theft, ransomware, identity theft, theft of PII, database compromises, cross site scripting, etc.).
Continuing with
Threat modeling engine 108, after generating threat artifacts 112 and providing a threat score from threat score generator 120 and one or more threat models based on, for example, threat artifacts 112, may exchange information with a threat prompt generator 122, which may include a prompt template composer 124, one or more threat prompt templates 126, a threat taxonomy 128 database configured to consider classifications of threat-related information, a threat data annotator 130 configured to annotate threat prompts, one or more threat configurations 132, and a threat prompt generation pipeline 134 configures to generate and distribute threat prompts 136 (for instance, threat prompts 136 may ultimately be based on, among other things, using the weights of an already trained network as starting values for training a new network; a sampling strategy; and/or a diversity algorithm) to a threat model & mitigation large language model (LLM) 138.
Prompt template composer 124 may obtain information from threat modeling engine 108 to compose prompt templates ready to proceed through prompt generation pipeline 134. Prompt template composer may exchange information with threat prompt templates 126, with threat prompt templates 126 being located, for example, in a software repository that may be called a threat template repository. Prompt template composer 124 may use templates from the repository in conjunction with threat taxonomy 128, with threat taxonomy 128 including, but not limited to, categories or classifications of threats for informing prompt template composer 124 of those categories or classifications to provide either standard or unique solutions against threats. Prompt template composer 124 may also combine information from threat taxonomy 128 with information from threat data annotator 130 to provide threat prompts that act in accordance with parameters either set by a user or are readable by large language model 140 (e.g., in consideration of fine-tuning, sampling strategies, diversity algorithms, etc.). One or more threat configurations 132 may be modified to control what parameters threat data annotator 130 may focus, prioritize, or consider in its annotation activities.
Prompt template composer 124 may, following any interaction with any component of threat prompt generator 122, exchange a prompt template with prompt generation pipeline 134. The prompt template may include generated threat prompts and/or security tests for mitigating and validating issues (e.g., given-when-then format for writing down test cases) posed by the generated threat prompts. Prompt generation pipeline 134 may be a set of data processing elements connected in a series, where the output of one element is the input of the next element. Prompt generation pipeline 134 may also comprise a sequence of computing processes (commands, program runs, tasks, threads, procedures, etc.). In operation, prompt generation pipeline 134 may import an API from any programming language (e.g., OpenAI through HTTP requests via Python bindings, Node.js, etc.), import a module (e.g., OS module in Python, Python's random module), import a data serialization language (e.g., YAML), and/or file formats (e.g., JSON). With imported items, prompt generation pipeline 134 may then search an environment for a calling process (e.g., via the function getenv( )), obtain a required security test based off prompt template composer 124, use Python with statements to open a software template or the required security test as a string or a stream, and employ loops (e.g., for . . . with loops) to generate responses as threat prompts 136 that may be readable by large language model 140.
Threat prompts 136 from prompt generation pipeline 134 may then be sent to threat model & mitigation LLM 138 in the form of one or more queries (e.g., tests) and/or one or more trainings (for training large language model 140) and may thereby include prompt engineering techniques (e.g., crafting, priming, refining, or probing prompts within a scope or within the capabilities of, for example, large language model 140). Prompt engineering may be performed manually by a user or administrator, or automatically within system 100 and/or large language model 140. Threat prompt generation pipeline 134 may be integrated with large language model 140 while generating a threat prompt based on, for example, information from threat taxonomy 128, threat data annotator 130 (e.g., which may include annotations derived within system 100 or from a third-party intelligence program source), and/or prompt templates 126. Large language model 140 may use one or more threat prompts 136 to develop one or more mitigations with regards to contents of the one or more threat prompts 136. Large language model 140 may, for instance, generate code (e.g., via code generator 148) to either update code present in a computer system or to adjust security controls in the computer system. Code generator 148 may generate security tests 150, infrastructure as code 152 (i.e., the managing and provisioning of infrastructure through code instead of through manual processes), policy as code (e.g., writing code in a language to manage and automate policies, such as automatic testing and automatic deployment policies), application code 156, and/or managed security services 158 (e.g., services that have been outsourced to a service provider or include third-party intelligence data or at least one third-party threat intelligence program). Large language model 140 may also, for instance, generate policies (e.g., via policy generator 142) to control security and risk management of a system. Policy generator 142 may generate scan policy standards 144 and/or application security vulnerability standards 146. Large language model 140 may, either independently or in conjunction with information from items generated by policy generator 142 and/or code generator 148, generate one or more threat reports 160. The one or more threat reports 160 may then, via a continuous automation module (not shown in
Threat model and Mitigation LLM 138 may also provide information pertaining to template patching 162 (e.g., via a template patching notification which may contain patching information or a lack of necessary patching thereof), which a system administrator may use as a notification to patch software template 106 and to further strengthen or improve system 100 in performing its continuous automated threat modeling functionality.
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present disclosure has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated. A number of implementations have been described. Having thus described the disclosure of the present application in detail and by reference to embodiments thereof, it will be apparent that modifications and variations are possible without departing from the scope of the disclosure defined in the appended claims.