The present technology pertains to network analytics, and more specifically to annotating process and user information in a network environment.
In a network environment, capturing agents or sensors can be placed at various devices or elements in the network to collect flow data and network statistics from different locations. The collected data from the capturing agents can be analyzed to monitor and troubleshoot the network. The data collected from the capturing agents can provide valuable details about the status, security, or performance of the network, as well as any network elements. Information about the capturing agents can also help interpret the data from the capturing agents, in order to infer or ascertain additional details from the collected data. For example, understanding the placement of a capturing agent relative to other capturing agents in the network can provide a context to the data reported by the capturing agents, which can further help identify specific patterns or conditions in the network. Unfortunately, however, information gathered from the capturing agents distributed throughout the network is often limited and may not include certain types of useful information. Moreover, as the network grows and changes, the information can quickly become outdated.
As data centers grow in size and complexity, the tools that manage them must be able to effectively identify inefficiencies while implementing appropriate security policies. Traditionally, network administrators have to manually implement security policies, manage access control lists (ACLs), configure firewalls, identify misconfigured or infected machines, etc. These tasks can become exponentially more complicated as a network grows in size and require an intimate knowledge of a large number of data center components. Furthermore, malicious attacks or misconfigured machines can shut down a data center within minutes while it could take a network administrator hours or days to determine the root problem and provide a solution. What is needed is a broad and deep network monitoring system that can automatically determine the network topology, map application dependencies, monitor traffic flow, dynamically analyze network performance, identify problems, implement policies, and present a network administrator with an interface reflecting the current state of the data center. The traffic monitoring system herein disclosed can provide such functionality.
In order to describe the manner in which the above-recited and other advantages and features of the disclosure can be obtained, a more particular description of the principles briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only exemplary embodiments of the disclosure and are not therefore to be considered to be limiting of its scope, the principles herein are described and explained with additional specificity and detail through the use of the accompanying drawings in which:
Various embodiments of the disclosure are discussed in detail below. While specific implementations are discussed, it should be understood that this is done for illustration purposes only. A person skilled in the relevant art will recognize that other components and configurations may be used without parting from the spirit and scope of the disclosure.
Overview
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or can be learned by practice of the herein disclosed principles. The features and advantages of the disclosure can be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. These and other features of the disclosure will become more fully apparent from the following description and appended claims, or can be learned by the practice of the principles set forth herein.
A flow is conventionally represented as a 5-tuple comprising a source address, destination address, source port, destination port, and protocol. Thus, if a user desired to search flow data, the user could only search based on these attributes.
NetFlow exposes other attributes of flows but none of the additional attributes of this invention nor does NetFlow enable users to customize the attributes of flows. A flow can be tagged with metadata to provide additional information about the flow such that the flows are searchable based on tags, or flows having common tags can be aggregated to visualize flow data. Users can also define custom tags and rules by which flows should be tagged.
Advantages include: capable of searching flows based on tags; enable improved visualization of flows. Industry use: can be by public cloud competitors (of Nimbus/CCS) (e.g., Amazon, Google, Microsoft, Rackspace, Oracle, etc.). Product documentation, UI, claims that a product allows a user to search on flows based on non-conventional attributes or visualize flows according to non-conventional attributes.
The approaches set forth herein can be used to annotate process and user information related to network flows captured by various capturing agents or sensors deployed throughout a virtualized compute environment. The capturing agents can be packet inspection sensors configured to monitor, capture, and/or report network traffic information at the various locations. The capturing agents can be deployed on virtual machines, hypervisors, servers, and network devices (e.g., physical switches) on the network. The various capturing agents can capture traffic from their respective locations (e.g., traffic processed by their hosts), and report captured data to one or more devices, such as a collector system or a processing engine. The captured data can include any traffic and/or process information captured by the capturing agents including reports or control flows generated by other capturing agents.
The data reported from the various capturing agents can be used to determine the particular process or user involved with a given flow being reported. For example, capturing agents deployed throughout the network can be configured to identify the process or operating system user account that is responsible for generating or processing a network flow and report such findings to a collector in the form of a control flow. The reported process and user information can be used to understand the relationships of the flows and the corresponding processes and users, and may drive further analytics on the network.
Disclosed are systems, methods, and computer-readable storage media for annotating process and user information in a network. A system may include a virtual machine, a hypervisor hosting the virtual machine, and a network device such as a switch communicatively connected to the hypervisor. The virtual machine can have a first capturing agent or sensor that is configured to monitor a first network flow associated with the virtual machine. The first capturing agent can generate a first control flow based on the first network flow. The first control flow can include first metadata that describes the first network flow. The first capturing agent can label the first control flow with a first identifier of a first process executing on the virtual machine, thus yielding a first labeled control flow. The first process can be associated with the first network flow. The first capturing agent can then transmit the labeled control flow to a collector via the network.
The hypervisor may also have a second capturing agent. The second capturing agent can be configured to monitor a second network flow associated with the hypervisor, and the second network flow can include at least the first labeled control flow. The second capturing agent can generate a second control flow based on the second network flow. The second control flow can include second metadata that describes the second network flow. The second control flow can then label the second control flow with a second identifier of a second process executing on the hypervisor, thus yielding a second labeled control flow. The second process can be associated with the second network flow. Next, the second capturing agent can transmit the second labeled control flow to the collector via the network.
In addition, the network device can have a third capturing agent that is configured to monitor a third network flow associated with the network device. The third network flow can include the first labeled control flow and/or the second labeled control flow. The third capturing agent can generate a third control flow based on the third network flow, and the third control flow may include third metadata describing the third network flow. The third capturing agent can then label the third control flow with a third identifier of a third process that is executing on the network device and associated with the third network flow, thus yielding a third labeled control flow. Finally, the third capturing agent can transmit the third labeled control flow to the collector via the network.
Description
The disclosed technology addresses the need in the art for understanding data reported from capturing agents on a virtualized network. Disclosed are systems, methods, and computer-readable storage media for determining relative placement and topology of capturing agents deployed throughout a network. A description of an example network environment, as illustrated in
Leaf routers 104 can be responsible for routing and/or bridging tenant or endpoint packets and applying network policies. Spine routers 102 can perform switching and routing within fabric 112. Thus, network connectivity in fabric 112 can flow from spine routers 102 to leaf routers 104, and vice versa.
Leaf routers 104 can provide servers 1-5 (106A-E) (collectively “106”), hypervisors 1-4 (108A-108D) (collectively “108”), and virtual machines (VMs) 1-5 (110A-110E) (collectively “110”) access to fabric 112. For example, leaf routers 104 can encapsulate and decapsulate packets to and from servers 106 in order to enable communications throughout environment 100. Leaf routers 104 can also connect other devices, such as device 114, with fabric 112. Device 114 can be any network-capable device(s) or network(s), such as a firewall, a database, a server, a collector 118 (further described below), an engine 120 (further described below), etc. Leaf routers 104 can also provide any other servers, resources, endpoints, external networks, VMs, services, tenants, or workloads with access to fabric 112.
VMs 110 can be virtual machines hosted by hypervisors 108 running on servers 106. VMs 110 can include workloads running on a guest operating system on a respective server. Hypervisors 108 can provide a layer of software, firmware, and/or hardware that creates and runs the VMs 110. Hypervisors 108 can allow VMs 110 to share hardware resources on servers 106, and the hardware resources on servers 106 to appear as multiple, separate hardware platforms. Moreover, hypervisors 108 and servers 106 can host one or more VMs 110. For example, server 106A and hypervisor 108A can host VMs 110A-B.
In some cases, VMs 110 and/or hypervisors 108 can be migrated to other servers 106. For example, VM 110A can be migrated to server 106C and hypervisor 108B. Servers 106 can similarly be migrated to other locations in network environment 100. For example, a server connected to a specific leaf router can be changed to connect to a different or additional leaf router. In some cases, some or all of servers 106, hypervisors 108, and/or VMs 110 can represent tenant space. Tenant space can include workloads, services, applications, devices, and/or resources that are associated with one or more clients or subscribers. Accordingly, traffic in network environment 100 can be routed based on specific tenant policies, spaces, agreements, configurations, etc. Moreover, addressing can vary between one or more tenants In some configurations, tenant spaces can be divided into logical segments and/or networks and separated from logical segments and/or networks associated with other tenants.
Any of leaf routers 104, servers 106, hypervisors 108, and VMs 110 can include capturing agent 116 (also referred to as a “sensor”) configured to capture network data, and report any portion of the captured data to collector 118. Capturing agents 116 can be processes, agents, modules, drivers, or components deployed on a respective system (e.g., a server, VM, hypervisor, leaf router, etc.), configured to capture network data for the respective system (e.g., data received or transmitted by the respective system), and report some or all of the captured data to collector 118.
For example, a VM capturing agent can run as a process, kernel module, or kernel driver on the guest operating system installed in a VM and configured to capture data (e.g., network and/or system data) processed (e.g., sent, received, generated, etc.) by the VM. Additionally, a hypervisor capturing agent can run as a process, kernel module, or kernel driver on the host operating system installed at the hypervisor layer and configured to capture data (e.g., network and/or system data) processed (e.g., sent, received, generated, etc.) by the hypervisor. A server capturing agent can run as a process, kernel module, or kernel driver on the host operating system of a server and configured to capture data (e.g., network and/or system data) processed (e.g., sent, received, generated, etc.) by the server. And a network device capturing agent can run as a process or component in a network device, such as leaf routers 104, and configured to capture data (e.g., network and/or system data) processed (e.g., sent, received, generated, etc.) by the network device.
Capturing agents 116 or sensors can be configured to report the observed data and/or metadata about one or more packets, flows, communications, processes, events, and/or activities to collector 118. For example, capturing agents 116 can capture network data as well as information about the system or host of the capturing agents 116 (e.g., where the capturing agents 116 are deployed). Such information can also include, for example, data or metadata of active or previously active processes of the system, operating system user identifiers, metadata of files on the system, system alerts, networking information, etc. Capturing agents 116 may also analyze all the processes running on the respective VMs, hypervisors, servers, or network devices to determine specifically which process is responsible for a particular flow of network traffic. Similarly, capturing agents 116 may determine which operating system user(s) is responsible for a given flow. Reported data from capturing agents 116 can provide details or statistics particular to one or more tenants. For example, reported data from a subset of capturing agents 116 deployed throughout devices or elements in a tenant space can provide information about the performance, use, quality, events, processes, security status, characteristics, statistics, patterns, conditions, configurations, topology, and/or any other information for the particular tenant space.
Collectors 118 can be one or more devices, modules, workloads and/or processes capable of receiving data from capturing agents 116. Collectors 118 can thus collect reports and data from capturing agents 116. Collectors 118 can be deployed anywhere in network environment 100 and/or even on remote networks capable of communicating with network environment 100. For example, one or more collectors can be deployed within fabric 112 or on one or more of the servers 106. One or more collectors can be deployed outside of fabric 112 but connected to one or more leaf routers 104. Collectors 118 can be part of servers 106 and/or separate servers or devices (e.g., device 114). Collectors 118 can also be implemented in a cluster of servers.
Collectors 118 can be configured to collect data from capturing agents 116. In addition, collectors 118 can be implemented in one or more servers in a distributed fashion. As previously noted, collectors 118 can include one or more collectors. Moreover, each collector can be configured to receive reported data from all capturing agents 116 or a subset of capturing agents 116. For example, a collector can be assigned to a subset of capturing agents 116 so the data received by that specific collector is limited to data from the subset of capturing agents.
Collectors 118 can be configured to aggregate data from all capturing agents 116 and/or a subset of capturing agents 116. Moreover, collectors 118 can be configured to analyze some or all of the data reported by capturing agents 116. For example, collectors 118 can include analytics engines (e.g., engines 120) for analyzing collected data. Environment 100 can also include separate analytics engines 120 configured to analyze the data reported to collectors 118. For example, engines 120 can be configured to receive collected data from collectors 118 and aggregate the data, analyze the data (individually and/or aggregated), generate reports, identify conditions, compute statistics, visualize reported data, troubleshoot conditions, visualize the network and/or portions of the network (e.g., a tenant space), generate alerts, identify patterns, calculate misconfigurations, identify errors, generate suggestions, generate testing, and/or perform any other analytics functions. Analytics engines can determine dependencies of components within the network. For example, if component A routinely sends data to component B but component B never sends data to component A, then analytics module 110 can determine that component B is dependent on component A, but A is likely not dependent on component B. If, however, component B also sends data to component A, then they are likely interdependent. These components can be processes, virtual machines, hypervisors, VLANs, etc. Once an engine has determined component dependencies, it can then form a component (“application”) dependency map. This map can be instructive when analytics module 110 attempts to determine the root cause of a failure (because failure of one component can cascade and cause failure of its dependent components) or when analytics module 110 attempts to predict what will happen if a component is taken offline. Additionally, engines can associate edges of an application dependency map with expected latency, bandwidth, etc. for that individual edge. Analytics engines can establish patterns and norms for component behavior. For example, it can determine that certain processes (when functioning normally) will only send a certain amount of traffic to a certain VM using a small set of ports. Engines can establish these norms by analyzing individual components or by analyzing data coming from similar components (e.g., VMs with similar configurations). Similarly, engines can determine expectations for network operations. For example, it can determine the expected latency between two components, the expected throughput of a component, response times of a component, typical packet sizes, traffic flow signatures, etc. In some embodiments, engines can combine its dependency map with pattern analysis to create reaction expectations. For example, if traffic increases with one component, other components may predictably increase traffic in response (or latency, compute time, etc.
While collectors 118 and engines 120 are shown as separate entities, this is for illustration purposes as other configurations are also contemplated herein. For example, any of collectors 118 and engines 120 can be part of a same or separate entity. Moreover, any of the collector, aggregation, and analytics functions can be implemented by one entity (e.g., collectors 118) or separately implemented by multiple entities (e.g., engine 120 and/or collectors 118).
Each of the capturing agents 116 can use a respective address (e.g., internet protocol (IP) address, port number, etc.) of their host to send information to collectors 118 and/or any other destination. Collectors 118 may also be associated with their respective addresses such as IP addresses. Moreover, capturing agents 116 can periodically send information about flows they observe to collectors 118. Capturing agents 116 can be configured to report each and every flow they observe. Capturing agents 116 can report a list of flows that were active during a period of time (e.g., between the current time and the time of the last report). The consecutive periods of time of observance can be represented as pre-defined or adjustable time series. The series can be adjusted to a specific level of granularity. Thus, the time periods can be adjusted to control the level of details in statistics and can be customized based on specific requirements, such as security, scalability, storage, etc. The time series information can also be implemented to focus on more important flows or components (e.g., VMs) by varying the time intervals. The communication channel between a capturing agent and collector 118 can also create a flow in every reporting interval. Thus, the information transmitted or reported by capturing agents 116 can also include information about the flow created by the communication channel.
Hypervisor 208 (otherwise known as a virtual machine monitor) can be a layer of software, firmware, and/or hardware that creates and runs VMs 202. Guest operating systems 206 running on VMs 202 can share virtualized hardware resources created by hypervisor 208. The virtualized hardware resources can provide the illusion of separate hardware components. Moreover, the virtualized hardware resources can perform as physical hardware components (e.g., memory, storage, processor, network interface, etc.), and can be driven by hardware resources 212 on server 106A. Hypervisor 208 can have one or more network addresses, such as an internet protocol (IP) address, to communicate with other devices, components, or networks. For example, hypervisor 208 can have a dedicated IP address which it can use to communicate with VMs 202, server 106A, and/or any remote devices or networks.
Hardware resources 212 of server 106A can provide the underlying physical hardware that drive operations and functionalities provided by server 106A, hypervisor 208, and VMs 202. Hardware resources 212 can include, for example, one or more memory resources, one or more storage resources, one or more communication interfaces, one or more processors, one or more circuit boards, one or more buses, one or more extension cards, one or more power supplies, one or more antennas, one or more peripheral components, etc. Additional examples of hardware resources are described below with reference to
Server 106A can also include one or more host operating systems (not shown). The number of host operating system can vary by configuration. For example, some configurations can include a dual boot configuration that allows server 106A to boot into one of multiple host operating systems. In other configurations, server 106A may run a single host operating system. Host operating systems can run on hardware resources 212. In some cases, hypervisor 208 can run on, or utilize, a host operating system on server 106A. Each of the host operating systems can execute one or more processes, which may be programs, applications, modules, drivers, services, widgets, etc. Each of the host operating systems may also be associated with one or more OS user accounts.
Server 106A can also have one or more network addresses, such as an internet protocol (IP) address, to communicate with other devices, components, or networks. For example, server 106A can have an IP address assigned to a communications interface from hardware resources 212, which it can use to communicate with VMs 202, hypervisor 208, leaf router 104A in
VM capturing agents 204A-C (collectively “204”) can be deployed on one or more of VMs 202. VM capturing agents 204 can be data and packet inspection agents or sensors deployed on VMs 202 to capture packets, flows, processes, events, traffic, and/or any data flowing into, out of, or through VMs 202. VM capturing agents 204 can be configured to export or report any data collected or captured by the capturing agents 204 to a remote entity, such as collectors 118, for example. VM capturing agents 204 can communicate or report such data using a network address of the respective VMs 202 (e.g., VM IP address).
VM capturing agents 204 can capture and report any traffic (e.g., packets, flows, etc.) sent, received, generated, and/or processed by VMs 202. For example, capturing agents 204 can report every packet or flow of communication sent and received by VMs 202. Such communication channel between capturing agents 204 and collectors 108 creates a flow in every monitoring period or interval and the flow generated by capturing agents 204 may be denoted as a control flow. Moreover, any communication sent or received by VMs 202, including data reported from capturing agents 204, can create a network flow. VM capturing agents 204 can report such flows in the form of a control flow to a remote device, such as collectors 118 illustrated in
VM capturing agents 204 can also report multiple flows as a set of flows. When reporting a set of flows, VM capturing agents 204 can include a flow identifier for the set of flows and/or a flow identifier for each flow in the set of flows. VM capturing agents 204 can also include one or more timestamps and other information as previously explained.
VM capturing agents 204 can run as a process, kernel module, or kernel driver on guest operating systems 206 of VMs 202. VM capturing agents 204 can thus monitor any traffic sent, received, or processed by VMs 202, any processes running on guest operating systems 206, any users and user activities on guest operating system 206, any workloads on VMs 202, etc.
Hypervisor capturing agent 210 can be deployed on hypervisor 208. Hypervisor capturing agent 210 can be a data inspection agent or a sensor deployed on hypervisor 208 to capture traffic (e.g., packets, flows, etc.) and/or data flowing through hypervisor 208. Hypervisor capturing agent 210 can be configured to export or report any data collected or captured by hypervisor capturing agent 210 to a remote entity, such as collectors 118, for example. Hypervisor capturing agent 210 can communicate or report such data using a network address of hypervisor 208, such as an IP address of hypervisor 208.
Because hypervisor 208 can see traffic and data originating from VMs 202, hypervisor capturing agent 210 can also capture and report any data (e.g., traffic data) associated with VMs 202. For example, hypervisor capturing agent 210 can report every packet or flow of communication sent or received by VMs 202 and/or VM capturing agents 204. Moreover, any communication sent or received by hypervisor 208, including data reported from hypervisor capturing agent 210, can create a network flow. Hypervisor capturing agent 210 can report such flows in the form of a control flow to a remote device, such as collectors 118 illustrated in
Hypervisor capturing agent 210 can also report multiple flows as a set of flows. When reporting a set of flows, hypervisor capturing agent 210 can include a flow identifier for the set of flows and/or a flow identifier for each flow in the set of flows. Hypervisor capturing agent 210 can also include one or more timestamps and other information as previously explained, such as process and user information.
As previously explained, any communication captured or reported by VM capturing agents 204 can flow through hypervisor 208. Thus, hypervisor capturing agent 210 can observe and capture any flows or packets reported by VM capturing agents 204, including any control flows. Accordingly, hypervisor capturing agent 210 can also report any packets or flows reported by VM capturing agents 204 and any control flows generated by VM capturing agents 204. For example, VM capturing agent 204A on VM 1 (202A) captures flow 1 (“F1”) and reports F1 to collector 118 on
When reporting F1, hypervisor capturing agent 210 can report F1 as a message or report that is separate from the message or report of F1 transmitted by VM capturing agent 204A on VM 1 (202A). However, hypervisor capturing agent 210 can also, or otherwise, report F1 as a message or report that includes or appends the message or report of F1 transmitted by VM capturing agent 204A on VM 1 (202A). In other words, hypervisor capturing agent 210 can report F1 as a separate message or report from VM capturing agent 204A's message or report of F1, and/or a same message or report that includes both a report of F1 by hypervisor capturing agent 210 and the report of F1 by VM capturing agent 204A at VM 1 (202A). In this way, VM capturing agents 204 at VMs 202 can report packets or flows received or sent by VMs 202, and hypervisor capturing agent 210 at hypervisor 208 can report packets or flows received or sent by hypervisor 208, including any flows or packets received or sent by VMs 202 and/or reported by VM capturing agents 204.
Hypervisor capturing agent 210 can run as a process, kernel module, or kernel driver on the host operating system associated with hypervisor 208. Hypervisor capturing agent 210 can thus monitor any traffic sent and received by hypervisor 208, any processes associated with hypervisor 208, etc.
Server 106A can also have server capturing agent 214 running on it. Server capturing agent 214 can be a data inspection agent or sensor deployed on server 106A to capture data (e.g., packets, flows, traffic data, etc.) on server 106A. Server capturing agent 214 can be configured to export or report any data collected or captured by server capturing agent 214 to a remote entity, such as collector 118, for example. Server capturing agent 214 can communicate or report such data using a network address of server 106A, such as an IP address of server 106A.
Server capturing agent 214 can capture and report any packet or flow of communication associated with server 106A. For example, capturing agent 216 can report every packet or flow of communication sent or received by one or more communication interfaces of server 106A. Moreover, any communication sent or received by server 106A, including data reported from capturing agents 204 and 210, can create a network flow associated with server 106A. Server capturing agent 214 can report such flows in the form of a control flow to a remote device, such as collector 118 illustrated in
Server capturing agent 214 can also report multiple flows as a set of flows. When reporting a set of flows, server capturing agent 214 can include a flow identifier for the set of flows and/or a flow identifier for each flow in the set of flows. Server capturing agent 214 can also include one or more timestamps and other information as previously explained.
Any communications captured or reported by capturing agents 204 and 210 can flow through server 106A. Thus, server capturing agent 214 can observe or capture any flows or packets reported by capturing agents 204 and 210. In other words, network data observed by capturing agents 204 and 210 inside VMs 202 and hypervisor 208 can be a subset of the data observed by server capturing agent 214 on server 106A. Accordingly, server capturing agent 214 can report any packets or flows reported by capturing agents 204 and 210 and any control flows generated by capturing agents 204 and 210. For example, capturing agent 204A on VM 1 (202A) captures flow 1 (F1) and reports F1 to collector 118 as illustrated on
When reporting F1, server capturing agent 214 can report F1 as a message or report that is separate from any messages or reports of F1 transmitted by capturing agent 204A on VM 1 (202A) or capturing agent 210 on hypervisor 208. However, server capturing agent 214 can also, or otherwise, report F1 as a message or report that includes or appends the messages or reports or metadata of F1 transmitted by capturing agent 204A on VM 1 (202A) and capturing agent 210 on hypervisor 208. In other words, server capturing agent 214 can report F1 as a separate message or report from the messages or reports of F1 from capturing agent 204A and capturing agent 210, and/or a same message or report that includes a report of F1 by capturing agent 204A, capturing agent 210, and capturing agent 214. In this way, capturing agents 204 at VMs 202 can report packets or flows received or sent by VMs 202, capturing agent 210 at hypervisor 208 can report packets or flows received or sent by hypervisor 208, including any flows or packets received or sent by VMs 202 and reported by capturing agents 204, and capturing agent 214 at server 106A can report packets or flows received or sent by server 106A, including any flows or packets received or sent by VMs 202 and reported by capturing agents 204, and any flows or packets received or sent by hypervisor 208 and reported by capturing agent 210.
Server capturing agent 214 can run as a process, kernel module, or kernel driver on the host operating system or a hardware component of server 106A. Server capturing agent 214 can thus monitor any traffic sent and received by server 106A, any processes associated with server 106A, etc.
In addition to network data, capturing agents 204, 210, and 214 can capture additional information about the system or environment in which they reside. For example, capturing agents 204, 210, and 214 can capture data or metadata of active or previously active processes of their respective system or environment, operating system user identifiers, metadata of files on their respective system or environment, timestamps, network addressing information, flow identifiers, capturing agent identifiers, etc. Moreover, capturing agents 204, 210, 214 are not specific to any operating system environment, hypervisor environment, network environment, or hardware environment. Thus, capturing agents 204, 210, and 214 can operate in any environment.
As previously explained, capturing agents 204, 210, and 214 can send information about the network traffic they observe. This information can be sent to one or more remote devices, such as one or more servers, collectors, engines, etc. Each capturing agent can be configured to send respective information using a network address, such as an IP address, and any other communication details, such as port number, to one or more destination addresses or locations. Capturing agents 204, 210, and 214 can send metadata about one or more flows, packets, communications, processes, events, etc.
Capturing agents 204, 210, and 214 can periodically report information about each flow or packet they observe. The information reported can contain a list of flows or packets that were active during a period of time (e.g., between the current time and the time at which the last information was reported). The communication channel between the capturing agent and the destination can create a flow in every interval. For example, the communication channel between capturing agent 214 and collector 118 can create a control flow. Thus, the information reported by a capturing agent can also contain information about this control flow. For example, the information reported by capturing agent 214 to collector 118 can include a list of flows or packets that were active at hypervisor 208 during a period of time, as well as information about the communication channel between capturing agent 210 and collector 118 used to report the information by capturing agent 210.
In this example, leaf router 104A can include network resources 222, such as memory, storage, communication, processing, input, output, and other types of resources. Leaf router 104A can also include operating system environment 224. The operating system environment 224 can include any operating system, such as a network operating system, embedded operating system, etc. Operating system environment 224 can include processes, functions, and applications for performing networking, routing, switching, forwarding, policy implementation, messaging, monitoring, and other types of operations.
Leaf router 104A can also include capturing agent 226. Capturing agent 226 can be an agent or sensor configured to capture network data, such as flows or packets, sent received, or processed by leaf router 104A. Capturing agent 226 can also be configured to capture other information, such as processes, statistics, users, alerts, status information, device information, etc. Moreover, capturing agent 226 can be configured to report captured data to a remote device or network, such as collector 118 shown in
Leaf router 104A can be configured to route traffic to and from other devices or networks, such as server 106A. Accordingly, capturing agent 226 can also report data reported by other capturing agents on other devices. For example, leaf router 104A can be configured to route traffic sent and received by server 106A to other devices. Thus, data reported from capturing agents deployed on server 106A, such as VM and hypervisor capturing agents on server 106A, would also be observed by capturing agent 226 and can thus be reported by capturing agent 226 as data observed at leaf router 104A. Such report can be a control flow generated by capturing agent 226. Data reported by the VM and hypervisor capturing agents on server 106A can therefore be a subset of the data reported by capturing agent 226.
Capturing agent 226 can run as a process or component (e.g., firmware, module, hardware device, etc.) in leaf router 104A. Moreover, capturing agent 226 can be installed on leaf router 104A as a software or firmware agent. In some configurations, leaf router 104A itself can act as capturing agent 226. Moreover, capturing agent 226 can run within operating system 224 and/or separate from operating system 224.
Moreover, VM capturing agent 204A at VM 110A, hypervisor capturing agent 210 at hypervisor 108A, network device capturing agent 226 at leaf router 104A, and any server capturing agent at server 106A (e.g., capturing agent running on host environment of server 106A) can send reports 244 (also referred to as control flows) to collector 118 based on the packets or traffic 242 captured at each respective capturing agent. Reports 244 from VM capturing agent 204A to collector 118 can flow through VM 110A, hypervisor 108A, server 106A, and leaf router 104A. Reports 244 from hypervisor capturing agent 210 to collector 118 can flow through hypervisor 108A, server 106A, and leaf router 104A. Reports 244 from any other server capturing agent at server 106A to collector 118 can flow through server 106A and leaf router 104A. Finally, reports 244 from network device capturing agent 226 to collector 118 can flow through leaf router 104A. Although reports 244 are depicted as being routed separately from traffic 242 in
Reports 244 can include any portion of packets or traffic 242 captured at the respective capturing agents. Reports 244 can also include other information, such as timestamps, process information, capturing agent identifiers, flow identifiers, flow statistics, notifications, logs, user information, system information, etc. Some or all of this information can be appended to reports 244 as one or more labels, metadata, or as part of the packet(s)′ header, trailer, or payload. For example, if a user opens a browser on VM 110A and navigates to examplewebsite.com, VM capturing agent 204A of VM 110A can determine which user (i.e., operating system user) of VM 110A (e.g., username “johndoe85”) and which process being executed on the operating system of VM 110A (e.g., “chrome.exe”) were responsible for the particular network flow to and from examplewebsite.com. Once such information is determined, the information can be included in report 244 as labels for example, and report 244 can be transmitted from VM capturing agent 204A to collector 118. Such additional information can help system 240 to gain insight into flow information at the process and user level, for instance. This information can be used for security, optimization, and determining structures and dependencies within system 240. Moreover, reports 244 can be transmitted to collector 118 periodically as new packets or traffic 242 are captured by a capturing agent. Further, each capturing agent can send a single report or multiple reports to collector 118. For example, each of the capturing agents 116 can be configured to send a report to collector 118 for every flow, packet, message, communication, or network data received, transmitted, and/or generated by its respective host (e.g., VM 110A, hypervisor 108A, server 106A, and leaf router 104A). As such, collector 118 can receive a report of a same packet from multiple capturing agents.
For example, a packet received by VM 110A from fabric 112 can be captured and reported by VM capturing agent 204A. Since the packet received by VM 110A will also flow through leaf router 104A and hypervisor 108A, it can also be captured and reported by hypervisor capturing agent 210 and network device capturing agent 226. Thus, for a packet received by VM 110A from fabric 112, collector 118 can receive a report of the packet from VM capturing agent 204A, hypervisor capturing agent 210, and network device capturing agent 226.
Similarly, a packet sent by VM 110A to fabric 112 can be captured and reported by VM capturing agent 204A. Since the packet sent by VM 110A will also flow through leaf router 104A and hypervisor 108A, it can also be captured and reported by hypervisor capturing agent 210 and network device capturing agent 226. Thus, for a packet sent by VM 110A to fabric 112, collector 118 can receive a report of the packet from VM capturing agent 204A, hypervisor capturing agent 210, and network device capturing agent 226.
On the other hand, a packet originating at, or destined to, hypervisor 108A, can be captured and reported by hypervisor capturing agent 210 and network device capturing agent 226, but not VM capturing agent 204A, as such packet may not flow through VM 110A. Moreover, a packet originating at, or destined to, leaf router 104A, will be captured and reported by network device capturing agent 226, but not VM capturing agent 204A, hypervisor capturing agent 210, or any other capturing agent on server 106A, as such packet may not flow through VM 110A, hypervisor 108A, or server 106A.
Each of the capturing agents 204A, 210, 226 can include a respective unique capturing agent identifier on each of reports 244 it sends to collector 118, to allow collector 118 to determine which capturing agent sent the report. Reports 244 can be used to analyze network and/or system data and conditions for troubleshooting, security, visualization, configuration, planning, and management. Capturing agent identifiers in reports 244 can also be used to determine which capturing agents reported what flows. This information can then be used to determine capturing agent placement and topology, as further described below, as well as mapping individual flows to processes and users. Such additional insights gained can be useful for analyzing the data in reports 244, as well as troubleshooting, security, visualization, configuration, planning, and management.
Server 106A and hypervisor 108A can receive flow 302 from leaf router 104A. Hypervisor 108A can then forward the received flow 302 to VM 110A. Hypervisor capturing agent 210 can also capture the received flow 302 and send a new control flow 306, reporting the received flow 302, to collector 118. Hypervisor capturing agent 210 may include in control flow 306 any additional information such as process information and user information related to hypervisor 108A and flow 302. Leaf router 104A can receive control flow 306, reporting flow 302, originating from hypervisor capturing agent 210, and forward flow 306 to collector 118. Network device capturing agent 226 can also capture control flow 306 received from hypervisor capturing agent 210, and send a new control flow 308, reporting flow 306, to collector 118. Again, network device capturing agent 226 may include in control flow 308 any additional information such as process information and user information related to network device 104A and flow 306.
Moreover, VM 110A can receive flow 302 from hypervisor 108A. At this point, flow 302 has reached its intended destination: VM 110A. Accordingly, VM 110A can then process flow 302. Once flow 302 is received by VM 110A, VM capturing agent 204A can capture received flow 302 and send a new control flow 310, reporting the receipt of flow 302, to collector 118. VM capturing agent 204A can include in control flow 310 any additional information such as process information and user information related to VM 110A and flow 302.
Hypervisor 108A can receive control flow 310 from VM capturing agent 204A, and forward it to leaf router 104A. Hypervisor capturing agent 210 can also capture flow 310, received from VM capturing agent 204A and reporting the receipt of flow 302, and send a new control flow 312, reporting flow 310, to collector 118. Hypervisor capturing agent 210 may include in control flow 312 any additional information such as process information and user information related to hypervisor 108A and flow 310.
Leaf router 104A can receive flow 310 forwarded from hypervisor 108A, and forward it to collector 118. Network device capturing agent 226 can also capture flow 310, forwarded from hypervisor capturing agent 210 and reporting the receipt of flow 302 at VM 110A, and send a new control flow 314, reporting flow 310, to collector 118. Network device capturing agent 226 may include in control flow 314 any additional information such as process information and user information related to network device 104A and flow 310.
Leaf router 104A can receive packet 312 from hypervisor capturing agent 210 and forward it to collector 118. Network device capturing agent 226 can also capture flow 312 and send a new control flow 316, reporting flow 312, to collector 118. Network device capturing agent 226 may include in control flow 316 any additional information such as process information and user information related to network device 104A and flow 312.
As described above, in this example, flow 302 destined from fabric 112 to VM 110A, can be reported by network device capturing agent 226, hypervisor capturing agent 210, and VM capturing agent 204A to collector 118. In addition, hypervisor capturing agent 210 and network device capturing agent 226 can each report the communication from VM 110A to collector 118, reporting flow 302 to collector 118. Moreover, network device capturing agent 226 can report any communications from hypervisor capturing agent 210 reporting flows or communications captured by hypervisor capturing agent 210. As one of skill in the art will understand, the order in which control flows 304, 306, 308, 310, 312, 314, 316 are reported to collector 118 need not occur in the same order that is presented in this disclosure as long as each control flow is transmitted or forwarded to another device after the flow which the control flow is reporting is received. For example, control flow 314, which reports flow 310, may be transmitted to collector 118 either before or after each of control flows 308, 312, 316 is transmitted or forwarded to collector 118 as long as control flow 314 is transmitted sometime after flow 310 is received at leaf router 104A. This applies to other control flows illustrated throughout disclosure especially those shown in
Referring to
Server 106A and hypervisor 108A can receive flow 324 from leaf router 104A. Hypervisor 108A can process received flow 324. Hypervisor capturing agent 210 can also capture received flow 324 and send a new control flow 320, reporting received flow 324, to collector 118. Hypervisor capturing agent 210 may include in control flow 320 any additional information such as process information and user information related to hypervisor 108A and flow 324. Leaf router 104A can receive flow 320, reporting flow 324, from hypervisor capturing agent 210, and forward control flow 320 to collector 118. Network device capturing agent 226 can also capture flow 320 received from hypervisor capturing agent 210, and send a new control flow 322, reporting flow 320, to collector 118. Network device capturing agent 226 may include in control flow 322 any additional information such as process information and user information related to network device 104A and flow 320.
As described above, in this example, flow 324 destined from fabric 112 to hypervisor 108A, can be reported by network device capturing agent 226 and hypervisor capturing agent 210 to collector 118. In addition, network device capturing agent 226 can report the communication from hypervisor 108A to collector 118, reporting flow 324 to collector 118.
Referring to
Referring to
VM capturing agent 204A can also capture flow 330 and send a new control flow 332, reporting flow 330, to collector 118. VM capturing agent 204A may include in control flow 332 any additional information such as process information and user information related to VM 110A and flow 330. Hypervisor capturing agent 210 can also capture flow 330 and send a new control flow 334, reporting flow 330, to collector 118. Hypervisor capturing agent 210 may include in control flow 334 any additional information such as process information and user information related to hypervisor 108A and flow 330. Similarly, network device capturing agent 226 can capture flow 330, and send a new control flow 336, reporting flow 330, to collector 118. Network device capturing agent 226 may include in control flow 336 any additional information such as process information and user information related to network device 104A and flow 330.
Hypervisor capturing agent 210 can also capture flow 332, reporting flow 330 by VM capturing agent 204A, and send a new control flow 338, reporting flow 332, to collector 118. Hypervisor capturing agent 210 may include in control flow 338 any additional information such as process information and user information related to hypervisor 108A and flow 332.
Network device capturing agent 226 can similarly capture flow 332, reporting flow 330 by VM capturing agent 204A, and send a new control flow 340, reporting flow 332, to collector 118. Network device capturing agent 226 may include in control flow 340 any additional information such as process information and user information related to network device 104A and flow 332. Moreover, network device capturing agent 226 can capture flow 338, reporting flow 332 from hypervisor capturing agent 210, and send a new control flow 342, reporting flow 338, to collector 118. Network device capturing agent 226 may include in control flow 342 any additional information such as process information and user information related to network device 104A and flow 338.
As described above, in this example, flow 330 destined to fabric 112 from VM 110A, can be reported by network device capturing agent 226, hypervisor capturing agent 210, and VM capturing agent 204A to collector 118. In addition, hypervisor capturing agent 210 and network device capturing agent 226 can each report the communication (i.e., control flow) from VM 110A to collector 118, reporting flow 330 to collector 118. Network device capturing agent 226 can also report any communications from hypervisor capturing agent 210 reporting flows or communications captured by hypervisor capturing agent 210.
Referring to
Hypervisor capturing agent 210 can also capture flow 344 and send a new control flow 346, reporting flow 344, to collector 118. Hypervisor capturing agent 210 may include in control flow 346 any additional information such as process information and user information related to hypervisor 108A and flow 344. Similarly, network device capturing agent 226 can capture flow 344, and send a new control flow 348, reporting flow 344, to collector 118. Again, network device capturing agent 226 may include in control flow 348 any additional information such as process information and user information related to network device 104A and flow 344.
Network device capturing agent 226 can also capture flow 346, reporting flow 344 by hypervisor capturing agent 210, and send a new control flow 350, reporting flow 346, to collector 118. Network device capturing agent 226 may include in control flow 350 any additional information such as process information and user information related to network device 104A and flow 346.
Referring to
VM capturing agent 204A can be configured to report to collector 118 traffic sent, received, or processed by VM 110A. Hypervisor capturing agent 210 can be configured to report to collector 118 traffic sent, received, or processed by hypervisor 108A. Finally, network device capturing agent 226 can be configured to report to collector 118 traffic sent, received, or processed by leaf router 104A.
Collector 118 can thus receive flows 402 from VM capturing agent 204A, flows 404 from hypervisor capturing agent 210, and flows 406 from network device capturing agent 226. Flows 402, 404, and 406 can include control flows. Flows 402 can include flows captured by VM capturing agent 204A at VM 110A.
Flows 404 can include flows captured by hypervisor capturing agent 210 at hypervisor 108A. Flows captured by hypervisor capturing agent 210 can also include flows 402 captured by VM capturing agent 204A, as traffic sent and received by VM 110A will be received and observed by hypervisor 108A and captured by hypervisor capturing agent 210.
Flows 406 can include flows captured by network device capturing agent 226 at leaf router 104A. Flows captured by network device capturing agent 226 can also include flows 402 captured by VM capturing agent 204A and flows 404 captured by hypervisor capturing agent 210, as traffic sent and received by VM 110A and hypervisor 108A is routed through leaf router 104A and can thus be captured by network device capturing agent 226.
Collector 118 can collect flows 402, 404, and 406, and store the reported data. Collector 118 can also forward some or all of flows 402, 404, and 406, and/or any respective portion thereof, to engine 120. Engine 120 can process the information, including any process information and user information, received from collector 118 to identify patterns, conditions, statuses, network or device characteristics; log statistics or history details; aggregate and/or process the data; generate reports, timelines, alerts, graphical user interfaces; detect errors, events, inconsistencies; troubleshoot networks or devices; configure networks or devices; deploy services or devices; reconfigure services, applications, devices, or networks; etc. In particular, collector 118 or engine 120 can map individual flows that traverse VM 110A, hypervisor 108A, and/or leaf router 104A to specific processes or users that are associated with VM 110A, hypervisor 108A, and/or leaf router 104A. For example, collector 118 or engine 120 can determine that a particular flow that originated from VM 110A and destined for fabric 112 was sent by an OS user named X on VM 110A and via a process named Y on VM 110A. It may be determined that the same flow was received by a process named Z on hypervisor 108A and forwarded to a process named Won leaf router 104A.
While engine 120 is illustrated as a separate entity, other configurations are also contemplated herein. For example, engine 120 can be part of collector 118 and/or a separate entity. Indeed, engine 120 can include one or more devices, applications, modules, databases, processing components, elements, etc. Moreover, collector 118 can represent one or more collectors. For example, in some configurations, collector 118 can include multiple collection systems or entities, which can reside in one or more networks.
Since flow 1 (502) has been observed by VM 110A, hypervisor 108A, and leaf router 104A, it can be captured and reported to collector 118 by VM capturing agent 204A at VM 110A, hypervisor capturing agent 210 at hypervisor 108A, and network device capturing agent 226 at leaf router 104A. On the other hand, since flow 2 (504) has been observed by hypervisor 108A and leaf router 104A but not by VM 110A, it can be captured and reported to collector 118 by hypervisor capturing agent 210 at hypervisor 108A and network device capturing agent 226 at leaf router 104A, but not by VM capturing agent 204A at VM 110A Finally, since flow 3 (506) has only been observed by leaf router 104A, it can be captured and reported to collector 118 only by capturing agent 226 at leaf router 104A.
The reports or control flows received by collector 118 can include information identifying the reporting capturing agent. For example, when transmitting a report to collector 118, each capturing agent can include a unique capturing agent identifier, which the collector 118 and/or any other entity reviewing the reports can use to map a received report with the reporting capturing agent. Furthermore, the reports or control flows received by collector 118 can include information identifying the process or the user responsible for the flow being reported. Collector 118 can use such information to map the flows to corresponding processes or users.
Thus, based on the reports from capturing agents 204A, 210, and 226, collector 118 and/or a separate entity (e.g., engine 120) can determine that flow 1 (502) was observed and reported by capturing agent 204A at VM 110A, capturing agent 210 at hypervisor 108A, and capturing agent 226 at leaf router 104A; flow 2 (504) was observed and reported by capturing agent 210 at hypervisor 108A and capturing agent 226 at leaf router 104A; and flow 3 (506) was only observed and reported by capturing agent 226 at leaf router 104A. Based on this information, collector 118 and/or a separate entity, can determine the placement of capturing agents 204A, 210, 226 within VM 110A, hypervisor 108A, and leaf router 104A, as further described below. In other words, this information can allow a device, such as collector 118, to determine which of capturing agents 204A, 210, 226 is located at VM 110A, which is located at hypervisor 108A, and which is located at leaf router 104A. If any of VM 110A, hypervisor 108A, and leaf router 104A is moved to a different location (e.g., VM 110A moved to server 106C and hypervisor 108B), the new flows collected by collector 118 can be used to detect the new placement and topology of VM 110A, hypervisor 108A, and leaf router 104A and/or their respective capturing agents. Furthermore, the process and/or user information included in the control flows received at collector 118 may also assist in determining how VM 110A, hypervisor 108A, and/or leaf router 104A may move to a different location within the network. For example, by recognizing that a new device that just appeared in the network is sending out a flow that matches the process and/or user profiles of a previously known device, such as VM 110A, collector 118 can determine that the new device is actually VM 110A that just moved to a different location (e.g., from server 1 (106A) to server 4 (106D)) within the network topology.
Flow identifier (e.g., unique identifier associated with the flow).
Capturing agent identifier (e.g., data uniquely identifying reporting capturing agent).
Timestamp (e.g., time of event, report, etc.).
Interval (e.g., time between current report and previous report, interval between flows or packets, interval between events, etc.).
Duration (e.g., duration of event, duration of communication, duration of flow, duration of report, etc.).
Flow direction (e.g., egress flow, ingress flow, etc.).
Application identifier (e.g., identifier of application associated with flow, process, event, or data).
Port (e.g., source port, destination port, layer 4 port, etc.).
Destination address (e.g., interface address associated with destination, IP address, domain name, network address, hardware address, virtual address, physical address, etc.).
Source address (e.g., interface address associated with source, IP address, domain name, network address, hardware address, virtual address, physical address, etc.).
Interface (e.g., interface address, interface information, etc.).
Protocol (e.g., layer 4 protocol, layer 3 protocol, etc.).
Event (e.g., description of event, event identifier, etc.).
Flag (e.g., layer 3 flag, flag options, etc.).
Tag (e.g., virtual local area network tag, etc.).
Process (e.g., process identifier, etc.).
User (e.g., OS username, etc.).
Bytes (e.g., flow size, packet size, transmission size, etc.).
The listing 700 includes a non-limiting example of fields in a report. Other fields and data items are also contemplated herein, such as handshake information, system information, network address associated with capturing agent or host, operating system environment information, network data or statistics, process statistics, system statistics, etc. The order in which these fields are illustrated is also exemplary and can be rearranged in any other way. One or more of these fields can be part of a header, a trailer, or a payload of in one or more packets. Moreover, one or more of these fields can be applied to the one or more packets as labels. Each of the fields can include data, metadata, and/or any other information relevant to the fields.
Having disclosed some basic system components and concepts, the disclosure now turns to the exemplary method embodiments shown in
In
At step 804, capturing agent 116 can generate a control flow based on the network flow. The control flow can include metadata describing the network flow. The metadata can relate to network data, an active process of the system, a previously active process of the device, and/or a file that is present on the device. The metadata can also relate to operating system user identifiers, timestamps, network addressing information, flow identifiers, capturing agent identifiers, time interval, interval duration, flow direction, application identifier, port, destination address, source address, interface, protocol, event, flag, tag, user, size, handshake information, statistics, etc. with regards to the network flow being monitored and reported.
At step 806, capturing agent 116 can determine which process executing on the first device is associated with the network flow to yield process information. The process information may include the process identifier of the process. Furthermore, the process information may include information about the OS username associated with the process. The identified process may be responsible for sending, receiving, or otherwise processing the network flow. The process can belong to the operating system environment of the first device. Capturing agent 116 can further determine which OS user of the first device is associated with the network flow to yield user information.
The capturing agent 116 can determine which kernel module has been loaded and/or query the operating system to determine which process is executing on the first device. The capturing agent 116 can also determine process ownership information to identify which user has executed a particular service or process.
At step 808, capturing agent 116 can label the control flow with the process information to yield a labeled control flow. Capturing agent 116 can further label the control flow with user information. The process and/or user information can be applied or added to the control flow as part of a header, a trailer, or a payload.
At step 810, capturing agent can transmit the labeled control flow to a second device in the network. The second device can be a collector that is configured to receive a plurality of control flows from a plurality of devices, particularly from their capturing agents, and analyze the plurality of control flows to determine relationships between network flows and corresponding processes. Those other devices can also be VMs, hypervisors, servers, network devices, etc. equipped with VM capturing agents, hypervisor capturing agents, server capturing agents, network device capturing agents, etc. The second device can map the relationships between the network flows and the corresponding processes within the first device and other devices in the plurality of devices. The second device or another device can utilize this information to identify patterns, conditions, statuses, network or device characteristics; log statistics or history details; aggregate and/or process the data; generate reports, timelines, alerts, graphical user interfaces; detect errors, events, inconsistencies; troubleshoot networks or devices; configure networks or devices; deploy services or devices; reconfigure services, applications, devices, or networks; etc.
In
At step 904, capturing agent 116 can generate a control flow based on the network flow. The control flow can include metadata describing the network flow. The metadata can relate to network data, an active process of the system, a previously active process of the device, and/or a file that is present on the device. The metadata can also relate to processes, timestamps, network addressing information, flow identifiers, capturing agent identifiers, time interval, interval duration, flow direction, application identifier, port, destination address, source address, interface, protocol, event, flag, tag, size, handshake information, statistics, etc. with regards to the network flow being monitored and reported.
At step 906, capturing agent 116 can determine which user of the first device is associated with the network flow to yield user information. The user can be an operating system user account. The user information may include the username or the user identifier associated with the user. The user may be an OS user of the first device's OS environment. The user may be associated with a process that sends, receives, or otherwise processes the network flow. Capturing agent 116 can further determine which process executing on the first device is associated with the network flow to yield process information.
At step 908, capturing agent 116 can label the control flow with the user information to yield a labeled control flow. Capturing agent 116 can further label the control flow with process information. The process and/or user information can be applied or added to the control flow as part of a header, a trailer, or a payload.
At step 910, capturing agent can transmit the labeled control flow to a second device in the network. The second device can be a collector that is configured to receive a plurality of control flows from a plurality of devices, particularly from their capturing agents, and analyze the plurality of control flows to determine relationships between network flows and corresponding processes. Those other devices can also be VMs, hypervisors, servers, network devices, etc. equipped with VM capturing agents, hypervisor capturing agents, server capturing agents, network device capturing agents, etc. The second device can map the relationships between the network flows and the corresponding users associated with the first device or another device in the plurality of devices. The second device or some other device can utilize this information to identify patterns, conditions, statuses, network or device characteristics; log statistics or history details; aggregate and/or process the data; generate reports, timelines, alerts, graphical user interfaces; detect errors, events, inconsistencies; troubleshoot networks or devices; configure networks or devices; deploy services or devices; reconfigure services, applications, devices, or networks; etc.
For the Tetration analytics, sensors are placed in the VMs, hypervisors, and switches within the ACI environment. The sensors can be used to analyze traffic to and from each point along the path of a packet, namely, the VM, hypervisor, and switch. This layered sensor structure allows for granular packet statistics and data at each hop. However, it is very important to be able to detect the characteristics and context of each sensor as this information can be used as part of the analytics. For example, when analyzing and collecting data from a sensor, it would be very helpful to know if the sensor resides in a VM, hypervisor, or switch, and what OS and environment is running on the sensor's system.
Determining whether a sensor in the Tetration platform resides in a VM or hypervisor and identifying the underlying environment of the system associated with the sensor.
Advantages include: This invention is extremely important for the Tetration platform, as it provides essential information that drives the Tetration platform. For example, the sensors in the VMs, hypervisors, and switches provide important analytics for the network. This invention helps us better understand the analytics data collected by the sensors. This information can be used for security, planning, deployment, determining dependencies, and troubleshooting.
The current invention can be used to identify the characteristics and context of each sensor within the Tetration platform. Specifically, this invention can be used to determine whether a sensor resides in a VM, hypervisor, or switch, and which OS or environment is running on the system where the sensor resides. For example, a sensor can monitor and analyze the system where the sensor resides and any traffic associated with that system in order to determine whether the sensor is in a VM or hypervisor, and identify the underlying environment (e.g., OS). By determining whether the sensor is in a VM or hypervisor, we can make various types of inferences about the traffic collected and monitored by the sensor as well as the statistics and activity at each hop. This information can also help us identify the structure or topology of the network, the communication path of traffic, and the security conditions of the network. Thus, this invention provides very important information that drives the tetration platform.
The Tetration platform uses sensors placed within the VMs, vhypervisors, and switches (ToRs) for data analytics. Each of the sensors needs to be properly configured and setup to perform its operations and transmit their data to the appropriate collector (or any other device). However, as changes are made to the network or systems experience problems, these sensors need to be updated in order to properly continue functioning. In the case of large data centers, it is extremely difficult to update and maintain the proper configurations and settings on all sensors. This is particularly difficult in the context of the Tetration platform which involve many collectors and sensors.
This invention provides a centralized mechanism which tracks collector information, such as status, location, and collector-to-sensor mappings, as well as sensor information, such as location of specific sensors, and updates the configuration settings of sensors as necessary to maintain accurate and up-to-date collector-to-sensor mappings.
Advantages include: This invention can detect current collector and sensor status conditions, and updates to dynamically update and maintain proper collector-to-sensor mappings from a centralized location. This invention can thus provide a feasible solution for ensuring that collectors and sensors are always functioning properly.
Industry use: There are other solutions for providing centralized upgrades of software and settings. However, we are not aware of any centralized solutions for tracking current sensor and collector information (e.g., status, location, settings, and mappings) and dynamically updates sensors as necessary to maintain proper and functioning collector-to-sensor mappings and other sensor settings.
Sensors need to have certain configuration settings to run in the Tetration platform, such as where their corresponding collectors are located. This invention provides a centralized mechanism which tracks collector information, such as status, location, and collector-to-sensor mappings, as well as sensor information, such as location of specific sensors, and updates the configuration settings of sensors as necessary to maintain accurate and up-to-date collector to sensor mappings. For example, if a sensor is configured to send traffic data to a specific collector and that collector goes down, the centralized system can detect that the collector is down and the sensor needs updated configuration settings. The centralized system can then determine which collector should be assigned to that sensor, and dynamically update the configuration settings of the sensor to point the sensor to the new collector. In this way, the centralized system can maintain the accuracy of the configuration settings of the sensors and ensure that the sensors are always connected to a collector even when an assigned collector goes down or the sensor otherwise experiences problems contacting its assigned collector.
To identify if there are changes on collectors, we run analytics on the collectors. The analytics can help identify which collectors are functioning and which are non-functioning. The analytics are based on data pushed from the collectors to a monitoring system, which can be the centralized system. In some cases, the trigger to switch from a collector to another collector can be based on health, where health can include memory usage, CPU utilization, bandwidth, or errors. In mapping collectors to sensors, the centralized system can use the analytics to load balance collectors.
Detecting Virtual Switching Devices and Forwarding Models Used by Hypervisors in Tetration Platform
A hypervisor may host multiple VMs which can communicate with each other and the Internet. The hypervisor will also include virtual switching devices. The virtual switching devices send and transmit the data between VMs and the Internet. When handling or forwarding packets, the virtual switching devices typically use different forwarding models depending on the type of virtual switching device, such as Linux bridge, Open vSwitch, vNic, or other software switch. It is important to understand what type of switching device and forwarding model is used by a hypervisor in order to optimize connections and properly attach VMs to the virtual switching device.
Conventional solutions can identify the name of the virtual switching device by looking at the hypervisor's configuration. For example, conventional systems can determine that a virtual switching device may be named vSwitch 1. In some cases, a device type can be inferred from the name given to the device by an administrator. However, the name does not always indicate the exact type of the device, particularly if the name attributed to the device is vague or nondescriptive. Moreover, looking at each individual hypervisor's configuration can be unfeasible and does not scale.
This invention provides a mechanism for identifying the type of virtual switching devices and forwarding models used by hypervisors and VMs in the Tetration platform.
Advantages include: By identifying the type of virtual switching device used by the hypervisor and VMs, the sensor can determine the forwarding model used by the particular virtual switching device(s). This information can be used to determine how a component such as a VM should attach to the virtual switching device(s). Thus, the device type information can help optimize the connections between VMs, hypervisors, and virtual switching devices. In addition, by knowing the forwarding model used by a virtual switching device, the sensors and collectors in the Tetration platform can determine how to collect data and which data to collect or ignore. This can be determined based on known behavior of a particular forwarding model.
Industry use: Conventional solutions can identify the name of the virtual switching device by looking at the hypervisor's configuration. We are not aware of any prior art that can extract this information from traffic analyzed by sensors.
The current invention provides sensors on the hypervisors in the Tetration platform, which can capture and analyze packets to and from the virtual switching devices on the hypervisors. The data extracted from the captured packets can be used to determine what type of virtual switching device(s) are used by the hypervisors. For example, a sensor in a hypervisor can analyze traffic to determine if a virtual switching device used by the hypervisor is a Linux bridge, an open vSwitch, or a vNic. By identifying the type of virtual switching device used by the hypervisor and VMs, the sensor can determine the forwarding model used by the particular virtual switching device(s). This information can then be used to determine how a component such as a VM should attach to the virtual switching device(s). Thus, the device type information can help optimize the connections between VMs, hypervisors, and virtual switching devices. In addition, by knowing the forwarding model used by a virtual switching device, which can be ascertained from the device type, the sensors and collectors can determine how to collect data and which data to collect or ignore. For example, by knowing the forwarding model of the virtual switching device, the sensor can ensure that it does not collect redundant data as it can determine what data may be redundant based on the known behavior of the virtual switching device(s) identified from the forwarding model.
Self Policing of Resources Used by Sensors in VMs and Hypervisors
The Tetration platform implements sensors in the VMs, hypervisors, and switches in the ACI in order to perform analytics and collect information for troubleshooting, planning, deployment, and security. As the amount of traffic seen by the sensors grows and events such as errors or even potential attacks occur, the sensors can begin to consume more resources, such as bandwidth, memory, CPU utilization, network traffic, etc. However, it is important to monitor and manage the amount of resources used by the sensors to ensure that the sensors themselves do not become a bottleneck or negatively impact the system's or network's performance.
This invention allows sensors to track and monitor themselves inside of the system (i.e., the hypervisor or VM) to identify activity and resource usage. The sensors can detect high resource usage and take corrective actions. The sensors can be designed with a core layer running the sensor logic and an outer shell running the monitoring logic.
Advantages include: The implementation of sensors in VMs, hypervisors, and switches can be very useful for analytics, management, and troubleshooting. The self-monitoring mechanism allows the sensors to run properly and efficiently and avoid creating unnecessary burdens on the network and systems.
Industry use: We are not aware of any prior art that teaches sensors policing themselves as disclosed herein or implementing the sensor architecture used in this invention.
This invention allows sensors to track and monitor themselves inside of the system (i.e., the hypervisor or VM). The sensors can continuously monitor themselves. In some cases, the sensors can perform synchronistic monitoring. Moreover, the sensor can track its usage of resources, such as bandwidth, memory, CPU utilization, network traffic, etc. The sensor can then detect resource usage above a particular threshold, which can be predetermined or configured, and take corrective actions or develop a plan. Thresholds can be based on rules which can be specific for a context, service, device, or performance requirement. In some cases, the sensor can relaunch itself if resource usage is above a threshold, as a mechanism of corrective action.
To implement this self-monitoring concept, the sensors can have a particular architecture with two layers. The first layer can be a core layer which corresponds to the sensor's logic. The second layer can correspond to an outer layer which contains the logic for performing the selfmonitoring and policing.
In some embodiments, detected resource usage over a threshold can also be used to identify potential issues or threats in the system or network. For example, collected resource usage statistics of a sensor can identify normal or expected consumption of the sensor. If the sensor detects a large spike in the amount of resources used by a sensor, this abnormal resource consumption can be used to detect a threat or attach, such as a DDoS attack, as the number of hits and consequently the number of used resources would be expected to spike significantly during an attack.
Further, the actions taken by a sensor can be based on the amount of resource usage detected, the type of resource usage or pattern, prior resource usage information, predefined rules, a current context, and other factors. The actions can include rebooting or relaunching the sensor, turning off the sensor, sensing an alert, limiting usage of the sensor, or turning off or systems, components, services, or even network segments.
The conditions detected by the self-monitoring of the sensors can also be used to guide what information should be reported by a sensor. For example, if the amount of resources used by a sensor are determined to be excessive, the sensor can then be instructed to limit the amount of information it reports to lower its consumption of resources through the reporting process.
Dealing with Compromised Sensors in VMs and Hypervisors
The Tetration platform implements sensors in the VMs, hypervisors, and switches in the ACI in order to perform analytics and collect information for troubleshooting, planning, deployment, and security. While this provides numerous benefits and advantages, the addition of elements in the network can also present the risk of attacks. For example, sensors can be hacked or compromised by hackers. A compromised or hacked sensor can create additional resource consumption, which can negatively impact network and system performance, or result in data breaches or further attacks. As a result, it is important to protect against attacks and provide a solution for compromised sensors.
This invention allows for detection of compromised sensors in VMs and hypervisors in a tenant space, and provides various mechanisms of correction for the compromised sensors. The data from the compromised sensors can be manipulated to limit the impact on the network and infer additional attacks or threats on other devices.
Advantages include: This invention provides a mechanism for quick and accurate detection and correction of compromised sensors in VMs and hypervisors. This provides numerous performance and security advantages when running sensors for analytics.
Industry use: Analyzing activity to determine if a device is compromised is known. However, we are unaware of any prior art that teaches detecting compromised sensors within VMs and hypervisors and takes corrective actions as described in this invention.
This invention allows for detection and correction of compromised sensors in the VMs and hypervisors. Below we describe the detection aspect of this invention, followed by corrective actions.
Detection: Detection can be performed using two checkpoints. The first checkpoint can be based on the data reported to the collector by the sensors. For example, the collector can collect historical statistics and usage information to determine what amount of usage and what types of behaviors are considered normal. Accordingly, as the collector receives data from the sensors, it can compare the reported data with previous data to determine if there is abnormal activity or behavior. For example, based on data and statistics reported by sensors to the collector, the collector can determine one thousand hits to a mysql database associated with a sensor is the average or expected amount of activity reported by the sensor. If the sensor suddenly reports a million hits to the mysql database, the collector can determine that this amount of activity is abnormal, which would raise a flag at the first checkpoint.
The second checkpoint can be a comparison of data reported by the sensor in the hypervisor with data reported by the sensor in the hardware switch. For example, if the sensor in the hypervisor reports one million hits to the mysql database, then the sensor on the hardware switch should also report one million hits. Thus, if the sensor on the hardware switch otherwise reports a significantly different amount of hits, such as one thousand hits, then the collector can infer that this discrepancy is a result of the sensor in the hypervisor being compromised.
The first and second checkpoints can be used together as multiple layers of detection to verify or confirm suspicious activity. However, in some cases, the invention can limit the detection to one checkpoint or detection mechanism, which can be selected based on a context, rule, or needs and requirements.
Corrective Actions: When a sensor is identified as being compromised, the flows from the sensor can be annotated to indicate such data is not reliable. The annotation can ensure that the collector does not rely on the data and statistics from the compromised sensor, or otherwise performs a verification procedure.
In addition, when a sensor is compromised, the data from the sensor can be aggregated or summarized and the amount of data retained can be limited. Moreover, when a sensor is compromised, the snapshots or time frames of data reported by the sensor can be modified. For example, the time frames can be increased so the compromised sensor is forced to report data at larger intervals. This can reduce the amount of data reported and collected, and the amount of bandwidth and resources used by the compromised sensor in reporting data. In some cases, the reporting of data by a compromised sensor can be stopped altogether for a period of time or indefinitely until the sensor is fixed. The larger time frames or snapshots can also ensure that the amount of data from the compromised sensor is less granular to reduce unnecessary or false/incorrect data reported by the compromised sensor and collected by the collector.
The data from a compromised sensor can also be analyzed to infer additional statistics or details. For example, by detecting that a sensor is compromised, the system can also infer that other sensors in the tenant space are also compromised. This can be based on the structure of the tenant space, the topology, and the relationships between sensors, for example. Moreover, the data from the compromised sensor can be analyzed to determine a state or condition of the compromised sensor and system, and other sensors or systems.
In some cases, corrective actions can be taken when either a sensor is detected as being compromised and/or when a server is compromised. To protect the pipeline, the sensors can be instructed to stop reporting flows or limit the amount of flows reported. The collector can also be instructed to start dropping or shedding loads or data from the compromised sensors, and protect the devices that sit behind the collector.
The Tetration platform implements sensors in the VMs, hypervisors, and switches to collect traffic data and statistics from the respective devices. This data can be reported to collectors which collect the data and provide it to a Tetration engine which performs analytics for troubleshooting, planning, deployment, and security. Thus, the collectors are an important aspect of the Tetration platform. If a collector stops functioning properly, this can disrupt the collection and reporting of data and consequently the analytics. Accordingly, it is important to provide high availability of collectors. This is particularly so as the size of the data center grows, which increases the likelihood and impact of a disabled or malfunctioning collector.
Sensors can send their data to a primary collector and a secondary collector, both of which collect and report the data from the sensors at all times. A centralized system can receive the data from the primary and secondary collectors and identify any duplicates in order to deduplicate the data before sending it to the pipeline.
Advantages include: The current mechanism of HA and deduplication is more reliable and accurate than traditional HA mechanisms which rely on heartbeats and zookeepers to maintain an active and inactive device. Precisely, the current mechanism ensures that data is not lost during a transition of roles between primary and secondary collectors.
Industry use: High availability of systems is generally known. Prior art solutions implement heartbeats for HA to coordinate roles between devices. We are not aware of prior art solutions which perform the HA and deduplication mechanism of this invention.
For every flow, we have a primary collector and a secondary collector. Every sensor sends the same data to two different collectors, a primary collector and a secondary collector. The different collectors can then send the data from the sensors to a centralized location which runs a deduplicator that knows what data to keep and what data to ignore. The centralized location can then send the deduplicated data to the pipeline for collection and analytics.
For example, a sensor can be instructed to send its data to a primary collector and a secondary collector. Some of this data will inherently be duplicate data. However, by sending the data to both a primary collector and a secondary collector, the sensor can ensure that its data is collected even in the case that a collector experiences a problem or fails to receive data from the sensor. Both collectors then report the data from the sensor to the centralized location. Accordingly, the centralized location is guaranteed to receive the data from the sensor even if one of the collectors experiences problems or fail to receive data from the sensor as previously noted. The centralized location can then deduplicate the data to remove any duplicate data reported from the primary and secondary collectors to ensure that unnecessary data is not ultimately reported to the pipeline.
To deduplicate the data, the centralized system can analyze the data it receives from the primary and secondary collectors to identify the respective sensors, flow IDs, and collectors associated with each received flow. The centralized system can compare the received flows, including the respective sensor, flow, and collector IDs of the received flows as well as the timestamps, to determine which flow it should keep. Any duplicate or redundant data can then be discarded by the centralized system. The deduplicated data can then be pushed to the pipeline for collection and analysis.
This mechanism can ensure that data is not lost if a collector goes down for any period of time or fails to receive data from a sensor. For example, in cases where high availability is performed using heartbeats or a zookeeper, data can be lost at specific times frames (albeit often small) between the time the active device experiences a problem and the inactive device takes over. On the other hand, this mechanism ensures that data is not lost when a collector experiences a problem because both collectors collect and report the sensor data at all times. The duplicate or redundant data can then be deduplicated by the centralized system.
The distribution of collectors to flows can be determined using a table with rows containing hash values calculated by applying a hash function to a flow key. Thus, each row is assigned a row in the table, and collectors are then assigned to each cell and row.
Determining Packet Loss at Different Points in a Distributed Sensor Collector Architecture
This invention is implemented within an architecture for observing and capturing information about network traffic in a datacenter as described below.
Network traffic coming out of a compute environment (whether from a container, VM, hardware switch, hypervisor or physical server) is captured by entities called sensors which can be deployed in or inside different environments as mentioned later. Such capturing agents will be referred to as “Sensors”. Sensors export data or metadata of the observed network activity to collection agents called “Collectors.” Collectors can be a group of processes running on a single machine or a cluster of machines. For sake of simplicity we will treat all collectors as one logical entity and refer to it as one Collector in our discussion. In actual deployment of datacenter scale, there will be more than just one collector, each responsible for handling export data from a group of sensors.
Collectors are capable of doing preprocessing and analysis of the data collected from sensors. It is capable of sending the processed or unprocessed data to a cluster of processes responsible for analysis of network data. The entities which receive the data from Collector can be a cluster of processes, and we will refer to this logical group as Pipeline. Note that sensors and collectors are not limited to observing and processing just network data, but can also capture other system information like currently active processes, active file handles, socket handles, status of I/O devices, memory, etc.
A network will often experience different amounts of packet loss at different points within the path of a flow. It is important to identify the amount of packet loss at each point to fine tune and improve the network.
This invention allows a centralized system to collect and aggregate data captured from sensors at each point within a communication path over a specific period of time and compare the information reported at each point to identify packet loss at each point.
Advantages include: This mechanism can be implemented in a live environment and can accurately and efficiently ascertain packet loss at each point within a network.
Industry use: Prior art solutions implement a request/reply model when trying to identify packet loss at different points. A system will send a request at each point and will identify packet loss if a reply is not received. However, unlike the current invention, this model cannot be implemented in a live environment. Moreover, this model is not as efficient or accurate as the current invention.
The current invention implements sensors within VMs, hypervisors, servers, and hardware switches which capture data sent and received at each of these points and reports the data to a collector which can aggregate and maintain the reported, sensed data. The collector can transmit the collected data from each sensor to the pipeline (e.g., Tetration engine), which can analyze the aggregated data and identify precise amounts of packet loss at each point.
The pipeline can identify packet loss at each point by comparing data or packets captured and reported by sensors at each point. This comparison can be performed per flow, per link, or on a host basis. Moreover, the pipeline can perform the comparison for data captured within a specific time window. For example, the pipeline can compare data from each point within a 30 minute time window. The pipeline can then identify packet loss at each point and determine if there is a problem at a specific point within the link, path, or flow.
For example, the pipeline can analyze an aggregate of data captured for a 30 minute window of communications from S1 to H1 to S2. Based on the aggregated data, the pipeline can determine that S1 reported 100% of the packets, H1 reported 90% of the packets, and S2 reported 80% of the packets. Here, the pipeline can thus determine that there is a 10% packet loss at each of H1 and S2.
In a Virtualized compute infrastructure, detect the placement relationship of various components that can be used to capture packet or metadata of packets flowing through it. A packet inspection agent called sensor can be deployed on a Virtual machine, or on a hypervisor or inside a physical switch. All of the three types of sensors mentioned above can export information of the captured packets or flows to a central entity called Collector for processing. The sensor could read an externally maintained config to figure out if it is deployed on a virtual machine or hypervisor or physical switch. Use of an external file to solve this problem requires either a person to update the config each time new sensors are deployed or the same sensor moves to a different virtual machine. Detecting the placement relationship of sensor without external configuration file, based on just the packet or flow information that is being exported by the sensor is the problem that is solved in this patent.
Detecting the sensor-collector topology in a network for understanding the placement of sensors and collectors associated with a reported flow.
Advantages include: The topology and placement information ascertained through this invention drives the Tetration platform and is used by most of the features which rely on data captured by sensors and collected by collectors. This invention is very important for the analytics performed by the Tetration platform.
Industry use: We are not aware of any prior art solutions for detecting the placement information for sensors and collectors as performed in this context.
In a Virtualized compute infrastructure, detect the placement relationship of various components that can be used to capture packet or metadata of packets flowing through it. A packet inspection agent called sensor can be deployed on a Virtual machine, or on a hypervisor or inside a physical switch. All of the three types of sensors mentioned above can export information of the captured packets or flows to a central entity called Collector for processing. The sensor could read an externally maintained config to figure out if it is deployed on a virtual machine or hypervisor or physical switch. Use of an external file to solve this problem requires either a person to update the config each time new sensors are deployed or the same sensor moves to a different virtual machine. Detecting the placement relationship of sensor without external configuration file, based on just the packet or flow information that is being exported by the sensor is the problem that is solved in this patent.
Sensors have been previously deployed in the manner described in this document. The placement of such sensors has been manually compiled by a person or software that is aware of the relative placement of the sensors. Using API or interfaces provided by hypervisor environment one can figure out if a sensor is deployed on a VM or a hypervisor. The management software like VMware vSphere or vCenter used for provisioning Virtual machines also know which VMs are present on which hypervisor. By integrating with such management or deployment software, one can figure out the relative placement of sensors. But no prior art uses the technique to figure out the placement just by looking at the flows metadata exported by the sensors or the network capture agents.
Agents for capturing network data have been deployed even previously inside a Virtual machine or hypervisor. Automatically detecting the environment in which such a sensor is placed by collectively analyzing the data reported by all of the sensors is the new technique presented in this patent. Another new technique presented in this patent is automatically detecting the relationship of these sensors in terms of their placement.
Processes or agents that run on a system to capture network data are referred to as capturing agents or simply “Sensors” in this document. Such sensors have ability to report metadata about the packets that is observed or report a subset of captured network data to a collection and aggregation entity that may be running on a different or same system. Apart from network data, the sensor may also capture additional information about the system it is running on. The additional data can consist of, but is not limited to data or metadata of active or previously active processes of the system along with metadata of files that are present on the system. The collection entity could be a single or a cluster of processes. A single collection entity or process is referred to as Collector in this document.
Sensors or network capture agents could be present and running inside multiple environments. We list three such possible environments.
a. As a process or kernel module or kernel driver on a guest Operating System installed in a Virtual machine.
b. As a process or kernel module or kernel driver on the host operating system installed at the hypervisor layer.
c. As a process or a component in a physical network gear that is capable of routing or switching. The network gear could provide a way to install such an agent, or the network gear itself could act as such an agent. The network gear or its component would have to export metadata about the packets or flows it observed, to a Collector.
In each of the above scenarios where a sensor can be placed, the sensor has the ability to observe all packets that flow through the system, and it can talk to the Collector using a IP address. In a datacenter or a large deployment, there can be millions of Sensors running and reporting network information to Collector. Collector can perform a number of processing activities on the reported data ranging from network diagnostics to security related applications. Having the knowledge of whether the reported sensor data was from a sensor deployed inside a VM or from a sensor deployed inside Hypervisor or from a sensor deployed inside a networking gear is very important for a number of algorithms that do processing on the gathered data. The use cases of the algorithms will not be discussed in this patent.
The network data observed by a sensor A inside a VM is a subset of the network data observed by a sensor B inside the hypervisor on which the VM is running. Further, the network data observed by a sensor B running inside a Hypervisor is again a subset of the network data observed by a sensor C running either inside or as part of the networking gear to which the hypervisor or the physical machine is connected to. The relationship information about whether sensor B in placed in a hypervisor which contains the VM where sensor A is placed, is very important for a lot of algorithms that do analysis on the captured data. This relationship about sensor placement can be constructed manually by a person who has deployed the sensors. It might be possible to query the hypervisor environment using hypervisor specific APIs, and management interfaces provided by various hypervisor environments like Xen, Vmware, KVM, etc. A new way of figuring out this relationship from the captured flow data is presented in this patent. The technique is not dependent on a hypervisor environments or specific management solutions provided by various environments. The technique also enables detection of VM movements, and thus updating the relationship automatically.
All sensors send information about the network traffic they have observed. This information is sent to Collector. Sensor knows the IP address and port number used to send information to the collector. All sensors periodically sends information about each and every flow they have observed to the Collector. The information sent contains a list of flows that were active between the current time and the time at which the last information was sent to the collector. The communication channel between the sensor and the collector also creates a flow in every interval. Let us denote this flow as CF or control flow. The information sent by a sensor will also contain information about the Control Flow since its also a valid flow in the system.
Let us imagine the following setup for purpose of explanation—
1. Sensor 51 is deployed in a VM that is running inside a Hypervisor. IP address of the VM is IP1.
2. Sensor S2 is deployed in the hypervisor mentioned in 1 above. IP address of the hypervisor is IP2 which is different from IP1.
3. Sensor S3 is deployed in or as part of the physical network switch or NIC. The IP address of the switch is IP3. This network switch is placed such that all network traffic coming out and going into the Hypervisor mentioned in 2, go through this switch.
Based on above placement of sensors, the following holds true—
1. All flows seen and reported by 51 will also be seen and reported by S2.
2. All flows seen and reported by S2 will also be seen and reported by S3.
Thus,
1. flow F1 that is generated inside the VM and seen by S1, will be reported by S1, S2, S3 to collector. So, the control flow denoting the communication between 51 and collector will be seen and reported by 51, S2, S3 to Collector.
2. flow F2 generated inside the hypervisor will be seen and reported by S2, and S3 but not 51. So the control flow denoting the communication between S2 and Collector will be seen and reported by S2 and S3 to Collector.
3. flow F3 generated by switch will be seen only by the switch itself and reported to Collector by F3 alone.
At Collector, after collecting information from all sensors, we will have the following relation—
1. F1 reported by S1, S2, S3
2. F2 reported by S2, S3
3. F3 reported by S3.
Here is the algorithm that determined the relationship of one sensor to others.
1. For each flow emit get the list the sensors reporting it. Call this list L. List L contains sensor ids of all sensors that reported the flow.
2. For every id ‘Si’ in list L, do the following—
a. emit a Tuple—{Si, Set of all sensors in L except Si}
3. Collect all the tuples at the end of Step 2.
4. For every sensor with id Si do the following—
a. Get a list of all tuples where Si is the first element.
b. Take a intersection of the sets that are second element in the tuples gathered above. Call this intersection set ‘Front sensors’. It represents the list of sensors that can see all flows that sensor Si can see. In our example, for S1, the set of Front sensors will be {52, S3}.
c. Take a union of the sets that are second element in the tuples generated in Step a. Compute the difference between the union set and the intersection set. Call this ‘Difference set’ as the ‘Rear sensors’. It represents the list sensors whose all flows can be seen by sensor Si. In our example, for S1, the set of Rear sensors will be empty set. For S2, the set of Rear sensors is {Si}.
Using the above algorithm Collector or any process that analyzes the flow metadata exported by the sensors can determine the relation placement of a sensor with respect to each other.
Conventionally, policies statically define how network objects (e.g., endpoints, endpoint groups, firewalls, network gear, etc.) can communicate with other network objects. However, it may be desirable to have conditional policies that allow (or deny) connections in some instances and deny (or allow) connections in other instances.
Policies can be defined to take into account a host's behavior (i.e., “reputation” or “vulnerability index”). Suppose a host can be designated as “Good,” “OK,” or “Bad,” and can move among these states over its lifetime. A policy can be established enabling connection to a certain EPG based on one of these states.
Advantages include: A host can move from one reputational EPG to another based on the host's behavior, and policies do not need to be manually updated to account for the changes to the reputation of the host.
Industry use: Microsoft has the concept of “Dynamic Access Control” but this is implemented at the user level rather than host to host.
The Tetration system introduces the concept of “reputation” or a “vulnerability index” for a host (discussed in detail elsewhere). Policies can be defined that can take into account a host's reputation (e.g., “Good,” “OK,” or “Bad”). For example, we can define “Good”, “OK,” and “Bad” EPGs. Host A can initially have a “Good” or “OK” reputation, and thus is a member of the “Good” or “OK” EPG. A policy P can be defined that allows members of the “Good” or “OK” EPG to access EPG B. According to this policy, A will be able to connect to B. Suppose that A is subsequently exposed to a malware attack resulting in A's reputation being reduced to “Bad,” and its EPG membership changing from the “Good” or “OK” EPG to the “Bad” EPG. Because A is now in the “Bad” EPG, A cannot connect to B under policy P.
As another example, suppose that policy Q is a rule that allows members of the “Bad” EPG to access image update servers in EPG C. Under policy Q, A can access the image update servers in C to update its software so that A can be remediated.
As should be understood, there can be more or fewer classifications than “Good,” “OK,” and “Bad.” Further, thresholds for the classifications can be configured by the user. For instance, a host may be designated as “Good” if it has a reputation between 0.7 and 1 (assuming a reputation scale of −1 to 1), “OK” if it has a reputation less than 0.7 but greater than 0, and “Bad” if it has a negative reputation.
Increasing robustness of a host's “reputation” or “vulnerability” index.
Reputations or vulnerabilities can be quantified for hosts in a data center. Reputation scores or vulnerability indexes can be derived from external sources, such as malware trackers or whois.
Advantages include: Reputations or vulnerabilities of a host in a data center can be augmented using external metadata.
Industry use: We believe that a host's “reputation” or “vulnerability” index in the context of a data center is a novel concept.
The Tetration system introduces the concept of “reputation” or a “vulnerability index” for a host (discussed in detail elsewhere). The reputation score or vulnerability index can be helpful for a variety of use cases, such as enabling conditional policies based on reputation/vulnerability, separating malicious versus non-malicious behavior, and determining effectiveness of policies, among other examples.
In an embodiment, a host can have a “Good,” “OK,” or “Bad” reputation, although there can be more or fewer classifications in other embodiments. Further, thresholds for the classifications can be configured by the user. For example, a host may be designated as “Good” if it has a reputation between 0.7 and 1 (assuming a reputation scale of −1 to 1), “OK” if it has a reputation less than 0.7 but greater than 0, and “Bad” if it has a negative reputation.
The reputation score can be calculated exclusively from analysis of network traffic in a data center. But we can also leverage external sources for further enhancing reputation scores. For example, we can crawl malware trackers (e.g., https://zeustracker.abuse.ch/monitor.php?filter=all), which identify IP addresses that have been infected by particular malware. The reputation of a host in a data center can be reduced if that host has communicated with an external host that has been infected by malware.
We can also crawl whois to determine what IP addresses have been properly allocated to legitimate entities. If a host in a data center is communicating with an unallocated IP address, we can reduce the reputation of that host.
Determining the extent that security policies are being utilized or not being utilized by a data center.
By including sensors at the various components of a data center (e.g., virtual machine, hypervisor, physical network gear), network traffic in the data center can be analyzed to determine which policies are being utilized (or not being utilized) and the extent (e.g., number of flows, number of packets, bytes, etc.) those policies are being utilized.
Advantages include:
i) Smart ordering of policies—policies can be ordered according to utilization. For example, higher-usage policies can be ordered higher in the policy rule set or higher-usage policies can be stored in memory of network gear).
ii) Garbage collection—those policies that are not being utilized can be removed (e.g., no flows, no packets, no IP addresses communicating on the connection).
Industry use: There does not appear to be any prior art relating to monitoring of utilization (or non-utilization) of policies between endpoints or endpoint groups in a data center. However, there appear to be providers in the related space of security policy management for firewalls and network devices (e.g., AlgoSec, FireMon, SolarWinds, Skybox Security, Tufin).
The Tetration policy pipeline is composed of four major steps/modules:
(1) Application Dependency Mapping
In this stage, network traffic is analyzed to determine a respective graph for each application operating in a data center (discussed in detail elsewhere). That is, particular patterns of traffic will correspond to an application, and the interconnectivity or dependencies of the application are mapped to generate a graph for the application. In this context, an “application” refers to a set of networking components that provides connectivity for a given set of workloads. For example, in a conventional three-tier architecture for application, the servers and other components of the web tier, application tier, and data tier would make up an application.
(2) Policy Generation
Whitelist rules are then derived for each application graph determined in (1) (discussed in detail elsewhere). As is known in the art, in a blacklist model, all communication is open unless explicitly denied, whereas a whitelist model requires communication to be explicitly defined before being permitted. Conventional systems use a blacklist model. One of the advantages of the Tetration system is implementation of a whitelist model, which may be more secure than a blacklist model. For instance, using a whitelist model is recognized by the Australian Signal Directorate to be the #1 approach for mitigating targeted cyber attacks (http://www.asd.gov.au/infosec/top-mitigations/top-4-strategies-explained.htm).
As an example of whitelist rule generation, suppose there is an edge of an application graph between E1 (e.g., endpoint, endpoint group) and E2. Permissible traffic flows on a set of ports of E1 to one or more ports of E2. A policy can be defined to reflect the permissible traffic from the set of ports of E1 to the one or more ports of E2.
(3) Flow Pre-Processing
After the application dependencies are mapped and the policies are defined, network traffic is pre-processed in the policy pipeline for further analysis. For each flow, the source endpoint of the flow is mapped to a source endpoint group (EPG) and the destination endpoint of the flow is mapped to a destination EPG. Each flow can also be “normalized” by determining which EPG corresponds to the client, and which EPG corresponds to the server.
(4) Flow Analysis
Each pre-processed flow is then analyzed to determine which policies are being enforced and the extent (e.g., number of packets, number of flows, number of bytes, etc.) those policies are being enforced within the data center.
This flow analysis occurs continuously, and the Tetration system allows a user to specify a window of time (e.g., time of day, day of week or month, month(s) in a year, etc.) to determine which policies are being implemented (or not being implemented) and how often those policies are being implemented.
Annotation
A flow is a collection of packets having a same source address, destination address, source port, destination port, protocol, tenant id, and starting timestamp. But having only this key/signature may not be particularly helpful to users trying to understand this data and we would like to be able tag flows to enable users to search the flow data and to present the flow data more meaningfully to users.
A high-level overview of the pipeline with the key components for flow annotation is provided as the attached figure. Generally, flow data is collected by sensors incorporated at various levels of a data center (e.g., virtual machine, hypervisor, physical switch, etc.) and provided to a Collector. The Collector may perform certain processing on the raw flow data, such as de-duping, and then that data is stored in the HDFS. The Compute Engine processes the flow data in the HDFS, including annotating each flow with certain metadata based on specified rules in order to classify each flow. This enables the UI to present meaningful views of flows or allows users to search flows based on tags.
Each flow is annotated according to certain default tags, such as Attack, Policy, Geo, Bogon, Whitelist, etc. Attack refers to whether a flow has been determined to be a malicious flow. Policy refers to whether a flow is compliant or non-compliant with policy. Geo refers to the geographic location from which the flow originated. This is determined based on IP address. Bogon refers to whether a flow corresponds to an IP address that has not yet been allocated by the IANA. Whitelist refers to a flow that has been determined to be a “good” flow.
Tagging can be hierarchical. For example, in addition to annotating a flow as an Attack flow, the Tetration system can also specify the type of attack, e.g., malware, scan, DDoS, etc. As another example, the Geo tag can classify a flow according to country, state, city, etc.
The Tetration system also enables users to tag flows based on custom tags according to rules that they define. The custom tags and rules can be input by users via the UI coupled to a Rules module. In an embodiment, the Rules module translates the user-defined tags and rules into machine-readable code (e.g., JSON, XML) to integrate the new tags into the HDFS. On the next iteration of the processing by the Compute Engine, the custom tags will be applied to the flows. The rules can be managed according to a Rule Management module that enables users to perform tag-based analytics (e.g., rank custom tags based on usage), share rules and custom tags among different tenants, associate tags to a hierarchy (e.g., classify tags as associated with certain organizations, or classify tags as relating to networking, etc.), alias tags (i.e., same rules w/ different names).
Policy Simulation
Determining how changes to the data center (e.g., adding or removing a policy, modifying endpoint group membership, etc.) will affect network traffic.
Policy changes and changes to endpoint group (EPG) membership can be evaluated prior to implementing such changes in a live system. Historical ground truth flows can be used to simulate network traffic based on the policy or EPG membership changes. Real-time flows can also be used to simulate the effects on network traffic based on implementation of an experimental policy set or experimental set of EPGs.
Advantages include:
i) Capable of determining impact on an application due to changes to policies or EPG membership.
ii) Capable of determining impact of future attacks to a data center based on policy or EPG membership changes. Industry use: There does not appear to be any prior art relating to simulation of policies between endpoints or endpoint groups in a data center. However, there appear to be providers in the related space of security policy management for firewalls and network devices (e.g., AlgoSec, FireMon, SolarWinds, Skybox Security, Tufin). The Tetration policy pipeline is composed of four major steps/modules:
(1) Application Dependency Mapping
In this stage, network traffic is analyzed to determine a respective graph for each application operating in a data center (discussed in detail elsewhere). That is, particular patterns of traffic will correspond to an application, and the interconnectivity or dependencies of the application are mapped to generate a graph for the application. In this context, an “application” refers to a set of networking components that provides connectivity for a given set of workloads. For example, in a conventional three-tier architecture for application, the servers and other components of the web tier, application tier, and data tier would make up an application.
(2) Policy Generation
Whitelist rules are then derived for each application graph determined in (1) (discussed in detail elsewhere). As is known in the art, in a blacklist model, all communication is open unless explicitly denied, whereas a whitelist model requires communication to be explicitly defined before being permitted. Conventional systems use a blacklist model. One of the advantages of the Tetration system is implementation of a whitelist model, which may be more secure than a blacklist model. For instance, using a whitelist model is recognized by the Australian Signal Directorate to be the #1 approach for mitigating targeted cyber attacks (http://www.asd.gov.au/infosec/top-mitigations/top-4-strategies-explained.htm).
As an example of whitelist rule generation, suppose there is an edge of an application graph between E1 (e.g., endpoint, endpoint group) and E2. Permissible traffic flows on a set of ports of E1 to one or more ports of E2. A policy can be defined to reflect the permissible traffic from the set of ports of E1 to the one or more ports of E2.
(3) Flow Pre-Processing
After the application dependencies are mapped and the policies are defined, network traffic is pre-processed in the policy pipeline for further analysis. For each flow, the source endpoint of the flow is mapped to a source endpoint group (EPG) and the destination endpoint of the flow is mapped to a destination EPG. Each flow can also be “normalized” by determining which EPG corresponds to the client, and which EPG corresponds to the server.
4) Flow Analysis
Each pre-processed flow is then analyzed to determine various metrics, such as whether a flow is in compliance with security policies, which policies and to what extent those policies are being utilized, etc.
This flow analysis occurs continuously, and the Tetration system allows a user to specify a window of time (e.g., time of day, day of week or month, month(s) in a year, etc.) to determine the number of non-compliant events that occurred during that period.
In addition to evaluating policies actually existing in the data plane, the policy pipeline also enables “what if” analysis, such as analyzing what would happen to network traffic upon adding a new policy, removing an existing policy or changing membership of EPG groups (e.g., adding new endpoints to an EPG, removing endpoints from an EPG, moving an endpoint from one EPG to another).
In one embodiment, historical ground truth flows are utilized for simulating network traffic based on a “what if” experiment. This is referred to as back-testing. In another embodiment, real-time flows can be evaluated against an experimental policy set or experimental set of EPGs to understand how changes to particular policies or EPGs affect network traffic in the data center.
Collapsing and Placement of Applications
To provide visibility of data flows in a multi-tier application and help network teams understand the dataflow of an application and develop the application's dataflow.
The invention is directed to an application dependency map visualized in a collapsible tree flow chart. The tree flow chart is collapsible and displays the policies/relationships between each logical entity that carries a multi-tier application. The collapsible multi-tier application UI displays the data flows of a multi-tier application.
The invention is directed to an application dependency map visualized in a collapsible tree flow chart. The tree flow chart is collapsible and displays the policies/relationships between each logical entity that carries a multi-tier application. The collapsible multi-tier application UI displays the data flows of a multi-tier application. A multitier application can have various aspects of the application running on various hosts. The UI displays the hierarchy and policies or dependencies between each logical entity running the application. The UI is collapsible allowing the user to drill down on any node/logical-entity representing hosts, databases or application tier. By making the UI collapsible, it allows for a more consumable UI.
The UI displays various nodes and interacting with a node will show an exploded view of that node. A node is any logical entity. For example, any application's tier of the multitier application, database tiers, and host tiers. The exploded view of the node will explode new nodes that have edges connecting the new nodes with the exploded node. The edges represent policies between the new nodes and between the new nodes and the exploded node. For example, the original node can be a host running the application. The exploded view displays new nodes. The new nodes represent all neighbors the host communicates with. The new nods are usually exploded right of the exploded node to demonstrate the hierarchy between the logical entities.
The collapsible tree flow chart uses the data gathered from the tetration layer. Data used and made visible in the collapsible tree flow chart are (1) data flows from one logical entity to another logical entity; (2) the policies that govern the data flows from one logical entity to another logical entity; (3) what host the data flow came from; (4) what host group the data flow came from; and (5) what subnet the data flow came from. The UI is customizable. User can select elements to adjust subnet groupings and cluster groupings. Additionally the user can upload side information. Examples of side information are DNS names, host names, etc.
Currently, the problem with tree flow charts is it only shows the flow of information between parent and child. It does not show all the relationships between all the entities. Furthermore if there is a large number of parents and children, the flow chart becomes unmanageable difficult to consume.
Custom Events Processor for Network Event
Malware and other malicious processes can be very harmful on a network. Given the amount of data, flows, and processes running on a network, it can be very difficult to detect malware and malicious events. Some types of malicious events, while very harmful to the network, can be extremely difficult to detect. For example, malicious command-in-control processes can be very difficult to identify particularly when hidden. This can be complicated by the fact that certain commands, while inherently dubious, may be triggered accidentally or by fluke without any necessary malicious intent. Accordingly, it would be valuable to provide a solution that allows to capture events on a network from different perspectives and understand the different patterns to determine if a process is truly malicious or not.
This invention collects sensed data to generate a lineage of every network process. A statistical model can be implemented to then detect patterns based on the lineage of the process and identify any anomalies or malicious events.
Advantages include: This invention can provide a better understanding of processes, particularly with EPGs, and help to detect any anomalies or malicious events when a command or process is executed in the network. This invention can be implemented in a wide variety of contexts using statistical models.
Industry use: Malware and spoofing prior art solutions. However, we are not aware of any solutions that implement a statistical model to generate process lineage mappings and identify anomalies.
This invention is implemented within an architecture for observing and capturing information about network traffic in a datacenter as described below.
Network traffic coming out of a compute environment (whether from a container, VM, hardware switch, hypervisor or physical server) is captured by entities called sensors which can be deployed in or inside different environments as mentioned later. Such capturing agents will be referred to as “Sensors”. Sensors export data or metadata of the observed network activity to collection agents called “Collectors.” Collectors can be a group of processes running on a single machine or a cluster of machines. For sake of simplicity we will treat all collectors as one logical entity and refer to it as one Collector in our discussion. In actual deployment of datacenter scale, there will be more than just one collector, each responsible for handling export data from a group of sensors.
Collectors are capable of doing preprocessing and analysis of the data collected from sensors. It is capable of sending the processed or unprocessed data to a cluster of processes responsible for analysis of network data. The entities which receive the data from Collector can be a cluster of processes, and we will refer to this logical group as Pipeline. Note that sensors and collectors are not limited to observing and processing just network data, but can also capture other system information like currently active processes, active file handles, socket handles, status of I/O devices, memory, etc.
In this context, we can capture data from sensors and use the data to develop a lineage for every process. The lineage can then be used to identify anomalies as further described below.
Every process in a network can have some type of lineage. The current invention performs an analysis of commands and processes in the network to identify a lineage of a process. The lineage can be specifically important and relevant with endpoint groups (EPGs). The lineage can help identify certain types of patterns which may indicate anomalies or malicious events.
For example, the system can identify a process at system Y when command X is executed. Command X may have been observed to be triggered by command Z. We then know that the lineage for the process at system Y is command Z followed by command X. This information can be compared with processes and commands as they are executed and initialized to identify any hidden command-in-control or other anomalies.
To detect anomalies, other factors can also be taken into account. For example, factors which are inherently dubious can be used in the calculus. To illustrate, a process for running a scan on the network is inherently dubious. Thus, we can use the process lineage (i.e., lineage of the process for scanning the network) to determine if the scan was executed by a malicious command or malware. For example, if the scan follows the expected lineage mapped out for that process then we may be able to determine that the scan is legitimate or an accident/fluke. On the other hand, if the scan was triggered by an external command (i.e., command from the outside), then we can infer that this scan is part of an attack or malicious event. Similarly, if the scan does not follow the previously-established lineage (e.g., scan was started by a parent process that is not in the lineage), we can determine that the scan is part of a malicious event.
This invention can use a statistical model, such as markov chains, to study the lineage patterns and detect anomalies. The lineage patterns ascertained through the statistical model can be based on data collected by the sensors on the various devices in the network (VMs, hypervisors, switches, etc.). The statistical models and lineage information can be used in other contexts and may be applied with EPGs for understanding processes and anomalies.
The lineage information can be used to detect a command-in-control for a process and determine if the command is a hidden command or not. For example, if the command is not in the lineage, we can expect the command to be a hidden command. Hidden commands can be inherently dubious and more likely to be malicious. However, based on our statistical model, we can identify whether the hidden command may be a fluke or accident, or whether it is indeed a malicious event.
Discovering Causal Temporal Patterns
Problem to solve: event sequences reveal a temporal structure of various applications running in a computing network. Discovering temporal patterns (sequences) can be an important component of various network-related tasks, such as normalcy modeling and discovering suspect behavior, and building application profiles. There is a need to efficiently determine causal temporal patterns.
The present technology determine causal temporal patterns in a computing network based upon various attributes of network flows, such as server port, packets sent, processes involved in the communications, and timing information when data is exchanged (e.g., flowlets) is recorded (per host).
The present technology determine causal temporal patterns in a computing network based upon various attributes of network flows, such as server port, packets sent, processes involved in the communications, and timing information when data is exchanged (e.g., flowlets) is recorded (per host).
In some embodiments, event co-occurrences can be analyzed within time windows for each host to determine sequential patterns. For example, for requests from host A on a port of host D, host B either becomes a client of host D or host F for 50% of the requests.
In some embodiments, algorithms for determining temporal patterns can also be used to remove noise and co-incidences, be robust to non-deterministic relations, as well as discover and remove periodic events, and be scalable (both memory & time efficiency).
Intra-Datacenter DDOS Detection
Traditionally, DDOS prevention occurs on the periphery of a network. With shared hosting and complicated internal systems, DDOS attacks can arise from within the datacenter, thus avoiding the traditional DDOS prevention techniques.
This is directed to monitoring traffic at a process/virtual machine/hypervisor/switch/etc. level. When any of a variety of traffic parameters goes outside of an established normal distribution, the system can determine whether the traffic of illegitimate traffic (possibly a DDOS attack) and notify a system administrator.
Advantages include: the primary benefit of this system is that it does not require complicated rules and configurations; instead, it relies on establishing a baseline and comparing traffic to that baseline. Further, it has the ability to monitor and manage traffic at a process and virtual machine level.
A distributed denial of service (DDOS) attack is where illegitimate traffic overwhelms a service, effectively shutting it down. Typically, these attacks come from a botnet or collection of botnets where each infected computer in the botnet is instructed to attack an individual machine or service. Various techniques are used to overcome these attacks, like using cloud-based services to accommodate the excess traffic, firewalls to filter requests related to improper ports or protocols, and “black holing” data by dropping all requests. These approaches are satisfactory for external traffic, but, because they are implemented on the periphery, are ineffective at combatting illegitimate intra-datacenter traffic. It would be too expensive to deploy firewalls throughout the datacenter and usually administrators want a way to solve the problem instead of weathering it.
This is directed to monitoring traffic at a process, virtual machine, hypervisor, top of rack, switch, etc. level, detect irregular traffic that might be indicative of a DOS attack or a misconfigured machine, and take appropriate action. Irregular traffic can be discovered by developing a signature of normal traffic from a particular process/virtual machine/hypervisor/etc. and comparing it to current traffic. The signature can include packet count, byte count, service/host connection counts, TCP flags, port, protocol, port count, geolocation, user (of a process), process ID, etc. The signature can be created using statistics and analytics.
The signature can be a long-term distribution. For a given period of time (second, minute, hour, etc.) the system can record data pertaining to all of the above-listed parameters and include the data in a running distribution. Individual parameters can be analyzed independently, or aggregate values can be created by combining parameters. The system can then compare short term data to the long term distribution and determine how likely it is that the short term data describes illegitimate traffic. This determination can be calculated using a human created formula or through machine learning techniques. The system can present a system administrator with a confidence indicator that represents how likely it is that the traffic comprises illegitimate traffic.
When irregular traffic is discovered, a system administrator can be notified and presented with appropriate actions that should correct the irregular traffic. Actions can include shutting down the process or virtual machine, blocking traffic from the virtual machine (via the hypervisor), or blocking the port/protocol/subnet corresponding to the traffic. More advanced filtering can be applied according to the detected anomaly. For example, the system can detect that short packets sent to port X from subnet Y in China are anomalous traffic and can filter traffic that meets that criteria. The filtering criteria can be optimized to capture a limited percentage of legitimate traffic while capturing a high amount of illegitimate traffic.
The DDOS detection is done by running following three phases:
1. Typical traits—Stats during typical normal operations for EPG (manual configuration or derived through ADM), hosts, host pairs, flows, etc. The stats could, for example, include distribution of number of unique destination ports opened on a host/server in a fixed interval. We can keep (mean, variance), or hand-crafted buckets (for example, 1, 2-10, 11-100, 100+ for ports).
2. Anomaly detection—Detect when stats for a particular host, host-pair or flows are outside the normal range. For example, if a host typically has src ports in the 1 and 2-10 range, but the current batch saw src ports in 100+ range, this would be considered an anomaly.
3. DDOS detection through Aggregation—By aggregating anomalies from multiple hosts and host-pairs, we can detect a DDOS attempt.
Keys for the Table—
The table contains data for datacenter, EPG, host, host-pair and server-flows (src+dst+sr3port). We can use a single table for these. The idea is that we would aggregate anomalies from a lower granularity, along with stats to more confidently detect DDOS. For example, within an EPG, we would consider number of hosts reporting anomalies as well as stats for the EPG to detect that an EPG is under attack.
Values in the Table—
We would maintain a traits table in our BD pipeline, which would include following features: (1) packets: a. mean and std of num packets, looking at some flow stats, it appears that the distribution of log(packets) looks Gaussian, and hence we may keep the mean and std of log(packets), b change in packets from last period; (2) bytes: a. mean and std of log(sent bytes), log(receive4bytes), b. change in bytes from last period; (3) number of client/server ports: a. Looking at stats from existing flows, it appears that distribution in the buckets (1, 110, 11100, 100+) would be interesting; (4) Connection rates: a. Number of unique connections in a unit time, b. Number of unique flows in a unit time; (5) number of unique hosts: a. Total number of unique hosts that given host communicates with; (6) Scan stats: a. Stats on stateless scans, b. Stats on stateful scans like SYN, RST, etc.
Example Devices
The interfaces 1068 are typically provided as interface cards (sometimes referred to as “line cards”). Generally, they control the sending and receiving of data packets over the network and sometimes support other peripherals used with the router 1010. Among the interfaces that may be provided are Ethernet interfaces, frame relay interfaces, cable interfaces, DSL interfaces, token ring interfaces, and the like. In addition, various very high-speed interfaces may be provided such as fast token ring interfaces, wireless interfaces, Ethernet interfaces, Gigabit Ethernet interfaces, ATM interfaces, HSSI interfaces, POS interfaces, FDDI interfaces and the like. Generally, these interfaces may include ports appropriate for communication with the appropriate media. In some cases, they may also include an independent processor and, in some instances, volatile RAM. The independent processors may control such communications intensive tasks as packet switching, media control and management. By providing separate processors for the communications intensive tasks, these interfaces allow the master microprocessor 1062 to efficiently perform routing computations, network diagnostics, security functions, etc.
Although the system shown in
Regardless of the network device's configuration, it may employ one or more memories or memory modules (including memory 1061) configured to store program instructions for the general-purpose network operations and mechanisms for roaming, route optimization and routing functions described herein. The program instructions may control the operation of an operating system and/or one or more applications, for example. The memory or memories may also be configured to store tables such as mobility binding, registration, and association tables, etc.
To enable user interaction with the computing device 1100, an input device 1145 can represent any number of input mechanisms, such as a microphone for speech, a touch-sensitive screen for gesture or graphical input, keyboard, mouse, motion input, speech and so forth. An output device 1135 can also be one or more of a number of output mechanisms known to those of skill in the art. In some instances, multimodal systems can enable a user to provide multiple types of input to communicate with the computing device 1100. The communications interface 1140 can generally govern and manage the user input and system output. There is no restriction on operating on any particular hardware arrangement and therefore the basic features here may easily be substituted for improved hardware or firmware arrangements as they are developed.
Storage device 1130 is a non-volatile memory and can be a hard disk or other types of computer readable media which can store data that are accessible by a computer, such as magnetic cassettes, flash memory cards, solid state memory devices, digital versatile disks, cartridges, random access memories (RAMs) 1125, read only memory (ROM) 1120, and hybrids thereof.
The storage device 1130 can include software modules 1132, 1134, 1136 for controlling the processor 1110. Other hardware or software modules are contemplated. The storage device 1130 can be connected to the system bus 1105. In one aspect, a hardware module that performs a particular function can include the software component stored in a computer-readable medium in connection with the necessary hardware components, such as the processor 1110, bus 1105, display 1135, and so forth, to carry out the function.
Chipset 1160 can also interface with one or more communication interfaces 1190 that can have different physical interfaces. Such communication interfaces can include interfaces for wired and wireless local area networks, for broadband wireless networks, as well as personal area networks. Some applications of the methods for generating, displaying, and using the GUI disclosed herein can include receiving ordered datasets over the physical interface or be generated by the machine itself by processor 1155 analyzing data stored in storage 1170 or 1175. Further, the machine can receive inputs from a user via user interface components 1185 and execute appropriate functions, such as browsing functions by interpreting these inputs using processor 1155.
It can be appreciated that example systems 1100 and 1150 can have more than one processor 1110 or be part of a group or cluster of computing devices networked together to provide greater processing capability.
For clarity of explanation, in some instances the present technology may be presented as including individual functional blocks including functional blocks comprising devices, device components, steps or routines in a method embodied in software, or combinations of hardware and software.
In some embodiments the computer-readable storage devices, mediums, and memories can include a cable or wireless signal containing a bit stream and the like. However, when mentioned, non-transitory computer-readable storage media expressly exclude media such as energy, carrier signals, electromagnetic waves, and signals per se.
Methods according to the above-described examples can be implemented using computer-executable instructions that are stored or otherwise available from computer readable media. Such instructions can comprise, for example, instructions and data which cause or otherwise configure a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. Portions of computer resources used can be accessible over a network. The computer executable instructions may be, for example, binaries, intermediate format instructions such as assembly language, firmware, or source code. Examples of computer-readable media that may be used to store instructions, information used, and/or information created during methods according to described examples include magnetic or optical disks, flash memory, USB devices provided with non-volatile memory, networked storage devices, and so on.
Devices implementing methods according to these disclosures can comprise hardware, firmware and/or software, and can take any of a variety of form factors. Typical examples of such form factors include laptops, smart phones, small form factor personal computers, personal digital assistants, rackmount devices, standalone devices, and so on. Functionality described herein also can be embodied in peripherals or add-in cards. Such functionality can also be implemented on a circuit board among different chips or different processes executing in a single device, by way of further example.
The instructions, media for conveying such instructions, computing resources for executing them, and other structures for supporting such computing resources are means for providing the functions described in these disclosures.
Although a variety of examples and other information was used to explain aspects within the scope of the appended claims, no limitation of the claims should be implied based on particular features or arrangements in such examples, as one of ordinary skill would be able to use these examples to derive a wide variety of implementations. Further and although some subject matter may have been described in language specific to examples of structural features and/or method steps, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to these described features or acts. For example, such functionality can be distributed differently or performed in components other than those identified herein. Rather, the described features and steps are disclosed as examples of components of systems and methods within the scope of the appended claims. Moreover, claim language reciting “at least one of” a set indicates that one member of the set or multiple members of the set satisfy the claim.
It should be understood that features or configurations herein with reference to one embodiment or example can be implemented in, or combined with, other embodiments or examples herein. That is, terms such as “embodiment”, “variation”, “aspect”, “example”, “configuration”, “implementation”, “case”, and any other terms which may connote an embodiment, as used herein to describe specific features or configurations, are not intended to limit any of the associated features or configurations to a specific or separate embodiment or embodiments, and should not be interpreted to suggest that such features or configurations cannot be combined with features or configurations described with reference to other embodiments, variations, aspects, examples, configurations, implementations, cases, and so forth. In other words, features described herein with reference to a specific example (e.g., embodiment, variation, aspect, configuration, implementation, case, etc.) can be combined with features described with reference to another example. Precisely, one of ordinary skill in the art will readily recognize that the various embodiments or examples described herein, and their associated features, can be combined with each other.
A phrase such as an “aspect” does not imply that such aspect is essential to the subject technology or that such aspect applies to all configurations of the subject technology. A disclosure relating to an aspect may apply to all configurations, or one or more configurations. A phrase such as an aspect may refer to one or more aspects and vice versa. A phrase such as a “configuration” does not imply that such configuration is essential to the subject technology or that such configuration applies to all configurations of the subject technology. A disclosure relating to a configuration may apply to all configurations, or one or more configurations. A phrase such as a configuration may refer to one or more configurations and vice versa. The word “exemplary” is used herein to mean “serving as an example or illustration.” Any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs. Moreover, claim language reciting “at least one of” a set indicates that one member of the set or multiple members of the set satisfy the claim.
Configuration and image manager 1202 can provision and maintain sensors 1204. Because many sensors 1204 reside within virtual machine images, configuration and image manager 1202 can be the component that also provisions virtual machine images.
Sensors 1204 can reside on every node and component of the data center (e.g., virtual machine, hypervisor, slice, blade, switch, router, gateway, etc.). Sensors 1204 can monitor traffic to and from the component, report on environmental data related to the component (e.g., component IDs, statuses, etc.), and perform actions related to the component (e.g., shut down a process, block ports, redirect traffic, etc.). Sensors 1202 can send their records over a high-speed connection to the collectors 1208 for storage. As mentioned previously, sensors 1204 can comprise a piece of software (e.g., running on a virtual machine, container, or hypervisor), an ASIC (e.g., a component of a switch, gateway, router, or standalone packet monitor), or an independent unit (e.g., a device connected to a switch's monitoring port or a device connected in series along a main trunk of a datacenter). For clarity and simplicity in this description, the term “component” is used to denote a component of the network (i.e., a process, module, slice, blade, hypervisor, machine, switch, router, gateway, etc.). It should be understood that various software and hardware configurations can be used as sensors 1204. Sensors 1204 can be lightweight, minimally impeding normal traffic and compute resources in a datacenter. Software sensors 1204 can “sniff” packets being sent over its host network interface card (NIC) or individual processes can be configured to report traffic to sensors 1204. In some embodiments, sensors 1204 are on every virtual machine, hypervisor, switch, etc. This layered sensor structure allows for granular packet statistics and data at each hop of data transmission. In some embodiments, sensors 1204 are prevented from being installed in certain places. For example, in a shared hosting environment, customers may have exclusive control of VMs, thus preventing network administrators from installing a sensor on those VMs.
As sensors 1204 capture traffic flows, they can continuously send reports to collectors 1208. The reports can relate to a packet, collection of packets, flow, group of flows, open ports, port knocks, etc. The reports can also include other details such as the VM bios ID, sensor ID, associated process ID, associated process name, process user name, sensor private key, geo-location of sensor, environmental details, etc. The reports can comprise data describing the connection information on all layers of the OSI model. For example, the reports can include Ethernet signal strength, destination MAC address, IP address, protocol, port number, encryption data, requesting process, etc.
Sensors 1204 can preprocess reports before sending. For example, sensors 1204 can remove extraneous or duplicative data or they can create a Summary of the data (e.g., latency, packets and bytes sent per traffic flow, flagging abnormal activity, etc.). In some embodiments, sensors 1204 are configured to only capture certain types of connection information and they disregard the rest. Because it can be overwhelming for a system to capture every packet, sensors can be configured to capture only a representative sample of packets (for example, every 12,000th packet).
Sensors 1204 can perform actions. For example, a sensor installed on a VM can close, quarantine, restart, or throttle a process. Sensors 1204 can create and enforce firewall policies (e.g., block access to ports, protocols, or addresses). In some embodiments, sensors 1204 receive instructions to perform such actions; alternatively, sensors 1204 can act independently and without external direction.
Sensors 1204 can send reports to one or multiple collectors 1208. Sensors 1204 can be assigned to send reports to a primary collector and a secondary collector. In some embodiments, sensors 1204 are not assigned a collector, but determine an optimal collector through a discovery process. Sensors 1204 can change where they send their reports if its environment changes. For example, if a certain collector experiences failure or if the sensor 1204 is migrated to a new location and is closer to a different collector. In some embodiments, sensors 1204 send different reports to different collectors. For example, sensors 1204 can send reports related to one type of process to one collector and reports related to another type of process to another collector.
Collectors 1208 can serve as a repository for the data recorded by the sensors. In some embodiments, collectors 1208 are directly connected to the top of rack switch; alternatively, collectors 1208 can be located near the end of row or elsewhere on or off premises. The placement of collectors 1208 can be optimized according to various priorities such as network capacity, cost, and system responsiveness. In some embodiments, collectors' 1208 data storage is located in an in-memory database such as dashDB by IBM. This approach benefits from rapid random access speeds that typically are required for analytics software. Alternatively, collectors 1208 can utilize solid state drives, disk drives, magnetic tape drives, or a combination of the foregoing according to cost, responsiveness, and size requirements. Collectors 1208 can utilize various database structures such as a normalized relational database or NoSQL database.
In some embodiments, collectors 1208 only serve as network storage for the traffic monitoring system 1200. Alternatively, collectors 1208 can organize, summarize, and preprocess data. For example, collectors 1208 can tabulate how often packets of certain sizes or types are transmitted from different virtual machines. Collectors 1208 can also characterize the traffic flows going to and from various network components. In some embodiments, collectors 1208 can match packets based on sequence numbers, thus identifying traffic flows and connection links. In some embodiments, collectors 1208 flag anomalous data. Because it would be inefficient to retain all data indefinitely, collectors 1208 can routinely replace detailed reports with consolidated summaries. In this manner, collectors 1208 can retain a complete dataset describing one period (e.g., the past minute), with a smaller report of another period (e.g., the previous), and progressively consolidated reports of other times (day, week, month, year, etc.). By organizing, summarizing, and preprocessing the data, collectors 1208 can help traffic monitoring system 1200 scale efficiently. Although collectors 1208 are generally herein referred to as a plural noun, a single machine or cluster of machines are contemplated to be sufficient, especially for smaller datacenters. In some embodiments, collectors 1208 serve as sensors 1204 as well.
In some embodiments, collectors 1208 receive data that does not come from sensors 1204. For example, they can receive data external to traffic monitoring system 1200 such as security reports, white-lists, IP watchlists, whois data, power status, temperature readings, etc.
Configuration and image manager 1202 can configure and manage sensors 1204. When a new virtual machine is instantiated or when an existing one is migrated, configuration and image manager 1202 can provision and configure a new sensor on the machine. In some embodiments configuration and image manager 1202 can monitor the health of sensors 1204. For example, configuration and image manager 1202 might request status updates or initiate tests. In some embodiments, configuration and image manager 1202 also manages and provisions virtual machines.
In some embodiments, configuration and image manager 1202 can verify and validate sensors 1204. For example, sensors 1204 can be provisioned a unique ID that is created using a one-way hash function of its BIOS UUID and a secret key stored on configuration and image manager 1202. This unique ID can be a large number that is difficult for an imposter sensor to guess. In some embodiments, configuration and image manager 1202 can keep sensors 1204 up to date by installing new versions of their software and applying patches. It can get these updates from a local source or automatically from the Internet.
Analytics module 1220 can have a wide bandwidth connection to the various collectors 1208 and can process the data stored therein. Analytics module 1220 can accomplish various tasks in its analysis, some of which are herein disclosed. In some embodiments, traffic monitoring system 1200 can automatically determine network topology. Using data provided from sensors 1204, traffic monitoring system 1200 can determine what type of devices exist on the network (brand and model of switches, gateways, machines, etc.), where they are physically located (e.g., latitude and longitude, building, datacenter, room, row, rack, machine, etc.), how they are interconnected (120 Gb Ethernet, fiber-optic, etc.), and what the strength of each connection is (bandwidth, latency, etc.). Automatically determining the network topology can assist with integration of traffic monitoring system 1200 within an already established datacenter. Furthermore, analytics module 1220 can detect changes of network topology without the needed of further configuration.
Analytics module 1220 can determine dependencies of components within the network. For example, if component A routinely sends data to component B but component B never sends data to component A, then analytics module 1220 can determine that component B is dependent on component A, but A is likely not dependent on component B. If, however, component B also sends data to component A, then they are likely interdependent. These components can be processes, virtual machines, hypervisors, VLANs, etc. Once analytics module 1220 has determined component dependencies, it can then form a component (“application”) dependency map. This map can be instructive when analytics module 1220 attempts to determine the root cause of a failure (because failure of one component can cascade and cause failure of its dependent components) or when analytics module 1220 attempts to predict what will happen if a component is taken offline. Additionally, analytics module 1220 can associate edges of an application dependency map with expected latency, bandwidth, etc. for that individual edge.
Analytics module 1220 can establish patterns and norms for component behavior. For example, it can determine that certain processes (when functioning normally) will only send a certain amount of traffic to a certain VM using a small set of ports. Analytics module can establish these norms by analyzing individual components or by analyzing data coming from similar components (e.g., VMs with similar configurations). Similarly, analytics module 1220 can determine expectations for network operations. For example, it can determine the expected latency between two components, the expected throughput of a component, response times of a component, typical packet sizes, traffic flow signatures, etc. In some embodiments, analytics module 1220 can combine its dependency map with pattern analysis to create reaction expectations. For example, if traffic increases with one component, other components may predictably increase traffic in response (or latency, compute time, etc.).
In some embodiments, analytics module 1220 uses machine learning techniques to identify which patterns are desirable or unwanted. For example, a network administrator can indicate network states corresponding to an attack and network states corresponding to normal operation. Analytics module 1220 can then analyze the data to determine which patterns most correlate with the network being in a desirable or undesirable state. In some embodiments, the network can operate within a trusted environment for a time so that analytics module 1220 can establish baseline normalcy. In some embodiments, analytics module 1220 contains a database of norms and expectations for various components. This database can incorporate data from sources external to the network. Analytics module 1220 can then create access policies for how components can interact. In some embodiments, policies can be established external to traffic monitoring system 1200 and analytics module 1220 can detect the policies and incorporate them into this framework. A network administrator can manually tweak the policies. Policies can dynamically change and be conditional on events. These policies can be enforced on the components. Policy engine 1222 can maintain these policies and receive user input to change the policies.
Policy engine 1222 can configure analytics module 1220 to establish what network policies exist or should be maintained. For example, policy engine 1222 may specify that certain machines should not intercommunicate or that certain ports are restricted. Network and security policy controller 1224 can set the parameters of policy engine 1222. In some embodiments, policy engine 1222 is accessible via the presentation module.
Over time, components may occasionally exhibit anomalous behavior. Analytics module 1220 can analyze the frequency and severity of the anomalous behavior to determine a reputation score for the component. Analytics module 1220 can use the reputation score of a component to selectively enforce policies. For example, if a component has a high reputation score, analytics module 1220 may allow the component to periodically violate its relevant policy; while if the component frequently violates its relevant policy, its reputation score may be lowered. Analytics module 1220 can correlate observed reputation score with characteristics of a component. For example, a particular virtual machine with a particular configuration may be more prone to misconfiguration and receive a lower reputation score. In some embodiments, policies are strictly followed, but explicitly factor in a component's reputation score. When a new component is placed in the network, analytics module 1220 can assign a starting reputation score similar to the scores of similarly configured components. The expected reputation score for a given component configuration can be externally sourced outside of the datacenter. A network administrator can be presented with expected reputation scores for various components before installation, thus assisting the network administrator in choosing components and configurations that will result in high reputation scores.
Some anomalous behavior can be indicative of a misconfigured component or a malicious attack. Certain attacks are easy to detect if they originate outside of the datacenter, but can prove difficult to detect and isolate if they originate from within the datacenter. One such attack could be a distributed denial of service (DDOS) where a component or group of components attempt to overwhelm another component with spurious transmissions and requests. Detecting an attack or other anomalous network traffic can be accomplished by comparing the expected network conditions with actual network conditions. For example, if a traffic flow varies from its historical signature (packet size, TCP header options, etc.) it may be an attack.
Once undesirable traffic is identified, analytics module 1220 can enforce and modify policies in order to mitigate the effects of the traffic. For example, a virtual machine may be prevented from communicating on certain ports. Analytics module 1220 can use the sensors 1204 to enforce these policies, including restarting a component. For example, if analytics module 1220 determines that an individual process is causing the attack, it can direct the sensor located on that virtual machine to terminate or restart the process. This enables other processes on the virtual machine and other network components to continue normal operation without interruption.
In some embodiments, analytics module 1220 can simulate changes in the network. For example, analytics module 1220 can simulate what may result if a machine is taken offline, if a connection is severed, or if a new policy is implemented. This type of simulation can provide a network administrator with greater information on what policies to implement. In some embodiments, the simulation may serve as a feedback loop for policies. For example, there can be a policy that if certain policies would affect certain services (as predicted by the simulation) those policies should not be implemented. Analytics module 1220 can use simulations to discover vulnerabilities in the datacenter. In some embodiments, analytics module 1220 can determine which services and components will be affected by a change in policy. Analytics module 1220 can then take necessary actions to prepare those services and components for the change. For example, it can send a notification to administrators of those services and components, it can initiate a migration of the components, it can shut the components down, etc.
In some embodiments, analytics module 1220 can supplement its analysis by initiating synthetic traffic flows and synthetic attacks on the datacenter. These artificial actions can assist analytics module 1220 in gathering data to enhance its model. In some embodiments, these synthetic flows and synthetic attacks are used to verify the integrity of sensors 1204, collectors 1208, and analytics module 1220.
In some cases, a traffic flow is expected to be reported by a sensor, but that sensor fails to report it. This situation could be an indication that the sensor has failed or become compromised. By comparing the reports from multiple sensors 1204 spread throughout the datacenter, analytics module 1220 can determine if a certain sensor is failing to report a particular traffic flow.
Presentation module 1226 can comprise serving layer 1228, authentication module 1220, web front end 1222, and public alert module 1224 connected to third party tools 1226. As analytics module 1220 processes the data and generates reports, they may not be in a human-readable form or they may be too large for an administrator to navigate. Presentation module 1226 can take the reports generated by analytics module 1220 and further summarize, filter, and organize the reports as well as create intuitive presentations of the reports.
Serving layer 1228 can be the interface between presentation module 1226 and analytics module 1220. As analytics module 1220 generates reports, predictions, and conclusions, serving layer 1228 can summarize, filter, and organize the information that comes from analytics module 1220. In some embodiments, serving layer 1228 can request raw data from a sensor, collector, or analytics module 1220.
Web frontend 1222 can connect with serving layer 1228 to present the data from serving layer 1228 in a page for human presentation. For example, web frontend 1222 can present the data in bar charts, core charts, tree maps, acyclic dependency maps, line graphs, tables, etc. Web frontend 1222 can be configured to allow a user to “drill down” on information sets to get a filtered data representation specific to the item the user wishes to “drill down” to. For example, individual traffic flows, components, etc. Web frontend 1222 can also be configured to allow a user to filter by search. This search filter can use natural language processing to determine analyze the network administrator's input. There can be options to view data relative to the current second, minute, hour, day, etc. Web frontend 1222 can allow a network administrator to view traffic flows, application dependency maps, network topology, etc.
In some embodiments, web frontend 1222 is solely configured to present information. In some embodiments, web frontend 1222 can receive inputs from a network administrator to configure traffic monitoring system 1200 or components of the datacenter. These instructions can be passed through serving layer 1228, sent to configuration and image manager 1202, or sent to policy engine 1222. Authentication module 1220 can verify the identity and privileges of the network administrator. In some embodiments, authentication module 1220 can grant network administrators different rights according to established policies.
Public alert module 1224 can identify network conditions that satisfy specified criteria and push alerts to third party tools 1226. Public alert module 1224 can use reports generated or accessible through analytics module 1220. One example of third party tools 1226 is a security information and event management system. Third party tools 1226 may retrieve information from serving layer 1228 through an API.
The various elements of network monitoring system 1200 can exist in various configurations. For example, collectors 1208 can be a component of sensors 1204. In some embodiments, elements perform some calculating and summarizing to ease the task of analytics module 1220.
Data then passes to the analytics module 1210 which comprises discovery engines 1314 and analytics engines 1316. Discovery engines 1314 can comprise a flow engine that uses packet data to identify traffic flows. Discovery engines 1314 can also comprise engines to identify host traits, process characteristics, application traits, policy and data traits. Further, discovery engines 1314 can comprise an application dependency mapping (ADM) engine as well as an engine to determine network topology (not depicted). These engines can discover the condition of network elements.
Analysis engines 1316 ingest the conditions and traits determined by discovery engines 1314 to identify network states and cross correlations. For example, analysis engines 1316 can comprise an attack detection engine, a search engine, a policy engine, and a DDOS detection engine.
The depicted engines can work independently or in concert. Analysis engines 1316 can ingest data from multiple discovery engines 1314. Discovery engines 1314 may perform analysis functions and analysis engines 1316 may perform discovery functions. Analytics module 1210 can comprise engines that are neither discovery engines 1314 nor analysis engines 1316.
Data can then flow from analytics module 1210 to presentation module 1216 which can comprise a persistence and API segment 1310 and a user interface and serving segment 1312. Persistence and API segment 1310 can comprise various database programs and access protocols. For example, Spark, sql, Hive, Kafka, Druid, Mongo, Java Database Connectivity (JDBC), and Ruby on Rails. User interface and serving segment 1312 can comprise various interfaces, for example, ad hoc queries 1318, third party SEIMs 1226, and full stack web server 1222.
A hypervisor may host multiple VMs which can communicate with each other and the Internet. The hypervisor will also include virtual switching devices. The virtual switching devices send and transmit the data between VMs and the Internet. When handling or forwarding packets, the virtual switching devices typically use different forwarding models depending on the type of virtual switching device, such as Linux bridge, Open vSwitch, vNic, or other software switch. It is important to understand what type of switching device and forwarding model is used by a hypervisor in order to optimize connections and properly attach VMs to the virtual switching device.
A Tetration policy pipeline is composed of four steps/modules:
(1) Application Dependency Mapping
In this stage, network traffic is analyzed to determine a respective graph for each application operating in a data center (discussed in detail elsewhere). That is, particular patterns of traffic will correspond to an application, and the interconnectivity or dependencies of the application are mapped to generate a graph for the application. In this context, an “application” refers to a set of networking components that provides connectivity for a given set of workloads. For example, in a conventional three-tier architecture for application, the servers and other components of the web tier, application tier, and data tier would make up an application.
(2) Policy Generation
Whitelist rules are then derived for each application graph determined in (1) (discussed in detail elsewhere). As is known in the art, in a blacklist model, all communication is open unless explicitly denied, whereas a whitelist model requires communication to be explicitly defined before being permitted. Conventional systems use a blacklist model. One of the advantages of the Tetration system is implementation of a whitelist model, which may be more secure than a blacklist model. For instance, using a whitelist model is recognized by the Australian Signal Directorate to be the #1 approach for mitigating targeted cyber attacks (http://www.asd.gov.au/infosec/top-mitigations/top-4-strategies-explained.htm).
As an example of whitelist rule generation, suppose there is an edge of an application graph between E1 (e.g., endpoint, endpoint group) and E2. Permissible traffic flows on a set of ports of E1 to one or more ports of E2. A policy can be defined to reflect the permissible traffic from the set of ports of E1 to the one or more ports of E2.
(3) Flow Pre-Processing
After the application dependencies are mapped and the policies are defined, network traffic is pre-processed in the policy pipeline for further analysis. For each flow, the source endpoint of the flow is mapped to a source endpoint group (EPG) and the destination endpoint of the flow is mapped to a destination EPG. Each flow can also be “normalized” by determining which EPG corresponds to the client, and which EPG corresponds to the server.
(4) Flow Analysis
Each pre-processed flow is then analyzed to determine which policies are being enforced and the extent (e.g., number of packets, number of flows, number of bytes, etc.) those policies are being enforced within the data center.
This flow analysis occurs continuously, and the Tetration system allows a user to specify a window of time (e.g., time of day, day of week or month, month(s) in a year, etc.) to determine which policies are being implemented (or not being implemented) and how often those policies are being implemented.
This application is a continuation of U.S. Non-Provisional patent application Ser. No. 16/237,187, filed on Dec. 31, 2018, which is a continuation of U.S. Non-Provisional patent application Ser. No. 15/152,163, filed on May 11, 2016, now U.S. Pat. No. 10,171,319, which claims the benefit of U.S. Provisional Patent Application No. 62/171,899, filed on Jun. 5, 2015, the full disclosures of each are hereby expressly incorporated by reference in their entireties.
Number | Name | Date | Kind |
---|---|---|---|
5086385 | Launey et al. | Feb 1992 | A |
5319754 | Meinecke et al. | Jun 1994 | A |
5400246 | Wilson et al. | Mar 1995 | A |
5436909 | Dev et al. | Jul 1995 | A |
5555416 | Owens et al. | Sep 1996 | A |
5726644 | Jednacz et al. | Mar 1998 | A |
5742803 | Igarashi et al. | Apr 1998 | A |
5742829 | Davis et al. | Apr 1998 | A |
5794047 | Meier | Aug 1998 | A |
5831848 | Rielly et al. | Nov 1998 | A |
5903545 | Sabourin et al. | May 1999 | A |
6012096 | Link et al. | Jan 2000 | A |
6026362 | Kim et al. | Feb 2000 | A |
6115462 | Servi et al. | Sep 2000 | A |
6141595 | Gloudeman et al. | Oct 2000 | A |
6144962 | Weinberg et al. | Nov 2000 | A |
6230312 | Hunt | May 2001 | B1 |
6239699 | Ronnen | May 2001 | B1 |
6247058 | Miller et al. | Jun 2001 | B1 |
6249241 | Jordan et al. | Jun 2001 | B1 |
6330562 | Boden et al. | Dec 2001 | B1 |
6351843 | Berkley et al. | Feb 2002 | B1 |
6353775 | Nichols | Mar 2002 | B1 |
6381735 | Hunt | Apr 2002 | B1 |
6499137 | Hunt | Dec 2002 | B1 |
6525658 | Streetman et al. | Feb 2003 | B2 |
6546553 | Hunt | Apr 2003 | B1 |
6597663 | Rekhter | Jul 2003 | B1 |
6611896 | Mason, Jr. et al. | Aug 2003 | B1 |
6629123 | Hunt | Sep 2003 | B1 |
6654750 | Adams et al. | Nov 2003 | B1 |
6718414 | Doggett | Apr 2004 | B1 |
6728779 | Griffin et al. | Apr 2004 | B1 |
6801878 | Hintz et al. | Oct 2004 | B1 |
6816461 | Scrandis et al. | Nov 2004 | B1 |
6847993 | Novaes et al. | Jan 2005 | B1 |
6848106 | Hipp | Jan 2005 | B1 |
6925490 | Novaes et al. | Aug 2005 | B1 |
6958998 | Shorey | Oct 2005 | B2 |
6965861 | Dailey et al. | Nov 2005 | B1 |
6983323 | Cantrell et al. | Jan 2006 | B2 |
6996808 | Niewiadomski et al. | Feb 2006 | B1 |
6996817 | Birum et al. | Feb 2006 | B2 |
6999452 | Drummond-Murray et al. | Feb 2006 | B1 |
7002464 | Bruemmer et al. | Feb 2006 | B2 |
7089583 | Mehra et al. | Aug 2006 | B2 |
7096368 | Kouznetsov et al. | Aug 2006 | B2 |
7111055 | Falkner | Sep 2006 | B2 |
7120934 | Ishikawa | Oct 2006 | B2 |
7162643 | Sankaran et al. | Jan 2007 | B1 |
7181769 | Keanini et al. | Feb 2007 | B1 |
7185103 | Jain | Feb 2007 | B1 |
7194664 | Fung et al. | Mar 2007 | B1 |
7203740 | Putzolu et al. | Apr 2007 | B1 |
7263689 | Edwards et al. | Aug 2007 | B1 |
7302487 | Ylonen et al. | Nov 2007 | B2 |
7337206 | Wen et al. | Feb 2008 | B1 |
7349761 | Cruse | Mar 2008 | B1 |
7353507 | Gazdik et al. | Apr 2008 | B2 |
7353511 | Ziese | Apr 2008 | B1 |
7356679 | Le et al. | Apr 2008 | B1 |
7360072 | Soltis et al. | Apr 2008 | B1 |
7370092 | Aderton et al. | May 2008 | B2 |
7395195 | Suenbuel et al. | Jul 2008 | B2 |
7444404 | Wetherall et al. | Oct 2008 | B2 |
7454486 | Kaler et al. | Nov 2008 | B2 |
7466681 | Ashwood-Smith et al. | Dec 2008 | B2 |
7467205 | Dempster et al. | Dec 2008 | B1 |
7496040 | Seo | Feb 2009 | B2 |
7496575 | Buccella et al. | Feb 2009 | B2 |
7523465 | Aamodt et al. | Apr 2009 | B2 |
7530105 | Gilbert et al. | May 2009 | B2 |
7539770 | Meier | May 2009 | B2 |
7568107 | Rathi et al. | Jul 2009 | B1 |
7571478 | Munson et al. | Aug 2009 | B2 |
7610330 | Quinn et al. | Oct 2009 | B1 |
7633942 | Bearden et al. | Dec 2009 | B2 |
7644438 | Dash et al. | Jan 2010 | B1 |
7676570 | Levy et al. | Mar 2010 | B2 |
7681131 | Quarterman et al. | Mar 2010 | B1 |
7693947 | Judge et al. | Apr 2010 | B2 |
7743242 | Oberhaus et al. | Jun 2010 | B2 |
7752307 | Takara | Jul 2010 | B2 |
7774498 | Kraemer et al. | Aug 2010 | B1 |
7783457 | Cunningham | Aug 2010 | B2 |
7787480 | Mehta et al. | Aug 2010 | B1 |
7788477 | Huang et al. | Aug 2010 | B1 |
7840618 | Zhang et al. | Nov 2010 | B2 |
7844696 | Labovitz et al. | Nov 2010 | B2 |
7844744 | Abercrombie et al. | Nov 2010 | B2 |
7864707 | Dimitropoulos et al. | Jan 2011 | B2 |
7873025 | Patel et al. | Jan 2011 | B2 |
7873074 | Boland | Jan 2011 | B1 |
7874001 | Beck et al. | Jan 2011 | B2 |
7885197 | Metzler | Feb 2011 | B2 |
7895649 | Brook et al. | Feb 2011 | B1 |
7904420 | Ianni | Mar 2011 | B2 |
7930752 | Hertzog et al. | Apr 2011 | B2 |
7934248 | Yehuda et al. | Apr 2011 | B1 |
7957934 | Greifeneder | Jun 2011 | B2 |
7961637 | McBeath | Jun 2011 | B2 |
7970946 | Djabarov et al. | Jun 2011 | B1 |
7975035 | Popescu et al. | Jul 2011 | B2 |
8005935 | Pradhan et al. | Aug 2011 | B2 |
8040232 | Oh et al. | Oct 2011 | B2 |
8040822 | Proulx et al. | Oct 2011 | B2 |
8056134 | Ogilvie | Nov 2011 | B1 |
8115617 | Thubert et al. | Feb 2012 | B2 |
8135657 | Kapoor et al. | Mar 2012 | B2 |
8135847 | Pujol et al. | Mar 2012 | B2 |
8156430 | Newman | Apr 2012 | B2 |
8160063 | Maltz et al. | Apr 2012 | B2 |
8179809 | Eppstein et al. | May 2012 | B1 |
8181248 | Oh et al. | May 2012 | B2 |
8185824 | Mitchell et al. | May 2012 | B1 |
8250657 | Nachenberg et al. | Aug 2012 | B1 |
8255972 | Azagury et al. | Aug 2012 | B2 |
8266697 | Coffman | Sep 2012 | B2 |
8280683 | Finkler | Oct 2012 | B2 |
8281397 | Vaidyanathan et al. | Oct 2012 | B2 |
8291495 | Burns et al. | Oct 2012 | B1 |
8296847 | Mendonca et al. | Oct 2012 | B2 |
8365286 | Poston | Jan 2013 | B2 |
8370407 | Devarajan et al. | Feb 2013 | B1 |
8381289 | Pereira et al. | Feb 2013 | B1 |
8391270 | Van Der Stok et al. | Mar 2013 | B2 |
8407164 | Malik et al. | Mar 2013 | B2 |
8413235 | Chen et al. | Apr 2013 | B1 |
8442073 | Skubacz et al. | May 2013 | B2 |
8451731 | Lee et al. | May 2013 | B1 |
8462212 | Kundu et al. | Jun 2013 | B1 |
8463860 | Guruswamy et al. | Jun 2013 | B1 |
8489765 | Vasseur et al. | Jul 2013 | B2 |
8516590 | Ranadive et al. | Aug 2013 | B1 |
8527977 | Cheng et al. | Sep 2013 | B1 |
8549635 | Muttik et al. | Oct 2013 | B2 |
8570861 | Brandwine et al. | Oct 2013 | B1 |
8572600 | Chung et al. | Oct 2013 | B2 |
8572734 | McConnell et al. | Oct 2013 | B2 |
8572735 | Ghosh et al. | Oct 2013 | B2 |
8572739 | Cruz et al. | Oct 2013 | B1 |
8588081 | Salam et al. | Nov 2013 | B2 |
8595709 | Rao et al. | Nov 2013 | B2 |
8600726 | Varshney et al. | Dec 2013 | B1 |
8612530 | Sapovalovs et al. | Dec 2013 | B1 |
8613084 | Dalcher | Dec 2013 | B2 |
8615803 | Dacier et al. | Dec 2013 | B2 |
8630316 | Haba | Jan 2014 | B2 |
8631464 | Belakhdar et al. | Jan 2014 | B2 |
8640086 | Bonev et al. | Jan 2014 | B2 |
8656493 | Capalik | Feb 2014 | B2 |
8661544 | Yen et al. | Feb 2014 | B2 |
8677487 | Balupari et al. | Mar 2014 | B2 |
8683389 | Bar-Yam et al. | Mar 2014 | B1 |
8689172 | Amaral et al. | Apr 2014 | B2 |
8706914 | Duchesneau | Apr 2014 | B2 |
8713676 | Pandrangi et al. | Apr 2014 | B2 |
8719452 | Ding et al. | May 2014 | B1 |
8719835 | Kanso et al. | May 2014 | B2 |
8750287 | Bui et al. | Jun 2014 | B2 |
8752042 | Ratica | Jun 2014 | B2 |
8752179 | Zaitsev | Jun 2014 | B2 |
8755396 | Sindhu et al. | Jun 2014 | B2 |
8762951 | Kosche et al. | Jun 2014 | B1 |
8769084 | Westerfeld et al. | Jul 2014 | B2 |
8775577 | Alford et al. | Jul 2014 | B1 |
8776180 | Kumar et al. | Jul 2014 | B2 |
8812725 | Kulkarni | Aug 2014 | B2 |
8813236 | Saha et al. | Aug 2014 | B1 |
8825848 | Dotan et al. | Sep 2014 | B1 |
8832013 | Adams et al. | Sep 2014 | B1 |
8832461 | Saroiu et al. | Sep 2014 | B2 |
8849926 | Marzencki et al. | Sep 2014 | B2 |
8881258 | Paul et al. | Nov 2014 | B2 |
8887238 | Howard et al. | Nov 2014 | B2 |
8904520 | Nachenberg et al. | Dec 2014 | B1 |
8908685 | Patel et al. | Dec 2014 | B2 |
8914497 | Xiao et al. | Dec 2014 | B1 |
8924941 | Krajec et al. | Dec 2014 | B2 |
8931043 | Cooper et al. | Jan 2015 | B2 |
8954546 | Krajec | Feb 2015 | B2 |
8954610 | Berke et al. | Feb 2015 | B2 |
8955124 | Kim et al. | Feb 2015 | B2 |
8966021 | Allen | Feb 2015 | B1 |
8966625 | Zuk et al. | Feb 2015 | B1 |
8973147 | Pearcy et al. | Mar 2015 | B2 |
8984331 | Quinn | Mar 2015 | B2 |
8990386 | He et al. | Mar 2015 | B2 |
8996695 | Anderson et al. | Mar 2015 | B2 |
8997063 | Krajec et al. | Mar 2015 | B2 |
8997227 | Mhatre et al. | Mar 2015 | B1 |
9014047 | Alcala et al. | Apr 2015 | B2 |
9015716 | Fletcher et al. | Apr 2015 | B2 |
9071575 | Lemaster et al. | Jun 2015 | B2 |
9088598 | Zhang et al. | Jul 2015 | B1 |
9110905 | Polley et al. | Aug 2015 | B2 |
9117075 | Yeh | Aug 2015 | B1 |
9130836 | Kapadia et al. | Sep 2015 | B2 |
9135145 | Voccio et al. | Sep 2015 | B2 |
9152789 | Natarajan et al. | Oct 2015 | B2 |
9158720 | Shirlen et al. | Oct 2015 | B2 |
9160764 | Stiansen et al. | Oct 2015 | B2 |
9178906 | Chen et al. | Nov 2015 | B1 |
9185127 | Neou et al. | Nov 2015 | B2 |
9191402 | Yan | Nov 2015 | B2 |
9197654 | Ben-Shalom et al. | Nov 2015 | B2 |
9225793 | Dutta et al. | Dec 2015 | B2 |
9237111 | Banavalikar et al. | Jan 2016 | B2 |
9246702 | Sharma et al. | Jan 2016 | B1 |
9246773 | Degioanni | Jan 2016 | B2 |
9253042 | Lumezanu et al. | Feb 2016 | B2 |
9258217 | Duffield et al. | Feb 2016 | B2 |
9276829 | Castro et al. | Mar 2016 | B2 |
9281940 | Matsuda et al. | Mar 2016 | B2 |
9286047 | Avramov et al. | Mar 2016 | B1 |
9292415 | Seto et al. | Mar 2016 | B2 |
9294486 | Chiang et al. | Mar 2016 | B1 |
9317574 | Brisebois et al. | Apr 2016 | B1 |
9319384 | Yan et al. | Apr 2016 | B2 |
9369435 | Short et al. | Jun 2016 | B2 |
9369479 | Lin | Jun 2016 | B2 |
9378068 | Anantharam et al. | Jun 2016 | B2 |
9396327 | Auger et al. | Jul 2016 | B2 |
9397902 | Dragon et al. | Jul 2016 | B2 |
9405903 | Xie et al. | Aug 2016 | B1 |
9417985 | Baars et al. | Aug 2016 | B2 |
9418222 | Rivera et al. | Aug 2016 | B1 |
9426068 | Dunbar et al. | Aug 2016 | B2 |
9454324 | Madhavapeddi | Sep 2016 | B1 |
9462013 | Boss et al. | Oct 2016 | B1 |
9465696 | McNeil et al. | Oct 2016 | B2 |
9483334 | Walsh | Nov 2016 | B2 |
9501744 | Brisebois et al. | Nov 2016 | B1 |
9531589 | Clemm et al. | Dec 2016 | B2 |
9552221 | Pora | Jan 2017 | B1 |
9563517 | Natanzon et al. | Feb 2017 | B1 |
9575869 | Pechanec et al. | Feb 2017 | B2 |
9575874 | Gautallin et al. | Feb 2017 | B2 |
9634915 | Bley | Apr 2017 | B2 |
9645892 | Patwardhan | May 2017 | B1 |
9658942 | Bhat et al. | May 2017 | B2 |
9665474 | Li et al. | May 2017 | B2 |
9684453 | Holt et al. | Jun 2017 | B2 |
9697033 | Koponen et al. | Jul 2017 | B2 |
9727394 | Xun et al. | Aug 2017 | B2 |
9736041 | Lumezanu et al. | Aug 2017 | B2 |
9749145 | Banavalikar et al. | Aug 2017 | B2 |
9800608 | Korsunsky et al. | Oct 2017 | B2 |
9804830 | Raman et al. | Oct 2017 | B2 |
9804951 | Liu et al. | Oct 2017 | B2 |
9813307 | Walsh et al. | Nov 2017 | B2 |
9813516 | Wang | Nov 2017 | B2 |
9904584 | Konig et al. | Feb 2018 | B2 |
9916232 | Voccio et al. | Mar 2018 | B2 |
9996529 | McCandless et al. | Jun 2018 | B2 |
10002187 | McCandless et al. | Jun 2018 | B2 |
10116531 | Alizadeh Attar et al. | Oct 2018 | B2 |
10394692 | Beckman et al. | Aug 2019 | B2 |
10454793 | Deen et al. | Oct 2019 | B2 |
10454999 | Eder | Oct 2019 | B2 |
10476982 | Tarre | Nov 2019 | B2 |
10652225 | Koved et al. | May 2020 | B2 |
20010028646 | Arts et al. | Oct 2001 | A1 |
20020053033 | Cooper et al. | May 2002 | A1 |
20020097687 | Meiri et al. | Jul 2002 | A1 |
20020103793 | Koller et al. | Aug 2002 | A1 |
20020107857 | Teraslinna | Aug 2002 | A1 |
20020141343 | Bays | Oct 2002 | A1 |
20020184393 | Leddy et al. | Dec 2002 | A1 |
20030023601 | Fortier, Jr. et al. | Jan 2003 | A1 |
20030065986 | Fraenkel et al. | Apr 2003 | A1 |
20030084158 | Saito et al. | May 2003 | A1 |
20030097439 | Strayer et al. | May 2003 | A1 |
20030126242 | Chang | Jul 2003 | A1 |
20030145232 | Poletto et al. | Jul 2003 | A1 |
20030151513 | Herrmann et al. | Aug 2003 | A1 |
20030154399 | Zuk et al. | Aug 2003 | A1 |
20030177208 | Harvey, IV | Sep 2003 | A1 |
20040019676 | Iwatsuki et al. | Jan 2004 | A1 |
20040030776 | Cantrell et al. | Feb 2004 | A1 |
20040205536 | Newman et al. | Oct 2004 | A1 |
20040213221 | Civanlar et al. | Oct 2004 | A1 |
20040243533 | Dempster et al. | Dec 2004 | A1 |
20040255050 | Takehiro et al. | Dec 2004 | A1 |
20040268149 | Aaron | Dec 2004 | A1 |
20050028154 | Smith et al. | Feb 2005 | A1 |
20050039104 | Shah et al. | Feb 2005 | A1 |
20050060403 | Bernstein et al. | Mar 2005 | A1 |
20050063377 | Bryant et al. | Mar 2005 | A1 |
20050083933 | Fine et al. | Apr 2005 | A1 |
20050108331 | Osterman | May 2005 | A1 |
20050166066 | Ahuja et al. | Jul 2005 | A1 |
20050177829 | Vishwanath | Aug 2005 | A1 |
20050185621 | Sivakumar et al. | Aug 2005 | A1 |
20050198247 | Perry et al. | Sep 2005 | A1 |
20050198371 | Smith et al. | Sep 2005 | A1 |
20050198629 | Vishwanath | Sep 2005 | A1 |
20050207376 | Ashwood-Smith et al. | Sep 2005 | A1 |
20050257244 | Joly et al. | Nov 2005 | A1 |
20050289244 | Sahu et al. | Dec 2005 | A1 |
20060048218 | Lingafelt et al. | Mar 2006 | A1 |
20060077909 | Saleh et al. | Apr 2006 | A1 |
20060080733 | Khosmood et al. | Apr 2006 | A1 |
20060095968 | Portolani et al. | May 2006 | A1 |
20060143432 | Rothman et al. | Jun 2006 | A1 |
20060156408 | Himberger et al. | Jul 2006 | A1 |
20060158354 | Aberg et al. | Jul 2006 | A1 |
20060159032 | Ukrainetz et al. | Jul 2006 | A1 |
20060173912 | Lindvall et al. | Aug 2006 | A1 |
20060195448 | Newport | Aug 2006 | A1 |
20060212556 | Yacoby et al. | Sep 2006 | A1 |
20060272018 | Fouant | Nov 2006 | A1 |
20060274659 | Ouderkirk | Dec 2006 | A1 |
20060280179 | Meier | Dec 2006 | A1 |
20060294219 | Ogawa et al. | Dec 2006 | A1 |
20070025306 | Cox et al. | Feb 2007 | A1 |
20070044147 | Choi et al. | Feb 2007 | A1 |
20070097976 | Wood et al. | May 2007 | A1 |
20070118654 | Jamkhedkar et al. | May 2007 | A1 |
20070124376 | Greenwell | May 2007 | A1 |
20070127491 | Verzijp et al. | Jun 2007 | A1 |
20070162420 | Ou et al. | Jul 2007 | A1 |
20070169179 | Narad | Jul 2007 | A1 |
20070180526 | Copeland, III | Aug 2007 | A1 |
20070195729 | Li et al. | Aug 2007 | A1 |
20070195794 | Fujita et al. | Aug 2007 | A1 |
20070195797 | Patel et al. | Aug 2007 | A1 |
20070201474 | Isobe | Aug 2007 | A1 |
20070211637 | Mitchell | Sep 2007 | A1 |
20070214348 | Danielsen | Sep 2007 | A1 |
20070230415 | Malik | Oct 2007 | A1 |
20070250930 | Aziz et al. | Oct 2007 | A1 |
20070300061 | Kim et al. | Dec 2007 | A1 |
20080022385 | Crowell et al. | Jan 2008 | A1 |
20080046708 | Fitzgerald et al. | Feb 2008 | A1 |
20080052387 | Heinz et al. | Feb 2008 | A1 |
20080056124 | Nanda et al. | Mar 2008 | A1 |
20080066009 | Gardner et al. | Mar 2008 | A1 |
20080082662 | Danliker et al. | Apr 2008 | A1 |
20080101234 | Nakil et al. | May 2008 | A1 |
20080120350 | Grabowski et al. | May 2008 | A1 |
20080126534 | Mueller et al. | May 2008 | A1 |
20080155245 | Lipscombe et al. | Jun 2008 | A1 |
20080201109 | Zill et al. | Aug 2008 | A1 |
20080232358 | Baker et al. | Sep 2008 | A1 |
20080250122 | Zsigmond et al. | Oct 2008 | A1 |
20080250128 | Sargent | Oct 2008 | A1 |
20080270199 | Chess et al. | Oct 2008 | A1 |
20080295163 | Kang | Nov 2008 | A1 |
20080301765 | Nicol et al. | Dec 2008 | A1 |
20090059934 | Aggarwal et al. | Mar 2009 | A1 |
20090064332 | Porras et al. | Mar 2009 | A1 |
20090077543 | Siskind et al. | Mar 2009 | A1 |
20090106646 | Mollicone et al. | Apr 2009 | A1 |
20090133126 | Jang et al. | May 2009 | A1 |
20090192847 | Lipkin et al. | Jul 2009 | A1 |
20090241170 | Kumar et al. | Sep 2009 | A1 |
20090249302 | Xu et al. | Oct 2009 | A1 |
20090300180 | Dehaan et al. | Dec 2009 | A1 |
20090307753 | Dupont et al. | Dec 2009 | A1 |
20090313373 | Hanna et al. | Dec 2009 | A1 |
20090313698 | Wahl | Dec 2009 | A1 |
20090323543 | Shimakura | Dec 2009 | A1 |
20090328219 | Narayanaswamy | Dec 2009 | A1 |
20100005288 | Rao et al. | Jan 2010 | A1 |
20100005478 | Helfman et al. | Jan 2010 | A1 |
20100049839 | Parker et al. | Feb 2010 | A1 |
20100077445 | Schneider et al. | Mar 2010 | A1 |
20100095293 | O'Neill et al. | Apr 2010 | A1 |
20100095367 | Narayanaswamy | Apr 2010 | A1 |
20100095377 | Krywaniuk | Apr 2010 | A1 |
20100138526 | DeHaan et al. | Jun 2010 | A1 |
20100138810 | Komatsu et al. | Jun 2010 | A1 |
20100148940 | Gelvin et al. | Jun 2010 | A1 |
20100153316 | Duffield et al. | Jun 2010 | A1 |
20100153696 | Beachem et al. | Jun 2010 | A1 |
20100174813 | Hildreth et al. | Jul 2010 | A1 |
20100180016 | Bugwadia et al. | Jul 2010 | A1 |
20100188995 | Raleigh | Jul 2010 | A1 |
20100220584 | DeHaan et al. | Sep 2010 | A1 |
20100235514 | Beachem | Sep 2010 | A1 |
20100235879 | Burnside et al. | Sep 2010 | A1 |
20100235915 | Memon et al. | Sep 2010 | A1 |
20100287266 | Asati et al. | Nov 2010 | A1 |
20100303240 | Beachem | Dec 2010 | A1 |
20100319060 | Aiken et al. | Dec 2010 | A1 |
20110010585 | Bugenhagen et al. | Jan 2011 | A1 |
20110022641 | Werth et al. | Jan 2011 | A1 |
20110055381 | Narasimhan et al. | Mar 2011 | A1 |
20110055388 | Yumerefendi et al. | Mar 2011 | A1 |
20110066719 | Miryanov et al. | Mar 2011 | A1 |
20110069685 | Tofighbakhsh | Mar 2011 | A1 |
20110083124 | Moskal et al. | Apr 2011 | A1 |
20110083125 | Komatsu et al. | Apr 2011 | A1 |
20110103259 | Aybay et al. | May 2011 | A1 |
20110107074 | Chan et al. | May 2011 | A1 |
20110107331 | Evans et al. | May 2011 | A1 |
20110126136 | Abella et al. | May 2011 | A1 |
20110126275 | Anderson et al. | May 2011 | A1 |
20110145885 | Rivers et al. | Jun 2011 | A1 |
20110153811 | Jeong et al. | Jun 2011 | A1 |
20110158088 | Lofstrand et al. | Jun 2011 | A1 |
20110167435 | Fang | Jul 2011 | A1 |
20110170860 | Smith et al. | Jul 2011 | A1 |
20110173490 | Narayanaswamy et al. | Jul 2011 | A1 |
20110185423 | Sallam | Jul 2011 | A1 |
20110196957 | Ayachitula et al. | Aug 2011 | A1 |
20110202655 | Sharma et al. | Aug 2011 | A1 |
20110214174 | Herzog et al. | Sep 2011 | A1 |
20110225207 | Subramanian et al. | Sep 2011 | A1 |
20110228696 | Agarwal et al. | Sep 2011 | A1 |
20110231510 | Korsunsky et al. | Sep 2011 | A1 |
20110239194 | Braude | Sep 2011 | A1 |
20110246663 | Meisen et al. | Oct 2011 | A1 |
20110277034 | Hanson | Nov 2011 | A1 |
20110283266 | Gallagher et al. | Nov 2011 | A1 |
20110289301 | Allen et al. | Nov 2011 | A1 |
20110302652 | Westerfeld | Dec 2011 | A1 |
20110314148 | Petersen et al. | Dec 2011 | A1 |
20120005542 | Petersen et al. | Jan 2012 | A1 |
20120011153 | Buchanan et al. | Jan 2012 | A1 |
20120079592 | Pandrangi | Mar 2012 | A1 |
20120089664 | Igelka | Apr 2012 | A1 |
20120102361 | Sass et al. | Apr 2012 | A1 |
20120102543 | Kohli et al. | Apr 2012 | A1 |
20120102545 | Carter, III et al. | Apr 2012 | A1 |
20120117226 | Tanaka et al. | May 2012 | A1 |
20120136996 | Seo et al. | May 2012 | A1 |
20120137278 | Draper et al. | May 2012 | A1 |
20120137361 | Yi et al. | May 2012 | A1 |
20120140626 | Anand et al. | Jun 2012 | A1 |
20120167057 | Schmich et al. | Jun 2012 | A1 |
20120195198 | Regan | Aug 2012 | A1 |
20120197856 | Banka et al. | Aug 2012 | A1 |
20120198541 | Reeves | Aug 2012 | A1 |
20120216271 | Cooper et al. | Aug 2012 | A1 |
20120218989 | Tanabe et al. | Aug 2012 | A1 |
20120219004 | Balus et al. | Aug 2012 | A1 |
20120233348 | Winters | Sep 2012 | A1 |
20120233473 | Vasseur et al. | Sep 2012 | A1 |
20120240185 | Kapoor et al. | Sep 2012 | A1 |
20120240232 | Azuma | Sep 2012 | A1 |
20120246303 | Petersen et al. | Sep 2012 | A1 |
20120254109 | Shukla et al. | Oct 2012 | A1 |
20120260227 | Shukla et al. | Oct 2012 | A1 |
20120278021 | Lin et al. | Nov 2012 | A1 |
20120281700 | Koganti et al. | Nov 2012 | A1 |
20130003538 | Greenburg et al. | Jan 2013 | A1 |
20130003733 | Venkatesan et al. | Jan 2013 | A1 |
20130006935 | Grisby | Jan 2013 | A1 |
20130007435 | Bayani | Jan 2013 | A1 |
20130019008 | Jorgenson et al. | Jan 2013 | A1 |
20130038358 | Cook et al. | Feb 2013 | A1 |
20130055145 | Anthony et al. | Feb 2013 | A1 |
20130086272 | Chen et al. | Apr 2013 | A1 |
20130097706 | Titonis et al. | Apr 2013 | A1 |
20130103827 | Dunlap et al. | Apr 2013 | A1 |
20130107709 | Campbell et al. | May 2013 | A1 |
20130117748 | Cooper et al. | May 2013 | A1 |
20130122854 | Agarwal et al. | May 2013 | A1 |
20130124807 | Nielsen et al. | May 2013 | A1 |
20130125107 | Bandakka et al. | May 2013 | A1 |
20130145099 | Liu et al. | Jun 2013 | A1 |
20130148663 | Xiong | Jun 2013 | A1 |
20130159999 | Chiueh et al. | Jun 2013 | A1 |
20130160128 | Dolan-Gavitt et al. | Jun 2013 | A1 |
20130174256 | Powers | Jul 2013 | A1 |
20130179487 | Lubetzky et al. | Jul 2013 | A1 |
20130179879 | Zhang et al. | Jul 2013 | A1 |
20130198839 | Wei et al. | Aug 2013 | A1 |
20130201986 | Sajassi et al. | Aug 2013 | A1 |
20130205293 | Levijarvi et al. | Aug 2013 | A1 |
20130219161 | Fontignie et al. | Aug 2013 | A1 |
20130232498 | Mangtani et al. | Sep 2013 | A1 |
20130238665 | Sequin | Sep 2013 | A1 |
20130242999 | Kamble et al. | Sep 2013 | A1 |
20130246925 | Ahuja et al. | Sep 2013 | A1 |
20130247201 | Alperovitch et al. | Sep 2013 | A1 |
20130254879 | Chesla et al. | Sep 2013 | A1 |
20130268994 | Cooper et al. | Oct 2013 | A1 |
20130275579 | Hernandez et al. | Oct 2013 | A1 |
20130283240 | Krajec et al. | Oct 2013 | A1 |
20130283281 | Krajec et al. | Oct 2013 | A1 |
20130283374 | Zisapel et al. | Oct 2013 | A1 |
20130290521 | Labovitz | Oct 2013 | A1 |
20130297771 | Osterloh et al. | Nov 2013 | A1 |
20130301472 | Allan | Nov 2013 | A1 |
20130304900 | Trabelsi et al. | Nov 2013 | A1 |
20130305369 | Karta et al. | Nov 2013 | A1 |
20130318357 | Abraham et al. | Nov 2013 | A1 |
20130326623 | Kruglick | Dec 2013 | A1 |
20130326625 | Anderson et al. | Dec 2013 | A1 |
20130333029 | Chesla et al. | Dec 2013 | A1 |
20130336164 | Yang et al. | Dec 2013 | A1 |
20130346736 | Cook et al. | Dec 2013 | A1 |
20130347103 | Veteikis et al. | Dec 2013 | A1 |
20140006610 | Formby et al. | Jan 2014 | A1 |
20140006871 | Lakshmanan et al. | Jan 2014 | A1 |
20140012814 | Bercovici et al. | Jan 2014 | A1 |
20140019972 | Yahalom et al. | Jan 2014 | A1 |
20140033193 | Palaniappan | Jan 2014 | A1 |
20140040343 | Nickolov et al. | Feb 2014 | A1 |
20140047185 | Peterson et al. | Feb 2014 | A1 |
20140047372 | Gnezdov et al. | Feb 2014 | A1 |
20140059200 | Nguyen et al. | Feb 2014 | A1 |
20140074946 | Dirstine et al. | Mar 2014 | A1 |
20140089494 | Dasari et al. | Mar 2014 | A1 |
20140092884 | Murphy et al. | Apr 2014 | A1 |
20140096058 | Molesky et al. | Apr 2014 | A1 |
20140105029 | Jain et al. | Apr 2014 | A1 |
20140115219 | Ajanovic et al. | Apr 2014 | A1 |
20140115403 | Rhee et al. | Apr 2014 | A1 |
20140137109 | Sharma et al. | May 2014 | A1 |
20140140213 | Raleigh et al. | May 2014 | A1 |
20140140244 | Kapadia et al. | May 2014 | A1 |
20140143825 | Behrendt et al. | May 2014 | A1 |
20140149490 | Luxenberg et al. | May 2014 | A1 |
20140156814 | Barabash et al. | Jun 2014 | A1 |
20140156861 | Cruz-Aguilar et al. | Jun 2014 | A1 |
20140164607 | Bai et al. | Jun 2014 | A1 |
20140165200 | Singla | Jun 2014 | A1 |
20140165207 | Engel et al. | Jun 2014 | A1 |
20140173623 | Chang et al. | Jun 2014 | A1 |
20140192639 | Smirnov | Jul 2014 | A1 |
20140201717 | Mascaro et al. | Jul 2014 | A1 |
20140208296 | Dang et al. | Jul 2014 | A1 |
20140215443 | Voccio et al. | Jul 2014 | A1 |
20140215573 | Cepuran | Jul 2014 | A1 |
20140215621 | Xaypanya et al. | Jul 2014 | A1 |
20140280499 | Basavaiah et al. | Sep 2014 | A1 |
20140280908 | Rothstein et al. | Sep 2014 | A1 |
20140281030 | Cui et al. | Sep 2014 | A1 |
20140286174 | Lizuka et al. | Sep 2014 | A1 |
20140286354 | Van De Poel et al. | Sep 2014 | A1 |
20140289854 | Mahvi | Sep 2014 | A1 |
20140298461 | Hohndel et al. | Oct 2014 | A1 |
20140317278 | Kersch et al. | Oct 2014 | A1 |
20140317737 | Shin et al. | Oct 2014 | A1 |
20140331276 | Frascadore et al. | Nov 2014 | A1 |
20140331280 | Porras et al. | Nov 2014 | A1 |
20140331304 | Wong | Nov 2014 | A1 |
20140351203 | Kunnatur et al. | Nov 2014 | A1 |
20140351415 | Harrigan et al. | Nov 2014 | A1 |
20140359695 | Chari et al. | Dec 2014 | A1 |
20150006714 | Jain | Jan 2015 | A1 |
20150009840 | Pruthi et al. | Jan 2015 | A1 |
20150026809 | Altman et al. | Jan 2015 | A1 |
20150033305 | Shear et al. | Jan 2015 | A1 |
20150036480 | Huang et al. | Feb 2015 | A1 |
20150036533 | Sodhi et al. | Feb 2015 | A1 |
20150039751 | Harrigan et al. | Feb 2015 | A1 |
20150046882 | Menyhart et al. | Feb 2015 | A1 |
20150058976 | Carney et al. | Feb 2015 | A1 |
20150067143 | Babakhan et al. | Mar 2015 | A1 |
20150082151 | Liang et al. | Mar 2015 | A1 |
20150082430 | Sridhara et al. | Mar 2015 | A1 |
20150085665 | Kompella et al. | Mar 2015 | A1 |
20150095332 | Beisiegel et al. | Apr 2015 | A1 |
20150112933 | Satapally | Apr 2015 | A1 |
20150113133 | Srinivas et al. | Apr 2015 | A1 |
20150124608 | Agarwal et al. | May 2015 | A1 |
20150128133 | Pohlmann | May 2015 | A1 |
20150138993 | Forster et al. | May 2015 | A1 |
20150142962 | Srinivas et al. | May 2015 | A1 |
20150170213 | O'Malley | Jun 2015 | A1 |
20150195291 | Zuk et al. | Jul 2015 | A1 |
20150222939 | Gallant et al. | Aug 2015 | A1 |
20150249622 | Phillips et al. | Sep 2015 | A1 |
20150256555 | Choi et al. | Sep 2015 | A1 |
20150261842 | Huang et al. | Sep 2015 | A1 |
20150261886 | Wu et al. | Sep 2015 | A1 |
20150271008 | Jain et al. | Sep 2015 | A1 |
20150271255 | Mackay et al. | Sep 2015 | A1 |
20150295945 | Canzanese, Jr. et al. | Oct 2015 | A1 |
20150347554 | Vasantham et al. | Dec 2015 | A1 |
20150356297 | Guri et al. | Dec 2015 | A1 |
20150358352 | Chasin et al. | Dec 2015 | A1 |
20160006753 | McDaid et al. | Jan 2016 | A1 |
20160019030 | Shukla et al. | Jan 2016 | A1 |
20160021131 | Heilig | Jan 2016 | A1 |
20160026552 | Holden et al. | Jan 2016 | A1 |
20160034560 | Setayesh et al. | Feb 2016 | A1 |
20160036636 | Erickson et al. | Feb 2016 | A1 |
20160036833 | Ardeli et al. | Feb 2016 | A1 |
20160036837 | Jain et al. | Feb 2016 | A1 |
20160050132 | Zhang et al. | Feb 2016 | A1 |
20160072815 | Rieke et al. | Mar 2016 | A1 |
20160080414 | Kolton et al. | Mar 2016 | A1 |
20160087861 | Kuan et al. | Mar 2016 | A1 |
20160094394 | Sharma et al. | Mar 2016 | A1 |
20160094529 | Mityagin | Mar 2016 | A1 |
20160103692 | Guntaka et al. | Apr 2016 | A1 |
20160105350 | Greifeneder et al. | Apr 2016 | A1 |
20160112270 | Danait et al. | Apr 2016 | A1 |
20160112284 | Pon et al. | Apr 2016 | A1 |
20160119234 | Valencia Lopez et al. | Apr 2016 | A1 |
20160127395 | Underwood et al. | May 2016 | A1 |
20160147585 | Konig et al. | May 2016 | A1 |
20160148251 | Thomas et al. | May 2016 | A1 |
20160162308 | Chen et al. | Jun 2016 | A1 |
20160162312 | Doherty et al. | Jun 2016 | A1 |
20160173446 | Nantel | Jun 2016 | A1 |
20160173535 | Barabash et al. | Jun 2016 | A1 |
20160191476 | Schutz et al. | Jun 2016 | A1 |
20160205002 | Rieke et al. | Jul 2016 | A1 |
20160216994 | Sefidcon et al. | Jul 2016 | A1 |
20160217022 | Velipasaoglu et al. | Jul 2016 | A1 |
20160234083 | Ahn et al. | Aug 2016 | A1 |
20160269442 | Shieh | Sep 2016 | A1 |
20160269482 | Jamjoom et al. | Sep 2016 | A1 |
20160277435 | Salajegheh | Sep 2016 | A1 |
20160294691 | Joshi | Oct 2016 | A1 |
20160308908 | Kirby et al. | Oct 2016 | A1 |
20160337204 | Dubey et al. | Nov 2016 | A1 |
20160357424 | Pang et al. | Dec 2016 | A1 |
20160357546 | Chang et al. | Dec 2016 | A1 |
20160357957 | Deen et al. | Dec 2016 | A1 |
20160359592 | Kulshreshtha et al. | Dec 2016 | A1 |
20160359628 | Singh et al. | Dec 2016 | A1 |
20160359658 | Yadav et al. | Dec 2016 | A1 |
20160359673 | Gupta et al. | Dec 2016 | A1 |
20160359677 | Kulshreshtha et al. | Dec 2016 | A1 |
20160359678 | Madani et al. | Dec 2016 | A1 |
20160359679 | Parasdehgheibi et al. | Dec 2016 | A1 |
20160359680 | Parasdehgheibi et al. | Dec 2016 | A1 |
20160359686 | Parasdehgheibi et al. | Dec 2016 | A1 |
20160359695 | Yadav et al. | Dec 2016 | A1 |
20160359696 | Yadav et al. | Dec 2016 | A1 |
20160359697 | Scheib et al. | Dec 2016 | A1 |
20160359698 | Deen et al. | Dec 2016 | A1 |
20160359699 | Gandham et al. | Dec 2016 | A1 |
20160359700 | Pang et al. | Dec 2016 | A1 |
20160359701 | Pang et al. | Dec 2016 | A1 |
20160359703 | Gandham et al. | Dec 2016 | A1 |
20160359704 | Gandham et al. | Dec 2016 | A1 |
20160359705 | Parasdehgheibi et al. | Dec 2016 | A1 |
20160359708 | Gandham et al. | Dec 2016 | A1 |
20160359709 | Deen et al. | Dec 2016 | A1 |
20160359711 | Deen et al. | Dec 2016 | A1 |
20160359712 | Alizadeh Attar et al. | Dec 2016 | A1 |
20160359740 | Parasdehgheibi et al. | Dec 2016 | A1 |
20160359759 | Singh et al. | Dec 2016 | A1 |
20160359872 | Yadav et al. | Dec 2016 | A1 |
20160359877 | Kulshreshtha et al. | Dec 2016 | A1 |
20160359878 | Prasad et al. | Dec 2016 | A1 |
20160359879 | Deen et al. | Dec 2016 | A1 |
20160359880 | Pang et al. | Dec 2016 | A1 |
20160359881 | Yadav et al. | Dec 2016 | A1 |
20160359888 | Gupta et al. | Dec 2016 | A1 |
20160359889 | Yadav et al. | Dec 2016 | A1 |
20160359890 | Deen et al. | Dec 2016 | A1 |
20160359891 | Pang et al. | Dec 2016 | A1 |
20160359897 | Yadav et al. | Dec 2016 | A1 |
20160359905 | Touboul et al. | Dec 2016 | A1 |
20160359912 | Gupta et al. | Dec 2016 | A1 |
20160359913 | Gupta et al. | Dec 2016 | A1 |
20160359914 | Deen et al. | Dec 2016 | A1 |
20160359915 | Gupta et al. | Dec 2016 | A1 |
20160359917 | Rao et al. | Dec 2016 | A1 |
20160373481 | Sultan et al. | Dec 2016 | A1 |
20170024453 | Raja et al. | Jan 2017 | A1 |
20170034018 | Parasdehgheibi et al. | Feb 2017 | A1 |
20170048121 | Hobbs et al. | Feb 2017 | A1 |
20170070582 | Desai et al. | Mar 2017 | A1 |
20170085483 | Mihaly et al. | Mar 2017 | A1 |
20170208487 | Ratakonda et al. | Jul 2017 | A1 |
20170250880 | Akens et al. | Aug 2017 | A1 |
20170250951 | Wang et al. | Aug 2017 | A1 |
20170289067 | Lu et al. | Oct 2017 | A1 |
20170295141 | Thubert et al. | Oct 2017 | A1 |
20170302691 | Singh et al. | Oct 2017 | A1 |
20170324518 | Meng et al. | Nov 2017 | A1 |
20170331747 | Singh et al. | Nov 2017 | A1 |
20170346736 | Chander et al. | Nov 2017 | A1 |
20170364380 | Frye, Jr. et al. | Dec 2017 | A1 |
20180006911 | Dickey | Jan 2018 | A1 |
20180007115 | Nedeltchev et al. | Jan 2018 | A1 |
20180013670 | Kapadia et al. | Jan 2018 | A1 |
20180145906 | Yadav et al. | May 2018 | A1 |
Number | Date | Country |
---|---|---|
101093452 | Dec 2007 | CN |
101770551 | Jul 2010 | CN |
102521537 | Jun 2012 | CN |
103023970 | Apr 2013 | CN |
103716137 | Apr 2014 | CN |
104065518 | Sep 2014 | CN |
107196807 | Sep 2017 | CN |
0811942 | Dec 1997 | EP |
1076848 | Jul 2002 | EP |
1383261 | Jan 2004 | EP |
1450511 | Aug 2004 | EP |
2045974 | Apr 2008 | EP |
2043320 | Apr 2009 | EP |
2860912 | Apr 2015 | EP |
2887595 | Jun 2015 | EP |
3069241 | Aug 2018 | EP |
2009-016906 | Jan 2009 | JP |
1394338 | May 2014 | KR |
WO 0145370 | Jun 2001 | WO |
WO 2007014314 | Feb 2007 | WO |
WO 2007070711 | Jun 2007 | WO |
WO 2008069439 | Jun 2008 | WO |
WO 2013030830 | Mar 2013 | WO |
WO 2015042171 | Mar 2015 | WO |
WO 2015099778 | Jul 2015 | WO |
WO 2015118454 | Aug 2015 | WO |
WO 2016004075 | Jan 2016 | WO |
WO 2016019523 | Feb 2016 | WO |
Entry |
---|
“Borg: The Predecessor to Kubernetes,” Apr. 23, 2015, 2 pages, available at https://kubernetes.io/blog/2015/04/borg-predecessor-to-kubernetes/. |
“Kubernetes Components,” Aug. 28, 2020, 4 pages, available at https://kubernetes.io/docs/concepts/overview/components/. |
“Nodes,” Jan. 12, 2021, 6 pages, available at https://kubernetes.io/docs/concepts/architecture/nodes/. |
“OpenTracing,” 10 pages, available at https://github.com/opentracing/specification/blob/master/specification.md. |
“Pods,” Jan. 12, 2021, 5 pages, available at https://kubernetes.io/docs/concepts/workloads/pods/pod/. |
“The OpenTracing Semantic Specification,” 8 pages, available at https://opentracing.io/docs/. |
“What is Kubernetes,” Oct. 22, 2020, 3 pages, available at https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/. |
Al-Fuqaha, Ala, et al., “Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications,” IEEE Communication Surveys & Tutorials. vol. 17, No. 4, Nov. 18, 2015, pp. 2347-2376. |
Aniszczyk, Chris, “Distributed Systems Tracing with Zipkin” Jun. 7, 2012, 3 pages, available at https://blog.twitter.com/engineering/en_us/a/2012/distributed-systems-tracing-with-zipkin.html. |
Arista Networks, Inc., “Application Visibility and Network Telemtry using Splunk,” Arista White Paper, Nov. 2013, 11 pages. |
Australian Government Department of Defence, Intelligence and Security, “Top 4 Strategies to Mitigate Targeted Cyber Intrusions,” Cyber Security Operations Centre Jul. 2013, http://www.asd.gov.au/infosec/top-mitigations/top-4-strategies-explained.htm. |
Author Unknown, “Blacklists & Dynamic Reputation: Understanding Why the Evolving Threat Eludes Blacklists,” www.dambala.com, 9 pages, Dambala, Atlanta, GA, USA. |
Aydin, Galip, et al., “Architecture and Implementation of a Scalable Sensor Data Storage and Analysis Using Cloud Computing and Big Data Technologies,” Journal of Sensors, vol. 2015, Article ID 834217, Feb. 2015, 11 pages. |
Ayers, Andrew, et al: “TraceBack: First Fault Diagnosis by Reconstruction of Distributed Control Flow,” Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '09, vol. 40, No. 6, Jun. 12, 2005, 13 pages. |
Baah, George K., et al: “The Probabilistic Program Dependence Graph and Its Application to Fault Diagnosis,” IEEE Transactions on Software Engineering, IEEE Service Center, Los Alamitos, CA, US, vol. 36, No. 4, Jul. 1, 2010 (Jul. 1, 2010), pp. 528-545, XP011299543, ISSN: 0098-5589. |
Backes, Michael, et al., “Data Lineage in Malicious Environments,” IEEE 2015, pp. 1-13. |
Bauch, Petr, “Reader's Report of Master's Thesis, Analysis and Testing of Distributed NoSQL Datastore Riak,” May 28, 2015, Brno. 2 pages. |
Bayati, Mohsen, et al., “Message-Passing Algorithms for Sparse Network Alignment,” Mar. 2013, 31 pages. |
Berezinski, Przemyslaw, et al., “An Entropy-Based Network Anomaly Detection Method,” Entropy, 2015, vol. 17, www.mdpi.com/journal/entropy, pp. 2367-2408. |
Berthier, Robin, et al. “Nfsight Netflow-based Network Awareness Tool,” 2010, 16pages. |
Bhuyan, Dhiraj, “Fighting Bots and Botnets,” 2006, pp. 23-28. |
Blair, Dana, et al., U.S. Appl. No. 62/106,006, tiled Jan. 21, 2015, entitled “Monitoring Network Policy Compliance.” |
Breen, Christopher, “MAC 911, How to dismiss Mac App Store Notifications,” Macworld.com, Mar. 24, 2014, 3 pages. |
Brocade Communications Systems, Inc., “Chapter 5—Configuring Virtual LANs (VLANs),” Jun. 2009, 38 pages. |
Chandran, Midhun, et al., “Monitoring in a Virtualized Environment,” GSTF International Journal on Computing, vol. 1, No. 1, Aug. 2010. |
Chari, Suresh, et al., “Ensuring continuous compliance through reconciling policy with usage,” Proceedings of the 18th ACM symposium on Access control models and technologies (SACMAT '13). ACM, New York, NY, USA, 49-60. |
Chen, Xu, et al., “Automating network application dependency discovery: experiences, limitations, and new solutions,” 8th USENIX conference on Operating systems design and implementation (OSDI'08), USENIX Association, Berkeley, CA, USA, 117-130. |
Choi, Chang Ho, et al: “CSMonitor: A Visual Client/Server Monitor for CORBA-based Distributed Applications,” Software Engineering Conference, 1998. Proceedings. 1998 Asia Pacific Taipei, Taiwan Dec. 2-4, 1998, Los Alamitos, CA, USA, IEEE Comput. Soc, US, Dec. 2, 1998, pp. 338-345, XP010314829, DOI: 10.1109/APSEC.1998.733738: ISBN: 978-0-8186-9183-6. |
Chou, C.W., et al., “Optical Clocks and Relativity,” Science vol. 329, Sep. 24, 2010, pp. 1630-1633. |
Cisco Systems, “Cisco Network Analysis Modules (NAM) Tutorial,” Cisco Systems, Inc., Version 3.5. |
Cisco Systems, Inc. “Cisco, Nexus 3000 Series NX-OS Release Notes, Release 5.0(3)U3(1),” Feb. 29, 2012, Part No. OL-26631-01, 16 pages. |
Cisco Systems, Inc., “Addressing Compliance from One Infrastructure: Cisco Unified Compliance Solution Framework,” 2014. |
Cisco Systems, Inc., “Cisco—VPN Client User Guide for Windows,” Release 4.6, Aug. 2004, 148 pages. |
Cisco Systems, Inc., “Cisco 4710 Application Control Engine Appliance Hardware Installation Guide,” Nov. 2007, 66 pages. |
Cisco Systems, Inc., “Cisco Application Dependency Mapping Service,” 2009. |
Cisco Systems, Inc., “Cisco Data Center Network Architecture and Solutions Overview,” Feb. 2006, 19 pages. |
Cisco Systems, Inc., “Cisco IOS Configuration Fundamentals Configuration Guide: Using Autoinstali and Setup,” Release 12.2, first published Apr. 2001, last updated Sep. 2003, 32 pages. |
Cisco Systems, Inc., “Cisco VN-Link: Virtualization-Aware Networking,” White Paper, Mar. 2009, 10 pages. |
Cisco Systems, Inc., “Cisco, Nexus 5000 Series and Cisco Nexus 2000 Series Release Notes, Cisco NX-OS Release 5.1(3)N2(1b), NX-OS Release 5.1(3)N2(1a) and NX-OS Release 5.1 (3)N2(1),” Sep. 5, 2012, Part No. OL-26652-03 CO, 24 pages. |
Cisco Systems, Inc., “Nexus 3000 Series NX-OS Fundamentals Configuration Guide, Release 5.0(3)U3(1): Using PowerOn Auto Provisioning,” Feb. 29, 2012, Part No. OL-26544-01, 10 pages. |
Cisco Systems, Inc., “Quick Start Guide, Cisco ACE 4700 Series Application Control Engine Appliance,” Software Ve740rsion A5(1.0), Sep. 2011, 138 pages. |
Cisco Systems, Inc., “Routing and Bridging Guide, Cisco ACE Application Control Engine,” Software Version A5(1.0), Sep. 2011, 248 pages. |
Cisco Systems, Inc., “VMWare and Cisco Virtualization Solution: Scale Virtual Machine Networking,” Jul. 2009, 4 pages. |
Cisco Systems, Inc., “White Paper—New Cisco Technologies Help Customers Achieve Regulatory Compliance,” 1992-2008. |
Cisco Systems, Inc., “A Cisco Guide to Defending Against Distributed Denial of Service Attacks,” May 3, 2016, 34 pages. |
Cisco Systems, Inc., “Cisco Application Visibility and Control,” Oct. 2011, 2 pages. |
Cisco Systems, Inc., “Cisco Remote Integrated Service Engine for Citrix NetScaler Appliances and Cisco Nexus 7000 Series Switches Configuration Guide,” Last modified Apr. 29, 2014, 78 pages. |
Cisco Systems, Inc., “Cisco Tetration Platform Data Sheet”, Updated Mar. 5, 2018, 21 pages. |
Cisco Technology, Inc., “Cisco IOS Software Release 12.4T Features and Hardware Support,” Feb. 2009, 174 pages. |
Cisco Technology, Inc., “Cisco Lock-and-Key: Dynamic Access Lists,” http://www/cisco.com/c/en/us/support/docs/security-vpn/lock-key/7604-13.html; Updated Jul. 12, 2006, 16 Pages. |
Cisco Systems, Inc.,“Cisco Application Control Engine (ACE) Troubleshooting Guide—Understanding the ACE Module Architecture and Traffic Flow,” Mar. 11, 2011, 6 pages. |
Costa, Raul, et al., “An Intelligent Alarm Management System for Large-Scale Telecommunication Companies,” In Portuguese Conference on Artificial Intelligence, Oct. 2009, 14 pages. |
De Carvalho, Tiago Filipe Rodrigues, “Root Cause Analysis in Large and Complex Networks,” Dec. 2008, Repositorio.ul.pt, pp. 1-55. |
Di Lorenzo, Guisy, et al., “EXSED: An Intelligent Tool for Exploration of Social Events Dynamics from Augmented Trajectories,” Mobile Data Management (MDM), pp. 323-330, Jun. 3-6, 2013. |
Duan, Yiheng, et al., Detective: Automatically Identify and Analyze Malware Processes in Forensic Scenarios via DLLs, IEEE ICC 2015—Next Generation Networking Symposium, pp. 5691-5696. |
Feinstein, Laura, et al., “Statistical Approaches to DDoS Attack Detection and Response,” Proceedings of the DARPA Information Survivability Conference and Exposition (DISCEX '03), Apr. 2003, 12 pages. |
Foundation for Intelligent Physical Agents, “FIPA Agent Message Transport Service Specification,” Dec. 3, 2002, http://www.fipa.org; 15 pages. |
George, Ashley, et al., “NetPal: A Dynamic Network Administration Knowledge Base,” 2008, pp. 1-14. |
Gia, Tuan Nguyen, et al., “Fog Computing in Healthcare Internet of Things: A Case Study on ECG Feature Extraction,” 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing, Oct. 26, 2015, pp. 356-363. |
Goldsteen, Abigail, et al., “A Tool for Monitoring and Maintaining System Trustworthiness at Run Time,” REFSQ (2015), pp. 142-147. |
Grove, David, “Call Graph Construction in Object-Oriented Languages”, ACM OOPSLA, Oct. 1997, ACM OOPSLA, 18 pages. |
Hamadi, S., et al., “Fast Path Acceleration for Open vSwitch in Overlay Networks,” Global Information Infrastructure and Networking Symposium (GIIS), Montreal, QC, pp. 1-5, Sep. 15-19, 2014. |
Heckman, Sarah, et al., “On Establishing a Benchmark for Evaluating Static Analysis Alert Prioritization and Classification Techniques,” IEEE, 2008; 10 pages. |
Hewlett-Packard, “Effective use of reputation intelligence in a security operations center,” Jul. 2013, 6 pages. |
Hideshima, Yusuke, et al., “Starmine: A Visualization System for Cyber Attacks,” https://www.researchaate.net/publication/221536306, Feb. 2006, 9 pages. |
Huang, Hing-Jie, et al., “Clock Skew Based Node Identification in Wireless Sensor Networks,” IEEE, 2008, 5 pages. |
Ihler, Alexander, et al: “Learning to Detect Events With Markov-Modulated Poisson Processes,” ACM Transactions on Knowledge Discovery From Data, vol. 1, No. 3, Dec. 1, 2007, pp. 13-1 to 13-23. |
InternetPerils, Inc., “Control Your Internet Business Risk,” 2003-2015, https://www.internetperils.com. |
Ives, Herbert, E., et al., “An Experimental Study of the Rate of a Moving Atomic Clock,” Journal of the Optical Society of America, vol. 28, No. 7, Jul. 1938, pp. 215-226. |
Janoff, Christian, et al., “Cisco Compliance Solution for HIPAA Security Rule Design and Implementation Guide,” Cisco Systems, Inc., Updated Nov. 14, 2015, part 1 of 2, 350 pages. |
Janoff, Christian, et al., “Cisco Compliance Solution for HIPAA Security Rule Design and Implementation Guide,” Cisco Systems, Inc., Updated Nov. 14, 2015, part 2 of 2, 588 pages. |
Joseph, Dilip, et al., “Modeling Middleboxes,” IEEE Network, Sep./Oct. 2008, pp. 20-25. |
Kent, S., et al. “Security Architecture for the Internet Protocol,” Network Working Group, Nov. 1998, 67 pages. |
Kerrison, Adam, et al., “Four Steps to Faster, Better Application Dependency Mapping—Laying the Foundation for Effective Business Service Models,” BMCSoftware, 2011. |
Kim, Myung-Sup, et al. “A Flow-based Method for Abnormal Network Traffic Detection, ” IEEE, 2004, pp. 599-612. |
Kraemer, Brian, “Get to know your data center with CMDB,” TechTarget, Apr. 5, 2006, http://searchdatacenter.techtarget.com/news/118820/Get-to-know-your-data-center-with-CMDB. |
Lab SKU, “VMware Hands-on Labs—HOL-SDC-1301” Version: 20140321-160709, 2013; http://docs.hol.vmware.com/HOL-2013/holsdc-1301_html_en/ (part 1 of 2). |
Lab SKU, “VMware Hands-on Labs—HOL-SDC-1301” Version: 20140321-160709, 2013; http://docs.hol.vmware.com/HOL-2013/holsdc-1301_html_en/ (part 2 of 2). |
Lachance, Michael, “Dirty Little Secrets of Application Dependency Mapping,” Dec. 26, 2007. |
Landman, Yoav, et al., “Dependency Analyzer,” Feb. 14, 2008, http://ifrog.com/confluence/display/DA/Home. |
Lee, Sihyung, “Reducing Complexity of Large-Scale Network Configuration Management,” Ph D. Dissertation, Carniege Mellon University, 2010. |
Li, Ang, et al., “Fast Anomaly Detection for Large Data Centers,” Global Telecommunications Conference (GLOBECOM 2010, Dec. 2010, 6 pages. |
Li, Bingbong, et al., “A Supervised Machine Learning Approach to Classify Host Roles on Line Using sFlow,” in Proceedings of the first edition workshop on High performance and programmable networking, 2013, ACM, New York, NY, USA, 53-60. |
Liu, Ting, et al., “Impala: A Middleware System for Managing Autonomic, Parallel Sensor Systems,” In Proceedings of the Ninth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming(PPoPP '03), ACM, New York, NY, USA, Jun. 11-13, 2003, pp. 107-118. |
Lu, Zhonghai, et al., “Cluster-based Simulated Annealing for Mapping Cores onto 2D Mesh Networks on Chip,” Design and Diagnostics of Electronic Circuits and Systems, pp. 1, 6, 16-18, Apr. 2008. |
Matteson, Ryan, “Depmap: Dependency Mapping of Applications Using Operating System Events: a Thesis,” Master's Thesis, California Polytechnic State University, Dec. 2010. |
Moe, Johan, et al: “Understanding Distributed Systems via Execution Trace Data,” May 12-13, 2001, 8 pages. |
Natarajan, Arun, et al., “NSDMiner: Automated Discovery of Network Service Dependencies,” Institute of Electrical and Electronics Engineers INFOCOM, Feb. 2012, 9 pages. |
Navaz, A.S. Syed, et al., “Entropy based Anomaly Detection System to Prevent DDoS Attacks in Cloud,” International Journal of computer Applications (0975-8887), vol. 62, No. 15, Jan. 2013, pp. 42-47. |
Neverfail, “Neverfail IT Continuity Architect,” 2015, https://web.archive.org/web/20150908090456/http://www.neverfailgroup.com/products/it-continuity-architect. |
Nilsson, Dennis K., et al., “Key Management and Secure Software Updates in Wireless Process Control Environments,” In Proceedings of the First ACM Conference on Wireless Network Security (WiSec '08), ACM, New York, NY, USA, Mar. 31-Apr. 2, 2008, pp. 100-108. |
Nunnally, Troy, et al., “P3D: A Parallel 3D Coordinate Visualization for Advanced Network Scans,” IEEE 2013, Jun. 9-13, 2013, 6 pages. |
O'Donnell, Glenn, et al., “The CMDB Imperative: How to Realize the Dream and Avoid the Nightmares,” Prentice Hall, Feb. 19, 2009. |
Ohta, Kohei, et al., “Detection, Defense, and Tracking of Internet-Wide Illegal Access in a Distributed Manner,” 2000, pp. 1-16. |
Online Collins English Dictionary, 1 page (Year: 2018). |
Pathway Systems International Inc., “How Blueprints does Integration,” Apr. 15, 2014, 9 pages, http://pathwaysystems.com/company-blog/. |
Pathway Systems International Inc., “What is Blueprints?” 2010-2016, http://pathwaysystems.com/blueprints-about/. |
Popa, Lucian, et al., “Macroscope: End-Point Approach to Networked Application Dependency Discovery,” CoNEXT'09, Dec. 1-4, 2009, Rome, Italy, 12 pages. |
Prasad, K. Munivara, et al., “An Efficient Detection of Flooding Attacks to Internet Threat Monitors (ITM) using Entropy Variations under Low Traffic,” Computing Communication & Networking Technologies (ICCCNT '12), Jul. 26-28, 2012, 11 pages. |
Sachan, Mrinmaya, et al., “Solving Electrical Networks to incorporate Supervision in Random Walks,” May 13-17, 2013, pp. 109-110. |
Sammarco, Matteo, et al., “Trace Selection for Improved WLAN Monitoring,” Aug. 16, 2013, pp. 9-14. |
Shneiderman, Ben, et al., “Network Visualization by Semantic Substrates,” Visualization and Computer Graphics, vol. 12, No. 5, pp. 733,740, Sep.-Oct. 2006. |
Sigelman, Benjamin H., et al., “Dapper, a Large-Scale Distributed Systems Tracing Infrastracture,” Google Technical Report dapper-2010-1, Apr. 2010, 14 pages, available at https://research.google/pubs/pub36356/. |
Thomas, R., “Bogon Dotted Decimal List,” Version 7.0, Team Cymru NOC, Apr. 27, 2012, 5 pages. |
Virtualization, Bosch, Apr. 2010, Lehigh University, pp. 1-33. |
Voris, Jonathan, et al., “Bait and Snitch: Defending Computer Systems with Decoys,” Columbia University Libraries, Department of Computer Science, 2013, pp. 1-25. |
Wang, RU, et al., “Learning directed acyclic graphs via bootstarp aggregating,” 2014, 47 pages, http://arxiv.org/abs/1406.2098. |
Wang, Yongjun, et al., “A Network Gene-Based Framework for Detecting Advanced Persistent Threats,” Nov. 2014, 7 pages. |
Witze, Alexandra, “Special relativity aces time trial, ‘Time dilation’ predicted by Einstein confirmed by lithium ion experiment,” Nature, Sep. 19, 2014, 3 pages. |
Woodberg, Brad, “Snippet from Juniper SRX Series” Jun. 17, 2013, 1 page, O'Reilly Media, Inc. |
Zatrochova, Zuzana, “Analysis and Testing of Distributed NoSQL Datastore Riak,” Spring, 2015, 76 pages. |
Zhang, Yue, et al., “Cantina: A Content-Based Approach to Detecting Phishing Web Sites,” May 8-12, 2007, pp. 639-648. |
Cisco Systems, Inc., “CCNA 2 v3.1 Module 1 WANs and Routers” Cisco.com, May 14, 2018, 26 pages. |
Cisco Systems, Inc., “CCNA 2 v3.1 Module 2 Introduction to Routers” Cisco.com, Jan. 18, 2018, 23 pages. |
Citirx, “AppFlow: next-generation application performance monitoring,” 2011, 8 pages. |
Goins, Adrian, et al., “Diving Deep into Kubernetes Networking”, Jan. 2019, 42 pages. |
Hogg, Scott, “Not your Father's Flow Export Protocol (Part 2), What is AppFlow and how does it differ from other flow analysis protocols,” Core Networking, Mar. 19, 2014, 6 pages. |
Number | Date | Country | |
---|---|---|---|
20220038353 A1 | Feb 2022 | US |
Number | Date | Country | |
---|---|---|---|
62171899 | Jun 2015 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16237187 | Dec 2018 | US |
Child | 17503097 | US | |
Parent | 15152163 | May 2016 | US |
Child | 16237187 | US |