The present invention relates generally to the field of machine learning, and more particularly to filtering and returning retraining data to counter drift and bias in machine learning models at the edge.
The “edge” is a term that refers to a location, far from the cloud or a big data center, where you have a computer device (edge device) capable of running (edge) applications. Edge computing is the act of running workloads on these edge devices. Machine learning at the edge is a concept that brings the capability of running machine learning models locally to edge devices. These machine learning models can be invoked by the edge application. Machine learning at the edge is important for many scenarios where raw data is collected from sources far from the cloud.
Aspects of an embodiment of the present invention disclose a method, computer program product, and computer system for filtering and returning retraining data to counter drift and bias in machine learning models at the edge. A processor receives, at an edge device running a local instance of a machine learning (ML) model, a set of inference data comprising a plurality of datapoints, wherein the local instance of the ML model is a deployed version of the ML model running in a cloud environment, and wherein the ML model was trained in the cloud environment and then deployed to the edge device. A processor runs the plurality of datapoints through one or more filters to determine a probability for each datapoint of whether a respective datapoint should be sent back to the cloud environment and used for retraining the ML model. A processor determines, for each datapoint, whether the probability for the respective datapoint meets a send back threshold that is required to be met before the respective datapoint is sent back to the cloud environment.
Embodiments of the present invention recognize that machine learning models are never completely accurate. These models need monitoring and retraining even after they are deployed. The most valuable data for retaining is real data points for categories that a machine learning model performed inference poorly on. However, unless there is always live human feedback, it is difficult to know where the model is failing. For example, if we can detect that a machine learning model is performing poorly for a certain group, additional data from that group would be key for retraining. Data seen on edge devices that are using the machine learning model may differ significantly from data used for training or even from data other devices see. This leads to the deployed models becoming less and less accurate. Knowing when data consistency has changed is a big challenge.
Collecting data that the machine learning model performs biases and/or incorrect inferences on as well as data that exemplifies drift, i.e., data that is significantly different from the training data, is essential to model retraining. However, edge devices usually get very large amounts of inference data, sometimes even a constant stream. All of this inference data is desirable from a machine learning point of view as generally more data leads to a better model. However, all the data cannot simply be sent back to the cloud as sending back large amounts of data negates many of the upsides of operating at the edge. Constantly sending back large amounts of data overwhelms network bandwidth and slows other data transfers over a network. Additionally, sending back all the data from edge devices, which can operate in fleets of thousands, leaves the data scientist with an impractical amount of data that they must sift through, select what to use for retraining, and label all the data points. Embodiments of the present invention recognize that deciding which of these data points to collect at the edge and send back for monitoring and retraining, without overwhelming the network, is a challenge.
Thus, embodiments of the present invention provide a system and method for filtering and returning retraining data to counter drift and bias in machine learning models at the edge. The system comprises a main machine learning model running in the cloud that is deployed to a plurality of edge devices. Each edge device must determine which of the inference data to send back to the cloud data center for retraining of the main machine learning model. There are three main categories of data to send back: (1) data from groups that the model may show bias against, (2) data that the model encounters that may be different from original training data (i.e., drift), and (3) feedback data that comes from transactions where end users disagree with the model's predictions. For each category, the goal is to collect and send data back to the cloud that would be valuable for retraining and improving the quality of the model, while discarding repeat and/or insignificant data.
Embodiments of the present invention utilize an intelligent device data filter program that runs directly on an edge device along with the machine learning model. In the case that the edge device has limited compute power or storage resources (e.g., small Internet of Things (IoT) devices), the filtering workload of the program can run fully on an edge server or can be split between the edge device and an edge server with the more computationally heavy filtering done by a machine learning model on the edge server. Alternatively, the intelligent device data filter program can be configured to run only when the machine learning model on the edge device and/or edge server is not utilizing as much of its compute resources, i.e., the use of hardware in the edge device and/or edge server (e.g., processors, RAM, etc.) to perform tasks required and handled by the edge device and/or edge server.
Data from the edge will be sent back to the cloud according to a priority ranking for each data point. By using this filtering method before data transport, embodiments of the present invention significantly minimize the amount of data that needs to travel through the network while providing the data that is most useful for model monitoring and retraining. Data transfer will occur when network conditions are observed to have sufficient bandwidth.
Implementation of embodiments of the invention may take a variety of forms, and exemplary implementation details are discussed subsequently with reference to the Figures.
As shown in
In
In
In
In
In
In
In
In
In
In
When IDDF program 112 cannot be run completely on edge computing device 1B, IDDF program 112 can be hosted by an edge server (e.g., edge server 130B) that the edge computing device 110B is in communication with, as shown in
In embodiments in which the user feedback filter is enabled with a user interface (e.g., user interface 116B), the filtering done on the edge device (e.g., edge computing device 110B) is done by an edge user flagging datapoints that the edge user believes are incorrect predictions (as described in sub-step 211 of
In
User interface 116C provides an interface between IDDF program 112 on edge computing device 110C and a user of edge computing device 110C. In one embodiment, user interface 116C is an edge interface of the ML application (as described above) software running on edge computing device 110C. In one embodiment, user interface 116C may be a graphical user interface (GUI) or a web user interface (WUI) that can display text, documents, web browser windows, user options, application interfaces, and instructions for operation, and include the information (such as graphic, text, and sound) that a program presents to a user and the control sequences the user employs to control IDDF program 112. User interface 116C optionally enables a user of edge computing device 110C to provide user feedback on inference data received by edge computing device 110C, in which this user feedback acts as one filter of IDDF program 112, as described in sub-step 211 of
Edge server 130C is a computer server that is located physically close to edge computing device 110C. In general, edge servers are located physically close to endpoints (i.e., edge devices) to reduce latency for operation of the endpoints. Edge server 130C can be a standalone computing device, a management server, a web server, a mobile computing device, or any other electronic device or computing system capable of receiving, sending, and processing data. In other embodiments, edge server 130C can represent a server computing system utilizing multiple computers as a server system, such as in a cloud computing environment. In another embodiment, edge server 130C can be a laptop computer, a tablet computer, a netbook computer, a personal computer (PC), a desktop computer, or any programmable electronic device capable of communicating with edge computing device 110C, cloud data center 120, and other computing devices (not shown) within distributed data processing environment 100 via network 105. In another embodiment, edge server 130C represents a computing system utilizing clustered computers and components (e.g., database server computers, application server computers, etc.) that act as a single pool of seamless resources when accessed within distributed data processing environment 100. Edge server 130C includes an instance of IDDF program 112 in which the filtering and model scoring is done on edge server 130C as opposed to edge computing device 110A or 110B to not overwhelm an edge device with too big of a computational workload, as described below with reference to
When IDDF program 112 cannot be run completely on an edge device, IDDF program 112 can be hosted by an edge server (e.g., edge server 130C) that the edge device is in communication with, as shown in
In embodiments in which the user feedback filter is enabled with a user interface (e.g., user interface 116C), the filtering done on the edge server (e.g., edge server 130C) is done by an edge user flagging datapoints that the edge user believes are incorrect predictions (as described in sub-step 211 of
In step 205, IDDF program 112 receives inference data. In an embodiment, at an edge device, IDDF program 112 receives inference data. For example, at edge computing device 110A, IDDF program 112 receives inference data from local ML model 114A. Inference data can include edge device information, type of edge device, location of edge device, time that the local ML model was used, input data to the local ML model, output data of the local ML model, feedback data from the user if provided on performance of the local ML model, frequency of use of the local ML model, user profile information associated with the edge computing device or user accessing the model (i.e., individual characteristics of the user), information from users that opt in to link other applications or social media profiles to the edge computing device, etc.
Input data to a local ML model (e.g., local ML model 114A) can include data that the edge device is employed to collect, i.e., data coming from surrounding environmental sensors or a GUI, and the type of data depends completely on the type of model deployed and how the sensors are configured. For example, if local ML model 114C on edge computing device 110C is a camera used to detect whether there is an intruder coming into someone's house. The camera is constantly collecting data, so there will a constant stream of data at the edge. However, most of the data is useless because usually there is no one walking around the door of the house, so it can be discarded at the edge. Thus, the goal is to send back only the most interesting datapoints (e.g., when someone is trying to open the door). At every timepoint though (as determined by the data scientist, e.g., possibly every frame of the video or possibly every second), local ML model 114C still runs inference on the live video (inference data for the model) and determines whether there is an intruder.
In step 210, IDDF program 112 runs the inference data through a set of filters and/or models. In an embodiment, at an edge computing device and/or an edge server (e.g., edge computing device 110A or 110B and/or edge server 130B or 130C), IDDF program 112 runs the inference data through a set of filters as a first step in determining whether the inference data should be sent back to cloud data center 120. Only inference data that makes it through at least one of the filters and/or models will move on to step 215 and may be referred to as filtered inference data. In some embodiments, IDDF program 112 runs the inference data through an edge user flag filter 211, a bias detection model 212, a drift detection model 213, and a data scientist criteria filter 214. If the inference data is flagged by at least one of the filters and/or models, then IDDF program 112 proceeds with that filtered inference data to step 215. In other embodiments where user feedback is not provided, IDDF program 112 runs the inference data through only a bias detection model 212, a drift detection model 213, and a data scientist criteria filter 214. In some embodiments, IDDF program 112 runs the inference data through each filter and/or model consecutively in any order according to computing resources of the edge device, e.g., edge computing device 110A or 110B and/or the edge server, e.g., edge server 130B or 130C. In other embodiments, IDDF program 112 runs the inference data through each filter and/or model simultaneously.
In embodiments in which an edge computing device (e.g., edge computing device 110B) does not have enough computing resources to perform or handle all of the filters and models, IDDF program 112 performs this step 210 partially on edge computing device 110B and partially on edge server 130B, as shown in
Sub-step 211 applies in embodiments in which a user interface, e.g., user interface 116A, is available to a user for providing feedback on outputs of the local ML model, e.g., local ML model 114A, 114B, or 114C. In an embodiment, IDDF program 112 enables a user to provide feedback on datapoints (i.e., predictions/inferences) output by the local ML model (e.g., local ML model 114A, 114B, or 114C). User feedback can be in the form of the user disagreeing with a prediction/inference or marking the prediction/inference as questionable, i.e., any indication by the user that the prediction/inference datapoint is incorrect or should be reviewed. Responsive to IDDF program 112 receiving user feedback about a datapoint, i.e., the inference data, IDDF program 112 flags the datapoint for sending back to cloud data center 120. For a flagged datapoint, IDDF program 112 outputs a binary result of one (1) for each flagged datapoint. For unflagged datapoints or for all the datapoints in embodiments with no user interface, IDDF program 112 outputs a binary result of zero (0) for each datapoint.
In sub-step 212, IDDF program 112 runs the inference data through a bias detection model. In an embodiment, IDDF program 112 runs the inference data through a bias detection model, i.e., a classification (supervised predictive) ML model trained on the cloud using the same training data as ML model 122. The training data is constantly being updated from the original training dataset with prior runs of IDDF program 112 sending back filtered data that is used for further training. The bias detection model is trained on whether or not ML model 122 made the correct prediction or not for each instance of the data. This allows the bias detection model to guess whether new data received at the edge is likely to be misclassified by the local ML model (e.g., local ML model 114A, 114B, or 114C). For training the bias detection model, IDDF program 112 feeds the training and testing datasets through ML model 122 and labels data into two groups: the datapoints the model categorizes correctly and the datapoints the model categorizes incorrectly. These become the new labels for the bias detection model, which is trained to detect whether the model is likely to categorize incoming datapoints correctly or not. The original model was trained to predict something (e.g., cost, temperature, etc.). That feature of the data (or column) is the target, i.e., the thing the model predicts based on the other data. However, that piece of data is irrelevant in the bias detection model, so that feature is replaced with the new feature of whether that instance of data was categorized correctly or not. This new feature becomes the new target feature since that is the target for the bias detection model to predict. IDDF program 112 runs bias detection model to predict whether the local ML model (e.g., local ML model 114A, 114B, or 114C) is likely to categorize incoming datapoints correctly or not.
The bias detection model aims to predict how similar a certain datapoint (i.e., the inference datapoint run through the local ML model) is with previous datapoints that the local ML model (e.g., local ML model 114A, 114B, or 114C) has predicted incorrectly, thereby revealing model biases. The bias detection model detects whether the inference data is likely to have been a datapoint that the local ML model (e.g., local ML model 114A, 114B, or 114C) misclassified based on the datapoint's similarity to previous datapoints the main or local ML model performed poorly on. The bias detection model assigns a confidence score to the datapoint to quantify how likely the main or local ML model was biased towards this datapoint. In an embodiment, IDDF program 112 receives the confidence score output by the bias detection model for the inference data or datapoint. The confidence score can be between zero (0) and one (1), in which the higher the confidence score, the more likely it is necessary to send the datapoint back to the cloud data center for use in retraining ML model 122 before a copy is redeployed as a local ML model, e.g., local ML model 114A, 114B, or 114C on edge computing device 110A, 110B or edge server 130C.
In sub-step 213, IDDF program 112 runs the inference data through a drift detection model. In an embodiment, IDDF program 112 runs the inference data through a drift detection model, i.e., a ML model trained on the cloud using the same training data as the main ML model (e.g., ML model 122). The drift detection model operates as a lightweight anomaly detection ML model that can be based off of any unsupervised clustering algorithm, such as Density-based Spatial Clustering of Applications with Noise (DBSCAN), Cluster-based Local Outlier Factor (CBLOF), tree-based methods such as Isolation Forests, or simple statistical calculations like Box Plots based on how features deviate from the training set. The choice of algorithm depends on the type most suitable for the dataset and edge device computing constraints. The drift detection model outputs a decimal score normalized to fall within the zero (0) to one (1) range with 0 meaning clearly similar to prior data and 1 meaning extremely deviant.
The purpose of the drift detection model is to identify whether any of the inference data at the edge has drifted far from what the most recent ML model on cloud data center 120 (e.g., ML model 122) was trained to see. The drift detection model detects how different this datapoint is from most other points seen in training the ML model 122 on cloud data center 120 and outputs a corresponding drift/anomaly score. The more different this datapoint is, the more likely the local model may have performed poorly on the datapoint, and therefore, the more valuable in sending the datapoint back to cloud data center 120 for use in retraining the ML model 122.
In sub-step 214, IDDF program 112 performs a data scientist criteria filter on the inference data. In an embodiment, IDDF program 112 compares the inference data or datapoint to criteria that a data scientist has deemed important for model analysis and retraining. The criteria may include, but is not limited to, datapoints with values within a particular range or predictions the original model made that the data scientist marked as potentially incorrect or biased. The criteria are set by a data scientist for the most recent ML model, e.g., ML model 122 of cloud data center 120. In an embodiment, IDDF program 112 scores each data point run through the data scientist criteria filter from zero (0) to one (1) based on how well the datapoint satisfies the given criteria.
In step 215, IDDF program 112 runs outputs from the set of filters through a send back ML model and receives a final score output by the send back ML model. In embodiments as shown in
In several embodiments, IDDF program 112 is designed to have the four filters and send back ML model separate such that the filters can be updated independently. In other embodiments, IDDF program 112 can be trained as one stacked model or layered neural network with the filters and send back ML model combined.
In an embodiment, the send back ML model is trained to assign a final score to datapoints based on a prioritization (i.e., weighting) of the output from the edge user flag filter being the highest prioritization, then the outputs of the bias detection model and drift detection model being next highest prioritization, and then the output from the data scientist criteria filter being the least prioritization. This order prioritizes datapoints that have definitely been misclassified (as determined by the edge user flag filter) and require further analysis, to datapoints that are likely to have been misclassified because they are either different from datapoints seen before or are similar to previously misclassified datapoints, and finally to datapoints that a data scientist has an interest in collecting for further analysis. The send back ML model is trained by creating a synthetic dataset with four feature columns each representing an output from each of the four filters of step 210, which each have a value between 0 and 1. Each row is scored according to the prioritization and the importance of collecting a certain datapoint. For example, the send back ML model assigns a final score of one (1) if the datapoint has both a very high drift score and a 1 from the edge user flag versus the model assigning a final score of zero (0) to a datapoint if it has a very low data scientist criteria score but is otherwise normal.
In decision 220, IDDF program 112 determines whether the final score meets a send back threshold. In an embodiment, IDDF program 112 compares the final score output by the send back ML model to a send back threshold, i.e., a threshold score between zero (0) and one (1) required for sending back data to the cloud data center, e.g., 0.8. The send back threshold is based on network conditions and can be adjusted by an administrator (e.g., data scientist that creates/updates the filters/models 211-214) according to network conditions. In order to not flood the network with data, the send back threshold for sending data back can have a direct relationship with network traffic, i.e., when network traffic is high, the send back threshold increases, and when network traffic is low, the send back threshold decreases.
In an embodiment, IDDF program 112 measures network traffic by sending a ping and measuring the Round Trip Time (RTT) of a packet of data. This provides a very lightweight measure of network latency, which has a direct effect on performance/throughput of a window-based protocol or any response-request protocol. IDDF program 112 sends pings to an IP address responsible for model monitoring every 15 seconds by default, but can be adjusted between 2 seconds and 30 seconds depending on a level of granularity desired. In an embodiment, an RTT of >400 ms is considered slow, and thus, IDDF program 112 increases the send back threshold according to the equation: (1−current threshold)/2, but never exceeding one (1). An RTT between 250 ms and 400 ms is considered normal, and thus the send back threshold is maintained. An RTT <250 ms is considered fast, and thus IDDF program 112 decreases the send back threshold by 0.1 (until the threshold is 0.1) every ping that is measured to be less than 250 ms.
If IDDF program 112 determines the final score does not meet the send back threshold (decision 220, NO branch), then IDDF program 112 proceeds to step 225 and discards (i.e., deletes or removes in some way) a preset percentage of the inference data at the edge and sends the remaining percentage to the cloud labelled as non-filtered data, e.g., 90% is discarded at edge computing device 110A and 10% is sent back to cloud data center 120 and that 10% of inference data is labelled as non-filtered data. If IDDF program 112 determines the final score does meet the send back threshold (decision 220, YES branch), then IDDF program 112 proceeds to step 230 and queues the inference data to be sent back to the cloud (e.g., cloud data center 120). Thus, only a subset of the inference data is sent back to cloud data center 120 with datapoints that do not meet the send back threshold being discarded at the edge.
Various aspects of the present disclosure are described by narrative text, flowcharts, block diagrams of computer systems and/or block diagrams of the machine logic included in computer program product (CPP) embodiments. With respect to any flowcharts, depending upon the technology involved, the operations can be performed in a different order than what is shown in a given flowchart. For example, again depending upon the technology involved, two operations shown in successive flowchart blocks may be performed in reverse order, as a single integrated step, concurrently, or in a manner at least partially overlapping in time.
A computer program product embodiment (“CPP embodiment” or “CPP”) is a term used in the present disclosure to describe any set of one, or more, storage media (also called “mediums”) collectively included in a set of one, or more, storage devices that collectively include machine readable code corresponding to instructions and/or data for performing computer operations specified in a given CPP claim. A “storage device” is any tangible device that can retain and store instructions for use by a computer processor. Without limitation, the computer readable storage medium may be an electronic storage medium, a magnetic storage medium, an optical storage medium, an electromagnetic storage medium, a semiconductor storage medium, a mechanical storage medium, or any suitable combination of the foregoing. Some known types of storage devices that include these mediums include: diskette, hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or Flash memory), static random access memory (SRAM), compact disc read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanically encoded device (such as punch cards or pits/lands formed in a major surface of a disc) or any suitable combination of the foregoing. A computer readable storage medium, as that term is used in the present disclosure, is not to be construed as storage in the form of transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide, light pulses passing through a fiber optic cable, electrical signals communicated through a wire, and/or other transmission media. As will be understood by those of skill in the art, data is typically moved at some occasional points in time during normal operations of a storage device, such as during access, de-fragmentation or garbage collection, but this does not render the storage device as transitory because the data is not transitory while it is stored.
In
Computer 301 may take the form of a desktop computer, laptop computer, tablet computer, smart phone, smart watch or other wearable computer, mainframe computer, quantum computer or any other form of computer or mobile device now known or to be developed in the future that is capable of running a program, accessing a network or querying a database, such as remote database 330. As is well understood in the art of computer technology, and depending upon the technology, performance of a computer-implemented method may be distributed among multiple computers and/or between multiple locations. On the other hand, in this presentation of computing environment 300, detailed discussion is focused on a single computer, specifically computer 301, to keep the presentation as simple as possible. Computer 301 may be located in a cloud, even though it is not shown in a cloud in
Processors set 310 includes one, or more, computer processors of any type now known or to be developed in the future. Processing circuitry 320 may be distributed over multiple packages, for example, multiple, coordinated integrated circuit chips. Processing circuitry 320 may implement multiple processor threads and/or multiple processor cores. Cache 321 is memory that is located in the processor chip package(s) and is typically used for data or code that should be available for rapid access by the threads or cores running on processor set 310. Cache memories are typically organized into multiple levels depending upon relative proximity to the processing circuitry. Alternatively, some, or all, of the cache for the processor set may be located “off chip.” In some computing environments, processor set 310 may be designed for working with qubits and performing quantum computing.
Computer readable program instructions are typically loaded onto computer 301 to cause a series of operational steps to be performed by processor set 310 of computer 301 and thereby effect a computer-implemented method, such that the instructions thus executed will instantiate the methods specified in flowcharts and/or narrative descriptions of computer-implemented methods included in this document (collectively referred to as “the inventive methods”). These computer readable program instructions are stored in various types of computer readable storage media, such as cache 321 and the other storage media discussed below. The program instructions, and associated data, are accessed by processor set 310 to control and direct performance of the inventive methods. In computing environment 300, at least some of the instructions for performing the inventive methods may be stored in block 316 in persistent storage 313.
Communication fabric 311 is the signal conduction path that allows the various components of computer 301 to communicate with each other. Typically, this fabric is made of switches and electrically conductive paths, such as the switches and electrically conductive paths that make up busses, bridges, physical input/output ports and the like. Other types of signal communication paths may be used, such as fiber optic communication paths and/or wireless communication paths.
Volatile memory 312 is any type of volatile memory now known or to be developed in the future. Examples include dynamic type random access memory (RAM) or static type RAM. Typically, volatile memory 312 is characterized by random access, but this is not required unless affirmatively indicated. In computer 301, the volatile memory 312 is located in a single package and is internal to computer 301, but, alternatively or additionally, the volatile memory may be distributed over multiple packages and/or located externally with respect to computer 301.
Persistent storage 313 is any form of non-volatile storage for computers that is now known or to be developed in the future. The non-volatility of this storage means that the stored data is maintained regardless of whether power is being supplied to computer 301 and/or directly to persistent storage 313. Persistent storage 313 may be a read only memory (ROM), but typically at least a portion of the persistent storage allows writing of data, deletion of data and re-writing of data. Some familiar forms of persistent storage include magnetic disks and solid state storage devices. Operating system 322 may take several forms, such as various known proprietary operating systems or open source Portable Operating System Interface-type operating systems that employ a kernel. The code included in block 316 typically includes at least some of the computer code involved in performing the inventive methods.
Peripheral device set 314 includes the set of peripheral devices of computer 301. Data communication connections between the peripheral devices and the other components of computer 301 may be implemented in various ways, such as Bluetooth connections, Near-Field Communication (NFC) connections, connections made by cables (such as universal serial bus (USB) type cables), insertion-type connections (for example, secure digital (SD) card), connections made through local area communication networks and even connections made through wide area networks such as the internet. In various embodiments, UI device set 323 may include components such as a display screen, speaker, microphone, wearable devices (such as goggles and smart watches), keyboard, mouse, printer, touchpad, game controllers, and haptic devices. Storage 324 is external storage, such as an external hard drive, or insertable storage, such as an SD card. Storage 324 may be persistent and/or volatile. In some embodiments, storage 324 may take the form of a quantum computing storage device for storing data in the form of qubits. In embodiments where computer 301 is required to have a large amount of storage (for example, where computer 301 locally stores and manages a large database) then this storage may be provided by peripheral storage devices designed for storing very large amounts of data, such as a storage area network (SAN) that is shared by multiple, geographically distributed computers. IoT sensor set 325 is made up of sensors that can be used in Internet of Things applications. For example, one sensor may be a thermometer and another sensor may be a motion detector.
Network module 315 is the collection of computer software, hardware, and firmware that allows computer 301 to communicate with other computers through WAN 302. Network module 315 may include hardware, such as modems or Wi-Fi signal transceivers, software for packetizing and/or de-packetizing data for communication network transmission, and/or web browser software for communicating data over the internet. In some embodiments, network control functions and network forwarding functions of network module 315 are performed on the same physical hardware device. In other embodiments (for example, embodiments that utilize software-defined networking (SDN)), the control functions and the forwarding functions of network module 315 are performed on physically separate devices, such that the control functions manage several different network hardware devices. Computer readable program instructions for performing the inventive methods can typically be downloaded to computer 301 from an external computer or external storage device through a network adapter card or network interface included in network module 315.
WAN 302 is any wide area network (for example, the internet) capable of communicating computer data over non-local distances by any technology for communicating computer data, now known or to be developed in the future. In some embodiments, the WAN 302 may be replaced and/or supplemented by local area networks (LANs) designed to communicate data between devices located in a local area, such as a Wi-Fi network. The WAN and/or LANs typically include computer hardware such as copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and edge servers.
End user device (EUD) 303 is any computer system that is used and controlled by an end user (for example, a customer of an enterprise that operates computer 301) and may take any of the forms discussed above in connection with computer 301. EUD 303 typically receives helpful and useful data from the operations of computer 301. For example, in a hypothetical case where computer 301 is designed to provide a recommendation to an end user, this recommendation would typically be communicated from network module 315 of computer 301 through WAN 302 to EUD 303. In this way, EUD 303 can display, or otherwise present, the recommendation to an end user. In some embodiments, EUD 303 may be a client device, such as thin client, heavy client, mainframe computer, desktop computer and so on.
Remote server 304 is any computer system that serves at least some data and/or functionality to computer 301. Remote server 304 may be controlled and used by the same entity that operates computer 301. Remote server 304 represents the machine(s) that collect and store helpful and useful data for use by other computers, such as computer 301. For example, in a hypothetical case where computer 301 is designed and programmed to provide a recommendation based on historical data, then this historical data may be provided to computer 301 from remote database 330 of remote server 304.
Public cloud 305 is any computer system available for use by multiple entities that provides on-demand availability of computer system resources and/or other computer capabilities, especially data storage (cloud storage) and computing power, without direct active management by the user. Cloud computing typically leverages sharing of resources to achieve coherence and economies of scale. The direct and active management of the computing resources of public cloud 305 is performed by the computer hardware and/or software of cloud orchestration module 341. The computing resources provided by public cloud 305 are typically implemented by virtual computing environments that run on various computers making up the computers of host physical machine set 342, which is the universe of physical computers in and/or available to public cloud 305. The virtual computing environments (VCEs) typically take the form of virtual machines from virtual machine set 343 and/or containers from container set 344. It is understood that these VCEs may be stored as images and may be transferred among and between the various physical machine hosts, either as images or after instantiation of the VCE. Cloud orchestration module 341 manages the transfer and storage of images, deploys new instantiations of VCEs and manages active instantiations of VCE deployments. Gateway 340 is the collection of computer software, hardware, and firmware that allows public cloud 305 to communicate through WAN 302.
Some further explanation of virtualized computing environments (VCEs) will now be provided. VCEs can be stored as “images.” A new active instance of the VCE can be instantiated from the image. Two familiar types of VCEs are virtual machines and containers. A container is a VCE that uses operating-system-level virtualization. This refers to an operating system feature in which the kernel allows the existence of multiple isolated user-space instances, called containers. These isolated user-space instances typically behave as real computers from the point of view of programs running in them. A computer program running on an ordinary operating system can utilize all resources of that computer, such as connected devices, files and folders, network shares, CPU power, and quantifiable hardware capabilities. However, programs running inside a container can only use the contents of the container and devices assigned to the container, a feature which is known as containerization.
Private cloud 306 is similar to public cloud 305, except that the computing resources are only available for use by a single enterprise. While private cloud 306 is depicted as being in communication with WAN 302, in other embodiments a private cloud may be disconnected from the internet entirely and only accessible through a local/private network. A hybrid cloud is a composition of multiple clouds of different types (for example, private, community or public cloud types), often respectively implemented by different vendors. Each of the multiple clouds remains a separate and discrete entity, but the larger hybrid cloud architecture is bound together by standardized or proprietary technology that enables orchestration, management, and/or data/application portability between the multiple constituent clouds. In this embodiment, public cloud 305 and private cloud 306 are both part of a larger hybrid cloud.