The present disclosure relates generally to information handling systems. More particularly, the present disclosure relates to systems and methods for analyzing the validity or quality of a network fabric design.
As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use, such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
The dramatic increase in computer usage and the growth of the Internet has led to a significant increase in networking. Networks, comprising such information handling systems as switches and routers, have not only grown more prevalent, but they have also grown larger and more complex. Network fabric can comprise a large number of information handling system nodes that are interconnected in a vast and complex mesh of links.
Furthermore, as businesses and personal lives increasingly rely on networked services, networks provide increasingly more central and critical operations in modern society. Thus, it is important that a network fabric be well designed and function reliably. However, given the size and complexity of modern network fabrics, it is difficult to ascertain the quality of a network design, particularly when designing the network. Sometimes, it is not until a network design has been implemented and used that it is known whether it was a good design or whether it has issues that affect its validity/quality, such as dependability, efficiency, stability, reliability, and/or expandability. For example, a network may have a network fabric design that can result in a single point of failure or may have a design that inefficiently utilizes the information handling systems of the network.
Accordingly, it is highly desirable to have ways to gauge the quality of a network fabric.
References will be made to embodiments of the disclosure, examples of which may be illustrated in the accompanying figures. These figures are intended to be illustrative, not limiting. Although the accompanying disclosure is generally described in the context of these embodiments, it should be understood that it is not intended to limit the scope of the disclosure to these particular embodiments. Items in the figures may not be to scale.
In the following description, for purposes of explanation, specific details are set forth in order to provide an understanding of the disclosure. It will be apparent, however, to one skilled in the art that the disclosure can be practiced without these details. Furthermore, one skilled in the art will recognize that embodiments of the present disclosure, described below, may be implemented in a variety of ways, such as a process, an apparatus, a system/device, or a method on a tangible computer-readable medium.
Components, or modules, shown in diagrams are illustrative of exemplary embodiments of the disclosure and are meant to avoid obscuring the disclosure. It shall also be understood that throughout this discussion that components may be described as separate functional units, which may comprise sub-units, but those skilled in the art will recognize that various components, or portions thereof, may be divided into separate components or may be integrated together, including, for example, being in a single system or component. It should be noted that functions or operations discussed herein may be implemented as components. Components may be implemented in software, hardware, or a combination thereof.
Furthermore, connections between components or systems within the figures are not intended to be limited to direct connections. Rather, data between these components may be modified, re-formatted, or otherwise changed by intermediary components. Also, additional or fewer connections may be used. It shall also be noted that the terms “coupled,” “connected,” “communicatively coupled,” “interfacing,” “interface,” or any of their derivatives shall be understood to include direct connections, indirect connections through one or more intermediary devices, and wireless connections. It shall also be noted that any communication, such as a signal, response, reply, acknowledgement, message, query, etc., may comprise one or more exchanges of information.
Reference in the specification to “one or more embodiments,” “preferred embodiment,” “an embodiment,” “embodiments,” or the like means that a particular feature, structure, characteristic, or function described in connection with the embodiment is included in at least one embodiment of the disclosure and may be in more than one embodiment. Also, the appearances of the above-noted phrases in various places in the specification are not necessarily all referring to the same embodiment or embodiments.
The use of certain terms in various places in the specification is for illustration and should not be construed as limiting. The terms “include,” “including,” “comprise,” and “comprising” shall be understood to be open terms, and any examples are provided by way of illustration and shall not be used to limit the scope of this disclosure.
A service, function, or resource is not limited to a single service, function, or resource; usage of these terms may refer to a grouping of related services, functions, or resources, which may be distributed or aggregated. The use of memory, database, information base, data store, tables, hardware, cache, and the like may be used herein to refer to system component or components into which information may be entered or otherwise recorded. The terms “data,” “information,” along with similar terms, may be replaced by other terminologies referring to a group of one or more bits, and may be used interchangeably. The terms “packet” or “frame” shall be understood to mean a group of one or more bits. The term “frame” shall not be interpreted as limiting embodiments of the present invention to Layer 2 networks; and, the term “packet” shall not be interpreted as limiting embodiments of the present invention to Layer 3 networks. The terms “packet,” “frame,” “data,” or “data traffic” may be replaced by other terminologies referring to a group of bits, such as “datagram” or “cell.” The words “optimal,” “optimize,” “optimization,” and the like refer to an improvement of an outcome or a process and do not require that the specified outcome or process has achieved an “optimal” or peak state.
It shall be noted that: (1) certain steps may optionally be performed; (2) steps may not be limited to the specific order set forth herein; (3) certain steps may be performed in different orders; and (4) certain steps may be done concurrently.
Any headings used herein are for organizational purposes only and shall not be used to limit the scope of the description or the claims. Each reference/document mentioned in this patent document is incorporated by reference herein in its entirety.
Because computer networking is a critical function in modern society, it is important that the design of information handling system nodes and connections (or links), which together form the network fabric, be done well.
Due to the complexity of modern network designs, a number of tools have been created to help in the design, operation, management, and/or troubleshooting of physical & virtual network topologies. One such tool is the SmartFabric Director (SFD), by Dell Technologies Inc. (also Dell EMC) of Round Rock, Texas, dramatically simplifies the definition, provisioning, monitoring, and troubleshooting of physical underlay fabrics with intelligent integration, visibility, and control for virtualized overlays. As a part of an initial (i.e., Day-0) deployment, SFD uses a wiring diagram, which may be imported into the system. This wiring diagram may be a JSON (JavaScript Object Notation) object that represents the physical topology to be managed. This JSON file may include such elements as: (1) managed switches (which may also be referred to as fabric elements); (2) managed connections between switches; (3) switch attributes (such as, model type (e.g., Z9264, S4128), role (e.g., spine, leaf, border), etc.; (4) connection attributes: link-id (e.g., ethernet1/1/1), link speed (e.g., 10G, 100G), link role (e.g., uplink, link aggregation group (LAG) internode link (also known as a virtual link trunking interface (VLTi))); and (5) other administrative items (e.g., management-IP (Internet Protocol) for switches).
To help generate a wiring diagram, other tools are also available. For example, Dell also provides Dell EMC’s Fabric Design Center (FDC) to help create a wiring diagram for a network fabric. Once the wiring diagram has been created, it may be imported into an SFD Controller for deployment.
While these tools aid in generating wiring diagrams and in deploying and managing fabrics, it is not apparent which designs are better than others. Given the vastness and complexity of some network fabrics, it may take experience, actual deployment, or both to gauge whether a network fabric will have issues that affect its validity/quality, such as dependability, efficiency, stability, reliability, and/or expandability.
In addition to fundamental problems with the network fabric design, a number of potential issues can exist in a design. For example, the following non-exhaustive list are issues that can exist in a wiring diagram: (1) missing fabric elements (e.g., missing a border switch); (2) missing one or more connection (e.g., uplink, VLTi, etc.); (3) platform compatibility issues; (4) feature compatibility issues; (5) end-of-life issues with older models; (6) platform capability issues (e.g., a lower-end device with limited capacity should preferably not be used in a key role, such as a spine node); (7) link bandwidth (e.g., not enough bandwidth between a spine-leaf or leaf-leaf).
Fabric analysis is generally a manual process where the wiring diagram is manually analyzed after being created. There may be some rule-based approaches to aid the analysis, but such approaches have limitations on scalability, performance, and adaptability.
Since typical deployment fabrics are CLOS networks, there are some established best practices guidelines on how to build them. For the reasons stated above, it would be very useful to have an analysis tool that can gauge these design level issues prior to deployment.
Accordingly, embodiments herein help automate the analysis of network fabric designs. In one or more embodiments, the analysis functionality may be incorporated into design and/or deployment and management tools. For example, a fabric design center tool may include a feature or features that allows a user to build a fabric (e.g., the “Build Your Own Fabric” section of Dell EMC’s FDC) and include a “Fabric Analysis” embodiment that analyzes the wiring diagram. Thus, the Fabric Analysis feature takes a wiring diagram and analyzes it using one or more embodiments as explained in the document. In one or more embodiments, the analysis feature may generate a real-valued score (e.g., 0.0 ≤ score ≤ 1.0) that represents the strength of the design, which score may be assigned to various categories. For example, in one or more embodiment, a qualitative policy may have three categories or classification as follows:
It shall be noted that different, fewer, or more categories may be used. For example, the set of classes may be associated with certain issues or potential issues with the network fabric. By classifying the issues or potential issues, a network designer or administrator may take one or more corrective actions.
In one or more embodiments, appropriate or corrective actions may include a design audit by an advanced services team (i.e., expert(s) in the field) for new recommendations. The audit may be performed at various degrees of complexity and may involve checking for the presence of common issues, for example: (a) checking all devices in the topology for end-of-life date; (b) checking if there is sufficient redundancy in the design (i.e., every leaf/spine is a pair); (c) checking connection bandwidth between leaf-pairs and spine-pairs to ensure sufficiency; (d) checking if a border leaf is present; and (e) checking to see if the devices are being used appropriately based on their capability (i.e., low-end device should not be used as a spine device). Additionally, or alternatively, the one or more corrective actions may involve making a change or changes based upon classification(s) identified by the neural network system.
In one or more embodiments, an adjacency matrix  ( ∈ ℝn×n ) is generated (210) to represent the multigraph. In one or more embodiments, the adjacency matrix is an n × n matrix, where n is the number of nodes in the multigraph. The adjacency matrix may be augmented or include (215) information related to edge features or attributes, such as link type, link speed, number of links connected between two nodes, etc.
In one or more embodiments, a degree matrix, D (D ∈ ℝn×n), which is an n × n diagonal matrix, that represents degree of each node in the multigraph is created (220). In one or more embodiments, degree represents the number of links of a node, which may consider bi-directional links as two separate links, or in embodiments, may treat them as a single link.
In one or more embodiments, the adjacency matrix, Â, and the degree matrix, D, may be combined and normalized (225) to build a normalized adjacency matrix A that will be used as an input to train the neural network. In one or more embodiments, the following formula may be used to obtain the normalized adjacency matrix:
In one or more embodiments, a feature matrix is created (230) for the nodes of the network fabric.
Returning to
Returning to
In compiling the dataset, a variety of node values and link values were used for generating wiring-diagrams multigraphs. Examples include:
In generating the dataset, care was taken to have a uniform distribution of equal numbers of good and not-so-good wiring diagrams. In one or more embodiments, the dataset was divided into 80-10-10 distribution representing training, cross-validation, and testing, respectively.
Returning to
In one or more embodiments, a 3-layer GCN may used, but it shall be noted that other configurations or different numbers of layers may be used.
where:
A general formula for generating convolutions on the graph at any level may be expressed as:
A processing pipeline for the 3-layer GCN depicted in
After flowing through multiple GCN layers 1210, 1220, and 1230 with different convolution filters, the hidden layer output of last layer is fed into a softmax non-linear function 1235 that produces a probability distribution of possible score values summing to 1.
where:
In one or more embodiments, these scores may be used as predicted score, in which the particular category with the highest score may be selected as the output category. For training, the score may be compared to the ground-truth score to compute a loss. The losses may be used to update one or more parameters of the GCN, using, for example, gradient descent.
In one or more embodiments, the training process may be repeated until a stop condition has been reached. In one or more embodiments, a stop condition may include: (1) a set number of iterations have been performed; (2) an amount of processing time has been reached; (3) convergence (e.g., the difference between consecutive iterations is less than a first threshold value); (4) divergence (e.g., the performance deteriorates); (5) an acceptable outcome has been reached; and (6) a set or sets of data have been fully processed. After training has completed, a trained GCN is output. As explained in the next section, the trained GCN model may be used for predicting the validity/quality of a wiring diagram for a network fabric design.
A degree matrix is also created (1320). In one or more embodiments, the degree value for a network element represents its number of connections.
In one or more embodiments, the adjacency matrix and the degree matrix are used (1325) to compute a normalized adjacency matrix, which may be computed using:
In like manner as for the training process, for each networking element, a feature representation is generated using one or more features about or related to the networking element. For the network fabric, the feature representations for the networking elements may be formed (1330) into a feature matrix. In one or more embodiments, the feature matrix comprises a feature representation for each network element in the network fabric.
In one or more embodiments, the feature matrix is input (1335) into a trained graph convolution network (GCN) that uses the input feature matrix and the normalized adjacency matrix to compute a classification probability for each class from a set of classes. Responsive to a classification probability exceeding a threshold value, a classification associated with that class is assigned (1340) to the network fabric. As noted previously, the classification classes may be generalized categories (e.g., green (good/acceptable), yellow (caution/potential issues), or red (do not use/critical issues)). Additionally, or alternatively, the neural network may comprise a set of neural networks that provide multiclass classification in which each identified class specifies a certain issue. For example, there may be classes related to missing links, poor redundancy, missing a fabric element, wrong configuration, incompatibility of devices or links, capacity issues, etc.
In any event, depending upon the assigned classification, one or more actions may be taken (1345). Action may include deploying the network fabric as designed. Corrective actions may include redesigning the network fabric to correct for one or more defects, which may be identified by the assigned classification.
In one or more embodiments, appropriate or corrective actions may include a design audit by an advanced services team (i.e., expert(s) in the field) for new recommendations. The audit may be performed at various degrees of complexity and may involve checking for the presence of common issues, for example: (a) checking all devices in the topology for end-of-life date; (b) checking if there is sufficient redundancy in the design (i.e., every leaf/spine is a pair); (c) checking connection bandwidth between leaf-pairs and spine-pairs to ensure sufficiency; (d) checking if a border leaf is present; and (e) checking to see if the devices are being used appropriately based on their capability (i.e., low-end device should not be used as a spine device). Alternatively, or additionally, identified classes of issues may be used to correct issues in the design, which may be fixed programmatically based upon the identified issues.
In tests, it was found that the trained model performed extremely well on all objective tasks of determining incomplete or incorrect topologies, incompatible equipment, and missing connections.
Initial latent design issues (e.g., Day-0 issues) may manifest themselves as problems during Day-N deployment and beyond. These initially undetected latent issues may have a significant delay in deployment and increase in operation expenses. For example, a single issue with wiring diagram for an initial release may require the whole virtual appliance to be redeployed from scratch. Thus, embodiments herein help the deployment engineer identify problems associated with creating physical wiring diagram. Furthermore, as compared with any rule-based system or expert-based approach, embodiments provide several benefits:
Scalability: Due to the number of permutations in the graph, a rule-based system would require millions of rules, which is impractical, if not impossible, to produce and is not scalable. In comparison, an embodiment has a fixed set of parameters and is spatially invariant; it learns high dimensional patterns from training examples to give predictions.
Adaptability: The nature of the problem is such that it may be said that there are no fixed “good” or “bad” wiring diagram. For example, a moderate bandwidth device deployed in a high capacity, high-bandwidth fabric is less desirable than deploying the same switch in a small-scale fabric. Embodiment adapts to the overall context of the fabric to predict a score of the viability of such a deployment.
Performance: Neural network models, such as GCNs, performed very well on all objective tasks. Furthermore, tests performed on an embodiment provided superior performance.
Continued Improvement: As more data becomes available, the neural network model may be retrained for improved classification and/or may be augmented to learn additional classifications.
Wide deployment and usage: A trained neural model can be readily and widely deployed. Thus, less skilled administrators can use the trained neural model and receive the benefits that would otherwise be unavailable to them given their limited experience with network fabric deployments.
Ease of usage and time savings: A trained neural model can be easily deployed and used. Furthermore, once trained, it is very inexpensive to have the model operated on a writing diagram. Thus, as networks evolve, it is easy, fast, and cost effective to gauge the quality of these changed network designs.
During the solution deployment of a converged or non-converged infrastructure, either in a private cloud or a hybrid cloud, networking systems design teams works with customers to design a network architecture based on the desired functionality and requirements. The output of this process is generally a wiring diagram of the physical topology that gets handed to a deployment team for installation at the customer data center.
This wiring diagram may represent the physical topology to be managed and may include such elements as: (1) managed fabric elements; (2) managed connections between fabric elements; (3) fabric element attributes (such as, hardware model, software version, role (e.g., spine, leaf, border, core, edge), protocol support, etc.; (4) connection attributes: interface-id (e.g., ethernet1/1/1), interface speed (e.g., 10G, 100G), interface role (e.g., uplink, link aggregation group (LAG) internode link (also known as a virtual link trunking interface (VLTi))); and (5) other administrative items (e.g., management-IP (Internet Protocol) for switches).
In addition to the functional aspects of the wiring design, each infrastructure element has or represents additional attributes that must be considered in design and deployment situations. One set of these additional attributes that became particularly prominent during the COVID-19 pandemic was the availability of products. COVID-19, and its related lockdowns, travel restrictions, impacts on labor, and impacts on domestic and foreign trade illustrated how vulnerable supply chains can be. The inability to source even one component of a multicomponent system-especially if that component is a critical component, such as a processor-was enough to impact an infrastructure’s deployment. These infrastructure deployments are often critical, which means their timely deployment and proper functioning are necessary. While improper design can cause functional problems, selection of an infrastructure element that suffers a supply chain delay so as to delay deployment is at least as negatively impactful as a poor design. Thus, when creating a fabric design, particularly a design that must be deployed according to a set timetable, it can be extremely important to also consider supply-related factors when gauging the robustness of a network fabric design.
Currently, to the extent that these factors are considered in the overall design, it is a manual process involving multiple teams. Given the complexity and the number of people involved in the process, the possibility for human error in either the design or in failing to appreciate a design-related issue—including a potential supply chain issue—is very high. Thus, it would be extremely useful to have an automatic network wiring diagram analyzer that could identify issues prior to deployment.
Accordingly, presented herein are systems and methods for network fabric design analysis to validate the robustness of a fabric architecture by modeling the problem as a graph classification problem using a neural network, such as a supervised deep-learning-based graph neural network (GNN). In one or more embodiments, a fabric topology design wiring diagram may be input either as a schematic or an object file (e.g., a JSON file), and the model generates a real-valued score (e.g., 0.0 ≤ score ≤ 1.0) to represent the strength of the design with the higher the score the better the design. Additionally, or alternatively, the model (which may be a set of models) may output values related to one or more elements of the infrastructure design to help pinpoint issues.
By way of general background, in one or more embodiments, a supplied infrastructure design, which may be but is not limited to a network fabric design, may be transformed into a graph, such as a directed acyclic multigraph, with nodes and unique identity edges. Infrastructure elements may be considered as graph nodes with their characteristics and capabilities as attributes or features of the graph node, and edges may be correlated to link connections (e.g., networking or communication connections) between the devices. It shall be noted that there may be multiple edges with unique attributes between two nodes. In one or more embodiments, feature extraction is performed on nodes and edges. Using a GNN, the GNN model is first trained with labeled data (which may be organic data, synthetic data, or both) making this a supervised learning methodology. For example, a company, such as Dell, has been generating or working with infrastructure designs for itself and for its customers for decades. This vast set of historic data may be mined to obtain designs and to correlate designs with actual results (e.g., issues when actually deployed, supply chain delays, etc.) and may be assigned labeled data analytically, by experts, or both. In one or more embodiments, these designs may be used to generate additional training data by using permutations of topologies. The training dataset may be divided (e.g., 80-10-10) into three training, cross-validation, and testing datasets, with the datasets being uniformly distributed with good and not-so-good wiring diagrams.
Once trained, the trained model may be used for predicting the strength/quality (or potential issues) of an infrastructure design prior to deployment. Predicted output score(s) may be interpreted by employing a threshold-based qualitative policy, in which a score below a certain threshold may be flagged for re-inspection. In one or more embodiments, a classification label identifying a specific issue or issues and/or one or more quality indicators may be employed. It shall be noted that the neural network may be configured to provide such information for each node, edge, or both of the infrastructure deployment design.
In one or more embodiments, the infrastructure deployment design may be related to a new deployment or may be an addition or upgrade to an existing system. In one or more embodiments, in the case of an addition/upgrade, the analysis may be performed in just the new infrastructure deployment design. Alternatively, or additionally, the infrastructure deployment design may be extended to include the existing system (or the portion of the existing system that will be integrated with the new infrastructure deployment design) as part of overall infrastructure deployment design that is analyzed. The existing portion may be detailed to include all of its nodes and edges, may be aggregated into one or more black box portions that represent a block or blocks of infrastructure elements, or a mixture thereof.
In one or more embodiments, for each edge, a feature representation is generated (1410) using one or more features about the edge. The features about the edge may include one or more features, such as link type, link speed, number of links connected between two nodes, type of connections, components of link (e.g., fibre optics, CAT-5, whether small form pluggable or transceivers are being used, etc.), etc. It shall be noted that one or more features for the edge may include supply-chain-related information. For example, it may be that certain transceivers may have several weeks lead time before they are available. If the timing is inconsistent with the deployment schedule for the infrastructure deployment design, then the neural network may be trained to signal an alert given the input and its training.
In one or more embodiments, an adjacency matrix related to the edge feature representations is generated (1415).
In one or more embodiments, a degree matrix, D (e.g., D ∈ ℝn×n), which may be an n × n diagonal matrix, that represents degree of each node in the multigraph is created (1515). In one or more embodiments, degree represents the number of links of a node, which may consider bi-directional links as two separate links, or in embodiments, may treat them as a single link.
In one or more embodiments, the initial adjacency matrix, which has been augmented, and the degree matrix may be combined and normalized (1520) to build an adjacency matrix A that will be used as an input into the neural network. In one or more embodiments, the following formula may be used to obtain the final adjacency matrix:
Returning to
Given the infrastructure elements’ features, a feature matrix may be generated (1425).
Because any attribute of the node may be used as a feature, in one or more embodiments, categorical (e.g., nominal, ordinal) features may be converted (1610) into numeric features using label encoders, one-hot vector encoding, or other encoding methodologies, including using a encoder model or models. It shall be noted that conversion of non-numeric features may also be performed for edges, where needed.
The feature representations for the nodes in the infrastructure deployment design may be combined (1615) to create a feature matrix, X, that represents all features from all nodes. In one or more embodiments, the feature matrix is an n × d matrix (X ∈ ℝn×d).
Returning to
In compiling the dataset, a variety of node values and link values were used for generating wiring-diagrams multigraphs. As noted above, in generating the dataset, care was taken to have a uniform distribution of equal numbers of good and not-so-good wiring diagrams. In one or more embodiments, the dataset was divided into 80-10-10 distribution representing training, cross-validation, and testing, respectively.
Given a test dataset, the feature matrices, adjacency matrices, and ground-truth scores are input into a neural network to train the neural network. In one or more embodiments, the neural network may be a graph neural network (GNN). Additionally, or alternatively, the neural network may comprise a set of neural networks.
where:
A general formula for a layer operating may be expressed as:
A processing pipeline for the 3-layer GNN depicted in
After flowing through multiple GNN layers, the hidden layer output of last layer may be fed into a softmax non-linear function 1735 that produces (1430) a probability distribution of possible score values summing to 1.
where:
In one or more embodiments, these scores may be used as predicted score, in which the particular category with the highest score or categories with scores above a threshold may be selected as the output category or categories. For training, the score may be compared to the ground-truth score to compute (1435) a loss. The losses may be used to update (1440) one or more parameters of the GNN, using, for example, gradient descent and backpropagation.
In one or more embodiments, the training process may be repeated (1445) until a stop condition has been reached. In one or more embodiments, a stop condition may include: (1) a set number of iterations have been performed; (2) an amount of processing time has been reached; (3) convergence (e.g., the difference between consecutive iterations is less than a first threshold value); (4) divergence (e.g., the performance deteriorates); (5) an acceptable outcome has been reached; and (6) a set or sets of data have been fully processed. After training has completed, a trained GNN is output (1450). As explained in the next section, the trained GNN model may be used for analyzing the validity/quality of a wiring diagram for an infrastructure design.
In like manner as for the training process, for each infrastructure element, a feature representation is generated (1820) using one or more features about or related to the infrastructure element. For the infrastructure deployment design, the feature representations for the infrastructure elements may be formed (1825) into a feature matrix. In one or more embodiments, the feature matrix comprises a feature representation for each infrastructure element or a subset of the infrastructure elements in the infrastructure deployment design.
In one or more embodiments, the feature matrix is input (1830) into a trained graph neural network that uses the input feature matrix and the adjacency matrix to compute a classification probability for each class from a set of classes regarding the infrastructure deployment. Responsive to a classification probability exceeding a threshold value, a classification associated with that class may be output. As noted previously, the classification classes may be generalized categories (e.g., green (good/acceptable), yellow (caution/potential issues), or red (do not use/critical issues)). Additionally, or alternatively, the neural network may provide multiclass classification in which each identified class specifies a certain issue. For example, there may be classes related to missing links, poor redundancy, missing a fabric element, wrong configuration, incompatibility of devices or links, capacity issues, supply chain issues, etc. Also, by way of example, the neural network (which may comprise a plurality of neural networks) may provide an output for a specifically requested node or edge or for all the nodes and edges.
Depending upon the assigned classification(s), one or more actions may be taken (1840). Actions may include deploying the infrastructure deployment design as designed. Corrective actions may include redesigning the infrastructure deployment design to correct for one or more defects in the design and/or due to concerns related to a component of the infrastructure deployment design being identified as affected by a supply chain issue (and a different component should be used instead)—all of which may be identified by the assigned classification(s).
In one or more embodiments, appropriate or corrective actions may include a design audit by an advanced services team (i.e., expert(s) in the field) for new recommendations. The audit may be performed at various degrees of complexity and may involve checking for the presence of common issues, for example: (a) checking all devices in the topology for end-of-life date; (b) checking if there is sufficient redundancy in the design (i.e., every leaf/spine is a pair); (c) checking connection bandwidth between leaf-pairs and spine-pairs to ensure sufficiency; (d) checking if a border leaf is present; (e) checking to see if the devices are being used appropriately based on their capability (i.e., low-end device should not be used as a spine device); and (f) checking supply chain-related issues. Alternatively, or additionally, identified classes of issues may be used to correct issues in the design, which may be fixed programmatically based upon the identified issues.
In one or more embodiments, aspects of the present patent document may be directed to, may include, or may be implemented on one or more information handling systems (or computing systems). An information handling system/computing system may include any instrumentality or aggregate of instrumentalities operable to compute, calculate, determine, classify, process, transmit, receive, retrieve, originate, route, switch, store, display, communicate, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data. For example, a computing system may be or may include a personal computer (e.g., laptop), tablet computer, mobile device (e.g., personal digital assistant (PDA), smart phone, phablet, tablet, etc.), smart watch, server (e.g., blade server or rack server), a network storage device, camera, or any other suitable device and may vary in size, shape, performance, functionality, and price. The computing system may include random access memory (RAM), one or more processing resources such as a central processing unit (CPU) or hardware or software control logic, read only memory (ROM), and/or other types of memory. Additional components of the computing system may include one or more disk drives, one or more network ports for communicating with external devices as well as various input and output (I/O) devices, such as a keyboard, mouse, stylus, touchscreen, and/or video display. The computing system may also include one or more buses operable to transmit communications between the various hardware components.
As illustrated in
A number of controllers and peripheral devices may also be provided, as shown in
In the illustrated system, all major system components may connect to a bus 1916, which may represent more than one physical bus. However, various system components may or may not be in physical proximity to one another. For example, input data and/or output data may be remotely transmitted from one physical location to another. In addition, programs that implement various aspects of the disclosure may be accessed from a remote location (e.g., a server) over a network. Such data and/or programs may be conveyed through any of a variety of machine-readable medium including, for example: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs and holographic devices; magneto-optical media; and hardware devices that are specially configured to store or to store and execute program code, such as application specific integrated circuits (ASICs), programmable logic devices (PLDs), flash memory devices, other non-volatile memory (NVM) devices (such as 3D XPoint-based devices), and ROM and RAM devices.
The information handling system 2000 may include a plurality of I/O ports 2005, a network processing unit (NPU) 2015, one or more tables 2020, and a central processing unit (CPU) 2025. The system includes a power supply (not shown) and may also include other components, which are not shown for sake of simplicity.
In one or more embodiments, the I/O ports 2005 may be connected via one or more cables to one or more other network devices or clients. The network processing unit 2015 may use information included in the network data received at the node 2000, as well as information stored in the tables 2020, to identify a next device for the network data, among other possible activities. In one or more embodiments, a switching fabric may then schedule the network data for propagation through the node to an egress port for transmission to the next destination.
Aspects of the present disclosure may be encoded upon one or more non-transitory computer-readable media with instructions for one or more processors or processing units to cause steps to be performed. It shall be noted that the one or more non-transitory computer-readable media shall include volatile and/or non-volatile memory. It shall be noted that alternative implementations are possible, including a hardware implementation or a software/hardware implementation. Hardware-implemented functions may be realized using ASIC(s), programmable arrays, digital signal processing circuitry, or the like. Accordingly, the “means” terms in any claims are intended to cover both software and hardware implementations. Similarly, the term “computer-readable medium or media” as used herein includes software and/or hardware having a program of instructions embodied thereon, or a combination thereof. With these implementation alternatives in mind, it is to be understood that the figures and accompanying description provide the functional information one skilled in the art would require to write program code (i.e., software) and/or to fabricate circuits (i.e., hardware) to perform the processing required.
It shall be noted that embodiments of the present disclosure may further relate to computer products with a non-transitory, tangible computer-readable medium that have computer code thereon for performing various computer-implemented/processor-implemented operations. The media and computer code may be those specially designed and constructed for the purposes of the present disclosure, or they may be of the kind known or available to those having skill in the relevant arts. Examples of tangible computer-readable media include, for example: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs and holographic devices; magneto-optical media; and hardware devices that are specially configured to store or to store and execute program code, such as application specific integrated circuits (ASICs), programmable logic devices (PLDs), flash memory devices, other non-volatile memory (NVM) devices (such as 3D XPoint-based devices), and ROM and RAM devices. Examples of computer code include machine code, such as produced by a compiler, and files containing higher level code that are executed by a computer using an interpreter. Embodiments of the present disclosure may be implemented in whole or in part as machine-executable instructions that may be in program modules that are executed by a processing device. Examples of program modules include libraries, programs, routines, objects, components, and data structures. In distributed computing environments, program modules may be physically located in settings that are local, remote, or both.
One skilled in the art will recognize no computing system or programming language is critical to the practice of the present disclosure. One skilled in the art will also recognize that a number of the elements described above may be physically and/or functionally separated into modules and/or sub-modules or combined together.
It will be appreciated to those skilled in the art that the preceding examples and embodiments are exemplary and not limiting to the scope of the present disclosure. It is intended that all permutations, enhancements, equivalents, combinations, and improvements thereto that are apparent to those skilled in the art upon a reading of the specification and a study of the drawings are included within the true spirit and scope of the present disclosure. It shall also be noted that elements of any claims may be arranged differently including having multiple dependencies, configurations, and combinations.
This patent application is a continuation-in-part application of and claims priority benefit under 35 USC §120 to co-pending and commonly-owned U.S. Pat. App. No. 16/920,345, filed on 2 Jul. 2020, entitled “NETWORK FABRIC ANALYSIS,” and listing Vinay Sawal as inventors (Docket No. 119323.01 (20110-2398)), which patent document is incorporated by reference herein in its entirety and for all purposes.
Number | Date | Country | |
---|---|---|---|
Parent | 16920345 | Jul 2020 | US |
Child | 18348118 | US |