Embodiments disclosed herein relate generally to process management services. More particularly, embodiments disclosed herein relate to systems and methods for managing a process performed by a host device using a management controller of the host device.
Computing devices may provide computer-implemented services. The computer-implemented services may be used by users of the computing devices and/or devices operably connected to the computing devices. The computer-implemented services may be performed with hardware components such as processors, memory modules, storage devices, and communication devices. The operation of these components may impact the performance of the computer-implemented services.
Embodiments disclosed herein are illustrated by way of example and not limitation in the figures of the accompanying drawings in which like references indicate similar elements.
Various embodiments will be described with reference to details discussed below, and the accompanying drawings will illustrate the various embodiments. The following description and drawings are illustrative and are not to be construed as limiting. Numerous specific details are described to provide a thorough understanding of various embodiments. However, in certain instances, well-known or conventional details are not described in order to provide a concise discussion of embodiments disclosed herein.
Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in conjunction with the embodiment can be included in at least one embodiment. The appearances of the phrases “in one embodiment” and “an embodiment” in various places in the specification do not necessarily all refer to the same embodiment.
References to an “operable connection” or “operably connected” means that a particular device is able to communicate with one or more other devices. The devices themselves may be directly connected to one another or may be indirectly connected to one another through any number of intermediary devices, such as in a network topology.
In general, embodiments disclosed herein relate to methods and systems for managing, at least in part, a process performed by a host device. To manage a process being performed, the system may include a data processing system (e.g., host device) with in-band components that may perform various processes such as startup processes, software installation processes, and/or other types of processes. While performing a process, the data processing system may be unable to perform other processes, provide a limited selection of information regarding the process being performed via select interfaces (e.g., graphical user interfaces), manage performance of the process being performed, and/or provide information regarding progress or completion of the process. For example, a limited amount of information (e.g., progress status bar) regarding an installation process being performed may be provided through a graphical user interface. Thus, requiring a person to be local to the data processing system to view the limited information of the process via the graphical user interface. However, the limited information available to the person via the graphical user interface may be inadequate for the person to identify whether the process is performing nominally (e.g., as expected for the type of process) or determine whether the process will be completed by the data processing system in the future. Consequently, the person tasked with managing the performance of the process may be unable to identify anomalies in progress of the process to ensure continued performance of the process to reach completion.
In order to manage continued performance of a process performed by the data processing system, the future progression status of the process may be inferred (e.g., using an inference model) based on current operating states of the data processing system at different points in time while performing the process. Obtaining current operating states at different points in time may include utilizing information regarding progress of the process displayed on a display of a graphical user interface of the data processing system. For example, an out-of-band component (e.g., management controller) of the data processing system may obtain screenshots of information displayed on the display at different points in time during performance of a process. By utilizing the screenshots at the different points in time and a trained inference model (e.g., inference model trained to generate inferences indicating an expected progression status of the process at a point in time), inferred process states of the process (e.g., identification of whether the process is performing nominally) may be obtained.
To obtain the future progression status of a process, the inferred process states may be ingested in an inference model (e.g., trained to generate inferences indicating whether the process will be completed at a point in time in the future) resulting in a predicted progression status of the process being performed. The predicted progression status may indicate a future progression status of the process such as “failed installation state”, “delayed installation state”, “expected completion of process”, and/or any other identifiers for the future state of the process being performed by the data processing system.
The system may communicate the predicted progression status of the process (e.g., future operating state of the data processing system) to an individual tasked with managing performance of the process. By doing so, the data processing system may continue to perform the process to facilitate completion of the process thereby obtaining an updated data processing system that may provide computer-implemented services.
In an embodiment, a method for managing a process performed by a host device is provided. The method may include obtaining at least two inferred process states for a process being performed by the host device, the process only providing information regarding progress of the process via an interface, the process only being locally manageable using hardware resources of the host device, and a first inferred process state of the at least two inferred process states being based on a first screenshot of the interface taken at a first point in time; obtaining a first inference that indicates progression of the process in the future using the at least two inferred process states; managing continued performance of the process based on the progression of the process to facilitate completion of the process to obtain an updated host device; and providing computer-implemented services using the updated host device.
The first screenshot may be obtained by a management controller of the host device, and the first screenshot being obtained while the host device is performing the process.
A second of the at least two inferred process states may be based on a second screenshot of the interface taken at a second point in time, and the second screenshot being obtained by the management controller of the host device.
Obtaining the at least two inferred process states may include: obtaining the first inferred process state, the first inferred process state indicating a current state of the process, and the first inferred process state being obtained using: a first inference model, and at least the first screenshot, the first inference may be obtained using a second inference model.
Obtaining the at least two inferred process states may further include: obtaining telemetry data for the hardware resources, the telemetry data may include measurements of characteristics of the hardware resources while the host device may be performing the process; and obtaining hardware data for the hardware resources, the hardware data specifying the hardware resources that are contributing to performance of the process, the first inferred process state may also be obtained using the telemetry data and the hardware data.
Obtaining the first inferred process state may include: ingesting at least the first screenshot, the telemetry data, and the hardware data into the first inference model, the first inference model generating the first inferred process state based on the first screenshot, the telemetry data, and the hardware data.
The first inference model may be trained to generate inferences indicating a current progression status of the process. The current progression status may indicate whether the current progress of the process is nominal, and a progress status of the process indicates how much of the process has been completed.
Obtaining the first inference for the progression may include: ingesting the at least two inferred process states in an inference model to obtain the progression of the process, wherein the progression being a predicted progression status for the process at a point in time in the future, the predicted progression status indicating whether performance of the process is complete at the point in time in the future.
Managing continued performance of the process may include: in an instance of the managing of the continued performance where the progression indicates that the process is not complete at the point in time in the future: obtaining instructions for a management controller of the host device to perform a restart of the host device, the instructions indicating actions performable by the management controller to complete the restart; and performing, by the management controller, the actions to restart the performance of the process.
Managing continued performance of the process may further include: in an instance of the managing of the continued performance where the progression indicates that the process is not complete at the point in time in the future: obtaining management actions for a management controller of the host device, the management actions being actions performable by the management controller to modify operation of the hardware resources; and performing, by the management controller, the management actions to obtain updated hardware resources to facilitate continued performance of the process.
In an embodiment, a non-transitory media is provided that may include instructions that when executed by a processor cause the computer-implemented method to be performed.
In an embodiment, a data processing system is provided that may include the non-transitory media and a processor, and may perform the computer-implemented method when the computer instructions are executed by the processor.
Turning to
To provide the computer implemented services, the system may include, for example, data processing system 100. Data processing system 100 may provide the computer implemented services.
As part of and/or to prepare to provide the computer implemented services, data processing system 100 may perform various processes such as startup processes, software installation processes, and/or other types of processes. These processes may be performed by in-band components of data processing systems such as, for example, processors, memory modules, storage devices, etc.
When performing these processes, for example, (i) the in-band components may be unable to perform other processes, (ii) information regarding the processes being performed may only be available via select interfaces (e.g., graphical user interfaces, command line interfaces, etc.), (iii) only a limited selection of information regarding the performance of the processes may be available via the select interfaces (e.g., the graphical user interfaces may only include a progress status bar), (iv) the processes may not include general management interfaces through which performance of the processes may be managed, information regarding completion of the processes may be obtained, errors or undesired activity due to the processes may be tracked, etc., and/or (v) other types of limitations on the utility of data processing system 100 may be caused by the performance of the processes. While described with respect to a graphical user interface in various places throughout this application, it is to be appreciated that any type of interface may be used as a part of the disclosed embodiments.
For example, consider a scenario where a data processing system lacks management software such as an operating system and an installation for the operating system is initiated. The installer for the operating system may lack management interfaces, and information regarding the installation process may only be provided through a graphical user interface (e.g., only locally available). Further, the graphical user interface may only provide a limited amount of information regarding the installation. Thus, when such processes are performed by data processing system 100, a person may need to be local to the data processing system to view the graphical user interface to understand the state of the installation process.
However, even if a person is present and able to view the graphical user interface, the limited available information may be insufficient for the person to diagnose whether the process is being performed nominally (e.g., as expected for the type of the process) or abnormally (e.g., stalled, failed, etc.). Consequently, when such processes are performed, they may be performed abnormally without the abnormality being apparent to administrators tasked with managing the data processing system through viewing of the graphical user interfaces.
Thus, even when informed using available interfaces, administrators may be unable to diagnose the operating states of data processing systems perform such processes. For example, while performing processes as discussed above, the data processing system may enter an undesired operating state (e.g., a frozen installation state, a failed installation state, frozen startup state, a failed startup state, etc.) without the change in state being detected.
In addition, even if a diagnosis of the operating state of the data processing system is obtainable, the limited available information may be insufficient for the person to determine progression of the process in the future. For example, while performing processes as discussed above, the data processing system may not provide any obvious indication of future anomalous behavior (e.g., continuous reboots, failure scenarios, etc.) based on a current operating state of the data processing system. Consequently, administrators may not accurately predict how long processes will take to complete and/or whether the processes being performed are likely to complete (e.g., even if progressing at a point in time that the administrator review information provided by the interfaces).
In general, embodiments disclosed herein may provide methods, systems, and/or devices for managing processes performed by data processing system. To manage the processes performed by the data processing system, the actual state of operation of the data processing at multiple points in time may be identified. To identify the actual state of the data processing system at different points in time, out-of-band components of a data processing system such as a management controller may obtain information at different points in time presented by a graphical user interface managed by a process performed by the data processing system. The information may be used (in isolation and/or combination with other information) to infer the actual operating state of the data processing system at the respective point in time. By doing so, the actual operating state at different points in time may be used to identify progression of the process in the future (e.g., future operating state). The progression of the process in the future may be used to manage the continued performance of the process to facilitate completion of the process being performed by the data processing system.
To provide the above noted functionality, the system may include data processing system 100, management system 102, and communication system 104. Each of these components is discussed below.
Data processing system 100 may provide all, or a portion, of the computer-implemented services. For example, data processing system 100 may provide computer-implemented services to users of data processing system 100 and/or other computing devices operably connected to data processing system 100.
To facilitate the computer implemented services, data processing system 100 may participate in process management services provided in cooperation with management system 102. To participate in the process management services, data processing system 100 (i) obtain a first screenshot and a second screenshot of a graphical user interface displayed on a display of a host device at a first point in time and at a second point in time, respectively, (ii) obtain a first inference model (e.g., trained to generate inferences indicating a current progression status of processes), (iii) obtain, using the first screenshot, a first inferred process state (e.g., identifying an operating state of the data processing system at a first point in time), (iv) obtain, using the second screenshot, a second inferred process state (e.g., identifying an operating state of the data processing system at a second point in time), (v) obtain an inference indicating a progression of the process in the future using the at least two inferred process states, and/or (vi) facilitate management of progression of the process being performed by the host device to facilitate completion of the process by the host device based on the identified progression of the process in the future (e.g., future operating state of the data processing system). Refer to
Management system 102 may also participate in the process management services. When participating in the process management services, management system 102 may (i) obtain the first screenshot, the second screenshot, and/or other information regarding operation of data processing system 100 from data processing system 100 at different points in time, (ii) identify the operating state of data processing system 100 at different points in time based on the obtained information from data processing system 100, (iii) identify the progression of the process in the future (e.g., future operating state) of data processing system 100 based on the identified operating states of data processing system 100 at different points in time, (iv) perform management actions to manage the continued performance of the process to facilitate completion of the process based on the identified progression of the process in the future (e.g., future operating state of data processing system 100), and/or (v) perform other processes to facilitate management of data processing system 100.
When providing the functionalities, data processing system 100 and/or management system 102 (and/or components thereof) may perform all, or a portion, of the methods and/or actions shown in
Data processing system 100 and/or management system 102 (and/or components thereof) may be implemented using a computing device such as a host or a server, a personal computer (e.g., desktops, laptops, and tablets), a “thin” client, a personal digital assistant (PDA), a Web enabled appliance, a mobile phone (e.g., Smartphone), an embedded system, local controllers, an edge node, and/or any other type of data processing device or system. For additional details regarding computing devices, refer to
Any of the components illustrated in
While illustrated in
Turning to
To provide computer implemented services, data processing system 100 may include any quantity of hardware resources 150. Hardware resources 150 may be in-band hardware components, and may include a processor operably coupled to memory, storage, and/or other hardware components.
The processor may host various management entities such as operating systems, drivers, network stacks, and/or other software entities that provide various management functionalities. For example, the operating system and drivers may provide abstracted access to various hardware resources. Likewise, the network stack may facilitate packaging, transmission, routing, and/or other functions with respect to exchanging data with other devices.
For example, the network stack may support transmission control protocol/internet protocol communication (TCP/IP) (e.g., the Internet protocol suite) thereby allowing the hardware resources 150 to communicate with other devices via packet switched networks and/or other types of communication networks.
The processor may also host various applications that provide the computer implemented services. The applications may utilize various services provided by the management entities and use (at least indirectly) the network stack to communicate with other entities.
However, use of the network stack and the services provided by the management entities may place the applications at risk of indirect compromise. For example, if any of these entities trusted by the applications are compromised, these entities may subsequently compromise the operation of the applications. For example, if various drivers and/or the communication stack are compromised, communications to/from other devices may be compromised. If the applications trust these communications, then the applications may also be compromised.
For example, to communicate with other entities, an application may generate and send communications to a network stack and/or driver, which may subsequently transmit a packaged form of the communication via channel 170 to a communication component, which may then send the packaged communication (in a yet further packaged form, in some embodiments, with various layers of encapsulation being added depending on the network environment outside of data processing system 100) to another device via any number of intermediate networks (e.g., via wired/wireless channels 176 that are part of the networks).
To reduce the likelihood of the applications and/or other in-band entities from being indirectly compromised, data processing system 100 may include management controller 152 and network module 160. Each of these components of data processing system 100 is discussed below.
Management controller 152 may be implemented, for example, using a system on a chip or other type of independently operating computing device (e.g., independent from the in-band components, such as hardware resources 150, of a host data processing system 100). Management controller 152 may provide various management functionalities for data processing system 100. For example, management controller 152 may monitor various ongoing processes performed by the in-band component, may manage power distribution, thermal management, and/or other functions of data processing system 100.
To do so, management controller 152 may be operably connected to various components via sideband channels 174 (in
For example, to reduce the likelihood of indirect compromise of an application hosted by hardware resources 150, management controller 152 may enable information from other devices to be provided to the application without traversing the network stack and/or management entities of hardware resources 150. To do so, the other devices may direct communications including the information to management controller 152. Management controller 152 may then, for example, send the information via sideband channels 174 to hardware resources 150 (e.g., to store it in a memory location accessible by the application, such as a shared memory location, a mailbox architecture, or other type of memory-based communication system) to provide it to the application. Thus, the application may receive and act on the information without the information passing through potentially compromised entities. Consequently, the information may be less likely to also be compromised, thereby reducing the possibility of the application becoming indirectly compromised. Similarly processes may be used to facilitate outbound communications from the applications.
For example, to provide information regarding the status of processes performed by hardware resources 150, management controller 152 may enable information from hardware resources 150 to be provided to other devices without traversing the network stack and/or when an operating system of the data processing system 100 has not been installed. To do so, management controller 152 may obtain information regarding the status of hardware resources 150 via sideband channels 174. Management controller 152 may then, for example, utilize the information to obtain an inference (e.g., using an inference model) indicating whether the process is progressing according to an expected progression schedule for the respective process. Thus, management controller 152 may provide the status of the process being performed by hardware resources 150 to external devices without compromising the application and/or without installation of an operating system by data processing system 100.
Management controller 152 may be operably connected to communication components of data processing system 100 via separate channels (e.g., 172) from the in-band components, and may implement or otherwise utilize a distinct and independent network stack (e.g., TCP/IP). Consequently, management controller 152 may communicate with other devices independently of any of the in-band components (e.g., does not rely on any hosted software, hardware components, etc.). Accordingly, compromise of any of hardware resources 150 and hosted component may not result in indirect compromise of any management controller 152, and entities hosted by management controller 152.
To facilitate communication with other devices, data processing system 100 may include network module 160. Network module 160 may provide communication services for in-band components and out-of-band components (e.g., management controller 152) of data processing system 100. To do so, network module 160 may include traffic manager 162 and interfaces 164.
Traffic manager 162 may include functionality to (i) discriminate traffic directed to various network endpoints advertised by data processing system 100, and (ii) forward the traffic to/from the entities associated with the different network endpoints. For example, to facilitate communications with other devices, network module 160 may advertise different network endpoints (e.g., different media access control address/internet protocol addresses) for the in-band components and out-of-band components. Thus, other entities may address communications to these different network endpoints. When such communications are received by network module 160, traffic manager 162 may discriminate and direct the communications accordingly (e.g., over channel 170 or channel 172, in the example shown in
Accordingly, traffic directed to management controller 152 may never flow through any of the in-band components. Likewise, outbound traffic from the out-of-band component may never flow through the in-band components.
To support inbound and outbound traffic, network module 160 may include any number of interfaces 164. Interfaces 164 may be implemented using any number and type of communication devices which may each provide wired and/or wireless communication functionality. For example, interfaces 164 may include a wide area network card, a WiFi card, a wireless local area network card, a wired local area network card, an optical communication card, and/or other types of communication components. These component may support any number of wired/wireless channels 176.
Thus, from the perspective of an external device, the in-band components and out-of-band components of data processing system 100 may appear to be two independent network entities, that may independently addressable, and otherwise unrelated to one another.
To further clarify embodiments disclosed herein, data flow diagrams in accordance with an embodiment are shown in
Turning to
To identify the future operating state of a process being performed by a data processing system (e.g., 100), inferred process states (e.g., inferred operating states) at different points in time while a process is being performed may be obtained. Although one data structure representing an inferred process state (e.g., process state 226) is illustrated in
In order to predict a future process state of a process performed by the data processing system, management controller 152 (e.g., an out-of-band component of data processing system 100) may obtain image derived data 208, telemetry data 218, and hardware data 224 for the data processing system. For example, the aforementioned data may be collected when the data processing system (and/or processes performed by it) is operating in a manner where it may not report its operating state to other entities. An inferred operating state (e.g., process state 226) based on the data collected at a point in time may be used to identify a progression status (e.g., process state 226) of the process at the point in time. Multiple such processes states may be collected and used to form a time series. The time series may then be used to predict the future process state of the process into the future.
To obtain the inferred operating state of the data processing system, inference generation process 210 may be performed to obtain inferences (e.g., process state 226). To obtain the inferences, an inference model (e.g., trained inference model 212) may be obtained. Inferences (e.g., indicating an expected progression status of processes being performed by in-band components of data processing system 100) may be obtained using trained inference model 212. Trained inference model 212 may be trained to generate inferences (e.g., as part of inference generation 210) indicating the progression status (e.g., process state 226) of the process being performed by the hardware resources of the data processing system (e.g., 100 shown in
Trained inference model 212 may be obtained using a variety of processes (e.g., generation, acquisition from another entity, etc.). For example, management controller 152 (shown in
Prior to obtaining an inference (e.g., process state 226), image derived data 208, telemetry data 218, and/or hardware data 224 may be obtained and which may serve as input to the trained inference model 212 during inference generation process 210. Image derived data 208, telemetry data 218, and/or hardware data 224 may include different types of data relating to in-band components of the data processing system (e.g., 100;
To obtain telemetry data 218, telemetry process 216 may be performed. To perform telemetry process 216, telemetry data request 214 may be obtained, for example, from a management controller (e.g., 152;
To obtain hardware data 224, hardware process 222 may be performed. To perform hardware process 222, hardware data request 220 may be obtained, for example, from a management controller (e.g., 152;
To obtain image derived data (e.g., 208), image preprocessing process 206 may be performed. During image preprocessing process 206, image 204 may be ingested and analyzed. The image (e.g., 204) may be used in various image processing processes to enhance the quality of the image, derive information from the image, and/or otherwise obtain information from image 204. Image preprocessing process 206 may include utilizing techniques (e.g., optical character recognition, textual information preprocessing, textual analysis using natural language processing, etc.) to extract visual features and detect objects or regions of interest (e.g., within image 204). For example, image preprocessing process 206 may include (i) identifying areas of interest in the screenshot (e.g., areas where graphical representations of progress are shown, areas where characters corresponding to words describing events/activities/etc. that occur during the process, etc.), (ii) segmenting the screenshot into segments to obtain screenshot segments, and/or (iii) extracting information from each screenshot segment by, for example, classifying the screenshot segments based on the areas of interest in the screenshot to obtain screenshot segment classifications corresponding to the screenshot segments. In some instances, the areas of interest in the screenshot (e.g., image 204) may define a group of pixels of the screenshot including informational content usable to infer the information (e.g., image derived data 208) regarding the progression of the process.
Image 204 may include a screenshot of the information regarding progress of the process displayed on a display of a graphical user interface of the data processing system. For example, image 204 may include a status bar and a percentage of the status bar being filled indicating the progress of the process being performed by data processing system 100.
To obtain the image (e.g., 204), image capture process 202 may be performed. Image capture process 202 may include receiving image request 200 to obtain a screenshot of the graphical user interface displayed on a display of the data management system. During image capture process 202, management controller (e.g., 152 shown in
After performing the various collection processes for the in-band components, the collected data (e.g., image derived data 208, telemetry data 218, and/or hardware data 224) may be used to infer the operating state of the process being performed by the data processing system at a point in time.
To infer the operating state of the data processing system (e.g., process state 226), image derived data 208, telemetry data 218, and/or hardware data 224 may be used in inference generation process 210. During inference generation process 210, different types of data relating to in-band components may be ingested in an inference model (e.g., trained inference model 212). Inference generation process 210 may include ingesting telemetry data 218, hardware data 224, and/or image derived data 208 into trained inference model 212 to obtain an inferred state (e.g., process state 226) of the process being performed by the data processing system and which may be stored as process state 226. In other words, an inferred progression state of the process may be obtained because the process may not report its state.
Process state 226 may indicate whether the process performed by the data management system (e.g., 100) is progressing as expected (e.g., process is being performed nominally for a process of the types of the process for the hardware resources of the system performing the process) or if the process is performing abnormally (e.g., stalled, failed, progressing slower than expected, etc.). Process state 226 may identify, for example, whether the operating state of the data processing system is aligned with the expectations (e.g., performance, timing, etc.) of a similar type of process being performed by data processing systems with similar hardware resources at a specific point in time during performance of the process.
Once obtained, process state 226 (and other process states for other points in time) may be used to identify progression of the process in the future (e.g., a future operating state of the data processing system). To identify progression of the process in the future, process progression process 228 may be performed. During process progression process, the inferred process states may be ingested by a second inference model (e.g., different inference model than the one used in inference generation process 210) to predict the process state of the process in the future. Refer to
Once obtained, the future process state may be used to manage the continued progression of the process being performed by the data management system to facilitate completion of the process. To manage the continued progression of the process, process management process 230 may be performed. During process management process 230, the inferred future process state (e.g., progression) may be provided to an external device to facilitate management of the continued progression of the process by an individual, may be used to automatically initiate activity to change the progression (e.g., to avoid having the prediction come true), and/or may be used as a basis for performing other types of actions for managing continued progression of the process to facilitate completion of the process being performed by the data management system. For example, management controller 152 may provide the predicted process state to management system 102 (shown in
For example, in a scenario in which the predicted process states indicates that performance of the process at a point in time in the future will not be completed (e.g., failure of completion of the process), then various actions may be performed to modify operation of the data processing system by the management controller (e.g., 152) (e.g., as initiated by process management process 230). In this example, management controller 152 may communicate the future failure of the process to complete by the predicted point in time to an external device operated by an individual to facilitate management of actions in order to manage the continued performance of the process to facilitate completion of the process being performed by the data processing system. The management controller may perform management actions indicated by an individual (e.g., via external operating system) to modify the operation of the hardware resources (e.g., hardware resources 150 shown in
In this example, to perform the actions, management controller 152 may receive management actions (e.g., communicated by an external device using sideband channels 172 shown in
Continuing with the example, if the predicted process state indicates that the process will be completed at a point in time in the future (e.g., successful completion of the process by the data processing system), then process management process 230 may include providing a message (e.g., via management controller 152;
Thus, using the data flow illustrated in
Turning to
To predict the future process states, any number of inferred process states (e.g., 226A-226N) associated with various points in time may be obtained.
Once obtained, the process states may be used to predict a future process state of the process.
To predict the future process state of the process, the process states may be ingested by inference generation process 262. During inference generation process 262, multiple inferred process states (e.g., at least two inferred process states of a process) at different points in time may be ingested in an inference model (e.g., trained inference model 260).
Trained inference model 260 may be obtained by reading the inference model from storage, receiving the inference model from another device, generating the inference model, and/or via other processes. Refer to
Once the process states are ingested, trained inference model 260 may output predicted future process state 264 for the process. The predicted future process state may indicate the process state (e.g., stalled, failed, on track, etc.) of the process at a future point in time.
Turning to
With reference to
To obtain training data 242, training data generation process 240 may be performed. During training data generation process 240, information relating to progress of processes being performed (e.g., by hardware resources 150 shown in
To obtain image data 232, schedule data 234, and/or hardware data 236, an out-of-band component (e.g., a management controller) of the data processing system (e.g., performing the process) may communicate with in-band components of the data processing system.
Image data 232 may include a screenshot of a graphical user interface displayed on a display of a data processing system during performance of the process at a point in time. The screenshot may include information regarding progress of the process being shown on the display of the data processing system. For example, a management controller may communicate a request for data corresponding to information being shown on a display to the graphics adapter (e.g., and/or another type of hardware component that manages display of information on a display). For example, management controller 152 may read information relating to values of the pixels in the display (e.g., provided by the graphics adapter) and use the information to construct an image (e.g., screenshot) of what would be displayed on a display. The image data 232 may include a screenshot of any type of information regarding progress of the process being performed (e.g., which may be displayed on a display of the data processing system). For example, image data 232 may include a status bar and a percentage of the status bar being filled indicating the progression of the process at the point in time in which the information is obtained from the graphics adapter and/or another type of hardware device that manages the display of information on a display.
While described with respect to just image data 232, image derived data may also be used as a feature of the training data. Consequently, trained inference models may use image data and/or image derived data to predict process states.
Schedule data 234 may include information regarding a duration of time that has passed since the start of the process for the point in time at which the other data is being collected (e.g., image data 232, hardware data 236, etc.). For example, a management controller of the data processing system may obtain schedule data 234 at various points in time in which image data 232, and/or hardware data 236 is collected to be used in identifying a relationship between progression of the process being performed over a period of time. Schedule data 234 may be used to obtain an expected progression of the process within a time interval.
Hardware data 236 may include any information relating to the hardware resources that are contributing to the performance of a process and/or telemetry data (e.g., health related data) relating to the hardware resources of the data processing system. In some instances, a data processing system may include different hardware resources (e.g., processors, storage devices, etc.) that may contribute to performance of processes in different ways. For example, a data processing system with a high performance processor may perform processes at faster speeds than compared to a low performance processor. As such, the parameters of hardware resources contributing to performance of a process may be obtained and used during training data generation process to identify any variance of progression of processes based on hardware resources of the data processing system.
In addition, hardware data 236 may include information relating to the telemetry data of the hardware resources while performing a process at different points in time to identify a range of performance measurements for different instances of processes being performed. The telemetry data may include measurements of characteristics of the hardware resources of the data processing system. For example, hardware data 236 may include temperature data of a processor while performing a process (e.g., at a point in time). Hardware data 236 may be collected over a period of time while the data processing system is performing a process. Hardware data 236 may be used to identify a range of performance metrics that may indicate if any performance related issues for the hardware resources contribute to the progress of the process. For example, management controller may collect temperature data of the processor while performing a process (e.g., over a period of time) in both successful and unsuccessful installations of the process. In this instance, the temperature data of the processor may be used to define a temperature range of the processor to fall within in order to successfully complete installation of a process. Any temperature data identifying a temperature outside the temperature range may indicate an error or issue with performance of the processor.
State data 238 may be the progress state ascribed to the process corresponding to image data 232, schedule data 234, and hardware data. Thus, state data 238 may serve as a label with image data 232, schedule data 234, and hardware data being features for this label. State data 238 may be provided by a subject matter expert by reviewing the features and defining the label in terms of the features.
Once obtained, training data 242 may be obtained by associating image data 232, schedule data 234, and hardware data 236 as features with the corresponding label of state data 238.
Thus, training data generation process 240 may result in the generation of training data 242. Accordingly, an inference model may be trained using training data 242 (e.g., to set the weights of neurons and/or other features of the inference model) to predict whether a process is progressing as expected for the instance of the process being performed (e.g., nominally or abnormally).
To obtain trained inference model 212, inference model training process 244 may be performed. During inference model training process 244, training data 242 may be ingested into a machine learning model (e.g., a deep learning model or other type of model). Once ingested, the weights and structure of the machine learning model may be adapted (e.g., values for nodes of intermediate layers may be selected, connections may be pruned/added, etc.) to generate inferences based on image data 232, schedule data 234, hardware data 236, and/or state data 238 (e.g., and/or other training data, training data 242 may include any amount of training data; state data 238, image data 232, schedule data 234, and hardware data 236 may be one feature-label association of training data 242).
Trained inference model 212 may generate inferences indicating an expected progression status of a process (e.g., an inferred operating state of a device) being performed by a data processing system (e.g., while the data processing system is unable or unwilling to report its operating state) based on new instances of measure image data, schedule data, and hardware data (e.g., in aggregate “process data”). For example, trained inference model 212 may ingest process data relating to progress of a process at a point in time (e.g., obtained from in-band components of a data processing system), such as image derived data (e.g., 208 shown in
Accordingly, the obtained trained inference model 212 may be used, as part of the first data flow shown in
Turning to
With reference to
To obtain training data 256, training data generation process 254 may be performed. During training data generation process 254, information relating to current progression status of processes being performed (e.g., by hardware resources 150 shown in
To ascertain the progression status for a process at a point in time in the future, patterns or sequences of process states (e.g., current progression status of a process at different points in time) may be identified. These patterns or sequences may be identified (i) through computer automation such as use of an inference model (e.g., machine learning model) and/or (ii) in cooperation with a person that may be a subject matter expert (the inference model may be treated as a subject matter expert in automated approaches).
Through computer automation, the inference model may take an inferred process state (e.g., process state 226 shown in
To perform training data generation process 254, process states data 252 may be obtained. Process states data 252 may include any number of process states (e.g., inferred process states) obtained at different points in time during performance of a process by a data processing system (e.g., data processing system 100 shown in
Process states data 252 may include information relating to current progression status of a process (e.g., whether the current progress of the process is nominal or abnormal) at a point in time while the process is being performed by the data processing system. For example, a first inferred process state may be obtained using a first screenshot at a first point in time, and a second inferred process state may be obtained using a second screenshot at a second point in time. In this example, the first inferred process state may indicate the current progression status of the process at the first point in time is performing nominally and the second inferred process state may indicate the current progression status of the process at the second point in time is performing abnormally. The first inferred process state and the second inferred process state (e.g., process states data 252) of the process being performed by the data processing system may be used to identify a pattern or schedule of progress of the process that leads to a certain outcome of the process (e.g., progression data 250).
To obtain progression data 250, a subject matter expert (e.g., computer engineer or another inference model) may review the inferred process states (e.g., process states data 252) for a process over a period of time and generate an identification of a future progression status for the process based on the schedule of progression for the process. For example, the subject matter expert may review a sequence of inferred process states (e.g., current progression status at different points in time) for a process being performed by a data processing system and assign a label for the pattern of the inferred process states over a time interval that indicates whether performance of the process may be completed at a point in time in the future.
In some instances, the label for sequences of inferred process states (e.g., progression data 250) may differentiate future progression statuses (e.g., future operating states) of a process being performed. For example, the label for a schedule of inferred process states may include identification of anomalies in the performance of the operating state at a point in time in the future such as “stalled installation state”, “frozen installation state”, “time-delayed installation state”, “failure installation state”, and/or any other identifying labels of the future progression status of the process (e.g., outcome or result of the process being performed).
Once obtained, training data 256 may be obtained by associating process states data 252 (e.g., inferred process states at different points in time) as a progression status schedule with the corresponding label of progression data 250.
Thus, training data generation process 254 may result in the generation of training data 256. Accordingly, an inference model may be trained using training data 256 (e.g., to set the weights of neurons and/or other features of the inference model) to predict progression of a process in the future.
To obtain trained inference model 260, inference model training process 258 may be performed. During inference model training process 258, training data 256 may be ingested into a machine learning model (e.g., a deep learning model or other type of model). Once ingested, the weights and structure of the machine learning model may be adapted (e.g., values for nodes of intermediate layers may be selected, connections may be pruned/added, etc.) to generate inferences based on progression data 250, and/or process states data 252 (e.g., and/or other training data, training data 256 may include any amount of training data; progression data 250, and process states data 252 may be one feature-label association of training data 256).
Trained inference model 260 may generate inferences indicating a future progression status of a process (e.g., a future operating state of a device) being performed by a data processing system based on new instances of measured progression data 250 and process states data 252. For example, trained inference model 260 may ingest process state data relating to current progression status of a process at a point in time (e.g., obtained using an inference model), such as process state 226A (e.g., shown in
Accordingly, the obtained trained inference model 260 may be used, as part of the data flow shown in
As discussed above, the components of
Turning to
At operation 300, at least two inferred process states for a process being performed by the host device may be obtained. The at least two inferred process states may be obtained by (i) reading the at least two inferred process states from storage, (ii) receiving the at least two inferred process states from another device, (iii) generating the at least two inferred process states, and/or (iv) via other methods.
The at least two inferred process states may be obtained via generation by (i) obtaining a first inferred process state using a first inference model and at least a first screenshot and/or (ii) obtaining a second inferred process state using a second inference model and at least a second screenshot. Obtaining the first inferred process state may be facilitated by (i) obtaining a first inference model, (ii) obtaining at least the first screenshot, (iii) obtaining telemetry data for the hardware resources (e.g., measurements of characteristics of the hardware resources while the host device is performing the process), (iv) obtaining hardware data for the hardware resources (e.g., data specifying the hardware resources that are contributing to performance of the process), (v) ingesting at least the first screenshot, the telemetry data, the hardware data into the first inference model, (vi) generating, using the first inference model, the first inferred process state based on the first screenshot, the telemetry data, and the hardware data, and/or (vii) any other methods.
The first inference model may be obtained by (i) receiving it from an external device, and/or (ii) via generation. The first inference model may be obtained via generation by (i) ingesting the first screenshot into the first inference model, and (ii) the first inference model generating the inference based on the first screenshot. Refer to
The first screenshot (and/or any additional screenshots at different points in time) may be obtained by a management controller communicating with any of the in-band components of the host device to obtain the information regarding the progress of the process being performed by the host device. For example, the management controller may read information from a graphics adapter and/or another type of hardware component of the host device that manages display of information on a display of an interface of the host device. The information may be kept, for example, in a data storage structure (e.g., frame buffer) that defines information for pixels in a display and the management controller may read the information from the frame buffer in order to obtain a first screenshot of the information being displayed on a display of the host device.
Obtaining the second inferred process state may be performed in a similar manner to that of the first inferred process state. The second inferred process state may be based on a second screenshot of the interface taken at a second point in time (e.g., while the host device is performing the process). The second inferred process state may be obtained by (i) obtaining a second screenshot (e.g., information regarding the progress of the process at a second point in time obtained via the management controller), (ii) obtaining telemetry data (e.g., at a second point in time), (iii) obtaining hardware data (e.g., at a second point in time), (iv) ingesting the second screenshot, the telemetry data, and the hardware data into the first inference model, and/or (v) generating, using the first inference model, the second inferred process state.
At operation 302, a first inference that indicates progression of the process in the future may be obtained. The first inference may be obtained by (i) receiving the first inference from another device, and/or (ii) via generation. The first inference may be obtained via generation by (i) obtaining a second inference model, (ii) ingesting the at least two inferred process states in the second inference model, and/or (iii) generating, using the second inference model, the first inference based on the at least two inferred process states. The at least two inferred process states may indicate the progression of the process at two different periods of time.
At operation 304, the continued performance of the process may be managed in order to facilitate completion of the process to obtain an updated host device. The continued performance of the process may be managed based on the progression of the process. The progression of the process (e.g., the first inference) may be obtained using a first inference model (as discussed above in operation 301) and may indicate whether the process is complete at a point in time in the future.
Managing continued performance of the process may include, in an instance of the managing of the continued performance where the progression indicates that the process is not complete at the point in time in the future: (i) obtaining instructions for a management controller of the host device to perform a restart of the host device, and/or (ii) performing actions to restart the performance of the process by the management controller. For example, management system 102 (e.g., shown in
Managing continued performance of the process may further include, in an instance of the managing of the continued performance where the progression indicates that the process is not complete at the point in time in the future: (i) obtaining management actions for a management controller of the host device, and/or (ii) performing, by the management controller, the management actions to obtain updated hardware resources to facilitate continued performance of the process. Performing the management actions may update the operation of the host device thereby obtaining an updated host device.
At operation 306, computer-implemented services may be provided using the updated host device. The computer-implemented services may be provided by the host device by continued operation of the updated host device. The continued operation of the updated host device, when compared to the operation of the original host device, may improve the likelihood that the process being performing will be completed in the future.
Any of the components illustrated in
In one embodiment, system 400 includes processor 401, memory 403, and devices 405-407 via a bus or an interconnect 410. Processor 401 may represent a single processor or multiple processors with a single processor core or multiple processor cores included therein. Processor 401 may represent one or more general-purpose processors such as a microprocessor, a central processing unit (CPU), or the like. More particularly, processor 401 may be a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, or processor implementing other instruction sets, or processors implementing a combination of instruction sets. Processor 401 may also be one or more special-purpose processors such as an application specific integrated circuit (ASIC), a cellular or baseband processor, a field programmable gate array (FPGA), a digital signal processor (DSP), a network processor, a graphics processor, a network processor, a communications processor, a cryptographic processor, a co-processor, an embedded processor, or any other type of logic capable of processing instructions.
Processor 401 may communicate with memory 403, which in one embodiment can be implemented via multiple memory devices to provide for a given amount of system memory. Memory 403 may include one or more volatile storage (or memory) devices such as random access memory (RAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), static RAM (SRAM), or other types of storage devices. Memory 403 may store information including sequences of instructions that are executed by processor 401, or any other device. For example, executable code and/or data of a variety of operating systems, device drivers, firmware (e.g., input output basic system or BIOS), and/or applications can be loaded in memory 403 and executed by processor 401. An operating system can be any kind of operating systems, such as, for example, Windows® operating system from Microsoft®, Mac OS®/iOS® from Apple, Android® from Google®, Linux®, Unix®, or other real-time or embedded operating systems such as VxWorks.
System 400 may further include IO devices such as devices (e.g., 405, 406, 407, 408) including network interface device(s) 405, optional input device(s) 406, and other optional IO device(s) 407. Network interface device(s) 405 may include a wireless transceiver and/or a network interface card (NIC). The wireless transceiver may be a WiFi transceiver, an infrared transceiver, a Bluetooth transceiver, a WiMax transceiver, a wireless cellular telephony transceiver, a satellite transceiver (e.g., a global positioning system (GPS) transceiver), or other radio frequency (RF) transceivers, or a combination thereof. The NIC may be an Ethernet card.
Input device(s) 406 may include a mouse, a touch pad, a touch sensitive screen (which may be integrated with a display device of optional graphics subsystem 404), a pointer device such as a stylus, and/or a keyboard (e.g., physical keyboard or a virtual keyboard displayed as part of a touch sensitive screen). For example, input device(s) 406 may include a touch screen controller coupled to a touch screen. The touch screen and touch screen controller can, for example, detect contact and movement or break thereof using any of a plurality of touch sensitivity technologies, including but not limited to capacitive, resistive, infrared, and surface acoustic wave technologies, as well as other proximity sensor arrays or other elements for determining one or more points of contact with the touch screen.
IO devices 407 may include an audio device. An audio device may include a speaker and/or a microphone to facilitate voice-enabled functions, such as voice recognition, voice replication, digital recording, and/or telephony functions. Other IO devices 407 may further include universal serial bus (USB) port(s), parallel port(s), serial port(s), a printer, a network interface, a bus bridge (e.g., a PCI-PCI bridge), sensor(s) (e.g., a motion sensor such as an accelerometer, gyroscope, a magnetometer, a light sensor, compass, a proximity sensor, etc.), or a combination thereof. IO device(s) 407 may further include an imaging processing subsystem (e.g., a camera), which may include an optical sensor, such as a charged coupled device (CCD) or a complementary metal-oxide semiconductor (CMOS) optical sensor, utilized to facilitate camera functions, such as recording photographs and video clips. Certain sensors may be coupled to interconnect 410 via a sensor hub (not shown), while other devices such as a keyboard or thermal sensor may be controlled by an embedded controller (not shown), dependent upon the specific configuration or design of system 400.
To provide for persistent storage of information such as data, applications, one or more operating systems and so forth, a mass storage (not shown) may also couple to processor 401. In various embodiments, to enable a thinner and lighter system design as well as to improve system responsiveness, this mass storage may be implemented via a solid state device (SSD). However, in other embodiments, the mass storage may primarily be implemented using a hard disk drive (HDD) with a smaller amount of SSD storage to act as an SSD cache to enable non-volatile storage of context state and other such information during power down events so that a fast power up can occur on re-initiation of system activities. Also a flash device may be coupled to processor 401, e.g., via a serial peripheral interface (SPI). This flash device may provide for non-volatile storage of system software, including a basic input/output software (BIOS) as well as other firmware of the system.
Storage device 408 may include computer-readable storage medium 409 (also known as a machine-readable storage medium or a computer-readable medium) on which is stored one or more sets of instructions or software (e.g., processing module, unit, and/or processing module/unit/logic 428) embodying any one or more of the methodologies or functions described herein. Processing module/unit/logic 428 may represent any of the components described above. Processing module/unit/logic 428 may also reside, completely or at least partially, within memory 403 and/or within processor 401 during execution thereof by system 400, memory 403 and processor 401 also constituting machine-accessible storage media. Processing module/unit/logic 428 may further be transmitted or received over a network via network interface device(s) 405.
Computer-readable storage medium 409 may also be used to store some software functionalities described above persistently. While computer-readable storage medium 409 is shown in an exemplary embodiment to be a single medium, the term “computer-readable storage medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The terms “computer-readable storage medium” shall also be taken to include any medium that is capable of storing or encoding a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of embodiments disclosed herein. The term “computer-readable storage medium” shall accordingly be taken to include, but not be limited to, solid-state memories, and optical and magnetic media, or any other non-transitory machine-readable medium.
Processing module/unit/logic 428, components and other features described herein can be implemented as discrete hardware components or integrated in the functionality of hardware components such as ASICS, FPGAs, DSPs or similar devices. In addition, processing module/unit/logic 428 can be implemented as firmware or functional circuitry within hardware devices. Further, processing module/unit/logic 428 can be implemented in any combination hardware devices and software components.
Note that while system 400 is illustrated with various components of a data processing system, it is not intended to represent any particular architecture or manner of interconnecting the components; as such details are not germane to embodiments disclosed herein. It will also be appreciated that network computers, handheld computers, mobile phones, servers, and/or other data processing systems which have fewer components or perhaps more components may also be used with embodiments disclosed herein.
Some portions of the preceding detailed descriptions have been presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the ways used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of operations leading to a desired result. The operations are those requiring physical manipulations of physical quantities.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the above discussion, it is appreciated that throughout the description, discussions utilizing terms such as those set forth in the claims below, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
Embodiments disclosed herein also relate to an apparatus for performing the operations herein. Such a computer program is stored in a non-transitory computer readable medium. A non-transitory machine-readable medium includes any mechanism for storing information in a form readable by a machine (e.g., a computer). For example, a machine-readable (e.g., computer-readable) medium includes a machine (e.g., a computer) readable storage medium (e.g., read only memory (“ROM”), random access memory (“RAM”), magnetic disk storage media, optical storage media, flash memory devices).
The processes or methods depicted in the preceding figures may be performed by processing logic that comprises hardware (e.g. circuitry, dedicated logic, etc.), software (e.g., embodied on a non-transitory computer readable medium), or a combination of both. Although the processes or methods are described above in terms of some sequential operations, it should be appreciated that some of the operations described may be performed in a different order. Moreover, some operations may be performed in parallel rather than sequentially.
Embodiments disclosed herein are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of embodiments disclosed herein.
In the foregoing specification, embodiments have been described with reference to specific exemplary embodiments thereof. It will be evident that various modifications may be made thereto without departing from the broader spirit and scope of the embodiments disclosed herein as set forth in the following claims. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.