Embodiments described herein generally relate to the field of programmable devices. More particularly, embodiments described herein relate to repairing or recovering a failed computer program (e.g., software, firmware, etc.) installed on one or more interconnected programmable devices.
Programmable devices—such as internet of things (IoT) devices, mobile computing devices, cloud computing devices, logical computing devices, virtual computing devices—can make up a computer system comprised of interconnected programmable devices. In such a computer system, each programmable device includes one or more computer programs (e.g., software, firmware, etc.) for performing its operations and functionalities.
As improvements in technology continue to make programmable devices more accessible and efficient, the number of interconnected programmable devices might increase. Consequently, some computer systems may include numerous interconnected programmable devices (e.g., tens, hundreds, thousands, millions, billions, etc.). In such systems, one problem that could arise is a scalability problem. This problem may occur when one or more programmable devices fail due to one or more faulty computer programs installed thereon, which in turn results in a need for recovery or repair of the faulty computer program(s) installed thereon. One current approach taken by an enterprise information technology (IT) system that services a computer system comprised of interconnected devices relies on a central configuration server to update a failed programmable device of the system with a known, good image of a computer program (e.g., software, firmware, etc.) installed on the device. In this current approach, the user of the programmable device plays an important role in notifying the central configuration server when a problem occurs (e.g., when the programmable device fails, etc.). For example, a user of a failed programmable device may open a service call ticket to be serviced by a service facility, and talk to an agent from the service facility who diagnoses the problem and recommends a repair action.
As the number of interconnected programmable devices that make up a computer system increase, these devices may become too numerous for the approach described in the preceding paragraph to work. This is because service facilities may not have enough resources to resolve the numerous devices that could fail. The problem described in this paragraph is further compounded by a potential lack of user interface capabilities on the programmable devices (or on computing systems that are available to the users of failed devices), which could prevent facilitating detection, diagnostics, and repair of the users' devices. An inability to resolve failed devices can, in turn, cause a negative impact on the availability of one or more interconnected devices. Consequently, a lack of resources to enable recovery and repair of computer programs installed on one or more interconnected programmable devices of a computer system may add risks to the operational integrity of the computer system.
The problem described above is also compounded in computer systems comprised of interconnected programmable devices because such systems rely on centralized communication models, otherwise known as the server/client model. The servers used in the server/client model are potential bottlenecks and failure points that can disrupt the functioning of an entire computer system. Additionally, these servers are vulnerable to security compromises (e.g., man-in-the-middle attacks, etc.) because all data associated with the multiple devices of the computer system must pass through the servers. Consequently, a server tasked with recovery or repair of computer programs installed on a failed programmable device may fail, which is undesirable.
Embodiments described herein relate to recovering or repairing computer program(s) installed on one or more interconnected programmable devices of a computer system using a distributed ledger that is available to multiple devices of the computer system. The embodiments described herein have numerous advantages, which are directed to improving computer functionality. One advantage of the embodiments described herein is that these embodiments can assist with addressing the scalability problem described in the background section of this document. For example, one or more of the embodiments described herein can make a failed programmable device in a computer system comprised of interconnected programmable devices auto-recoverable using a distributed ledger that is available to multiple programmable devices in the computer system. For this example, the distributed ledger facilitates a device-driven recovery, failover, or replacement strategy, which may be referred to herein as a “self-reliant” strategy or “self-reliance.” Another advantage of the embodiments described herein is that such embodiments can provide an alternative to the central communication model of repairing or recovering computer programs (i.e., the client/server model). Furthermore, at least one of the embodiments described herein can assist with one or more of the following: (i) minimizing or eliminating failure rates of devices in a computer system comprised of interconnected programmable devices, which in turn assist with preventing other devices in the system from becoming disabled; (ii) minimizing or eliminating risks to the operational integrity of a computer system comprised of interconnected programmable devices caused by failed devices; (iii) minimizing or eliminating the use of servers as the only watchdog devices used for recovering or repairing computer programs installed on interconnected programmable devices of a computer system because such servers are potential bottlenecks and failure points that can disrupt the functioning of an entire computer system; and (iv) minimizing or eliminating vulnerabilities caused by security compromises (e.g., man-in-the-middle attacks, etc.) because the data associated with the multiple interconnected devices of a computer system does not have to be communicated using a centralized communication model.
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments described herein. It will be apparent, however, to one skilled in the art that the embodiments described herein may be practiced without these specific details. In other instances, structure and devices are shown in block diagram form in order to avoid obscuring the embodiments described herein. References to numbers without subscripts or suffixes are understood to reference all instance of subscripts and suffixes corresponding to the referenced number. Moreover, the language used in this disclosure has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter in the embodiments described herein. As such, resort to the claims is necessary to determine the inventive subject matter in the embodiments described herein. Reference in the specification to “one embodiment,” “an embodiment,” “another embodiment,” or their variations means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least one of the embodiment described herein, and multiple references to “one embodiment,” “an embodiment,” “another embodiment,” or their variations should not be understood as necessarily all referring to the same embodiment.
As used herein, the term “programmable device” and its variations refer to a physical object that includes electronic components configured to receive, transmit, and/or process data information. For one embodiment, one or more of the electronic components may be embedded within the physical object, such as in wearable devices and mobile devices (e.g., self-driving vehicles). For one embodiment, the device may also include actuators, motors, control functions, sensors, and/or other components to perform one or more tasks without human intervention, such as drones, self-driving vehicles, and/or automated transporters. The programmable device can refer to a computing device, such as (but not limited to) a mobile computing device, a lap top computer, a wearable computing device, a network device, an internet of things (IoT) device, a cloud computing device, a vehicle, a smart lock, etc.
As used herein, the terms a “program,” a “computer program,” and their variations refer to one or more computer instructions are executed by a programmable device to perform a task. Examples include, but are not limited to, software and firmware.
As used herein, “software recovery services,” “software recovery,” “software repair,” “recovery,” “repair,” and their variations refer to modification, re-installation, and/or deletion of a computer program installed on a programmable device to a known, good configuration of the computer program. For brevity, the terms “software recovery” or “software recovery services” will be used to refer to “software recovery services,” “software recovery,” “software repair,” “recovery,” and “repair,” as described herein. Software recovery services include, but are not limited to, a rollback operation to rollback a computer program that is currently installed on a programmable device to the last known, good configuration of the computer program. Examples of rolling back a computer program include, but are not limited to, a major version rollback, a minor version rollback, a patch, a hotfix, a maintenance release, and a service pack. As such, rolling back a computer program includes moving from a version of a computer program to another version, as well as, moving from one state of a version of a computer program to another state of the same version of the computer program. Rollbacks can be used for fixing security vulnerabilities and other bugs, improving the device's functionality by adding new features, improving power consumption and performance, repairing failed programmable devices, etc. Rollbacks may be viewed as important features in the lifecycles of programmable devices. Additional details about software recovery services are described below in connection with one or more of
As used herein, the term “a computer system” can refer to a single programmable device or a plurality of programmable devices working together to perform a function or an operation described as being performed on or by a computer system. For one embodiment of a computer system comprised of multiple programmable devices, one or more of the devices can perform at least one function or at least one operation that is different from one or more functions or operations that are performed by one or more other devices of the system. For one example, a first device of a computer system can perform a first function or operation that differs from a second function or operation performed by a second device of the computer system. For another embodiment of a computer system comprised of multiple programmable devices, one or more of the devices can have at least one function or at least one operation performed on it that is different from one or more functions or operations that are performed on one or more other devices of the system. For example, a first device of a computer system can have a first function or operation performed on it that differs from a second function or operation that is performed on a second device of the computer system.
As used herein, a “computer network,” a “network,” and their variations refer to a plurality of interconnected programmable devices that can exchange data with each other. For example, a computer network can enable a computer system comprised of interconnected programmable devices to communicate with each other. Examples of computer networks include, but are not limited to, a peer-to-peer network, any type of data network such as a local area network (LAN), a wide area network (WAN) such as the Internet, a fiber network, a storage network, or a combination thereof, wired or wireless. In a computer network, interconnected programmable devices exchange data with each other using a communication mechanism, which refers to one or more facilities that allow communication between devices in the network. The connections between interconnected programmable devices are established using either wired or wireless communication links. The communication mechanisms also include networking hardware (e.g., switches, gateways, routers, network bridges, modems, wireless access points, networking cables, line drivers, switches, hubs, repeaters, etc.).
As used herein, a “watchdog system,” a “watchdog device,” a “watchdog,” and their variations refer to hardware (e.g., one or more processing units, electronic circuitry, etc.), software (e.g., a computer program executed by one or more processing units or electronic circuitry, etc.), or a combination of both that sends out messages (e.g., a signal, a ping packet, etc.) to a programmable device on a periodic basis with the aim of receiving a response from the programmable device. When the watchdog does not receive a response to its message from the programmable device within a predetermined period of time, then the watchdog device can initiate one or more software recovery services for the device that failed to respond to the watchdog device as described in connection with one or more of the embodiments set forth herein. The predetermined period of time can be based on at a time from when the watchdog message was transmitted by the watchdog device or a time from when the watchdog message was received by the client device. For one embodiment, the watchdog device is a programmable device configured to perform the operations described in this paragraph.
As used herein, a “watchdog message,” a “watchdog ping,” a “message,” a “ping,” and their variations refer to a signal that is sent by a watchdog device to a programmable device in a computer system comprised of interconnected programmable devices, which the programmable device must respond to within a predetermined amount of time to indicate that a computer program installed on the programmable device is operating without fault (e.g., the program is operating as expected, etc.). A response to the watchdog message may also be referred to herein as a “watchdog response message.”
As used herein, the term “distributed ledger” and its variations refer to a database that is available to multiple programmable devices and/or multiple watchdogs of a computer system comprised of interconnected programmable devices. One key feature of a distributed ledger is that there is no central data store where a master copy of the distributed ledger is maintained. Instead, the distributed ledger is stored in many different data stores, and a consensus protocol ensures that each copy of the ledger is identical to every other copy of the distributed ledger. A distributed ledger can, for example, be based on a blockchain-based technology, which is known in the art of cryptography and cryptocurrencies (e.g. bitcoin, etherium, etc.). The distributed ledger may provide a publically and/or non-publically verifiable ledger used for software recovery in one or more programmable devices and/or one or more watchdog devices in a computer system comprised of interconnected programmable devices. Changes in the distributed ledger (e.g., successful responses to watchdog messages, failed responses to watchdog messages, etc.) represent working conditions of one or more computer programs installed on one or more programmable devices of a computer system comprised of interconnected programmable devices. These changes may be added to and/or recorded in the distributed ledger. For one embodiment, multiple programmable devices and/or watchdog devices of a computer system comprised of interconnected programmable devices are required to validate changes, add them to their copy of the distributed ledger, and broadcast their updated distributed ledger to the entire computer system. Each of the programmable devices and/or watchdog devices having the distributed ledger may validate changes according to a validation protocol. For one embodiment, the validation protocol defines a process by which the interconnected devices of the computer system that comprises interconnected programmable devices agree on changes and/or additions to the distributed ledger. For one embodiment, the validation protocol may include the proof-of-work protocol implemented by Bitcoin or a public consensus protocol. For another embodiment, the validation protocol may include a private and/or custom validation protocol. The distributed ledger enables the interconnected devices in a computer system comprised of interconnected programmable devices to agree via the verification protocol on one or more changes and/or additions to the distributed ledger (e.g., to include successful responses to watchdog messages, to include failed responses to watchdog messages, etc.).
Each of the client devices 102A-N can be an internet of things (IoT) device, a mobile computing device, a cloud computing device, a logical computing device, or a virtual computing device. Also, each of the client devices 102A-N can include electronic components 130A-N. Examples of the components 130A-N include: processing unit(s) (such as microprocessors, co-processors, other types of integrated circuits (ICs), etc.); corresponding memory; and/or other related circuitry. For one embodiment, each of the client devices 102A-N includes a corresponding one of the self-reliance logic/modules 101, which implements a distributed ledger 103. The ledger 103 is used for software recovery of one or more computer programs installed on one or more of the client devices 102A-N. The distributed ledger 103 can, for one embodiment, be distributed across at least two of the devices 102A-N and 104A-N. In this way, the distributed ledger 103 may be used to avoid one or more shortcomings of a central communication technique used for software recovery of computer programs (i.e., the server/client model). Furthermore, and as shown in
Each of the self-reliance logic/modules 101 can be implemented as at least one of hardware (e.g., electronic circuitry of the processing unit(s), dedicated logic, etc.), software (e.g., one or more instructions associated with a computer program executed by the processing unit(s), software run on a general-purpose computer system or a dedicated machine, etc.), or a combination thereof. For one embodiment, each of the self-reliance logic/modules 101 performs one or more embodiments of techniques for software recovery of a computer program installed on one or more interconnected client devices 102A-N, as described herein.
For some embodiments, each of the self-reliance logic/modules 101 of the client devices 102A-N is implemented as one or more special-purpose processors with tamper resistance features. These types of specialized processors are commonly known as tamper resistant processors. Examples of such special-purpose processors include a trusted platform module (TPM) cryptoprocessor, an application specific integrated circuit (ASIC), an application-specific instruction set processor (ASIP), a field programmable gate array (FPGA), a digital signal processor (DSP), any type of cryptographic processor, an embedded processor, a co-processor, or any other type of logic with tamper resistance features that is capable of processing instructions. In this way, the self-reliance logic/modules 101 and the distributed ledger 103 can be implemented and maintained in a secure manner that assists with minimizing or preventing security vulnerabilities, as well as with improving the resilience of the client devices 102A-N against software failure. For a further embodiment, the self-reliance logic/modules 101 and/or the distributed ledger 103 may be maintained separately from the components 130A-N. For example, the self-reliance logic/modules 101 may be implemented as one or more special-purpose processors that is separate from the components 130A-N.
In the computer system 100, each of the client devices 102A-N includes one or more computer programs (e.g., software, firmware, etc.) for performing its operations and functionalities. Furthermore, each of the client devices 102A-N's computer program(s) may be rolled back as the computer program(s) fail and/or become faulty. These rollbacks are usually in the form of major version rollbacks, minor version rollbacks, patches, hotfixes, maintenance releases, service packs, etc. The goal of rolling back computer program(s) installed on the programmable devices 102A-N is to bring such a device back to know, good operational state (prior to the failure or faulty operation of the client device). Rollbacks can assist with fixing security vulnerabilities and other bugs, returning the device's functionality back to usable operational states, or returning power consumption and performance back to a normal state. Such rollbacks, therefore, can be viewed as important features in the lifecycles of IoT devices, mobile computing devices, cloud computing devices, logical computing devices, and virtual computing devices.
For a specific embodiment, each of the self-reliance logic/modules 101 is implemented in a trusted execution environment (TEE) of one or more processors of the client devices 102A-N. Examples of TEEs can be included in processors and/or cryptoprocessors based on Intel Software Guard Extensions (SGX) technology, processors and/or cryptoprocessors based on Intel Converged Security and Manageability Engine (CSME) technology, processors and/or cryptoprocessors based on Intel Trusted Execution Technology (TXT) technology, processors and/or cryptoprocessors based on Trusted Platform Module (TPM) technology, processors and/or cryptoprocessors based on ARM TrustZone technology, etc. In this way, the TEE acts as an isolated environment for the distributed ledger 103 that runs in parallel with the other computer programs (e.g., software, firmware, etc.) installed on the client devices 102A-N. For one example, a self-reliance logic/module 101 can be implemented in TEE of a TPM cryptoprocessor, an ASIC, an ASIP, an FPGA, a DSP, any type of cryptographic processor, an embedded processor, a co-processor, or any other type of logic with tamper resistance features that is capable of processing instructions.
Each of the watchdog devices 104A-N in the computer system 100 is a computer system that executes various types of processing including transmission of watchdog messages and receipt thereof. Also, each of the watchdog devices 104A-N can include electronic components 131A-N. Examples of the components 131A-N include: processing unit(s) (such as microprocessors, co-processors, other types of integrated circuits (ICs), etc.); corresponding memory; and/or other related circuitry. As such, each of the watchdog devices 104A-N can be any of various types of computers, including general-purpose computers, workstations, personal computers, servers, etc. For one embodiment, the watchdog devices 104A-N in the computer system 100 are associated with an external entity (e.g., a service facility that provides software recovery services 199, etc.). As such, the watchdog devices 104A-N can assist with delivery of software recovery service(s) 199 without having a user contact a service facility that provides software recovery services 199 to initiate software recovery operations. Examples of a service facility that provides software recovery services 199 includes, but is not limited to, Internet-based service facilities that facilitate software recovery of computer programs installed on one or more client devices 102A-N. Additional details about software recovery services 199 are discussed below in connection with at least
A rollback, for some embodiments, can be in the form of a software image (e.g., a disk image, a process image, etc.). For other embodiments, a rollback can be in the form of a bundle (e.g., a directory with a standardized hierarchical structure that holds executable code and the resources used by that code, etc.).
The client devices 102A-N and the watchdog devices 104A-N communicate within the computer system 100 via one or more networks 105. These network(s) 105 comprise one or more different types of computer networks, such as the Internet, enterprise networks, data centers, fiber networks, storage networks, WANs, and/or LANs. Each of the networks 105 may provide wired and/or wireless connections between the devices 102A-N and the watchdog devices 104A-N that operate in the electrical and/or optical domain, and also employ any number of network communication protocols (e.g., TCP/IP). For example, one or more of the networks 105 within the computer system 100 may be a wireless fidelity (Wi-Fi®) network, a Bluetooth® network, a Zigbee® network, and/or any other suitable radio based network as would be appreciated by one of ordinary skill in the art upon viewing this disclosure. It is to be appreciated by those having ordinary skill in the art that the network(s) 105 may also include any required networking hardware, such as network nodes that are configured to transport data over network(s) 105. Examples of network nodes include, but are not limited to, switches, gateways, routers, network bridges, modems, wireless access points, networking cables, line drivers, switches, hubs, and repeaters. For embodiment, at least one of the client devices 102A-N and/or at least one of the watchdog devices 104A-N implements the functionality of a network node.
One or more of the networks 105 within the computer system 100 may be configured to implement computer virtualization, such as virtual private network (VPN) and/or cloud based networking. For one embodiment, at least one of the client devices 102A-N and/or at least one of the watchdog devices 104A-N comprises a plurality of virtual machines (VMs), containers, and/or other types of virtualized computing systems for processing computing instructions and transmitting and/or receiving data over network(s) 105. Furthermore, at least one of the client devices 102A-N and/or at least one of the watchdog devices 104A-N may be configured to support a multi-tenant architecture, where each tenant may implement its own secure and isolated virtual network environment. Although not illustrated in
For some embodiments, the network(s) 105 comprise a cellular network for use with at least one of the client devices 102A-N and/or at least one of the watchdog devices 104A-N. For this embodiment, the cellular network may be capable of supporting of a variety of the client devices 102A-N and/or the watchdog devices 104A-N that include, but are not limited to computers, laptops, and/or a variety of mobile devices (e.g., mobile phones, self-driving vehicles, ships, and drones). The cellular network can be used in lieu of or together with at least one of the other networks 105 described above. Cellular networks are known so they are not described in detail in this document.
In some situations, the computer program(s) installed on the client devices 102A-N are meant to operate without any setbacks or negative ramifications. However, one or more of these computer programs can sometimes introduce problems (e.g., faulty operation of a device, disabling of the device, etc.). In some scenarios, a faulty computer program installed on a single one of the client devices 102A-N (e.g., client device 102A, etc.) can disable one or more client devices 102A-N (e.g., one or more client devices 102B-N, etc.), which can in turn cause risks to the operational integrity of the computer system 100. Software recovery service(s) 199 can be used to assist with resolving a faulty computer program that is installed on one or more of the client devices 102A-N by re-installing previous versions of the installed computer program that were known to operate as intended.
The distributed ledger 103, as implemented by the self-reliance logic/modules 101, can assist with minimizing or eliminating at least one of the problems described in the immediately preceding paragraph. This is because the distributed ledger 103 operates based on the concept of decentralized consensus, as opposed to the currently utilized concept of centralized consensus. Centralized consensus is the basis of the client/server model and it requires one central database or server for deciding how or which software recovery service(s) are provided to the client device(s) 102A-N, and as a result, this can create a single point of failure that is susceptible to security vulnerabilities. In contrast, the distributed ledger 103 operates based on a decentralized scheme that does not require a central database for deciding how or which software recovery service(s) are provided to one or more of the client devices 102A-N. For one embodiment, the computer system 100 enables its nodes (e.g., the client devices 102A-N, the watchdog devices 104A-N, etc.) to continuously and sequentially record the watchdog communications between the client devices 102A-N and the watchdog devices 104A-N in a unique chain—that is, in the distributed ledger 103. For one embodiment, the distributed ledger 103 is an append-only record of the watchdog communications between the client devices 102A-N and the watchdog devices 104A-N that is based on a combination of cryptography and blockchain technology. For this embodiment, each successive block of the distributed ledger 103 comprises a unique fingerprint of an immediately preceding watchdog communication between the client devices 102A-N and the watchdog devices 104A-N. This unique fingerprint can be include at least one of: (i) a hash as is known in the art of cryptography (e.g., SHA,, RIPEMD, Whirlpool, Scrypt, HAS-160, etc.); or (ii) a digital signature generated with a public key, a private key, or the hash as is known in the art of generating digital signatures. Examples of digital signature algorithms include secure asymmetric key digital signing algorithms. One advantage of the distributed ledger 103 is that it can assist with software recovery even in instances when a portion of the computer system 100 is unavailable, which in turn removes the need for the central database or server that is required in the client/server model. Another advantage of the distributed ledger 103 is that it can assist with software recovery even in instances when users of failed client devices 102A-N have not contacted a service facility that can provide software recovery service(s) 199, which can in turn assist with automatic software recovery of failed client devices 102A-N in the computer system 100 and with improving resilience against failure within the computer system 100. Yet another advantage of the distributed ledger 103 is that it can prevent unnecessary rollback operations from being performed on a failed one of the client devices 102A-N. In particular, the distributed ledger 103 can assist with ensuring that a rollback operation is performed no more than once on a failed one of the client devices 102A-N. For example, when the client device 102A receives a first watchdog message from the watchdog device 104A and a second watchdog message from the watchdog device 104B at or around the same time, the self-reliance logic/module 101 records a response from the client device 102A to either one of the watchdog messages as a response to both messages in the distributed ledger 103. For this example, the records created by the self-reliance logic/module 101 in the distributed ledger 103 are communicated via the network(s) 105 to every other copy of the distributed ledger 103 that is stored on or available to the other self-reliance logic/module 101. In this way, and for this example, the distributed ledger 103 enables all of the client devices 102A-N and/or the watchdog devices 104A-N to maintain a record of responses to watchdog messages, which can assist with determining points of failure and initiating software recovery service(s) 199.
The distributed ledger 103, as a blockchain, includes information stored in its header that is accessible to the client devices(s) 102A-N and/or the watchdog devices 104A-N, which enables the client devices(s) 102A-N and/or the watchdog devices 104A-N to “view” one or more of: (i) watchdog messages that have been transmitted to the client devices(s) 102A-N by the watchdog devices 104A-N; and (ii) responses to the watchdog messages that have been transmitted by the client devices(s) 102A-N to the watchdog devices 104A-N. In this way, the distributed ledger 103 is a software design approach that binds the client devices 102A-N and/or the watchdog devices 104A-N together such that commonly obey the same consensus process for releasing or recording what information they hold, and where all related interactions are verified by cryptography. The distributed ledger 103 can be a private blockchain or a public blockchain. Furthermore, the distributed ledger 103 can be a permissioned blockchain or a permissionless blockchain.
One issue associated with distributed ledgers that are based on blockchain technology is that they are resource-intensive. That is, they require a large amount of processing power, storage capacity, and computational resources that grow as the ledger is replicated on more and more devices. This issue is based, at least in part, on the requirement that every node or device that includes a ledger must process every transaction in order to ensure security, which can become computationally expensive. As such, each device that includes the ledger may require access to a sizable amount of computational resources. On programmable devices with fixed or limited computational resources (e.g., mobile devices, vehicles, smartphones, lap tops, tablets, and media players, microconsoles, IoT devices, etc.), processing a ledger may prove difficult.
At least one embodiment of the distributed ledger 103 described herein can assist with minimizing the resource-intensive issue described above. For one embodiment, the distributed ledger 103 is not constructed as a monolithic blockchain with all of its blocks existing on all of the client devices 102A-N and/or the watchdog devices 104A-N. Instead, the distributed ledger 103 is constructed as a light ledger based on, for example, the light client protocol for the ethereum blockchain, the light client protocol for the bitcoin blockchain, etc. In this way, the distributed ledger 103 may be replicated on the client devices 102A-N and/or the watchdog devices 104A-N on an as-needed basis. For one embodiment, any one of the client devices 102A-N and/or the watchdog devices 104A-N that is resource-constrained will only store the most recent blocks of the ledger 103 (as opposed to all of the blocks of the ledger 103). For this embodiment, the number of blocks stored by a particular device or entity can be determined dynamically based on its storage and processing capabilities. For example, any one of the client devices 102A-N and/or the watchdog devices 104A-N can store (and also process) only the current block and the immediately following block of the ledger 103. This ensures that any consensus protocols required to add new blocks to ledger 103 can be executed successfully without requiring all the client devices 102A-N and/or the watchdog devices 104A-N to store the ledger 103 as a large monolithic blockchain. For another embodiment, each block of a ledger 103 may be based on a light client protocol such that the block is broken into two parts: (a) a block header showing metadata about which one of the watchdog communications (i.e., watchdog messages and responses to the watchdog messages) was committed to the block; and (b) a transaction tree that contains the actual data for the committed watchdog communication in the block. For this embodiment, the block header can include at least one of the following: (i) a hash of the previous block's block header; (ii) a Merkle root of the transaction tree; (iii) a proof of work nonce; (iv) a timestamp associated with the committed watchdog communication in the block; (v) a Merkle root for verifying existence of the committed watchdog communication in the block; or (vi) a Merkle root for verifying which one of the client device 102A-N and/or watchdog devices 104A-N generated the committed watchdog communication. For this embodiment, the client devices 102A-N and/or the watchdog devices 104A-N having the ledger 103 can use the block headers to keep track of the entire ledger 103, and request a specific block's transaction tree only when processing operations need to be performed on the ledger 103 (e.g., adding a new block to the ledger 103, etc.). For yet another embodiment, the ledger 103 can be made more resource-efficient by being based on the epoch Slasher technique associated with the light client protocol for the ethereum blockchain.
In some instances, a blockchain synchronization algorithm is required to maintain the ledger 103 across the client devices 102A-N and/or the watchdog devices 104A-N. Here, the blockchain synchronization algorithm enables nodes of the computer system 100 (e.g., one or more of the client devices 102A-N and/or the watchdog devices 104A-N) to perform a process of adding transactions to the ledger 103 and agreeing on the contents of the ledger 103. The blockchain synchronization algorithm allows for one or more of the client devices 102A-N and/or the watchdog devices 104A-N to use the ledger 103, as a block chain, to distinguish legitimate transactions (i.e., watchdog communications comprised of watchdog messages and responses thereof) from attempts to compromise or include false/faulty/flawed information by an attacker (e.g., man-in-the-middle attacks, etc.) in the computer system 100.
Executing the blockchain synchronization algorithm is designed to be resource-intensive so that the individual blocks of the ledger 103 must contain a proof to be considered valid. Examples of proofs include, but are not limited to, a proof of work and a proof of stake. Each block's proof is verified by the client devices 102A-N and/or the watchdog devices 104A-N when they receive the block. In this way, the blockchain synchronization algorithm assists with allowing the client devices 102A-N and/or the watchdog devices 104A-N to reach a secure, tamper-resistant consensus. For one embodiment, the blockchain synchronization algorithm is embedded in the computer system 100 and performed by at least one of the client devices 102A-N and/or the watchdog devices 104A-N. For example, one or more of the client devices 102A-N and/or the watchdog devices 104A-N may include an FPGA or other type of processor that is dedicated to performing and executing the blockchain synchronization algorithm. For this example, the FPGA or other type of processor generates the proofs for the blocks to be included in the ledger 103. Also, and for this example, the blocks are added to the ledger 103 only through verification and consensus (as described above). The blockchain synchronization algorithm can be performed by: (i) any of the client devices 102A-N and/or the watchdog devices 104A-N; or (ii) multiple of the devices 102A-N and/or the watchdog devices 104A-N. For a further embodiment, generating proofs for new blocks is performed in response to automatically determining the complexity of the operation given the availability of resources in the computer system 100. In this way, the resources of the computer system 100 can be utilized more efficiently.
For another embodiment, the blockchain synchronization algorithm is performed outside of the computer system 100 by, for example, a synchronization device (not shown). This synchronization device can be paired to one or more of the client devices 102A-N and/or the watchdog devices 104A-N having the ledger 103. For example, one or more of the client devices 102A-N may be paired via network(s) 105 to a synchronization device outside the system 100. For this example, the synchronization device includes electronic components that are similar to components 130A-N (which are described above). Also, and for this example, each transaction is communicated to the synchronization device via the network(s) 105 using one or more secure communication techniques. Here, the synchronization device generates the proof required for verification and consensus and communicates it back to the system 100. For one embodiment, each transaction comprises one or more of: (i) a watchdog message; (ii) a record of a transmitted or received watchdog message; (iii) a response to a watchdog message; and (iv) a record of a transmitted or received response to a watchdog message.
For yet another embodiment, the ledger 103 may be maintained across the system 100 without using the blockchain synchronization algorithm. As a first example, the ledger 103 may be implemented as a distributed database. For a second example, the ledger 103 may be maintained across the system 100 as a distributed version control system (DVCS), which is also sometimes known as a distributed revision control system (DVRS). Examples of a DVCS include, but are not limited to, ArX, BitKeeper, Codeville, Dares, DCVS, Fossil, Git, and Veracity.
The ledger 103 can also be made as a combination of the immediately preceding embodiments. For one embodiment, the ledger 103 is implemented with the blockchain synchronization algorithm in response to determining that resources of the system 100 are sufficient for the resource-intensive synchronization process. For this embodiment, the ledger 103 is implemented without the blockchain synchronization algorithm in response to determining that resources of the system 100 are not enough for the synchronization process.
Enabling the client devices 102A-N and/or enabling the watchdog devices 104A-N to record watchdog communications (e.g., a watchdog message, a response to a watchdog message, etc.) to the ledger 103 can be based on the enhanced privacy identification (EPID) protocol, e.g., the zero-knowledge proof protocol. For an embodiment based on the zero-knowledge proof protocol, one or more of the client devices 102A-N and/or the watchdog devices 104A-N (e.g., device 102A, device 104A, etc.) acts as a verifier that determines whether other ones of the client devices 102A-N and/or the watchdog devices 104A-N are members of a group of devices that have been granted the privilege to have their actions processed and added to the blockchain represented as the ledger 103. For this embodiment, each of the client devices 102A-N and/or the watchdog devices 104A-N that has privilege to access the ledger 103 cryptographically binds its corresponding public-key to the zero-knowledge proof sent to the verifier, resulting in that public-key being recognized as an identity that has obtained permission to perform actions on the blockchain represented as the ledger 103. For one embodiment, the client device(s) 102A-N and/or the watchdog device(s) 104A-N acting as the verifier adds the verified public-key to the ledger 103. Thus, the ledger 103 can maintain its own list of client devices 102A-N and/or watchdog devices 104A-N that can interact with the ledger 103. In this way, the client device(s) 102A-N and/or the watchdog device(s) 104A-N acting as the verifier ensures that any of the devices 102A-N and/or watchdog devices 104A-N that writes to the ledger 103 is authorized to do so.
To assist with security, and for one embodiment, the ledger 103 can be accessible to the watchdog device(s) 104A-N only via public key cryptography. Here, public keys associated with the ledger 103 can be disseminated to the watchdog device(s) 104A-N, on an as-needed basis, with private keys associated with the ledger 103, which would be known only to users of the client devices 102A-N. In this way, public key cryptography can be used for two functions: (i) using the public key to authenticate that a watchdog message originated with one of the watchdog devices 104A-N that is a holder of the paired private key; or (ii) encrypting a watchdog message provided by one of the watchdog devices 104A-N with the public key to ensure that only the client devices 102A-N, which would be the holders of the paired private key can decrypt and respond to the watchdog message. For example, and for one embodiment, the watchdog device 104A cannot commit watchdog communications (e.g., a watchdog message, a response to a watchdog message, etc.) to the ledger 103 unless the watchdog device 104A is granted access to the ledger 103 via public key cryptography and/or unless the watchdog entity 104A has been verified via the zero proof protocol described above. While, the public key may be publicly available to the watchdog devices 104A-N, a private key and/or prior verification via the zero proof protocol will be necessary to commit watchdog communications (e.g., a watchdog message, a response to a watchdog message, etc.) to the ledger 103. For this example, the private key can be provided to the watchdog device 104A via the network(s) 105 by the logic/module 101 of client device 102A in response to input provided to the client device 102A by a user. Based on a combination of public key cryptography and/or the verification via the zero proof protocol, the watchdog device 104A is enabled to commit watchdog communications (e.g., a watchdog message, a response to a watchdog message, etc.) to the ledger 103. As shown by the immediately preceding example, only users of the client devices 102A-N can provide the watchdog devices 104A-N with access to the ledger 103. This has an advantage of minimizing or eliminating the risk of security vulnerabilities (e.g., man-in-the-middle attacks, eavesdropping, unauthorized data modification, denial-of-service attacks, sniffer attacks, identity spoofing, etc.) because the users will always know which ones of watchdog devices 104A-N has been granted to their devices 102A-N via the ledger 103. For one embodiment, the private key can include information that grants the watchdog devices 104A-N with access to the ledger 103 for a limited period of time (e.g., 10 minutes, 1 hour, any other time period, etc.). Thus, security is further bolstered by preventing watchdog device(s) 104A-N from having unfettered access to the devices 102A-N and/or the ledger 103.
One feature of the distributed ledger 103, which is based on blockchain technology, is the ability to resolve forks attributable to the devices 102A-N and/or the watchdog devices 104A-N that have access to the ledger 103 attempting to add blocks to the end of the chain by finding a nonce that produces a valid hash for a given block of data. When two blocks are found that both claim to reference the same previous block, a fork in the chain is created. Some of the devices 102A-N and/or the watchdog devices 104A-N in the system 100 will attempt to find the next block on one end of the fork while other ones of the devices 102A-N and/or the watchdog devices 104A-N in the system 100 will work from the other end of the fork. Eventually one of the forks will surpass the other in length, and the longest chain is accepted by consensus as the valid chain. This is usually achieved using a consensus algorithm or protocol. Therefore, intruders attempting to change a block must not only re-find a valid hash for each subsequent block, but must do it faster than everyone else working on the currently accepted chain. Thus, after a certain number of blocks have been chained onto a particular block, it becomes a resource-intensive task to falsify contents of a block, which assists with minimizing or eliminating security vulnerabilities. For one embodiment, this ability to resolve forks can be used to perform rollback operations that are necessary to deal with one or more faulty computer programs.
Detecting flaws in the configurations of the computer program may occur as a result of audits, forensics, or other investigation of configurations installed on the client devices 102A-N. The investigation can include, but is not limited, investigations performed based on information recorded into the ledger 103. The one or more logic/modules 101 can detect a flaw in a computer program installed on the client devices 102A-N using one or more software configuration management (SCM) techniques. One example of an SCM technique is a watchdog timing technique and/or a heartbeat timing technique that can be used to detect a flaw that results from a computer program installed on the client devices 102A-N. A watchdog timing technique includes, for example, the client device 102A periodically resetting a timer before the timer expires to indicate that there are no errors in the operation of the device 102A. When the client device 102A does not reset its timer, it is assumed that the operation of device 102A is flawed. Thus, the one or more logic/modules 101 can detect the flaw in a computer program installed on the client device 102A when the one or more logic/modules 101 determine that the client device 102A failed to reset its timer during execution of a computer program. A heartbeat timing technique generally includes the client device 102A transmitting a heartbeat signal with a payload to another device (e.g., any of watchdog devices 104, etc.) in the computer system (e.g., system 100, etc.) to indicate that the device 102A is operating properly. Thus, one or more logic/modules 101 can detect the flaw in a computer program installed on client device 102A when the one or more logic/modules 101 determine that the client device 102A failed to transmit its heartbeat signal on time during execution of an installed computer program by the client device 102A. The watchdog timing technique and/or the heartbeat timing technique can be implemented in a processor (e.g., fault-tolerant microprocessor, etc.) of the client device 102A. For another example of an SCM technique, exception handling techniques (e.g., language level features, checking of error codes, etc.) can be used by the logic/module 101 to determine that a computer program installed on the client device 102A is flawed. For a specific example of an exception handling technique that applies when the client device 102A includes or executes a script, the one or more logic/modules 101 can determine that the computer program installed on the client device 102A is flawed when the one or more logic/modules 101 determine that the client device 102A failed to output or return a result message (e.g., an exit status message, a result value, etc.) to indicate that the script was successfully run or executed during execution of the installed computer program by the client device 102A. The one or more logic/modules 101 can request the result message from the processor(s) of the client device 102A running or executing the script. In response to detecting the flawed computer program, at least one of the logic/modules 101 can initiate performance of a rollback operation to return the computer program to a previous state—that is, to return the computer program from a defective state to a properly functioning state recorded in a block of the ledger 103. This is important in situations where the actual effect of an update may be unknown or speculative, which could result in a computer program that is in an inconsistent state.
For one embodiment, the operations performed in the immediately preceding paragraph are performed in response one or more logic/modules 101 inspecting the ledger 103 to determined that a client device (e.g., the client device 102A, etc.) failed to respond to a watchdog message or failed to transmit a watchdog response message within a predetermined amount of time. For a further embodiment, the logic/modules 101 communicate messages to each other to report that a client device (e.g., the client device 102A, etc.) failed to respond to a watchdog message or failed to transmit a watchdog response message within a predetermined amount of time. When the logic/module 101 of the faulty client device (e.g., the client device 102A, etc.) receives the message reporting the faulty device, then the logic/module 101 of the faulty client device can initiate one or more software recovery services 199.
In
Technique 200 begins at operation 210, where a watchdog device 104A sends a first watchdog message to the client device 102A. One embodiment of technique 200 can optionally include operation 217, which includes the watchdog device 104A committing a record of the first watchdog message being sent to the distributed ledger 103. Next, at operation 211, the self-reliance logic/module 101 in the client device 102A can respond to the first watchdog message within a predetermined period of time to indicate that the computer program(s) 206 are operating without any issues (i.e., as expected). As shown, operations 212A-B include a record of the successful response to the first watchdog message being committed to the ledger 103. Operation 212A can be performed by the watchdog device 104A and operation 212B can be performed by the self-reliance logic/module 101 of the client device 102A. For one embodiment, only of one of operations 212A-B is performed. For another embodiment, both operations 212A-B are performed.
Technique 200 further includes operation 213, where the watchdog device 104A communicates a second watchdog message to the self-reliance logic/module 101 of the client device 102A. One embodiment of the technique 200 can optionally include operation 218, which includes the watchdog device 104A committing a record of the second watchdog message being sent to the distributed ledger 103. As shown in
Next, technique 200 proceeds to operation 215, where the self-reliance logic/module 101 of the client device 102A detects that the computer program(s) 206 are faulty or failing. The detection can be performed in response to the self-reliance logic/module 101 performing operation 214B. Alternatively, or additionally, the detection can be performed in response to the self-reliance logic/module 101 inspecting the ledger 103 after one or more of operations 214A-B. After operation 215, technique 200 proceeds to operation 216. Here, the self-reliance logic/module 101 initiates software recovery service(s) 199, which are described in connection with
Referring briefly to
With regard again to
Referring now to
A self-reliance logic/module of any one of the client devices 102A-N (e.g., one or more of the logic/modules 101) may perform the technique 400 when the watchdog devices 104A-B and the client devices 102A-N have a contract to communicate watchdog messages with each other. For one embodiment, each contract can be a smart contract—that is, a state stored in the blockchain represented as the distributed ledger 103 that facilitates, authenticates, and/or enforces performance of a contract between the watchdog devices 104A-B and the client devices 102A-N. Consequently, a smart contract is one feature of the ledger 103, as a blockchain, that can assist the one or more self-reliance logic/modules 101 with locating faulty or flawed computer program(s) installed in one or more of the client devices 102A-N. This is beneficial because a smart contract can enable the ledger 103 to remain stable, even as account servicing roles are transferred or passed between the watchdog devices 104A-B. Technique 400, as described below and in connection with
Technique 400 begins at operation 402, where a self-reliance logic/module of the client device 102A monitors a computer program installed on a client device 102A with the ledger 103. For one embodiment, SCM techniques as described above in connection with
Operation 403 includes the client device 102A generating a watchdog communication (e.g., a watchdog response message, etc.) and transmitting the watchdog communication to one or more watchdog devices 104A-B. For one embodiment, operation 403 is performed in accord with one or more of
Technique 400 proceeds to operation 404, where one or more records of the watchdog communication are committed to the distributed ledger 103. For one embodiment, the one or more records include one or more of: (i) a record of a transmitted watchdog response message, which can be committed to the ledger 103 by the client device 102A; (ii) a record of a received watchdog response message, which can be committed to the ledger 103 by the one of watchdog devices 104A-N that received the watchdog response message; (iii) a record of a transmitted watchdog message, which can be committed to the ledger 103 by the one of watchdog devices 104A-N that transmitted the watchdog message; and (iv) a record of a received watchdog message, which can be committed to the ledger 103 by the client device 102A that received the watchdog message.
Next, at operation 405, the self-reliance logic/module of the client device 102A can detect whether the client device 102A has failed due to faulty computer program(s) installed thereon. Local failure detection refers to the self-reliance logic/module of the client device 102A determining that faulty computer program(s) installed thereon have caused the client device 102A to fail. Local detection is determined based on inspecting the ledger 103 and/or on internal SCM techniques, for example, as described above in accord with
Technique 400 proceeds to operation 407 when remote failure is detected. Here, the self-reliance logic/module of the client device 102A transmits a failure message to the self-reliance logic/module of the failed device, which can cause the self-reliance logic/module of the failed device to trigger software recovery service(s) as described below in connection with operation 408 (or above in connection with one or more of
Operation 409 includes the self-reliance logic/module of the client device 102A determining whether the flawed computer program(s) installed on the client device 102A can be recovered locally using data from the client device 102A. An example of such data is the replicant image 207 of
When a failover device is unavailable, technique 400 proceeds to operation 411. Here, a determination is made as to whether the failed client device 102A is repairable by a servicing entity (e.g., a service technician, a drone, etc.) or replaceable by an entity (e.g., a service technician, a drone, a delivery vehicle, etc.). When the failed client device 102A is repairable or replaceable, then technique 400 proceeds to operation 415. Here, the self-reliance logic/module(s) of the client device 102A communicates via the network(s) 105 with the appropriate service facility to dispatch installation of replacement device or servicing of the failed client device 102A. For one embodiment, operation 415 is performed automatically and/or without a user of the client device 102A initiating communication with the appropriate service facility.
Technique 400 also includes operation 416, which occurs after operations 411 and 413-415 have been performed. For an embodiment, technique 400 proceeds to operation 416 from operation 411 whether or not operation 415 can be performed. For one embodiment, technique 400 proceeds to operation 416 after performance of operations 413-415. At operation 416, a determination is made as to whether the failure of the program(s) has been resolved. When the failure has been resolved, then technique 400 returns to operation 402 (which is described above). Alternatively, when the failure has not been resolved, then technique 400 proceeds to operation 412. Here, the failed client device 102A is decommissioned. For one embodiment, the self-reliance logic/module of the client device 102A decommissions the client device 102A. For another embodiment, the self-reliance logic/module of the client device 102A communicates with the appropriate entities (e.g., an enterprise IT service facility, etc.) that can perform the decommissioning process via network(s) 105.
For one embodiment, the ledger 103 can be generated during operation 402 by creating a genesis block (when the ledger 103 lacks any blocks) or appending a block to an already existing ledger 103. For one embodiment, a self-reliance logic/module registers the client devices 102A-N and/or the watchdog devices 104A-N with the ledger 103 by committing, to the ledger 103, a record of a communicated watchdog message and/or a record of a communicated watchdog response message.
Programmable device 500 is illustrated as a point-to-point interconnect system, in which the first processing element 570 and second processing element 580 are coupled via a point-to-point interconnect 550. Any or all of the interconnects illustrated in
As illustrated in
Each processing element 570, 580 may include at least one shared cache 546. The shared cache 546A, 546B may store data (e.g., computing instructions) that are utilized by one or more components of the processing element, such as the cores 574A, 574B and 584A, 584B, respectively. For example, the shared cache may locally cache data stored in a memory 532, 534 for faster access by components of the processing elements 570, 580. For one or more embodiments, the shared cache 546A, 546B may include one or more mid-level caches, such as level 2 (L2),level 3 (L3),level 4 (L4), or other levels of cache, a last level cache (LLC), or combinations thereof. The memory 532, 534 may include software instructions representing one or more self-reliance logic/modules 101, which include a distributed ledger 103 that is accessible by each of the processing elements 570 and 580. Each of the logic/modules 101 and the distributed ledger 103 is described above in connection with at least
While
First processing element 570 may further include memory controller (MC) logic 572 and point-to-point (P-P) interconnects 576 and 578. Similarly, second processing element 580 may include a MC 582 and P-P interconnects 586 and 588. As illustrated in
Processing element 570 and processing element 580 may be coupled to an I/O subsystem 590 via respective P-P interconnects 576 and 586 through links 552 and 554. As illustrated in
In turn, I/O subsystem 590 may be coupled to a first link 516 via an interface 596. In one embodiment, first link 516 may be a Peripheral Component Interconnect (PCI) bus, or a bus such as a PCI Express bus or another I/O interconnect bus, although the scope of the present invention is not so limited.
As illustrated in
Note that other embodiments are contemplated. For example, instead of the point-to-point architecture of
The programmable devices depicted in
Program instructions may be used to cause a general-purpose or special-purpose processing system that is programmed with the instructions to perform the operations described herein. Alternatively, the operations may be performed by specific hardware components that contain hardwired logic for performing the operations, or by any combination of programmed computer components and custom hardware components. The methods described herein may be provided as a computer program product that may include a machine readable medium having stored thereon instructions that may be used to program a processing system or other device to perform the methods. The term “machine readable medium” used herein shall include any medium that is capable of storing or encoding a sequence of instructions for execution by the machine and that cause the machine to perform any one of the methods described herein. The term “machine readable medium” shall accordingly include, but not be limited to, tangible, non-transitory memories such as solid-state memories, optical and magnetic disks. Furthermore, it is common in the art to speak of software, in one form or another (e.g., program, procedure, process, application, module, logic, and so on) as taking an action or causing a result. Such expressions are merely a shorthand way of stating that the execution of the software by a processing system causes the processor to perform an action or produce a result.
At least one embodiment is disclosed and variations, combinations, and/or modifications of the embodiment(s) and/or features of the embodiment(s) made by a person having ordinary skill in the art are within the scope of the disclosure. Alternative embodiments that result from combining, integrating, and/or omitting features of the embodiment(s) are also within the scope of the disclosure. Where numerical ranges or limitations are expressly stated, such express ranges or limitations may be understood to include iterative ranges or limitations of like magnitude falling within the expressly stated ranges or limitations (e.g., from about 1 to about 10 includes, 2, 3, 4, etc.; greater than 0.10 includes 0.11, 0.12, 0.13, etc.). The use of the term “about” means ±10% of the subsequent number, unless otherwise stated.
Use of the term “optionally” with respect to any element of a claim means that the element is required, or alternatively, the element is not required, both alternatives being within the scope of the claim. Use of broader terms such as comprises, includes, and having may be understood to provide support for narrower terms such as consisting of, consisting essentially of, and comprised substantially of Accordingly, the scope of protection is not limited by the description set out above but is defined by the claims that follow, that scope including all equivalents of the subject matter of the claims. Each and every claim is incorporated as further disclosure into the specification and the claims are embodiment(s) of the present disclosure.
The following examples pertain to further embodiments.
Example 1 includes a machine readable medium storing instructions for recovery of a program installed on a client device, comprising instructions that when executed cause a watchdog device to: transmit, to the client device, a request for an indication of an expected operation of a program installed on the client device; commit, to a distributed ledger on a plurality of interconnected devices, a first record responsive to receiving a response to the request from the client device within a predetermined period of time, the client device and the watchdog device being among the plurality of interconnected devices; commit, to the distributed ledger, a second record responsive to not receiving a response to the request within the predetermined period of time; and initiate a software recovery service for the client device responsive to committing the second record.
In Example 2, the subject matter of example 1 can optionally include that the instructions further comprise instructions that when executed cause the watchdog device to commit the request to the distributed ledger.
In Example 3, the subject matter of claim 1 or 2 can optionally include that the software recovery service for the client device includes one or more of the following: a first software recovery service that includes replacing the program with a known configuration of the program stored in an image; a second software recovery service that includes transferring one or more operations performed by the client device to a second client device, the second client device being one of the plurality of interconnected devices; a third software recovery service that includes decommissioning the client device; and a fourth software recovery service that includes dispatching a replacement device to replace the client device or a servicing entity to repair the client device.
In Example 4, the subject matter of claim 1, 2, or 3 can optionally include that the distributed ledger stores records of successful responses and indications of failure to respond in separate blocks of a blockchain.
In Example 5, the subject matter of claim 1, 2, 3, or 4 can optionally include that each transmitted response is generated according to a predetermined schedule.
In Example 6, the subject matter of claim 1, 2, 3, 4, or 5 can optionally include that the watchdog device includes at least one tamper resistant processor for executing at least some of the instructions in a secure environment in order to minimize or prevent security vulnerabilities.
In Example 7, the subject matter of claim 1, 2, 3, 4, 5, or 6 can optionally include that the instructions further comprise instructions than when executed cause the watchdog device to: determine, based on the distributed ledger, that the program is faulty.
Example 8 includes a method for recovery of a program installed on a client device, the method comprising: transmitting, to the client device and by a watchdog device, a request for an indication of an expected operation of a program installed on the client device; committing, to a distributed ledger on a plurality of interconnected devices, a first record responsive to receiving a response to the request from the client device within a predetermined period of time, the client device and the watchdog device being among the plurality of interconnected devices; committing, to the distributed ledger, a second record responsive to not receiving a response to the request within the predetermined period of time; and initiating a software recovery service for the client device responsive to committing the second record.
In Example 9, the subject matter of claim 8 can optionally include that the method further comprises committing the request to the distributed ledger.
In Example 10, the subject matter of claim 8 or 9 can optionally include that the software recovery service for the client device includes one or more of the following: a first software recovery service that includes replacing the program with a known configuration of the program stored in an image; a second software recovery service that includes transferring one or more operations performed by the client device to a second client device, the second client device being one of the plurality of interconnected devices; a third software recovery service that includes decommissioning the client device; and a fourth software recovery service that includes dispatching a replacement device to replace the client device or a servicing entity to repair the client device.
In Example 11, the subject matter of claim 8, 9, or 10 can optionally include that the distributed ledger stores records of successful responses and indications of failure to respond in separate blocks of a blockchain.
In Example 12, the subject matter of claim 8, 9, 10, or 11 can optionally include that each transmitted response is generated according to a predetermined schedule.
In Example 13, the subject matter of claim 8, 9, 10, 11, or 12 can optionally include that the method further comprises determining, based on the distributed ledger, that the program is faulty.
Example 14 includes watchdog device for recovery of a program installed on a client device, the watchdog device comprising: one or more processors; and a memory coupled to the one or more processors and storing instructions, comprising instructions that when executed cause the one or more processors to: transmit, to the client device, a request for an indication of an expected operation of a program installed on the client device; commit, to a distributed ledger on a plurality of interconnected devices, a first record responsive to receiving a response to the request from the client device within a predetermined period of time, the client device and the watchdog device being among the plurality of interconnected devices; commit, to the distributed ledger, a second record responsive to not receiving a response to the request within the predetermined period of time; and initiate a software recovery service for the client device responsive to committing the second record.
In Example 15, the subject matter of claim 14 can optionally include that the instructions further comprise instructions that when executed cause the one or more processors to commit the request to the distributed ledger.
In Example 16, the subject matter of claim 14 or 15 can optionally include that the software recovery service for the client device includes one or more of the following: a first software recovery service that includes replacing the program with a known configuration of the program stored in an image; a second software recovery service that includes transferring one or more operations performed by the client device to a second client device, the second client device being one of the plurality of interconnected devices; a third software recovery service that includes decommissioning the client device; and a fourth software recovery service that includes dispatching a replacement device to replace the client device or a servicing entity to repair the client device.
In Example 17, the subject matter of claim 14, 15, or 16 can optionally include that the distributed ledger stores records of successful responses and indications of failure to respond in separate blocks of a blockchain.
In Example 18, the subject matter of claim 14, 15, 16, or 17 can optionally include that each transmitted response is generated according to a predetermined schedule.
In Example 19, the subject matter of claim 14, 15, 16, 17, or 18 can optionally include that the one or more processors includes at least one tamper resistant processor for executing at least some of the instructions in a secure environment in order to minimize or prevent security vulnerabilities.
In Example 20, the subject matter of claim 14, 15, 16, 17, 18, or 19 can optionally include that the instructions further comprise instructions than when executed cause the one or more processors to determine, based on the distributed ledger, that the program is faulty.
Example 21 includes a machine readable medium storing instructions for recovery of a program installed on a client device, comprising instructions that when executed cause the client device to: transmit, to a watchdog device, a message indicating an expected operation of a program installed on the client device; commit, to a distributed ledger on a plurality of interconnected devices, a first record responsive to transmitting the message to the watchdog device within a predetermined period of time, the client device and the watchdog device being among the plurality of interconnected devices; commit, to the distributed ledger, a second record responsive to not transmitting the message to the watchdog device within the predetermined period of time; and initiate a software recovery service for the client device responsive to committing the second record.
Example 22 includes a method for recovery of a program installed on a client device, the method comprising: transmitting, by the client device and to a watchdog device, a message indicating an expected operation of a program installed on the client device; committing, to a distributed ledger on a plurality of interconnected devices, a first record responsive to transmitting the message to the watchdog device within a predetermined period of time, the client device and the watchdog device being among the plurality of interconnected devices; committing, to the distributed ledger, a second record responsive to not transmitting the message to the watchdog device within the predetermined period of time; and initiating a software recovery service for the client device responsive to committing the second record.
Example 23 includes a client device for recovery of an installed program, comprising: one or more processors; and a memory coupled to the one or more processors and storing instructions, wherein the instructions comprise instructions than when executed causes at least some of the one or more processors to: transmit, to a watchdog device, a message indicating an expected operation of a program installed on the client device; commit, to a distributed ledger on a plurality of interconnected devices, a first record responsive to transmitting the message to the watchdog device within a predetermined period of time, the client device and the watchdog device being among the plurality of interconnected devices; commit, to the distributed ledger, a second record responsive to not transmitting the message to the watchdog device within the predetermined period of time; and initiate a software recovery service for the client device responsive to committing the second record.
In Example 24, the subject matter of claim 23 can optionally include that the one or more processors includes at least one tamper resistant processor for executing at least some of the instructions in a secure environment in order to minimize or prevent security vulnerabilities.
In Example 25, the subject matter of claim 23 or 24 can optionally include that the client device further comprises: an auxiliary power source configured to power the tamper resistant processor independently of other components of the client device.
It is to be understood that the above description is intended to be illustrative, and not restrictive. For example, the above-described embodiments may be used in combination with each other. Many other embodiments will be apparent to those of skill in the art upon reviewing the above description. The scope of the invention therefore should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
In this document, reference has been made to blockchain technologies, such as ethereum and bitcoin. ETHEREUM may be a trademark of the Ethereum Foundation (Stiftung Ethereum). BITCOIN may be a trademark of the Bitcoin Foundation. These and any other marks referenced herein may be common law or registered trademarks of third parties affiliated or unaffiliated with the applicant or the assignee. Use of these marks is by way of example and shall not be construed as descriptive or to limit the scope of the embodiments described herein to material associated only with such marks.