Embodiments of the present invention relate generally to electrical memory devices and, in particular, to a method and system for tracking and recording changes to a memory and restoring memory based on the recorded changes.
One exemplary embodiment is a method for tracking memory changes. The method includes defining a change-track area of memory having at least one memory address range for which changes will be tracked and allocating a protected log region of memory for storing a change-track log. An operational mode for change tracking is selected from among a plurality of modes, the selected operational mode having criteria (or a criterion) for tracking memory changes. Memory transactions are detected using a memory logging module. The method also includes generating a transaction record for each memory transaction that occurs in the change-track area of memory and which meets the criteria, and storing the transaction record in the change-track log.
Another exemplary embodiment is a system for tracking memory changes. The system includes a memory, a processor and a memory transaction logging module. The memory includes a first region identified as a change-track region and a second region allocated as a protected log region, the change-track region having one or more memory areas for which changes will be tracked. The processor can be coupled to the memory via a bus or any other suitable wired or wireless link. The memory transaction logging module is adapted to: provide an operational mode for tracking changes including a criteria for logging memory transactions; detect memory transactions on the bus; and store transaction details for any memory transaction that is directed to the change-track region of memory and that meets the criteria.
In operation, certain memory transactions can be recorded and later retrieved. For example, as the processor 112 executes processes 114-118 the processor 112 may access the memory 102 via memory transactions (e.g., read and write operations). It may be desirable to record certain ones of the memory transactions for later use such as recovery from errors, restoration of another memory, diagnostic or other operational analysis, or the like.
In the exemplary embodiment shown in
Memory transactions can be detected by a memory logging module that can be a single module or be a distributed module within the system 100. For example, the memory logging module can include software instructions that, when executed, cause the processor to detect memory transactions, generate a memory transaction record and log memory transaction records meeting the criteria described below for recording. Once detected, memory transactions can be evaluated according to address and operational criteria. A memory transaction record can be generated for those transactions meeting the defined or selected criteria. The memory transaction records can be recorded to the protected log region 110. The protected log region 110 may be allocated as a protected region of memory in order to help ensure that the log memory does not get corrupted by a process executing on the processor 112. The memory transactions can include details such as type of transaction, address (or addresses) value referenced, and the subject data value. Other information could also be recorded such as control signals, a time stamp, device or processor identification, or the like. In general, any information that may be useful in later restoring another processors memory, or reconstructing or analyzing memory transactions can be stored as part of the transaction record.
Although shown in
As memory transactions occur on the bus 120 (or other memory-processor interface such as cache memory boundary) a record of the transactions can be made by the change-track logging module. It will be appreciated that the memory 102 and the bus 120 may comprise multiple physical and/or logical devices and a memory transaction may span two or more of these physical and/or logical devices.
As shown in
The modified software can be implemented directly (i.e., included in the source code and built as part of the software) or implemented as a library that can “overload” memory functions (i.e., include functions that replace existing memory functions with ones containing software code for memory change-track logging) and can be included in the software processes at the compile stage or linking stage (or any stage in the software design, build or execution process). Also, the modified software can include modified firmware or modified operating system software.
The address ranges A-C shown in
The processor 112 (and other processors described herein) can be any device that executes an algorithm or otherwise processes data, including, but not limited to, sequential microprocessors, vector microprocessors, microcontrollers, digital signal processors (DSPs), field programmable gate arrays (FPGAs), graphics processing units (GPUs), application specific integrated circuits (ASICs), or the like. As used herein “processor” can refer to a unitary processor or a distributed processor; processor can also refer to a multi-core processor including a hardware implementation (e.g., multiple processors packaged in a single device or constructed on a single substrate) or a firmware implementation (e.g., so-called “soft core” configured onto a programmable logic device). The memory 102 (and other memory devices described herein) can include any internal or external device or combination of devices used to store data or instructions or both.
It will be appreciated that different schemes can be used to store the memory transactions in the change-track log such as storing the most recent, storing all, or another scheme based on a contemplated embodiment.
The recorded memory transactions can be retrieved and used to synchronize two memories. For example, in a multi-processor system one processor may experience an error condition that requires the processor to be reset or rebooted. During reboot time, the processor is not processing data and therefore may be out of synchronization with other processors once it resumes executing application software. The recorded memory transaction can be retrieved and transmitted to the processor being rebooted in order to bring its memory up to a current state and make it possible for the rebooted processor to resume synchronous operation with the other processors in the system. The multi-processor restore operation will be described in greater detail below with respect to
In addition to being useful for restoring a rebooted or reset processor's memory, the memory change-track log may also have other uses such as diagnostics, audit trail, or any other use where retrieving recorded memory transactions may be required, desired or helpful.
The process of tracking changes to a memory may be carried out continuously or intermittently. Intermittent operation may be initiated in response to a signal. For example, in the scenario described above in which one processor of a multi-processor system requires resetting (or rebooting), the function of resetting the processor could also include generating a signal to request that memory change tracking be initiated. Thus, as the processor is being reset, memory transactions are being recorded in order to restore the processor once it resumes operation. Other types of signals can be used to initiate or terminate memory change tracking and can be determined based on a system design or operational requirements.
In operation, the software executing on the processor 206 can be modified to include a separate log process 214 that monitors the memory transactions of the other processes (208-212) and records the transactions meeting defined address and operational mode criteria. The separate logging process 214 can make it possible to provide additional security by isolating the logging process. The separate logging process 214 may be incorporated into the software for execution on the processor 206 via the direct source modification or library options described above, or by other suitable methods such as applying a patch to the operating system kernel.
The interrupt handler embodiment of the system 300 may not require modification of the software of the processes (308-312). Further, the interrupt handler embodiment may provide additional security because the logging mechanism is independent from the application processes (308-312).
Processors typically include an interrupt generating/handling mechanism that can be used to detect events such as memory transactions and generate an interrupt signal in response (e.g., a cache miss protocol). A process (i.e., an interrupt handler) can be assigned to respond to the interrupt signal. For processors or other devices that do not include an implementation of a conventional interrupt/interrupt handler system, other hardware or software mechanisms can be implemented to provide an analogous function that can execute independently on the device. For example, a separate hardware circuit implemented on an FPGA could be designed and implemented to monitor memory transactions and record those transactions meeting address and/or other criteria.
In operation, the processor 306 includes an interrupt mechanism that can be used to generate an interrupt when a memory transaction occurs. When one of the processes (308-312) performs a memory transaction an interrupt is raised and the log handler 314 is executed, the memory transaction can be stored locally in the internal log 315. The log handler 314 can evaluate the memory transaction to determine if the transaction is occurring in a change-track region of the memory and also to determine if the memory transaction meets any other operational criteria. Although shown as part of the processor 306, the log handler 314 can be an external process or circuit. In addition, the log could be external to the memory (302) or can be kept within memory internal to the processor (306). Such arrangements of the log can also be applied to other embodiments described herein, but have been omitted for the sake of clarity in describing other features.
The system 300 can include means for describing a change-track area of memory such as a table or other data structure including one or more address ranges that correspond to one or more logical or physical devices that can be internal or external to the system 300. The change-track area description can be provided by the processor 306, by the log handler 314 or by another internal or external circuit, process or module.
The system 300 can include means for providing data storage space for use as the protected change-track log 304. The providing means can include allocation of memory by the processor 306, a configuration file or data that indicates a portion of the memory to be allocated as the log 304, or any other suitable means for allocating or defining a region of memory for use as the log 304.
The system 300 can also include means for selecting an operational mode for tracking memory changes having criteria for changes to be tracked. The operational mode can be selected by the processor 306 while executing and may be changeable during execution depending on an operational context. The operational mode could be selected by the log handler 314. Alternatively, the operational mode could be selected by configuration data, hardwired configuration selection, an internal or external device, circuit or module, or any other means suitable for selecting an operational mode. The operational mode can be selected from among the operational modes described above or other modes suitable for use in a contemplated embodiment.
The system 300 can also include means for detecting any memory transaction that occurs in the change-track area and which meets the criteria. Hardware or software memory transaction modules, circuits or devices can be used to detect relevant memory transactions. Also, the system 300 includes means for generating an interrupt signal based on the detection of a transaction of interest. Typically, interrupt signals are generated by hardware sections dedicated to interrupt processing, but other interrupt generation means can be used including dedicated hardware for generating interrupts, a software interrupt generator, or any other means suitable for generating an interrupt signal. The interrupt signal can be a hardware signal (e.g., a signal line provided internally or externally from a device, circuit or module), a software signal (e.g., a semaphore, or the like) or both.
As mentioned, the system 300 includes a log handler 314. The log handler 314 includes means for handling the interrupt signal by storing transaction information associated with the transaction of interest in the change-track log. Such means can include a software module that has been defined as the interrupt handler associated with the interrupt signal for memory transactions. Other interrupt handling structures can be used including an integrated or stand-alone hardware interrupt handler (e.g., an FPGA or portion of an FPGA configured for use as an interrupt handler).
In operation, the monitoring device 405 monitors memory transactions (e.g., by directly monitoring the signals on the bus or any other means of interconnection 414) without interfering in those transactions. Any transactions that occur in the change-track area and also meet any other applicable criteria can be recorded to the log region 404.
Because the monitoring device 405 can monitor memory transactions without modifying, or placing a burden on, the processes or the memory itself, the monitoring device 405 option can make it possible for memory transaction monitoring to occur without an adverse impact on system performance or throughput. However, the monitoring device 405 may require additional specialized hardware for implementation.
In operation, the monitoring device 505 monitors memory transactions (e.g., by directly monitoring the signals on the interconnection means 514 via link 516) without interfering in those transactions. Any transactions that occur in the change-track area and also meet any other applicable criteria can be recorded to the log region 504. Also, the monitoring device 505 can provide recorded memory transactions to an external device or system via external link 518.
In operation, memory transactions can be generated by each of the processors (606 and 624). The memory transactions associated with one or both processors may be monitored by the monitoring device 618 and a transaction can be recorded if it meets the criteria for change-track logging. It should be appreciated that although two memories and one monitoring device are shown, a system can include additional memories and/or monitoring devices. As such, the embodiment may be implemented with two or more processors.
The monitoring device 618 can record certain memory transactions relating to the processors (606 and 624) to the respective protected log (604 and 622) of the processor. In addition to the protected log regions (604 and 622) the monitoring device 618 can also record memory transactions from one or more processors to a remote log or device 636. In some cases, a remote device may not be able to sustain a data transfer rate sufficient to keep up with a device such as the monitoring device. The monitoring device 618 can use the local log buffer 638 to buffer memory transactions for transfer to the remote log/device 636. Alternatively, an embodiment could include logging the memory transactions to the local log buffer 638.
By providing a monitoring device that can monitor the memory transactions of two or more processors, it is possible for the monitoring device 618 to compare memory transactions from the various processors. The comparison of memory transactions can occur in “real-time” (i.e., as they are occurring) or in a time-delayed manner. This type of comparison can be used in various embodiments described below in more detail.
In operation, the second memory 712 can be updated to a current state by the memory track change controller 711 transferring changes (hatched areas) made to regions of interest (704 and 706) in the first memory 702 to the corresponding regions of interest (714 and 716) in the second memory 712. The memory changes, while shown for illustration purposes as coming from the first memory 702 to the second memory 712, can also be provided from the protected log 708 of the first memory 702.
In step 804, a change-track area of memory is defined or described. The change track area includes at least one memory address range for which changes will be tracked. Processing continues to step 806.
In step 806, a protected log region of memory for storing a change-track log is allocated. Processing continues to step 808.
In step 808, an operational mode for change tracking is selected from among a plurality of modes. As described above, the selected operational mode can have criteria (or a criterion) for tracking memory changes. Processing continues to step 810.
In step 810, memory transactions are detected using a memory logging module or process. Processing continues to step 812.
In step 812, a transaction record for each memory transaction that occurs in the change-track are of memory and which meets the criteria can be generated. The memory transaction record can include a defined data structure or can be as simple as a few items of data. Processing continues to step 814.
In step 814, the transaction record is stored in the change-track log. If tracking is still enabled (815), then processing can continue back to step 810. If tracking is disabled, processing can move to step 816 where processing ends. It will be appreciated that steps 802-816 can be repeated in whole or in part in order to accomplish a contemplated memory change-track logging task.
It should be appreciated that the exemplary method shown in
In step 904, change-track data is retrieved from the change-track log. As described above the change-track data can include address values, data values and other information. Processing continues to step 906.
In step 906, data is retrieved as needed from the memory corresponding to the processor processing the change-track data retrieval operation. Data may be needed from memory in cases where the change-track log does not include data values (e.g., only address values were recorded to the change-track log). The data retrieved from the memory can be joined with the change-track log data for transfer to the processor or other system. Processing continues to step 908.
In step 908, the change-track log data (and possibly data from memory) is provided to another processor or device. Optionally, in a step not shown, the data can be validated or checked for validity prior to being provided to another processor or device. Processing continues to step 910, where processing ends. It will be appreciated that steps 902-910 can be repeated in whole or in part in order to accomplish a contemplated change-track data retrieval operation.
One consideration applicable to any of the embodiments described above that include a log region is how to determine an appropriate log region size. The size of a log region can be based on a number of factors including frequency of memory access, size of the change-track memory area being monitored and length of time that the logging system is predicted to be active. Another consideration is how to handle a situation in which the log region becomes full. Several options can be implemented to handle a log region overflow situation, such as automatically suspending logging activity or continuing logging with some scheme for data replacement. Data replacement can occur in a first-in-first-out (FIFO) method, a last-in-first-out method (LIFO), or via an n-way redundant index-based replacement strategy, or the like. In general, any now known or later developed scheme or method for handling overflow conditions of buffers or queues can be implemented.
Another possible implementation for handling log region overflow conditions is to define a method by which the protected log region can grow. For example, if an overflow condition is imminent or is occurring, the logging process could request an additional allocation of protected log region memory in order to continue logging transactions. Alternatively, the logging process could change logging locations in a hierarchical fashion if the primary log region becomes full. For example, the logging process could send transactions to an external storage device if the local log region becomes full. A mechanism may also be provided for the logging process to release the additionally allocated memory once a need for it has passed (e.g., once logging has stopped and the logged data has been retrieved). A method or process for determining a rollover or overflow location can be implemented in a hierarchical based location system so that a track change processor can determine a subsequent logging location if a currently used location becomes full. For example, referring to
Another embodiment for addressing storage issues can include the logging process having multiple operating modes (in addition to the modes discussed above). For example, the logging process may operate in a mode that uses less memory (e.g., logging only address data) for a majority of processing time and may periodically (e.g., once per time period or every n-memory accesses) switch into a mode that stores more data (e.g., recording address values and data values) and which may provide higher fidelity.
As mentioned above, an embodiment can be used as an update and synchronization mechanism for processor lockstep operations (e.g., redundant processors). In such an embodiment, memory transactions of one of the processors can be recorded making it possible to update another processor's memory at a later time by transferring changes made to memory during a given period of time to the other processor. If a processor is determined to be out of lock step (or experiencing an error condition) it may need to be reset or rebooted as described above. The other processors may continue operating and may initiate memory change-track logging. Once the processor experiencing an error has been rebooted and resumes execution, the memory contents can be updated as described above with respect to
Conventional systems may require that all processors halt while one is rebooting in order to maintain lockstep. In contrast, an embodiment of the present invention can make it possible for the processors that have not suffered an error to continue processing data during a majority of the rebooting process. Because memory changes for the processors that have not suffered an error can be recorded while the erroneous processor is being rebooted. The rebooted processor can be brought into synchronization once it begins operating after reset or reboot. An initial bulk transfer of memory values may need to be transferred to the rebooted processor (e.g. a baseline memory image). At a certain point in the rebooting process, all of the processors may need to be halted for a brief period while any updates recorded in the track change log are transferred to the rebooted processor and applied to the memory image in the rebooted processor. Once the changes have been applied, all of the processors can resume operation and will be operating in lock-step or in some other synchronized fashion. So, an embodiment can make it possible for an increased duration of system functionality (or “up-time”) during a processor reset or reboot as compared to some conventional systems without memory change-track logging.
An embodiment can also be used to determine if redundant processors are in lock-step or some other synchronized fashion (see, e.g., the description of
Another embodiment of this invention that incorporates the memory transaction method can be employed in comparing data from processors that are not performing traditional lock-step operations. Transaction-based or batch processing are exemplary areas of computing related to this type of processing in which the correctness of processing is determined by comparing the final result of a collection of operations that are together assumed to be atomic. Examples of transaction-based processing include cryptographic processing, image processing, and data mining operations. Testing for failures in this method of computing contrasts to lockstep processing in that each individual operation is not checked (i.e. only the final output checked) and the processing and therefore checking does not have to occur simultaneously. Therefore, a single processor can execute the same algorithm on the same data set in a repetitive and sequential manner (aka temporal redundant processing) and the outputs of each of the sequential atomic processing groups can be compared by monitor 618 in
In contrast, an embodiment could include logging memory transactions using the memory change-track logging method. Periodically, a delimiter could be inserted into the log at checkpoint intervals. Thus, it can be possible to reduce processor downtime while check-pointing is occurring because memory change logging can occur while the processor is operating. An embodiment also may make it possible to reduce the time for a roll-back operation because only those memory values that have changed since the last check-pointing operation need to be copied back into memory.
An embodiment can also be used for data coherency schemes (e.g., cache coherency). A region of interest of the cache or memory can be tagged with the appropriate cache coherency state names for a given coherency scheme (e.g., write-exclusive, read-only, etc.) and the logging device or process can be used to detect incorrect addresses to a given area of memory (e.g., writing to a read-only location). Also, an embodiment can be configured to provide updated data to other processors (see, e.g.,
Another embodiment can be used for operating system security. For example, a memory change-track logging device or process can be adapted to help ensure that an operating system is maintaining security of certain sensitive regions of memory and that safeguards for those regions are enforced. In a hardware implementation of the logging device or process, additional security may be provided by having a security check function occur in hardware as compared to software security functions that may be easier to compromise. A modification to a sensitive area of memory can be recorded and checked to ensure that safeguard measures are not being circumvented. The memory change-track device or method can be adapted to shut down the processor if a specified sensitive area of memory is accessed (e.g., modifications to the operating system, sensitive application instructions, protected system password areas, or the like). Also, an embodiment (e.g., that shown in
The exemplary embodiments described above have included a processor executing three processes for illustration purposes. It will be appreciated that an actual implementation or embodiment may include more or less processes executing on a processor.
An embodiment of the present invention can be used to handle situations in which one or more processors or memories encounters a fault. For example, a fault can arise from the interaction of ionizing radiation with the processor(s) and/or memory device(s). Specific examples of ionizing radiation include highly-energetic particles such as protons, ions, and neutrons. A flux of highly-energetic particles can be present in environments including terrestrial and space environments. As used herein, the phrase “space environment” refers to the region beyond about 50 miles (80 km) in altitude above the earth.
Faults can arise from any source in any application environment such as from the interaction of ionizing radiation with one or more of the processors or memories. In particular, faults can arise from the interaction of ionizing radiation with the processor(s) in the space environment. It should be appreciated that ionizing radiation can also arise in other ways, for example, from impurities in solder used in the assembly of electronic components and circuits containing electronic components. These impurities typically cause a very small fraction (e.g., <<1%) of the error rate observed in space radiation environments.
An embodiment can be constructed and adapted for use in a space environment, generally considered as 80 km altitude or greater, and included as part of the electronics system of one or more of the following: a satellite, or spacecraft, a space probe, a space exploration craft or vehicle, an avionics system, a telemetry or data recording system, a communications system, or any other system where distributed memory synchronized processing may be useful. Additionally, the embodiment can be constructed and adapted for use in a manned or unmanned aircraft including avionics, telemetry, communications, navigation systems or a system for use on land or water.
Embodiments of the method, system and apparatus for memory change track logging, may be implemented on a general-purpose computer, a special-purpose computer, a programmed microprocessor or microcontroller and/or peripheral integrated circuit element, an ASIC or other integrated circuit, a digital signal processor, a graphics processor, a hardwired electronic or logic circuit such as a discrete element circuit, a programmable logic device such as a PLD, PLA, FPGA, PAL, or the like. In general, any process capable of implementing the structures, functions or steps described herein can be used to implement embodiments of the method, system, or device for memory change track logging.
Furthermore, embodiments of the disclosed method, system, and device for memory change track logging may be readily implemented, fully or partially, in software using, for example, assembly language, high level language, symbolic or graphical language, modeling language, and/or object or object-oriented software development environments that provide portable software code that can be used on a variety of computer platforms. Alternatively, embodiments of the disclosed method, system, and device for memory change track logging can be implemented partially or fully in hardware using, for example, standard logic circuits or a VLSI design. Other hardware or software can be used to implement embodiments depending on the speed and/or efficiency requirements of the systems, the particular function, and/or a particular software or hardware system, microprocessor, or microcomputer system being utilized. Embodiments of the method, system, and device for memory change track logging can be implemented in hardware and/or software using any known or later developed systems or structures, devices and/or software by those of ordinary skill in the applicable art from the functional description provided herein and with a general basic knowledge of the computer and electrical arts.
Moreover, embodiments of the disclosed method, system, and device for memory change track logging can be implemented in software executed on a programmed general-purpose computer, a special purpose computer, a microprocessor, a microcontroller, a soft-core configured as a portion of an FPGA, or the like. Also, the memory change track logging method of this invention can be implemented as a program embedded on a personal computer such as a JAVA® or CGI script, as a resource residing on a server or graphics workstation, as a routine embedded in a dedicated processing system, or the like. The method and system can also be implemented by physically incorporating the method for memory change track logging into a software and/or hardware system, such as the hardware and/or software systems of a satellite.
It is, therefore, apparent that there is provided in accordance with the present invention, a method, system, and apparatus for memory change track logging. While this invention has been described in conjunction with a number of embodiments, it is evident that many alternatives, modifications and variations would be or are apparent to those of ordinary skill in the applicable arts. Accordingly, applicants intend to embrace all such alternatives, modifications, equivalents and variations that are within the spirit and scope of this invention.