This invention relates generally to software development systems, and more specifically to reproducing non-regular bugs using CPU lockstep and virtual machine live migration methods.
Software development efforts require effective debugging tools. A key task in fixing bugs is finding and reproducing so that problematic conditions can be accurately identified and corrected. Some bugs, such as those that occur regularly, can be simply localized and reproduced by re-running the logical scenario that gave rise to the problem. The developer can add logs to the re-execution and use other debugging tools until the problematic scenario is clear. The more problematic bugs are probabilistic bugs, which are those that happen only once in a while or on an irregular basis. The first time that such a bug happens, the developer lacks the evidence to analyze it and must wait until it happens again. This makes the reproduction of such bugs costly and ultimately may impair the quality of product passed to customers. The main issue is consistently reproducing a phenomenon that usually happens with low probability. Even after creating a fix, one cannot be absolutely sure that the reproduction is successful or that the bug is fixed.
In a multi-threaded computing environment, probabilistic bugs are usually caused by a special and unexpected case of interactions between threads. These kind of bugs can be exceedingly difficult to analyze, and as the thread count in each process grows, the analysis becomes even more complicated. Bug reconstruction complexity also increases with the number of processes. Having many processes or services interacting with each other increases the occurrence of such bugs. Even if a bug occurs multiple times, it can manifest slightly differently in each occurrence, which makes the analysis more difficult. Furthermore, the very act of debugging software code may alter the fault condition. For example, in certain race-conditions, the debugger can change the system (e.g., setting CPU registers) in a way that causes the fault condition to change or even disappear. Bug reconstruction complexity also increases in a containerized system with cross container interaction. The advent of containerization has given rise to an increase in non-predicted interaction between containers, which further complicates the reproduction of bugs in these environments.
What is needed, therefore, is a testing method and system that accurately reconstructs a software bug condition and consistently reproduces the condition one hundred percent of the time.
The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions. VMotion and vLockstep are trademarks of VMware Corporation. RecoverPoint is a trademark of DellEMC Inc.
In the following drawings like reference numerals designate like structural elements. Although the figures depict various examples, the one or more embodiments and implementations described herein are not limited to the examples depicted in the figures.
A detailed description of one or more embodiments is provided below along with accompanying figures that illustrate the principles of the described embodiments. While aspects of the invention are described in conjunction with such embodiment(s), it should be understood that it is not limited to any one embodiment. On the contrary, the scope is limited only by the claims and the invention encompasses numerous alternatives, modifications, and equivalents. For the purpose of example, numerous specific details are set forth in the following description in order to provide a thorough understanding of the described embodiments, which may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the embodiments has not been described in detail so that the described embodiments are not unnecessarily obscured.
It should be appreciated that the described embodiments can be implemented in numerous ways, including as a process, an apparatus, a system, a device, a method, or a computer-readable medium such as a computer-readable storage medium containing computer-readable instructions or computer program code, or as a computer program product, comprising a computer-usable medium having a computer-readable program code embodied therein. In the context of this disclosure, a computer-usable medium or computer-readable medium may be any physical medium that can contain or store the program for use by or in connection with the instruction execution system, apparatus or device. For example, the computer-readable storage medium or computer-usable medium may be, but is not limited to, a random-access memory (RAM), read-only memory (ROM), or a persistent store, such as a mass storage device, hard drives, CDROM, DVDROM, tape, erasable programmable read-only memory (EPROM or flash memory), or any magnetic, electromagnetic, optical, or electrical means or system, apparatus or device for storing information. Alternatively, or additionally, the computer-readable storage medium or computer-usable medium may be any combination of these devices or even paper or another suitable medium upon which the program code is printed, as the program code can be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory. Applications, software programs or computer-readable instructions may be referred to as components or modules. Applications may be hardwired or hard coded in hardware or take the form of software executing on a general-purpose computer or be hardwired or hard coded in hardware such that when the software is loaded into and/or executed by the computer, the computer becomes an apparatus for practicing the invention. In this specification, these implementations, or any other form that the described embodiments may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention.
Some embodiments of the invention involve software development of software products and programs that provide or enable the use of application software in a distributed system, such as a very large-scale wide area network (WAN), metropolitan area network (MAN), or cloud based network system, however, those skilled in the art will appreciate that embodiments are not limited thereto, and may include smaller-scale networks, such as LANs (local area networks). Thus, aspects of the one or more embodiments described herein may be implemented on one or more computers executing software instructions, and the computers may be networked in a client-server arrangement or similar distributed computer network.
In general, the nature of the bugs or problems discovered by the test regime 22 and debugger 28 depends on a great many factors, such as complexity of the code, deployment environment, production constraints, and so on. Accurately reproducing any detected bugs is a critical process in successfully debugging the program code. As stated above, in modern large-scale networks, bug reproduction, especially in programs that involve multi-threaded race conditions and/or containerized systems is generally quite difficult, as it is great challenge to consistently reproduce a program fault that usually happens with low probability. Embodiments of the test regime 22 include a bug reproduction process 26 that consistently reproduces bugs by combining multi-point-in-time replication (like RecoverPoint), CPU lockstep and the certain constructs used in implementing live migration of virtual machines, such as VMware VMotion functionality. Once the bug or fault condition is adequately or consistently reproduced, the results can be sent to the debugger 28 for detailed analysis and correction. Although the test regime 22 is illustrated as a unitary process executed by server 12, embodiments are not so limited. For example, certain tasks of the test regime 22, such as debugger 28, may be offloaded and performed by different servers. In addition, though
In an embodiment, the deployed software comprises application or other software programs executed by one or more servers and/or clients in a large-scale virtual machine system, though embodiments are not so limited.
Virtualization technology has allowed computer resources to be expanded and shared through the deployment of multiple instances of operating systems and applications run virtual machines (VMs). A virtual machine network is managed by a hyperwisor or virtual machine monitor (VMM) program that creates and runs the virtual machines. The server on which a hypervisor runs one or more virtual machines is the host machine, and each virtual machine is a guest machine. The hypervisor presents the guest operating systems with a virtual operating platform and manages the execution of the guest operating systems. Multiple instances of a variety of operating systems y share the virtualized hardware resources. For example, different OS instances (e.g., Linux and Windows) can all run on a single physical computer.
In an embodiment, system 100 illustrates a virtualized network in which a hypervisor program 112 supports a number (n) VMs 104. A network server supporting the VMs (e.g., network server 102) represents a host machine and target VMs (e.g., 104) represent the guest machines. Target VMs may also be organized into one or more virtual data centers 106 representing a physical or virtual network of many virtual machines (VMs), such as on the order of thousands of VMs each. These data centers may be supported by their own servers and hypervisors 122.
The data sourced in system 100 by or for use by the target VMs may be any appropriate data, such as database data that is part of a database management system. In this case, the data may reside on one or more hard drives (118 and/or 114) and may be stored in the database in a variety of formats (e.g., XML or RDMS). For example, computer 108 may represent a database server that instantiates a program that interacts with the database.
The data may be stored in any number of persistent storage locations and devices, such as local client storage, server storage (e.g., 118), or network storage (e.g., 114), which may at least be partially implemented through storage device arrays, such as RAID components. In an embodiment network 100 may be implemented to provide support for various storage architectures such as storage area network (SAN), Network-attached Storage (NAS), or Direct-attached Storage (DAS) that make use of large-scale network accessible storage devices 114, such as large capacity drive (optical or magnetic) arrays. In an embodiment, the target storage devices, such as disk array 114 may represent any practical storage device or set of devices, such as fiber-channel (FC) storage area network devices, and OST (OpenStorage) devices. In a preferred embodiment, the data source storage is provided through VM or physical storage devices, and the target storage devices represent disk-based targets implemented through virtual machine technology.
An application 117 or any other relevant software program executed in system 100 may be developed and debugged using the test regime 22 of system 10. As such, it is subject to the bug reproduction process 26 as they it is executed on a target VM (e.g., VM1) in the system in the event that a problem or bug condition is manifested. In a test scenario, operation of the target VM is monitored by the test regime 22 of the development server and any detected bugs are reproduced by the bug reproduction process 26.
In an embodiment, the bug reproduction process 26 uses three different technologies in combination.
In a VM environment, such as
The second technology 205 of the bug reproduction process 26 is live migration of VMs from one physical server to another physical server, such as provided by the VMware VMotion product and VM State. In VMware VMotion, a VM is moved between two hypervisors with little disruption. This is done by pausing the VM, capturing the full VM state including memory on the source hypervisor, transferring this information to the target hypervisor and continue running the VM again. The transfer is typically completed on the order of a few milliseconds, though in reality it often is done through multiple cycles to reduce the final stopping time. The VM live migration process 205 captures certain state information just prior to the VM transfer. These include: (1) all virtual CPU registers, buffers and state; (2) the contents on all the VM memory; (3) BIOS (basic input/output system) values; and (4) All virtual hardware information and states. The set of the these capture values at a specific point in time is called the “VM State” of the system at that point in time.
Although embodiments of the VM live migration component 205 are described with reference to the VMware VMotion program, embodiments are not so limited and any other similar live migration program may be used. With respect to VMotion, live migration of a virtual machine from one physical server to another is enabled by three underlying technologies. First, the entire state of a virtual machine is encapsulated by a set of files stored on shared storage such as Fibre Channel or iSCSI Storage Area Network (SAN) or Network Attached Storage (NAS). A clustered Virtual Machine File System (VMFS) allows multiple installations of the hypervisor server to access the same virtual machine files concurrently. Second, the active memory and precise execution state of the virtual machine is rapidly transferred over a high speed network, allowing the virtual machine to instantaneously switch from running on the source server to the destination server. VMotion keeps the transfer period imperceptible to users by keeping track of on-going memory transactions in a bitmap. Once the entire memory and system state has been copied over to the target server, VMotion suspends the source virtual machine, copies the bitmap to the target server, and resumes the virtual machine on the target server. The networks being used by the virtual machine are also virtualized by the underlying hypervisor server to preserve network identity and network connections after migration. VMotion manages the virtual MAC (media access controller) address as part of the process.
As shown in
In an embodiment VMware RecoverPoint is used to perform any point-in-time (PiT) replication that is used together with the quiescing function to create multiple application consistent point in time snapshots. In general, RecoverPoint for virtual machines uses a journal-based implementation to hold the PiT information of all changes made to the protected data. It provides the shortest recovery time to the latest PiT via journal technology enables recovery to just seconds or fractions of a second before data corruption occurred. RecoverPoint for VMs is a fully virtualized hypervisor-based replication and automated disaster recovery solution. As shown in
Embodiments include a record-rewind-replay approach at a VM level in the bug reproduction process 26. The process creates static snapshots at various points in time, and also a log describing CPU events, allowing a user or analyst to exactly replay CPU execution scenarios. This allows the use of debuggers or other inspection tools to better analyze bugs or fault conditions in the executing code. The use of CPU lockstep together with the RecoverPoint replication software allows the bug reproduction process to capture I/O interrupts and events and maintain CPU execution cycles that are in sync with the data stored on the disk storage media. The use of RecoverPoint replication ensures that all the data is correct at any point-in-time, and CPU lockstep synchronizes the CPU cycles and events with the data and disk storage state. This allows the bug reproduction process to capture the system state any point-in-time with respect to all pertinent aspects: CPU, I/O, data, storage state, and so on. Any event, fault or bug scenario can be repeated as many times as desired. This allows playback of CPU execution sequences at a granularity one instruction at a time and such repeated sequences happen in the exact same sequence as originally executed and in sync with same clock as the CPU.
Changes in the state are then captured, step 304. State changes are capture so that the process can replay a given scenario (e.g., bug condition, fault, error, etc.), meaning that the replay will go over the exact states that the original scenario went through. The state change capture step comprises certain sub-steps. Once in a given period, the process will capture the VM state by first quiescing the VM. It then writes the state of memory, storage, CPU, and any other relevant component to disk or other storage. Such storage is usually not the VM disk itself, but rather a datastore or other storage device. A RecoverPoint (or similar) snapshot backup of the VM is then taken.
In between states, the process uses vLockstep (or similar) technology to capture all CPU external events and/or interrupts, step 306. These events are then stored in an event log, step 308. The event log can be embodied as a simple file with time-stamped entries, or it can be a time-ordered data structure, or any other similar data storage construct. As is generally known, an interrupt is a signal to the CPU generated by a hardware component or software process indicating an event that needs immediate attention, and that causes interruption of the code currently being executed by the CPU.
The CPU external events include all network traffic data that is received or processed or otherwise impacts the VM, interrupts (e.g., I/O interrupts, fault conditions, etc.), and any other relevant processing markers, along with the precise timing information of these events. The timing information is precise to the CPU instruction level timing to maintain sync with the CPU clock. Keeping all this data persistently allows the system to accurately reproduce the bug that caused the problem or anomaly. It should be noted that the states captured in steps 302 and 304 of process 300 comprise the full machine state, and not only disk images as in usual replication snapshot backups.
During operation of the VM some time along time line 402, it is presumed that a bug condition is detected. In general it is usually known approximately what time such a problem occurred. The bug reproduction process allows the user to replay the sequence surrounding the bug condition.
Since it is known more or less the time at which the problem occurred, the process can return the VM to the state before it occurred, such as to time T 504 by using certain RecoverPoint disaster recovery and loading the captured state created using the VMotion VM migration techniques.
In step 606, the process feeds the VM with the events from the event log. This is done by having the VM re-execute the events from the event log. This process essentially guarantees the ability to reproduce the same problem 502 that was experienced starting at time T 504. A user can then play the event log through a debugger or other tools that will shed light on the issue, step 608. The problematic issue is guaranteed to be reproduced, as the lockstep event log (data and timing) information is built in a way that ensures exact replay across multiple CPU scenarios. Therefore replaying the log on the same VM will exhibit the same behavior, thus reproducing the bug condition. With reference to
The bug reproduction process provides an effective way of reconstructing bug scenarios in VMs. Embodiments use application consistent snapshots and the capture of a full VM state to create a completely consistent point in time state of the target VM, and then replays captured CPU lockstep events to consistently and repeatedly replay a particular scenario, such as a fault or bug condition. This allows 100% reproduction of bug scenarios especially those that are extremely difficult to reproduce
Embodiments of the processes and techniques described above can be implemented on any appropriate software development system or network server system. Such embodiments may include other or alternative data structures or definitions as needed or appropriate.
The network of
Arrows such as 1045 represent the system bus architecture of computer system 1005. However, these arrows are illustrative of any interconnection scheme serving to link the subsystems. For example, speaker 1040 could be connected to the other subsystems through a port or have an internal direct connection to central processor 1010. The processor may include multiple processors or a multicore processor, which may permit parallel processing of information. Computer system 1005 is intended to illustrate one example of a computer system suitable for use with the present system. Other configurations of subsystems suitable for use with the present invention will be readily apparent to one of ordinary skill in the art.
Computer software products may be written in any of various suitable programming languages. The computer software product may be an independent application with data input and data display modules. Alternatively, the computer software products may be classes that may be instantiated as distributed objects. The computer software products may also be component software.
An operating system for the system 1005 may be one of the Microsoft Windows®. family of systems (e.g., Windows Server), Linux, Mac OS X, IRIX32, or IRIX64. Other operating systems may be used. Microsoft Windows is a trademark of Microsoft Corporation.
The computer may be connected to a network and may interface to other computers using this network. The network may be an intranet, internet, or the Internet, among others. The network may be a wired network (e.g., using copper), telephone network, packet network, an optical network (e.g., using optical fiber), or a wireless network, or any combination of these. For example, data and other information may be passed between the computer and components (or steps) of a system of the invention using a wireless network using a protocol such as Wi-Fi (IEEE standards 802.11, 802.11a, 802.11b, 802.11e, 802.11g, 802.11i, 802.11n, 802.11ac, and 802.11ad, among other examples), near field communication (NFC), radio-frequency identification (RFID), mobile or cellular wireless. For example, signals from a computer may be transferred, at least in part, wirelessly to components or other computers.
In an embodiment, with a web browser executing on a computer workstation system, a user accesses a system on the World Wide Web (WWW) through a network such as the Internet. The web browser is used to download web pages or other content in various formats including HTML, XML, text, PDF, and postscript, and may be used to upload information to other parts of the system. The web browser may use uniform resource identifiers (URLs) to identify resources on the web and hypertext transfer protocol (HTTP) in transferring files on the web.
For the sake of clarity, the processes and methods herein have been illustrated with a specific flow, but it should be understood that other sequences may be possible and that some may be performed in parallel, without departing from the spirit of the invention. Additionally, steps may be subdivided or combined. As disclosed herein, software written in accordance with the present invention may be stored in some form of computer-readable medium, such as memory or CD-ROM, or transmitted over a network, and executed by a processor. More than one computer may be used, such as by using multiple computers in a parallel or load-sharing arrangement or distributing tasks across multiple computers such that, as a whole, they perform the functions of the components identified herein; i.e., they take the place of a single computer. Various functions described above may be performed by a single process or groups of processes, on a single computer or distributed over several computers. Processes may invoke other processes to handle certain tasks. A single storage device may be used, or several may be used to take the place of a single storage device.
Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in a sense of “including, but not limited to.” Words using the singular or plural number also include the plural or singular number respectively. Additionally, the words “herein,” “hereunder,” “above,” “below,” and words of similar import refer to this application as a whole and not to any particular portions of this application. When the word “or” is used in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list, all of the items in the list and any combination of the items in the list.
All references cited herein are intended to be incorporated by reference. While one or more implementations have been described by way of example and in terms of the specific embodiments, it is to be understood that one or more implementations are not limited to the disclosed embodiments. To the contrary, it is intended to cover various modifications and similar arrangements as would be apparent to those skilled in the art. Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.