The present application claims the priority benefit of U.S. patent application Ser. No. 17/371,127, filed on Jul. 21, 2021, titled “TEST CYCLE TIME REDUCTION AND OPTIMIZATION,” the disclosure of which is incorporated herein by reference.
Continuous integration of software involves integrating working copies of software into mainline software, in some cases several times a day. Before integrating the working copy of software, the working copy must be tested to ensure it operates as intended. Testing working copies of software can be time consuming, especially when following typical testing protocols which require executing an entire test plan every test cycle. An entire test plan often takes hours to complete, which wastes computing resources and developer time.
The present technology, roughly described, automatically reduces the time to a first failure in a series of tests. Detecting failures in tests, such as for example unit tests, allows engineers to assess and attend to any issues as soon as possible rather than once the unit tests are complete. The present system collects test data as unit tests are executed on code. The historical collection of data as well as details for the most recent source code under test are used to train a machine learnt model, for example one that uses gradient boosted decision trees. The trained models predict a likelihood of unit test failure for each unit test. The likelihood predictions are then used to order the test execution order so that the tests most likely to fail are executed first.
In operation, a test agent will operate in a testing environment and communicates with an intelligence server. When a test within the testing environment is about to execute, the test agent communicates with the intelligence server by providing the build number, commit-id, and other information, for example in one or more files sent by the test agent to the intelligence server. The intelligence server receives the information, processes the information using a call graph, and generates a test list. An artificial intelligence model is then trained with historical data for the and data for each current source code set to be tested. The training data may be modified to make it suitable for ingestion by the model. Once the model is trained, for each set, the model receives current data for each test unit and outputs a prediction of the likelihood that the particular test will failure. The system then orders the tests in order of mostly likely to fail to least likely to fail. The ordered tests are then executed in the determined order. When a test fails, an engineer can address the source code that is subject to the test, earlier rather than later due to the order of the unit tests, thereby saving engineer time and resources.
In some instances, the present technology provides a method for testing software. The method beings by detecting a test event initiated by a testing program and associated with testing a first software at a testing server, the test event detected by an agent executing within the testing program at the testing server, the testing event associated with a plurality of tests for the first software. The method continues by receiving, by the agent on the testing server from the remote server, a list of tests to be performed in response to the test event, the received list of tests being a subset of the plurality of tests. Each test in the list of tests is then ordered according to a likelihood of failure, and the ordered tests are executed by the agent in the testing server.
In some instances, a non-transitory computer readable storage medium includes embodied thereon a program, the program being executable by a processor to perform a method for automatically testing software code. The method may begin with detecting a test event initiated by a testing program and associated with testing a first software at a testing server, the test event detected by an agent executing within the testing program at the testing server, the testing event associated with a plurality of tests for the first software. The method continues by receiving, by the agent on the testing server from the remote server, a list of tests to be performed in response to the test event, the received list of tests being a subset of the plurality of tests. Each test in the list of tests is then ordered according to a likelihood of failure, and the ordered tests are executed by the agent in the testing server.
In some instances, a system for automatically testing software code includes a server having a memory and a processor. One or more modules can be stored in the memory and executed by the processor to detect a test event initiated by a testing program and associated with testing a first software at a testing server, the test event detected by an agent executing within the testing program at the testing server, the testing event associated with a plurality of tests for the first software, receive, by the agent on the testing server from the remote server, a list of tests to be performed in response to the test event, the received list of tests being a subset of the plurality of tests, order each test in the list of tests according to a likelihood of failure, and execute the ordered tests by the agent in the testing server
The present technology, roughly described, automatically reduces the time to a first failure in a series of tests. Receiving failures in tests, such as for example unit tests, allows engineers to assess and attend to any issues as soon as possible rather than once the unit tests are complete. The present system collects test data as unit tests are executed on code. The historical collection of data as well as details for the most recent source code under test are used to train a machine learnt model, for example one that uses gradient boosted decision trees. The trained models predict a likelihood of unit test failure for each unit test. The likelihood predictions are then used to order the test execution order so that the tests most likely to fail are executed first.
In operation, a test agent will operate in a testing environment and communicates with an intelligence server. When a test within the testing environment is about to execute, the test agent communicates with the intelligence server by providing the build number, commit-id, and other information, for example in one or more files sent by the test agent to the intelligence server. The intelligence server receives the information, processes the information using a call graph, and generates a test list. An artificial intelligence model is then trained with historical data for the and data for each current source code set to be tested. The training data may be modified to make it suitable for ingestion by the model. Once the model is trained, for each set, the model receives current data for each test unit and outputs a prediction of the likelihood that the particular test will failure. The system then orders the tests in order of mostly likely to fail to least likely to fail. The ordered tests are then executed in the determined order. When a test fails, an engineer can address the source code that is subject to the test, earlier rather than later due to the order of the unit tests, thereby saving engineer time and resources.
The present system addresses a technical problem of efficiently testing portions of software to be integrated into a main software system used by customers. Currently, when a portion of software is to be integrated into a main software system, a test plan is executed to test the entire test portion. The entire test plan includes many tests and takes a long time to complete, often hours, and takes up large amounts of processing and memory resources, as well as time.
The present system provides a technical solution to the technical problem of efficiently testing software by intelligently selecting a subset of tests from a test plan and executing the subset. The present system identifies portions of a system that have changed or for which a test has been changed or added, and adds the identified tests to a test list. An agent within the test environment then executes the identified tests. The portions of the system can be method classes, allowing for a very precise list of tests identified for execution.
Network 140 may be implemented by one or more networks suitable for communication between electronic devices, including but not limited to a local area network, wide-area networks, private networks, public network, wired network, a wireless network, a Wi-Fi network, an intranet, the Internet, a cellular network, a plain old telephone service, and any combination of these networks.
Testing server 110 may include testing software 120. Testing software 120 tests software that is under development. The testing software can test the software under development in steps. For example, the testing software may test a first portion of the software using a first step 122, and so on with additional steps through an nth step 126.
A testing agent 124 may execute within or in communication with the testing software 120. The testing agent may control testing for a particular stage or type of testing for the software being developed. In some instances, the testing agent may detect the start of the particular testing, and initiate a process to identify which tests of a test plan to execute in place of every test in the test plan. Testing agent 124 is discussed in more detail with respect to
Intelligence server 150 may communicate with testing server 110 and data store 160, and may access a call graph stored in data store 160. Intelligence server 150 may identify a subgroup of tests for testing agent 124 to execute, providing for a more efficient testing experience at testing server 110. Intelligence server 150 may, in some instances, generate likelihood of failure scores and order tests in order of likelihood of failure. Intelligence server 150 is discussed in more detail with respect to
Data store 160 may store a call graph 162 and may process queries for the call graph. The queries main include storing a call graph, retrieving call graph, updating portions of a call graph, retrieving data within the call graph, and other queries.
AI platform 170 may implement one or more artificial intelligence models that can be trained and applied to test data, current and historical, to predict the likelihood of failure for each test. The platform may implement a machine learning model that utilizes gradient boosted decision trees to predict the likelihood of a unit test failure. In some instances, the primary data are git-commit graphs of historical unit test results.
An intelligence server can also include score generator 350. Score generator 350 can, in some implementations, implement one or more artificial intelligence models that can be trained and applied to test data, current and historical, to predict the likelihood of failure for each test. Hence, the artificial intelligence models of the present system can be implemented on intelligence server 150, AI platform 170, or both. Score generator 350 may implement a machine learning model that utilizes gradient boosted decision trees to predict the likelihood of a unit test failure. In some instances, the primary data are git-commit graphs of historical unit test results.
In some instances, the code to be tested is updated, or some other event occurs and is detected which triggers a test. A complete set of tests for the code may be executed at step 415.
A call graph may be generated with relationships between methods and tests, and stored at step 420. Generating a call graph may include detecting properties for the methods in the code. Detecting the properties may include retrieving method class information by an intelligence server based on files associated with the updated code. The call graph may be generated by the intelligence server and stored with the method class information by the intelligence server. The call graph may be stored on the intelligence server, a data store, or both.
In some instances, generating the call graph begins when the code to be tested is accessed by an agent on the testing server. Method class information is retrieved by the agent. The method class information may be retrieved in the form of one or more files associated with changes made to the software under test. The method class information, for example the files for the changes made to the code, are then transmitted by the agent to an intelligence server. The method class information is received by an intelligence server from the testing agent. The method class information is then stored either locally or at a data store by the intelligence server.
A test server initiates tests at step 425. The agent may detect the start of a particular step in the test at step 430. A subset of tests is then selected for the updated code based on the call graph generated by the intelligence server at step 435. Selecting a subset of tests may include accessing files associated by the changed code, parsing the received files to identify method classes associated with those files, and generating a test list from the received method classes using a call graph. Selecting a subset of tests for an updated code based on the call graph is disclosed in U.S. patent application Ser. No. 17/371,127, filed Jul. 9, 2021, titled “Test Cycle time Reduction and Optimization,” the disclosure of which is incorporated herein by reference.
The tests in the subset of tests are intelligently ordered by a prediction engine in order of likelihood to fail at step 440. To intelligently order the tests, a likelihood of failure is predicted for each test. The prediction is made may by a prediction engine, implemented in some instances as a machine learning model utilizing gradient boosted decision trees. More detail for intelligently ordering tests is discussed with respect to the method of
A test agent receives the ordered test list from the intelligence server at step 445. The test list is generated by the intelligence server and the AI model(s), which uses the call graph to select tests from a comprehensive test plan. The test list includes a subset of tests from the test plan that would normally be performed on the software under test, are and ordered based on likelihood for failure. The subset of tests only includes tests for methods that were changed and tests that have changed or added.
The test agent executes the ordered test list comprising the subset of tests at step 450. In some instances, a test agent executes the test list with instrumentation on. This allows data to be collected during the tests.
At test completion, the testing agent accesses and parses the test results and uploads the results with an automatically generated call graph at step 455. Parsing the test results may include looking for new methods as well as results of previous tests. The results may be uploaded to the intelligence server and include all or a new portion of a call graph or new information from which the intelligence server may generate a call graph. The intelligence server may then take the automatically generated call graph portion and place it within the appropriate position within a master call graph. The call graph is then updated, whether it is stored locally at the intelligence server or remotely on the data store.
Test subsets are accessed at step 515. The subsets are the tests that have been determined will be tested in this test cycle. The score prediction engine may be tuned at step 520. Tuning the score prediction engine may be implemented with additional parameters. Some parameters that may tune a score generator engine include a learning rate, number of trees to use within the machine learning model, and the depth of the decisions.
Code change commit graph data and historical test result data are fed to the prediction engine at step 525. In some instances, the score generator can be implemented as a gradient boosted decision tree. A score can be in the form of a weight from 0 to 1, generated from the data that to the score generator. In this case, a score of 0.5 is a likelihood of failure, and less than 0.5 suggest there is a lower or no likelihood of failure.
The output of the prediction engine for each test is received at step 530. The tests are then ordered in order of highest predicted likelihood to fail to lowest predicted likelihood to fail.
The components shown in
Mass storage device 1130, which may be implemented with a magnetic disk drive, an optical disk drive, a flash drive, or other device, is a non-volatile storage device for storing data and instructions for use by processor unit 1110. Mass storage device 1130 can store the system software for implementing embodiments of the present invention for purposes of loading that software into main memory 1120.
Portable storage device 1140 operates in conjunction with a portable non-volatile storage medium, such as a floppy disk, compact disk or Digital video disc, USB drive, memory card or stick, or other portable or removable memory, to input and output data and code to and from the computer system 1100 of
Input devices 1160 provide a portion of a user interface. Input devices 1160 may include an alpha-numeric keypad, such as a keyboard, for inputting alpha-numeric and other information, a pointing device such as a mouse, a trackball, stylus, cursor direction keys, microphone, touch-screen, accelerometer, and other input devices. Additionally, the system 1100 as shown in
Display system 1170 may include a liquid crystal display (LCD) or other suitable display device. Display system 1170 receives textual and graphical information and processes the information for output to the display device. Display system 1170 may also receive input as a touch-screen.
Peripherals 1180 may include any type of computer support device to add additional functionality to the computer system. For example, peripheral device(s) 1180 may include a modem or a router, printer, and other device.
The system of 1100 may also include, in some implementations, antennas, radio transmitters and radio receivers 1190. The antennas and radios may be implemented in devices such as smart phones, tablets, and other devices that may communicate wirelessly. The one or more antennas may operate at one or more radio frequencies suitable to send and receive data over cellular networks, Wi-Fi networks, commercial device networks such as a Bluetooth device, and other radio frequency networks. The devices may include one or more radio transmitters and receivers for processing signals sent and received using the antennas.
The components contained in the computer system 1100 of
The foregoing detailed description of the technology herein has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the technology to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. The described embodiments were chosen to best explain the principles of the technology and its practical application to thereby enable others skilled in the art to best utilize the technology in various embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the technology be defined by the claims appended hereto.
Number | Date | Country | |
---|---|---|---|
Parent | 17371127 | Jul 2021 | US |
Child | 17545577 | US |