This invention pertains generally to computer software development and, more particularly, to the testing and verification of the operation of application programming interfaces (APIs).
In developing software for particular computer systems, e.g., operating systems, developers use specific facilities to access the functionality of the operating system. Typically these facilities are referred to as APIs (application programming interfaces). In order for the developed software to function correctly on the target operating system, it is important that the APIs function correctly.
Typically, in order to test entities such as APIs, a tester will manually write code to expose and test each API. However, there are a number of drawbacks to this type of testing. First and foremost, it is very labor intensive, and as such creates additional costs and inefficiencies in the testing process. Moreover, since human intervention is required, the possibility of errors increases. Finally, a user may not always think to test certain avenues of use of the API, thus often yielding an incomplete test.
Currently, the testing of APIs is thus a balance between cost and completeness, with an attendant risk of errors or omissions regardless of cost. A new testing regimen is needed whereby human intervention is minimized and testing completeness and accuracy are improved.
In embodiments of the invention, a developer or tester automatically performs invalid parameter tests and stress tests on exported APIs written in native code without requiring specific domain knowledge about the APIs. An example of testable APIs includes those exported in WINDOWS DLLs in the WINDOWS™ brand operating systems from MICROSOFT™ Corporation of Redmond, Wash. In an embodiment of the invention, the developer or tester performs surface-level checks on the APIs which may be stateful or stateless, through the use of random or directed parameters. The APIs in a DLL are determined in an embodiment of the invention by going through the symbol file which is generated at the time of binary compilation.
In an embodiment of the invention, the developer or tester determines vulnerabilities in the APIs to invalid parameter data and exposes potential crashes or memory leaks. This is quite beneficial in that it can be an important way to discover security vulnerabilities in the operating system. Unlike current test tools which are very specific to a set of APIs and directed to functional validation of the APIs, the present invention provides a generic, flexible tool to test a range of APIs without any prior knowledge regarding the APIs.
Additional features and advantages of the invention will be apparent from the following detailed description of illustrative embodiments which proceeds with reference to the accompanying figures.
While the appended claims set forth the features of the present invention with particularity, the invention and its advantages are best understood from the following detailed description taken in conjunction with the accompanying drawings, of which:
Methods and systems for automated testing of APIs will now be described with respect to various embodiments. The skilled artisan will readily appreciate that the methods and systems described herein are merely exemplary and that variations can be made without departing from the spirit and scope of the invention.
In overview, embodiments of the invention provide a mechanism for quickly testing a wide range of APIs with respect to both stress response and valid and invalid parameter response. The present invention will be more completely understood through the following detailed description, which should be read in conjunction with the attached drawings. In this description, like numbers refer to similar elements within various embodiments of the present invention.
The invention is illustrated as being implemented in a suitable computing environment. Although not required, the invention will be described in the general context of computer-executable instructions, such as procedures, being executed by a personal computer. Generally, procedures include program modules, routines, functions, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the invention may be practiced in a variety of computer system configurations, including hand-held devices, multi-processor systems, and microprocessor-based or programmable consumer electronics devices. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices. The term computer system may be used to refer to a system of computers such as may be found in a distributed computing environment.
In overview, before proceeding to a discussion of specific methods, the Tester 105 according to embodiments of the invention extracts an identification of exposed API functions from the PDB 107 so that those functions may be tested. In addition to random testing of functions, a user may also specify specific functions to test via user interface 109 linked to Tester 105. APIs typically support one or both of two types of functions, namely internal and exported. Exported functions are those exposed or made available to other entities outside of the operating system 101. It is exported functions that are of greatest concern since these functions can compromise or be used to compromise the security of the operating system.
The Tester application 105 queries a DLL's PBD file using the Debug Interface Access (DIA) SDK or otherwise and makes a list of all functions available in the DLL that would be of interest for API testing. It will be appreciated that there are other means to glean information from the PDB file or from a similar file. Although the pdb format is not used with respect to all operating systems, it will be appreciated that many such operating systems support formats similar to pdb. It will thus be readily appreciated that symbol extraction and other operations described herein with reference to the pdb format may also be executed with respect to any other such formats and the relevant information can be obtained in such operating environments as well.
According to an embodiment of the invention, a function is considered to be of interest if it has the following properties: (1) Function is an exported function, and (2) Function has one or more parameters. For each function of test interest, the Tester application 105 captures parameter information, including number of parameters and the data type of each parameter. Once the list of functions is available, depending on the setting provided, the application either outputs the list of functions for another application to prune the list or begins testing each function in the list, calling it with random or directed sets of parameter data.
Once a test body of API functions has been identified using the PDB 107, and the list has been pruned in an embodiment of the invention, the Tester 105 proceeds to test the APIs using both random testing with random parameters as well as directed testing using purposefully chosen data parameters in an embodiment of the invention. Examples of the types of parameters that are used for testing include Pointers to data buffers or Pointers to basic types, Pointers to structs, Basic integer data types, Character strings, and Handles or other complex data types.
With respect to Pointers to data buffers or Pointers to basic types, one or more of a number of testing techniques are used in an embodiment of the invention. Example techniques include providing a random memory address that has not actually been allocated, allocating a limited amount of memory and providing pointer to that memory, allocating memory starting at address 0, and providing pointer to that memory, and/or providing other addresses as defined by one or more other programs.
For Pointers to structs, similar testing techniques are used in an embodiment of the invention. Example techniques include providing a random memory address that has not been allocated, determining the size of the struct and allocating the required amount of memory, filling the memory with garbage non-zero data and providing pointer to that memory, allocating memory that is less than what would be required given the size for that struct, providing other addresses as defined by another program calling into this program, and/or allocating memory starting at address 0.
With respect to Basic integer data types, one or more of a number of testing techniques are used in an embodiment of the invention. Example techniques include setting the value as 0, setting the value to the maximum potential real value possible, and/or setting the value at the lowest potential real value possible, or to the highest magnitude negative number if the value is signed.
With respect to Character strings, one or more of a number of testing techniques are used in an embodiment of the invention. Example techniques include providing NULL pointers, providing empty strings, providing extra-long strings (e.g., approx 512 characters in length), providing additional variations for UNICODE or ANSI strings, and/or providing valid strings as inputs.
Finally, with respect to Handles, one or more of a number of testing techniques are used in an embodiment of the invention. Example techniques include providing random parameters for handles and/or providing valid handles as passed in from another program.
An example algorithm for testing a function is as follows:
This process, or another test routine, is repeated for each function in the function list indefinitely or until a predetermined time, such as may be defined by the user, has elapsed.
A process for the extraction of functions from the PDB file according to an embodiment of the invention is shown in
In step 205, the extraction process determines whether the symbol just read represents a function. If not, the process returns to step 201. Otherwise, the process flows to step 207, where the extraction application determines whether a user has specified a list of functions to test. If the user has specified such a list, the process flows to step 209, whereat the extraction application determines whether the function corresponding to the symbol just read is on the user-supplied list. If not, the process returns to step 201. Otherwise, the process flows to step 211, where the extraction application determines whether the function is an exported function. If not, the process returns to step 201. Otherwise, the process flows to step 213, whereupon the extraction application determines whether the function accepts any parameters.
If the function does not accept any parameters, the process returns to step 201. Otherwise, the process flows to step 215. At step 215, the extraction application extracts the function information for the function from the PDB file. Finally at step 217, the extraction application inserts the function into the list of function to be tested.
In summary then, the extraction application determines what function are represented in the PDB file. If the user has specified a list of functions to test, then the extraction application will compile a list of function information for the functions in the list to the extent that those functions are represented in the PDB file. Otherwise the extraction application will compile a list of function information for all exported functions represented in the PDB file.
Having discussed the manner in which functions can be identified and extracted, a process of testing the various functions will now be described by reference to
If it is determined at step 305 that the command line arguments are indeed valid, then the process flows to step 309, whereat the test application loads the DLL in question into its process space. Next, at step 311, the test application determines whether the user has specified a list of functions to test. If the user has specified a list of functions to test, then the process flows to step 313, whereat the test application reads the file and creates a list of user-specified functions. Subsequently the process flows to node A of the flow chart of
The flow chart of
If at step 417 the test application determines that the user has requested to receive a listing of the functions extracted from the PDB file, then the process flows to step 421. At step 421 the test application outputs to the user interface a list of functions that can be tested based on the information contained in the PDB file. Finally, at step 423, the test application cleans up and exits and the process ends.
As noted above, the test application and other components of the invention described herein operate on one or more computing devices.
That said, one example system for implementing the invention includes a general purpose computing device in the form of a computer 510. Components of the computer 510 may include, but are not limited to, a processing unit 520, a system memory 530, and a system bus 521 that couples various system components including the system memory to the processing unit 520. The system bus 521 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures.
The computer 510 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by the computer 510 and include both volatile and nonvolatile media, and removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer 510. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above are included within the scope of computer-readable media.
The system memory 530 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 531 and random access memory (RAM) 532. By way of example, and not limitation,
The computer 510 may also include other removable and non-removable, volatile and nonvolatile computer storage media. By way of example only,
The computer system may include interfaces for additional types of removable non-volatile storage devices. For instance, the computer may have a USB port 553 that can accept a USB flash drive (UFD) 554, or a SD card slot 557 that can accept a Secure Digital (SD) memory card 558. A USB flash drive is a flash memory device that is fitted with a USB connector that can be inserted into a USB port on various computing devices. A SD memory card is a stamp-sized flash memory device. Both the USB flash drive and SD card offer high storage capacity in a small package and high data transfer rates. Other types of removable storage media may also be used for implementing the invention.
The drives and their associated computer storage media, discussed above and illustrated in
The computer 510 preferably operates or is adaptable to operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 580. The remote computer 580 may be a personal computer, a server, a router, a peer device or other network node, and typically includes some or all of the elements described above relative to the computer 510, although only a memory storage device 581 has been illustrated in
When used in a LAN environment, the computer 510 is connectable to the LAN 571 through a network interface or adapter 570. The computer 510 may also include a modem 572 or other means for establishing communications over the WAN 573. The modem 572, which may be internal or external, may be connected to the system bus 521 by way of the user input interface 560 or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 510, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,
It will be appreciated that a new and useful system for automated API testing has been described. Preferred embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Variations of those preferred embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description. The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than as specifically described herein. For example, although the term “list” is used herein, it will be appreciated that a listing is not required to plural but comprises one or more items. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.
All references, including publications, patent applications, patents and appendices, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein.
The use of the terms “a” and “an” and “the” and similar referents in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to,”) unless otherwise noted. Any recitation of ranges of values herein is merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.