This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2021-038744, filed on Mar. 10, 2021, the entire contents of which are incorporated herein by reference.
The present invention relates to a software performance verification system and a software performance verification method.
Patent Document 1 describes a software performance prediction system that predicts the performance of software. The software performance prediction system receives input of an initial source code or a changed source code from multiple terminals in connection with software development. The received source code is registered as source code information with a database. A comparison in software performance is made between the changed source code and the old source code included in the registered source code information so as to calculate the rate of reduction in performance between the new and old source codes. It is then determined whether the rate of performance reduction exceeds a predetermined value. If the performance reduction rate is determined to exceed the predetermined value, the result of the determination is reported to the outside.
Patent Document 2 describes a system analysis apparatus that predicts a performance bottleneck in system development. The system analysis apparatus detects past system design information that is similar to development system design information to obtain a result of the similar system detection. A search is made for past system function parts that are similar to development system function parts included in the development system design information so as to obtain a result of the similar function search. Using development system requirements, past system requirements, the development system function parts, the past system function parts, and the result of past system measurement, the system analysis apparatus acquires system part performance information in which the development system function parts and the past system function parts are associated with performance information. From the result of the similar system detection and from the system part performance information, the system analysis apparatus detects a system function part that constitutes a performance bottleneck.
Patent Document 1: JP-2012-234448-A
Patent Document 2: JP-2020-149681-A
In software development, it is necessary to meet two categories of requirements: requirements for the functions to be incorporated in the software (referred to as “function requirements” hereunder), and requirements for the performance to be achieved by the software (referred to as “performance requirements” hereunder). Whether the function requirements are met can be verified by developers on the basis of the source code at each step of software development. On the other hand, it is difficult to determine whether the performance requirements are met at the source code level. The verification of software performance by use of so-called performance analysis tools (profilers) cannot be made until, at a later stage of development, when the source code and data necessary for executable code generation (compile, build, etc.) have been prepared. In a case where the performance requirements are found not to be satisfied, that means a large number of rework man-hours are needed, which significantly affects production efficiency. In the verification of software performance, a large number of preparation man-hours are involved because it is necessary to set test cases that envisage various execution states and to prepare large quantities of test data for each of the test cases.
Patent Document 1 compares the performance of the software based on the changed source code with the performance of the software based on the old source code included in the registered source code information. To make the comparison, however, requires compiling and building the changed source code. It follows that the performance cannot be verified until the necessary source code and data have been prepared. Further, the technology described in Patent Document 1 does not envision programming languages that need not be compiled, such as interpreter-type languages. Furthermore, Patent Document 1 obtains the rate of reduction in performance between the new and old source codes, which requires preparing test data for each of the new and the old source codes.
Patent Document 2 compares the design information of the development system with the design information of the past system, i.e., makes a comparison therebetween on the level of design information to improve the efficiency of system design. With nothing assumed of the verification of performance issues on specific code levels, the technology described in Patent Document 2 is not intended to offload the verification of performance at the time of software development.
The present invention has been made in view of the above circumstances and aims to provide a software performance verification system and a software performance verification method for efficiently verifying the performance of software.
In achieving the foregoing and other objects of the present invention and according to one aspect thereof, there is provided a software performance verification system configured by use of an information processing apparatus. The software performance verification system includes: a storage section that stores code of a program configuring software; a partial code extraction section that extracts a partial code as part of the code; a feature vector generation section that generates a feature vector based on the partial code; and a performance verification processing section that generates, as a verification result of the partial code, information based on output obtained from a performance verification model through input of the partial code as a verification target to the performance verification model that is a machine learning model having been trained by use of learning data that includes the feature vector of the partial code for learning and performance information indicative of the performance of software implemented on the basis of the partial code.
The foregoing and other problems and the solutions to these problems will become evident from a reading of the following detailed description of preferred embodiments taken in conjunction with the appended drawings.
The present invention makes it possible efficiently to verify the performance of software.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. The ensuing description and the drawings are examples intended to explain the present invention and thus may be simplified or abbreviated as needed for purposes of clarification. The present invention may be implemented in various other embodiments. Unless specifically noted, each of the constituent elements involved may be singular or multiple. Throughout the ensuing description and the appended drawings, like reference signs designate like or corresponding constituent elements having identical or similar functions, and the explanations of such elements will be omitted where they are redundant. In the ensuing description, the character “S” prefixed to a reference sign denotes a processing step.
The performance verification model 216 is made to learn (trained) using learning data (training data) having an existing code description associated with information indicative of the performance of the code of interest (referred to as “performance information” hereunder) as a label (correct data).
The language for describing the code targeted for verification is not necessarily limited to any specific language. Any language is acceptable as long as it can be interpreted by an average-ability programmer. The method by which to generate an executable code based on the above code is not necessarily limited to any specific method. For example, the method may involve using a compiler language that requires compiling and building (also referred to as linking and making) in generating the executable code, or a sequential execution type interpreter language.
The above code is not necessarily limited to any specific type. For example, the code may be a webpage description language, script language, language for describing applications running on the server side, language for describing systems such as an operating system, language for describing embedded software, or language for describing batch processing.
The verification target code may be a code prepared anew at the time of development or a code updated or added at the time of maintenance. The above code may be one that is described by users such as software development engineers in a stand-alone development environment, or one that is generated in a joint development environment using a repository environment in which multiple information processing apparatuses are interconnected via a communication network.
What follows is a description, for example, of the case where the verification target code is the source code of a compiler language.
The performance verification model 216, not necessarily limited to any specific type, may be assumed to be of a type that performs binary classification or multi-class classification, for example. The schemes of machine learning for implementing the performance verification model 216 include, for example, DNN (Deep Neural Network), SVM (Support Vector Machine), decision tree, and k nearest neighbors (k-means). The performance verification model 216 is expressed by a matrix that includes features and weight information regarding each feature (parameters), for example.
The existing code used to generate the learning data is, for example, the code (referred to as “existing code” hereunder) used by an existing information processing system similar in function and configuration (referred to as “existing similar system”) to an information processing system to be implemented using the verification target code (referred to as “verification target system” hereunder). The similarities in function and configuration between the verification target system and the existing similar system are determined on the basis of application fields of the systems, methods of implementing the systems, environments in which to execute the software constituting the systems, types of users using the systems, and commonalities between the programming languages used for development, for example.
What is used as the label associated with the existing code for generating the learning data is, for example, information obtained from an execution log acquired in the production and test environments regarding the executable code based on the existing code, and information obtained from the results of test runs and simulations carried out on the executable code. Alternatively, the performance information set by those well versed in the existing similar system may be used as the label, for example.
As depicted in
As illustrated in
Of these sections, the partial code extraction section 120 extracts from a source code group (verification target) 111 a description corresponding to each method in a verification target method list 112 (the description is referred to as “partial code (verification target)” hereunder). The partial code extraction section 120 outputs each partial code (verification target) thus extracted as a partial code group (verification target) 113. The verification target method list 112 is a list of the names of the methods targeted for verification. The content of the list is set by the user, for example.
The feature vector generation section 130 converts each partial code (verification target) in the partial code group (verification target) 113 into a feature vector, and outputs the converted feature vectors as a feature vector group (verification target) 114.
The performance verification processing section 140 inputs each feature vector in the feature vector group (verification target) 114 to the performance verification model 216. Given the input, the performance verification model 216 outputs performance information. The performance verification processing section 140 generates as a verification result 115 information based on the performance information output from the performance verification model 216. The performance verification processing section 140 offers the verification result 115 to the user via a user interface, for example.
As depicted in
Of these sections, the partial code extraction section 220 extracts a code (referred to as “partial code (for learning)” hereunder) from the source codes (referred to as “source code group (for learning) 211” hereunder) used to train the performance verification model 216. The partial code extraction section 220 outputs the extracted partial codes (for learning) as a partial code group (for learning) 213.
The feature vector generation section 230 converts the partial codes (for learning) in the partial code group (for learning) 213 into feature vectors. The feature vector generation section 230 outputs the converted feature vectors as a feature vector group (for learning) 214.
The learning data generation section 240 generates at least one learning data item (a set of a feature vector and a label) by associating each feature vector in the feature vector group (verification target) 114 with a corresponding label from among performance labels 212. The learning data generation section 240 outputs the learning data items thus generated as a learning data group 215.
The learning processing section 250 inputs the feature vector of each learning data item in the learning data group 215 to the performance verification model 216. The learning processing section 250 uses a difference between the output of the performance verification model 216 given the input feature vector on one hand and the label associated with the input feature vector on the other hand in order to adjust the parameters of the performance verification model 216. By so doing, the learning processing section 250 trains the performance verification model 216.
Below is a detailed description of each of the functions indicated in
From the source code group (verification target) 111, the partial code extraction section 120 extracts as a partial code (verification target) the description of the method corresponding to each method in the verification target method list 112, for example. The partial code extraction section 220 extracts as a partial code (for learning) the description of each method included in the source code group (for learning) 211, for example.
As indicated in
From the source code group (verification target) 111, the partial code extraction section 120 extracts as the partial code (verification target) the code that includes the description of a given method in the verification target method list 112 and the description of the related method group of that method. The partial code extraction section 220 extracts as the partial code (for learning) the code that includes the description of a given method in the source code group (for learning) 211 and the description of the related method group of that method, for example.
When the metrics values in the partial code are used for feature vector generation as described above, it is possible to generate a feature vector having the metrics values of the partial code as the features, and to execute performance verification based on the differences between the partial code metrics values.
Specifically, the feature vector generation section converts into common form expressions the words (method name (function name), variable type, variable name, data type, storage class specifier, etc.) described in the partial code group extracted from the source code group (e.g., source code group configuring one information processing system such as an application system). The expressions are each associated with a different sign to generate a word dictionary. In the example in
The feature vector generation section then converts into common form expressions the words included in the partial code targeted for feature vector conversion, acquires from the word dictionary the signs corresponding to the converted expressions, and generates as a feature vector the vector having the acquired signs as its elements. In
When the feature vector is generated on the basis of the combination of the words described in the partial code as explained above, it is possible to generate, for example, a feature vector that captures the structure of the partial code as the feature, and thereby to carry out performance verification based on structural differences in the partial code.
In this example, the learning processing section 250 inputs the feature vector from the learning data (combination of the feature vector and label) into the performance verification model 216. Given the input, the performance verification model 216 outputs a value (“0.3” in this example). On the basis of the difference between this output value and the label of the learning data (“0.0” in this example), the learning processing section 250 adjusts the parameters of the performance verification model 216.
In this example, the learning processing section 250 inputs the feature vector from the learning data (combination of the feature vector and label) into the performance verification model 216. Given the input, the performance verification model 216 outputs values from different viewpoints (“numerous loops: 0.3,” “numerous memory operations: 0.3” in this example). On the basis of the difference between each of the output values and the label of the learning data (“numerous loop occurrences: 1,” “numerous memory operation occurrences: 0” in this example), the learning processing section 250 adjusts the parameters of the performance verification model 216.
As explained above, the performance verification system 1 of this embodiment extracts a partial code from the source code, converts the extracted partial code into a feature vector, and inputs the converted feature vector to the performance verification model 216, thereby verifying the performance of the source code (i.e., performance of the executable code of the source code including the partial code). When the performance verification system 1 verifies the performance on the basis of the source code in this manner, it is possible to verify, for example, the performance of the source code even before its executable code is generated. This makes it possible, for example, for users such as developers to verify as needed the performance of the source code when its executable code is executed while describing the source code. This means fewer rework man-hours are needed than if the verification is made after generation of the executable code. It is also possible to detect at an early stage code that, lying in the source code, can become a bottleneck of the performance. Since the performance verification system 1 verifies the performance on the basis of the source code, there is no need to set test cases or to prepare the test data. The performance verification system 1 can also be applied to programming languages that need not be compiled, such as interpreter type languages. In this manner, the performance verification system 1 of this embodiment significantly offloads the verification of the performance of software at the time of development or maintenance and thereby permits efficient system development and maintenance.
Explained below are usage examples (application examples) of the performance verification system 1.
Utilizing the performance verification system 1 in this manner allows the user describing a source code to verify the performance of the source code easily and quickly, the source code being targeted to be developed or maintained by use of an integrated development environment (IDE) running on the user apparatus 2, for example. This enables the user to proceed with development or maintenance work while verifying the performance of the source code as needed. It is thus possible efficiently to prepare high-quality software that meets performance requirements.
The joint development environment 60 is communicably connected with the performance verification section 100. Upon receipt of a source code and a verification request regarding the source code from the user apparatus 2, the joint development environment 60 transmits the verification request together with the received source code to the performance verification section 100. On receiving the verification request, the performance verification section 100 generates a verification result 115 regarding the source code and transmits the generated verification result 115 to the joint development environment 60.
If the performance indicated by the received verification result 115 meets preset performance requirements, the joint development environment 60 registers the updated source code with the repository 61. On the other hand, if the performance indicated by the received verification result 115 fails to meet the preset performance requirements, the joint development environment 60 does not register the updated source code with the repository 61. The joint development environment 60 proceeds to transmit to the user apparatuses 2 a notification indicating that the source code has failed to meet the performance requirements.
In the case where the performance verification system 1 is used in this manner, only the source code group 62 meeting the performance requirements is managed with the repository 61 of the joint development environment 60. This makes it possible constantly to ensure the quality of software.
The information processing apparatus 10 of this example may be implemented partially or in total by use of virtual information processing resources provided by virtualization and process space separation technologies, such as virtual servers offered by cloud systems, for example. Also, the functions provided by the information processing apparatus 10 may be implemented partially or in total by services offered by cloud systems through an API (Application Program Interface), for example. Further, one information processing apparatus 10 may be shared by at least two of the performance verification system 1, user apparatus 2, and joint development environment 60.
In
The main storage device 102 stores programs and data. For example, the main storage device 102 is a ROM (Read Only Memory), RAM (Random Access Memory), or nonvolatile memory (NVRAM (Non Volatile RAM)).
The auxiliary storage device 103 is an SSD (Solid State Drive), hard disk drive, optical storage device (CD (Compact Disc), DVD (Digital Versatile Disc), etc.), storage system; reader/writer for recording media such as IC cards, SD cards, and optical recording media; or storage regions of cloud servers, for example. Programs and data may be read into the auxiliary storage device 103 by way of the reader for recording media or via the communication device 106. The programs and data held (stored) in the auxiliary storage device 103 are read as needed into the main storage device 102.
The input device 104 is an interface that receives input from the outside. For example, the input device 104 is a keyboard, a mouse, a touch panel, a card reader, a pen-input tablet, or a voice input device.
The output device 105 is an interface that outputs diverse information such as processing progress and processing results. For example, the output device 105 is a display device (liquid crystal monitor, LCD (Liquid Crystal Display), graphic card, etc.) for visualizing the diverse information, a device for vocalizing the diverse information (audio output device (speakers, etc.)), or a device for transcribing the diverse information (printer, etc.). For example, the information processing apparatus 10 may be configured to output and input information to and from other devices via the communication device 106.
The input device 104 and the output device 105 constitute a user interface that presents and receives information to and from the user.
The communication device 106 (communication section) is a device that conducts communication with other devices. The communication device 106 is a wired or wireless communication interface that implements communication with other devices via a communication network (control system network 50, information/control system network 51, information system network 52). For example, the communication device 106 is an NIC (Network Interface Card), a wireless communication module, or a USB module.
An operating system, a file system, DBMS (DataBase Management System) (relational database, NoSQL, etc.), KVS (Key-Value Store), other diverse software (software, middleware, and various applications for implementing the user interface such as a GUI (Graphical User Interface) using the input device 104 and output device 105) may be installed in the information processing system 10, for example.
The functions provided by the performance verification system 1, user apparatus 2, and joint development environment 60 are implemented by the processor 101 reading and executing relevant programs held in the main storage device 102 or by use of the hardware of the information processing apparatus 10 (FPGA, ASIC, AI chip, etc.), for example. The diverse data held by the performance verification system 1, user apparatus 2, and joint development environment 60 are stored into the main storage device 102 and auxiliary storage device 103 (storage section).
The diverse functions of the performance verification system 1 may be implemented partially or wholly by use of various known data mining methods such as text data mining, various known processing methods (morphological analysis, parsing, semantic analysis, context analysis, feature extraction, word embeddings, named entity extraction, text classification, series labeling), or various known machine learning methods (deep learning (DNN (Deep Neural Network), RNN (Recurrent Neural Network), etc.), for example.
It is to be understood that while the invention has been described in conjunction with a specific embodiment, it is evident that many alternatives, modifications and variations are possible within the scope of this invention. For example, whereas the above-described embodiment gives detailed and comprehensive explanations of this invention, the invention is not necessarily limited to any embodiment having all the configurations and components discussed above. Also, the above-described configurations or components may be partially deleted, changed, or supplemented as needed with suitable configurations or components to constitute another valid embodiment of the present invention.
For example, variations of the above-described partial code extraction method (
Whereas the embodiment above is described using examples in which the partial codes are extracted in units of “methods” from the code, the partial codes may be extracted from the code in units of other processing blocks such as “function” or “class.” Alternatively, the partial codes may be extracted from the code in accordance with an extraction method set by the user.
The above-described configurations, functional sections, processing sections, and processing means may be implemented partially or in total by hardware such as suitably designed integrated circuits. The above-described configurations and functions may be implemented by software run by a processor interpreting and executing the programs for implementing the respective functions. The information such as programs, tables, and files for implementing the functions may be placed in a recording device such as a memory, hard disk, or SSD (Solid State Drive); or on recording media including IC cards, SD cards, and DVDs.
The functional sections, processing sections, and databases of each information processing apparatus discussed above are shown to be arranged only in an illustrative manner. These functional sections, processing sections, and databases may be optimally arranged from the viewpoint of the performance, processing efficiency, and communication efficiency of the hardware and software included in the relevant devices.
The configuration (schema, etc.) of the database for holding the above-mentioned diverse data may be changed flexibly from the viewpoint of efficient resource usage, improvement in processing efficiency, improvement in access efficiency, or improvement in search efficiency.
Number | Date | Country | Kind |
---|---|---|---|
2021-038744 | Mar 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/005632 | 2/14/2022 | WO |