Embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
The subject program 21 is a set of programs that are analysis subjects of the system shown in
The graphical structure of the business model (shown on the left side of
By the way, in the physical model 22 and business model 24 shown in
In the present embodiment, association of the physical model 22 wit the business model 24 is managed using the association model 23. The association model 23 is represented by dotted lines 58 and 59 in
First, the controller 40 reads an analysis order given by the user and input from the keyboard 33 or pointing device 34, starts the program analyzer 41 to analyze the subject program 21, and generates the physical model 22 (Step 101). Subsequently, the controller 40 reads a model registration order input by the user, and starts the model register/modifier 45. The model register/modifier 45 reads the business model and association model input by the user, and registers the business model 24 and association model 23 (Step 102). Subsequently, the controller 40 makes a decision whether the order given by the user is data driven analyzing, function driven analyzing or termination (Step 103).
If the order given by the user is “data driven analyzing” as a result of the decision made at Step 103, then the controller 40 starts the data driven analyzer 42 for a business function specified by the user, and displays a result of processing on a screen (Step 104). Here, the data driven analyzing is processing of extracting a subgraph associated with the specified business function from the physical model, with data of the business model/association model taken as the starting point. Details of Step 104 will be described later with reference to
If the order given by the user is “function driven analyzing” as a result of the decision made at Step 103, then the controller 40 starts the function driven analyzer 43 for a business function specified by the user, and displays a result of processing on the screen (Step 105). Here, the function driven analyzing is processing of extracting a subgraph associated with the specified business function from the physical model, with a function portion of the business model/association model taken as the starting point. Details of Step 105 will be described later with reference to
An input conducted by the user as to whether modification is necessary and a modification method is accepted on the view displayed on the screen at Step 104 or 105. Upon receiving this input, the controller 40 updates the associated business model or association model of the user (Step 106), and returns to the state in which an order is accepted (Step 103). By thus repeating the process of Steps 103 to 106, the user ascertains the difference between the business model and physical mode, and gives a modification order. As a result, precisions of the business model and association model can be gradually raised.
First, the data driven analyzer 42 conducts retrieval in the business function column 87 in the business I/O relation table 82 included in the business model 24, and thereby obtains a set S of relating logical data (Step 111). For example, if the business function specified by the user is “order receiving registration”, the data driven analyzer 42 conducts retrieval in the business function column 87 in the business I/O relation table 82 by using “order receiving registration” as a key, obtains a set S containing three logical data “order receiving”, “person in charge”, and “order receiving slit”, and stores the set S in a storage area in the memory 10.
Subsequently, the data driven analyzer 42 conducts retrieval in the logical data column 93 in the data association model 91 (
Subsequently, the data driven analyzer 42 selects one data from the set s of physical data obtained at Step 112, and stores the data in a variable v contained in a storage area in the memory 10 (Step 113). Subsequently, the data driven analyzer 42 conducts retrieval in a direction of arrows along edges of a graph on the physical model by taking physical data specified by the variable v as the starting point, obtains a set of paths starting from the physical data v and leading to arbitrary data on the graph, and stores the set of the paths in a variable P contained in a storage area in the memory 10 (Step 114). For example, supposing FILE-a in the graph showing in
Subsequently, the data driven analyzer 42 stores one path selected from the variable P obtained at Step 114 in a variable p contained in a storage area in the memory 10 (Step 115). If the last vertex in the variable p is contained in the data set s obtained at Step 112, all vertexes on the path p selected at Step 115 are provided with ◯ (Step 116). Here, vertexes mean physical data and programs included in a certain path. For example, as for “FILE-a→PGM-x→FILE-n→PGM-y→FILE-o→PGM-w→FILE-c”, the last vertex “FILE-c” is contained in the set s. With respect to “FILE-a, PGM-x, FILE-n, PGM-y, FILE-o, PGM-w and FILE-c” which are vertexes on this path, therefore, “◯” is stored in the mark columns 64 and 66 of associated records in the program table 60 and the physical data table 61 (
Processing at Steps 115 and 116 is conducted on all paths contained in the variable P obtained at Step 114 (Step 117). In addition, processing at Steps 113 to 117 is conducted on all physical data contained in the set s obtained at Step 112 (Step 118). For example, if processing is executed on the graph shown in
Subsequently, the data driven analyzer 42 provides physical data that is included in physical data input or output by programs provided with “◯” at Step 116 and that is not provided with the mark “◯”, with a mark “Δ” (Step 119). In the example shown in
Subsequently, the data driven analyzer 42 provides physical data included in physical data provided with the mark “◯” at Step 116 and input to or output from a program that is not provided with the mark “◯” and a mark “Δ” (Step 120). In the example shown in
Finally, the display unit 44 transmits a subgraph of a result of processing conducted up to Step 120 to the display apparatus. The display apparatus diagrammatically displays the subgraph of the result of processing (Step 121).
A frame line 132 represents a business function “order receiving registration”. Figures indicating physical data and programs surrounded by the frame line 132 represent physical data and programs processed by the data driven analyzing (
The user ascertains such a screen, and makes a decision as to whether modification is necessary and as to the modification method. For example, the user's modification order supposed in the example shown in
(1) Associate the program PGM-y with the business function “order receiving registration”.
(2) Associate the physical data FILE-b with logical data “person in charge”.
(3) Register logical data “inquiry about appointed date of delivery” in business model as new output data, and associate the physical data FILE-d with the logical data “inquiry about appointed date of delivery”.
The controller 40 reads such an order given by the user, from the pointing device 34 such as a mouse. The data driven analyzer 42 conducts update processing of the business model and association model at Step 116 in
Subgraphs as shown in
First, the data driven analyzer 42 makes a decision whether the last vertex in the path p is included in the specified set s of physical data (Step 181). If the last vertex in the path p is not included, the processing is finished. If the last vertex in the path p is included, the data driven analyzer 42 stores ◯ in the mark column 66 in the physical data table 61 associated with vertexes that are elements of the set s of physical data contained on the path p (Step 182), and selects one of sections obtained by dividing the path p with elements of the set s (Step 183).
The data driven analyzer 42 examines vertexes in the section selected at Step 183, and determines whether a vertex having an identifier of a connected component added thereto is included in the vertexes (Step 184). If a vertex having an identifier added thereto is not present, the data driven analyzer 42 issues a new identifier, and adds the new identifier to all vertexes in that section (Step 185). If a vertex having an identifier added thereto is present and only one identifier is used in the whole section, the data driven analyzer 42 adds this identifier to all vertexes in the section (Step 186). If there are a plurality of identifiers in this section, the data driven analyzer 42 selects one of the identifiers and replaces other identifiers with the selected identifier (Step 187). By the way, the identifier replacing processing is conducted on the whole physical model. Thereafter, the data driven analyzer 42 adds the selected identifier to all vertexes in the subject section (Step 186).
Until an unprocessed section on the path p disappears, the data driven analyzer 42 conducts the processing of Steps 183 to 187 (Step 188). Owing to the processing heretofore described, it is possible to set an identifier for vertexes inside the subgraph every connected component, and display as shown in
(1) Delete a business model and an association model associated with the physical data FILE-e.
(2) Delete a business model and an association model associated with the physical data FILE-f.
(3) Divide the business function into ranges surrounded by the frame lines 166, 167 and 168, and associate programs contained in the ranges with functions obtained by the division.
The controller 40 reads such an order given by the user, from the pointing device 34 such as a mouse. The data driven analyzer 42 conducts update processing of the business model and association model at Step 116 in
First, the function driven analyzer 43 conducts retrieval in the business function column 95 in the function association table 92 by using a business function specified by the user, and thereby obtains a set F of programs with which the subject business function is associated (Step 141). For example, if the specified business function is “order receiving registration”, contents of the set F become F={PGM-x, PGM-z, PGM-w} as shown in
Subsequently, the function driven analyzer 43 selects one program from the set F, and stores the program in a variable f contained in a storage area in the memory 10 (Step 142). The function driven analyzer 43 conducts retrieval in a direction of arrows along edges of a graph on the physical model by taking physical data f as the starting point, obtains a set of paths starting from f and leading to an arbitrary vertex, and stores the set of the paths in a variable P contained in a storage area in the memory 10 (Step 143).
The function driven analyzer 43 takes one path from the set P of the paths obtained at Step 143, and stored the path in a variable p contained in a storage area in the memory 10 (Step 144). If the last vertex in the path p is contained in the program set F obtained at Step 141, the function driven analyzer 43 stores ◯ in the mark columns 64 and 66 of records associated with physical models (the program table 60 or the physical table 61) of all vertexes (programs or physical data) on the path P (Step 145). The function driven analyzer 43 conducts processing of Steps 144 and 145 on all paths contained in the variable P obtained at Step 143 (Step 146). In addition, the function driven analyzer 43 conducts processing at Steps 141 to 146 on all physical data contained in the set F obtained at Step 141 (Step 147). If processing is executed on the graph shown in
Subsequently, the function driven analyzer 43 provides physical data that is included in physical data input or output by programs provided with the mark “◯” at Step 145 and (1) that is only input or output by a program provided with the mark “◯” or (2) that is input to a program that is not provided with the mark “◯”, with a mark “Δ” (Step 148). For example, in
Finally, the function driven analyzer 43 transmits a subgraph obtained by the processing conducted at Steps 141 to 148 to the display apparatus 32. The display apparatus 32 displays the subgraph of the processing result in the same way as
In the first embodiment, it is assumed that the physical data is directly associated with a data storage area of a record, a variable, a file and a table in a program. The embodiment method described above may be extended to the case where data having different contents of meaning is stored in the data storage area. As practical cases, it is assumed that data having different contents of meaning exists as different records in a file and that different types of data occupy the same memory area each time a program is executed.
Also in processing of the physical model, although the program is considered as the vertex of a graph, the embodiment may be extended to the case where a plurality of different functions exist mixedly in a program. Description will be made on the second embodiment by incorporating the description of the first embodiment.
Since the physical data is represented by a combination of a data storage area and a data restriction, the physical data is not determined unanimously only by the data storage area. The data ID column 201 is therefore provided as a new key for distinguishing among records.
A program function table 190 is a substitute for the table 60. The object of this table 190 is physical mount of processing for managing program functions similar to the table 60. A function ID column 191 is provided as a unique key of the table because a function cannot be specified unanimously among a plurality of functions in the same program only by the program name.
A physical I/O association table 210 is a substitute for the table 62 and represents association of an input with an output of the program function/physical data in the extended physical model.
The program function is represented by a function ID column 212. One record of the physical I/O association table 210 represents one input or output association of the program function with the data storage area. Data to be input and output is represented by a combination of a data column 213 and a data restriction column 214. Input/output is distinguished by an I/O classification column 215 similar to the first embodiment. The data restriction column 214 stores a restrictive condition for each storage area imposed on input or output data in the data column 213. It is assumed that the restrictive condition may take an arithmetic equation, an inequality or a logical equation using the field of each record, a special value “−” meaning that no condition is imposed on a field, or a special value “false” meaning that a record is not input or output.
A set of records having the same function ID writes all restrictions imposed on input/output items for the program function. For example, a set of records 220 to 223 writes restrictions imposed on input/output items for the program function with the function ID=F1. The data restriction column for input data stores the condition imposed on the input record of a program to distinguish among a plurality of different program functions contained in the program. A combination of a program and input restriction represents indirectly a portion of the program to be executed when the conditions are satisfied.
For example, the restrictive condition “y=1” 218 shown in
Data restriction of output data is the restriction to be satisfied by the output data when the program is executed under a given input restriction.
For example, the restrictive condition “y=1” 219 indicates that “a field y of the input record in FILE-n with the program function F1 is 1”. The record 222 means that “the field y of FILE-n record which is an output of the program PGM-x is 1” on the condition assumption of an input of the program function F1. The data restriction column 214 of the record for FILE-m is “false” in the physical I/O association table 210, which means that there is no output for FILE-m on the same condition assumption.
In the extended physical model of the second embodiment, a key for identifying the physical data is a data ID, and a key for identifying the program function is a function ID. A data ID column 233 and a function ID column 235 are therefore provided in the association model to represent association with the business model.
For example, a record 236 shown in
A record 237 shown in
Similar to the first embodiment, work starts in an initial state by assuming a model which can be estimated initially. In the second embodiment, the physical model cannot be generated perfectly only by analysis of the program. A precision of the model is raised gradually by interactive processing for adding information from the user, similar to the first embodiment.
Also in the system basing upon the extended model with restrictions, data driven analyzing and function driven analyzing can be conducted in a manner similar to the first embodiment. Because of a change in the physical model, the following description is incorporated basically for
(1) The physical data corresponds to a combination of a data storage area and a restriction imposed on the data storage area, and is identified by the data ID in processing.
(2) The program corresponds to a combination of a physical program and a restriction imposed on an input of the program (program function), and is identified by the function ID in processing.
(3) If a path is traced from a vertex representing the physical data to a vertex representing the program function of inputting the physical data, a set of function IDs are acquired and only the vertexes corresponding to the set are coupled by arrows, the set satisfying:
(3-1) a program corresponding to a program function inputs data in a data storage area corresponding to the physical data, and
(3-2) both the restrictive condition imposed on the input item of the program corresponding to the program function and the restrictive condition imposed on the physical data are satisfied.
(3-3) First, by using the data ID as a key, the physical data table 200 is searched to acquire records (Step 300).
(3-4) Data and a value in the data restriction column of each acquired record are stored in variables d and c (Step 301).
(3-5) By using the data name d and I/O classification “I”, the physical I/O association table 210 is searched, and a search result is stored in an array R (Step 302).
(3-6) One record is picked out from the array R, and the function ID and a value in the data restriction column of the record are stored in variables p and cl (Step 303).
(3-7) It is evaluated by using a theorem providing apparatus or the like whether both the restrictions c and cl are satisfied, and if satisfied, the function ID p is stored in a result array A (Step 304).
(3-8) The above-described operations are repeated until all records in the array R are processed (Step 305). After all records in the array R are processed, processing shown in
(4) If a path is traced from a vertex representing the program function to a vertex representing the physical data output from the program function, a set of data IDs are acquired and only the vertexes corresponding to the set are coupled by arrows, the set satisfying:
(4-1) a program corresponding to a program function outputs data in a data storage area corresponding to the physical data, and
(4-2) both the restrictive condition imposed on the output data storage area corresponding to the physical data and the restrictive condition imposed on the output item of the program function are satisfied.
(4-3) By using the given function ID and I/O classification “O”, the physical I/O association table 210 is searched to acquire records, and the records are stored in the array R (Step 310).
(4-4) One record is picked out from the array R, and a data name and a value in the data restriction column of each acquired record are stored in variables n and cl (Step 311).
(4-5) By using the data name n, the physical data table 200 is searched to acquire records, and the records are stored in an array Q (Step 312).
(4-6) One record is picked out from the array Q, and a data ID and a value in the data restriction column of the record are set to variables d and c (Step 313).
(4-7) It is evaluated by using a theorem providing apparatus or the like whether both the restrictions c and cl are satisfied, and if satisfied, the data ID d is stored in a result array A (Step 314).
(4-8) The above-described operations are repeated for all records in the arrays R and Q (Steps 315 and 316). After all records in the arrays Q and R are processed, processing shown in
As described above, also in the system extending the physical model and program function to models with restrictions, the same data driven analyzing and function driven analyzing without restrictions can be conducted.
However, in order to efficiently operate the system using models with restrictions, it is necessary for the user to find restrictions imposed on the physical model and register the restrictions in the model. This modification of the physical model can be conducted interactively in a manner similar to modification of the business model and association model at Step 106 shown in
If there is physical data input or output from a plurality of functions, these functions and physical data are highlighted. Restrictions imposed on each input/output of the functions are displayed. For example, a worker studies division of the physical data by imposing a restriction on a data storage area by referring to the restriction imposed on each input/output of the functions.
(6) A function having an input or output from a plurality of physical data having different restrictions and the same data storage area is highlighted. Restrictions imposed on the plurality of physical data are displayed. For example, a worker studies imposing the restrictions imposed on data on each input/output of the functions.
(7) Under the given restriction imposed on an input, restriction imposed on output data of the program function is evaluated and displayed. It is assumed that this evaluation can be conducted by using already existing technologies.
Processing of model generation in the system supporting the extended model with restrictions will be described specifically by using following examples. In the following, mapping of the order receiving registration processing is made precise, and processing of domestic order registration is made understandable from data mapping processing. For example, it is first assumed that knowledge that a flag y in a record is 1 is obtained from information obtained from hearing with an end user of an analysis target system.
A worker using the embodiment system registers this knowledge in the system at Step 106 to use it as the restriction imposed on the record of FILE-a, and in the association model, the record on which the restriction y=1 is imposed is associated with domestic order receiving. In the business model, business of inputting domestic order receiving and outputting an order receiving slip is defined newly as a “domestic order receiving registration” function, and mapping was conducted again.
In this example, FILE-a was classified on the basis of data restriction and was displayed as icons 240 and 241. A conditional formula in { } of the icon indicates the restrictive condition imposed to each record.
The icon 240 pertains to domestic order receiving. The system highlights an icon of PGM-x because data having different restrictions and the same data storage area of FILE-a is input to PGM-x.
The user pays attention to inputs of highlighted PGM-x, and registers physical data restrictions y=1 and y!=1 as the restrictions imposed on the inputs of PGM-x to divide the function into two functions and conduct mapping again.
Icons of functions are merely copied at this time. Therefore, flows under the icons 263 and 264 are not easy to be observed. In order to analyze downstream flows, it is possible to instruct evaluation of execution results of the program functions corresponding to the icons 263 and 264.
For example, for the icon 263, an output of execution result of PGM-x is evaluated on the assumption that the input record satisfies the condition formula y=1. It is assumed that the condition formula y=1 is obtained for the output record n of PGM-x and that an output record m is not output. For the icon 264, an output of execution result of PGM-x is evaluated on the assumption that the input record satisfies the condition formula y!=1. It is assumed that the condition formula y!=1 is obtained for the output record n of PGM-x. Since these evaluations are not necessarily executable, a worker is required to supplement knowledge if not executable.
Next, the user pays attention to conditions imposed on outputs of a plurality of functions to the highlighted FILE-n 275 to impose the restriction condition on FILE-n and divide it into two physical data.
In the system having the configuration described above and in the embodiment using the extended model with restrictions, a portion regarding the “domestic order receiving registration” is extracted by order receiving registration processing so that programs and data regarding the “domestic order receiving registration” can be identified.
As described so far, the reverse engineering support system of the present invention can support a process of understanding an information system constituted of a number of programs by utilizing technologies of program analysis.
It should be further understood by those skilled in the art that although the foregoing description has been made on embodiments of the invention, the invention is not limited thereto and various changes and modifications may be made without departing from the spirit of the invention and the scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
2006-224828 | Aug 2006 | JP | national |