Neural network resource sizing apparatus for database applications

Description

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other aspects, features and advantages of the invention will be more apparent from the following more particular description thereof, presented in conjunction with the following drawings wherein:

FIG. 1 shows a graphical representation of a perceptron showing inputs p1 through pR input to the perceptron layer that yields resource utilization outputs a1 through aS.

FIG. 2 is an equivalent view of FIG. 1 using different neural network notation.

FIG. 3 shows the perceptron learning rule formulas.

FIG. 4 shows the weighting matrix W having row indices associated with the destination neuron and column indices associated with the given input.

FIG. 5 shows a flowchart detailing a method for obtaining training data.

FIG. 6 shows a portal interface to an embodiment of the apparatus.

FIG. 7 shows a first test schema for generating a resource learning session.

FIG. 8 shows a second test schema for generating a resource learning session.

FIG. 9 shows an architectural diagram having a test server for training the neural net and also a portal with an HTML interface and a webservice interface utilizing XML.

FIG. 10 shows an embodiment of the XML input and output used by the webservice interface.

DETAILED DESCRIPTION

A neural network resource sizing apparatus for database applications will now be described. In the following exemplary description numerous specific details are set forth in order to provide a more thorough understanding of embodiments of the invention. It will be apparent, however, to an artisan of ordinary skill that the present invention may be practiced without incorporating all aspects of the specific details described herein. In other instances, specific features, quantities, or measurements well known to those of ordinary skill in the art have not been described in detail so as not to obscure the invention. Readers should note that although examples of the invention are set forth herein, the claims, and the full scope of any equivalents, are what define the metes and bounds of the invention.

FIG. 1 shows a graphical representation of a perceptron showing inputs p1 through pR input to the perceptron layer that yields resource utilization outputs a1 through aS. Inputs p1 through pR may be configured as follows in one embodiment of the invention:

p1=number of records

p2=number of lookups

p3=number of images

p4=number of PDF files

p5=number of fields

p6=number of BLOBs

p7=width of all fields

Outputs from Perceptron Layer may be as follows in one embodiment of the invention:

a1=amount of recommended processing power in a desired benchmark (SPEC, Dhrystone, etc.)

a2=amount of recommended memory

a3=amount of recommended disk

a4=amount of recommended network throughput

Each neuron in the Perceptron Layer is represented as a summation symbol followed by a hardlim, i.e., hard-limit transfer function. The hardlim transfer function returns a zero or one. The perceptron neuron produces a zero if the net input into the hardlim transfer function is less than zero, or a one if the net input to the hardlim transfer function is equal to or greater than zero. The hardlim transfer function allows the perceptron neuron to split the input space into two regions. The weighting matrix W is correlates the weights of each input against each neuron. By applying multiple vectors of inputs and recommended outputs to the neural network, the neural network is trained to output recommended resource capacities for a given database application version.

FIG. 2 is an equivalent view of FIG. 1 using different neural network notation. In this diagram Input is shown as a bar to indicate that it is a vector of size R. Regardless of the notation used, the inputs, outputs and training are the same.

FIG. 3 shows the perceptron learning rule formulas. The goal of training the perceptron is to minimize the error “e” which is the difference between the target vector “t” and the neuron response vector “a”. By altering the weights in weight vector W based on the input vector “p”, the new weight vector w(new) is calculated from w(old) and error “e” and input vector “p”. For example if an input vector is presented and the output is correct, then the weight vector “w” is not altered. If the neuron output is zero and should be one, “a” is zero, “t” is one and hence “e”=“t”−“a”=1, then input vector “p” is added to the weight vector “w”. If the neuron output is one and should be zero, then the input vector “p” is subtracted from the weight vector “w”. Similarly, the bias can also be updated based on the error “e”. One skilled in the art of neural networks will understand that many tools or different types of calculations may be performed to produce an updated weighting matrix W.

FIG. 4 shows the weighting matrix W having row indices associated with the destination neuron and column indices associated with the given input. The weighting matrix W comprises the various weight vectors and is updated as more and more test data is used to train the system. In this manner, the neural network may be utilized to recommend resource capacities for database application implementations not yet observed. Any updated training information based on existing installations may also be applied to the neural network to further improve the accuracy of the apparatus. Anyhow known software package may be utilized to implement the neural network such as for example MATHMATICA®.

FIG. 5 shows a flowchart detailing a method for obtaining training data. Processing starts at 500. The database is loaded with a first test schema. The order in which test schemas are loaded and utilized to obtain training data does not matter and the input of simple schemas before more complex schemas is exemplary only. A performance load is run on the database application at 502. There are many tools that may be utilized in order to simulate a load on the database application. The resulting utilization of CPU, RAM, disk and/or network resources is obtained at 503. If there are no more tests to run as determined at 504, then training data is returned at 508 and processing completes at 509. If there are more tests to run as determined at 504, then the database is loaded with the next test schema at 505. A performance load is placed on the database application with the new test schema at 506. The resulting utilization of CPU, RAM, disk and/or network resources is obtained at 507. If there are more tests to run at 504, then another schema is loaded and tested otherwise the training data is returned at 508 and processing ends at 509. By obtaining a number of resource output results for different database application parameter scenarios, accurate recommended resource output results may be provided.

FIG. 6 shows a portal interface to an embodiment of the apparatus. In this figure portlet 600 is shown that may be embedded in another webpage for example. In other embodiments of the invention, a webservice may be utilized in addition to, or in place of the graphical user interface shown in FIG. 6. In this embodiment of the portlet, the user inputs database application parameters such as for example the number of records, number of lookups, number of images, number of PDF files, number of BLOBs and number of fields in the database for the given schema in input area 601. Calculate button 602 is pressed and recommended resource output results are shown in recommended resource output results area 603. Optionally, recommended servers or hardware products that meet the required capacities may be shown. A recommended server may be shown either if the recommended resources capacities are within the bounds of the recommended server for example.

Webservice embodiments may be utilized for example that allow for a given database application implementation to routinely report utilization numbers that occur over time. These reports may be used over time to increase the accuracy of the neural network or to flag problems. For example if a particular installation appears to be losing ground in resource utilization with respect to the planned resources, then this may indicate that there are problems with the system such as hardware problems or over utilized resources which limit the amount of resources that a particular installation may utilize. For example, if the amount of disk for a given installation drops and the number of main data records rises, then the amount of RAM utilized may result in swapping or thrashing. This information may be utilized to not only update the neural network, but also to alert support personnel that there may be a problem.

FIG. 7 shows a first test schema for generating a resource learning session. Test schema 700 utilizes a main data table without lookups and with 5000 product records. The database application may make use of family based data which builds upon an existing hierarchy of manufacturer and category however this is optional. The number of PDF files in the 5000 data records is known and is used as an input for training for this test schema. A load module is run against the schema that defines the database application parameters and resource utilization is recorded such as CPU, RAM, disk and/or the network as resource output results. The database application parameters and resource output results (or resource output results rounded up to the meet hardware capable of handling the load for example) are saved and input to the neural network for training the neural network. Any factor for increasing the resource output results to add a safety margin is in keeping with the spirit of the invention.

FIG. 8 shows a second test schema for generating a resource learning session. Test schema 800 also includes lookups based on attributes that are associated in the main table with a category as per category-attribute table 801. The attribute names and types are shown in attributes table 802. The main data table in this case utilizes 100,000 records and may have a variety of loads place on the database application in order to generate one or more performance points for use in training the neural network. Generally, the more training that can be applied to the neural network over varying parameters, the more accurate the resulting recommended resource output results become. Although the example shown in FIG. 8 is simplified for brevity, any number of fields, BLOBs and field widths may be utilized for example in order to provide an array of various tests for a particular database application implementation and given hardware setup.

FIG. 9 shows an architectural diagram having a test server for training the neural net and also a portal with an HTML interface and a webservice interface utilizing XML. Load tester LOAD interfaces with server TEST SERVER associated with database DB. Server TEST SERVER utilizes test schemas 1 through N as inputs for a test. The apparatus obtains the database application parameters associated with each database schema, installation, implementation, version or any other database related element and along with the load test results that result from running load tester LOAD. TEST SERVER or any other computing element coupled with the apparatus then trains neural net NN with these database application parameters and resource output result parameters. When a user of the apparatus desires recommendations for a desired database application, the apparatus obtains the desired database application parameters and provides at least one recommended resource output result based on neural network NN as trained. The interface to the apparatus may include HTML via portal interface HTML or portal interface WEBSERVICE. Any other method of training neural network NN is in keeping with the spirit of the invention so long as database application parameters are utilized in training neural network NN to provide recommended resource output results. (See FIG. 6 for an HTML embodiment of the portal interface HTML).

FIG. 10 shows an embodiment of the XML input and output used by the webservice interface. XML input message 1000 shows elements associated with database parameters residing within element designated DBparameter. The various database application parameters used follow and include NumberOfRecords, NumberOfLookups, NumberOfImages, NumberOfPDFFiles, NumberOfBLOBS and NumberOfFields along with the associated values. XML output message from the webservice includes elements associated with recommended resource capacities residing in element RecommendedCapacity. The various recommended capacity elements used follow and include CPU, RAM, DISK, NETWORK and SERVER. Any variation of the database application parameters and recommended resource output results is in keeping with the spirit of the invention and those shown in FIG. 10 are exemplary.

While the invention herein disclosed has been described by means of specific embodiments and applications thereof, numerous modifications and variations could be made thereto by those skilled in the art without departing from the scope of the invention set forth in the claims.

Claims

1. A neural network resource sizing computer program product comprising computer readable instruction code executing in a tangible memory medium of a computer, said computer readable instruction code configured to: obtain at least one database application parameter;obtain at least one load test result comprising at least one resource output result from a load test run on a database application test implementation comprising said at least one database application parameter;train a neural network based on said at least one database application parameter and said at least one resource output result; andprovide at least one recommended resource output result based on said at least one database application parameter and said at least one load test result for a desired database application implementation.
2. The computer program product of claim 1 wherein said computer readable program code is further configured to obtain at least one database application parameter from said desired database application implementation.
3. A neural network resource sizing computer program product comprising computer readable instruction code executing in a tangible memory medium of a computer, said computer readable instruction code configured to: obtain at least one database application parameter;obtain at least one load test result comprising at least one resource output result from a load test run on a database application test implementation comprising said at least one database application parameter;train a neural network based on said at least one database application parameter and said at least one resource output result;obtain at least one database application parameter from a desired database application implementation; and,provide at least one recommended resource output result based on said at least one database application parameter for said desired database application implementation wherein said at least one database application parameter is input to said neural network.
4. The computer program product of claim 3 wherein said at least one database application parameter is a number of database records.
5. The computer program product of claim 3 wherein said at least one database application parameter is a number of database lookups.
6. The computer program product of claim 3 wherein said at least one database application parameter is a number of images in a database.
7. The computer program product of claim 3 wherein said at least one database application parameter is a number of BLOBs.
8. The computer program product of claim 3 wherein said at least one database application parameter is a number of PDF files stored in a database.
9. The computer program product of claim 3 wherein said at least one database application parameter is a number of database fields or width of a plurality of database fields.
10. The computer program product of claim 3 wherein said at least one recommended resource output result is a central processing unit benchmark number and unit of measure associated with database computer hardware.
11. The computer program product of claim 3 wherein said at least one recommended resource output result is an amount of random access memory associated with database computer hardware.
12. The computer program product of claim 3 wherein said at least one recommended resource output result is an amount of disk storage space for said database hardware.
13. The computer program product of claim 3 wherein said at least one recommended resource output result is a network throughput speed.
14. The computer program product of claim 3 wherein said neural network comprises a perceptron architecture that utilizes a hard-limit transfer function.
15. The computer program product of claim 3 wherein said neural network comprises a multilayer neural network.
16. The computer program product of claim 3 further comprising a portal for providing said at least one recommended resource output result.
17. The computer program product of claim 3 further comprising a portal wherein said portal is configured to provide said at least one recommended resource output result via a webservice XML response.
18. The computer program product of claim 3 further comprising a portal configured to obtain ongoing output results from at least one customer installation over a period of time.
19. The computer program product of claim 3 further comprising a portal configured to obtain ongoing output results from at least one customer installation and report a high utilization to a customer.
20. A neural network resource sizing apparatus for database applications comprising: means for obtaining at least one load test result comprising at least one resource output result from a load test run on a database application test implementation comprising said at least one database application parameter;means for training a neural network based on at least one database application parameter and said at least one resource output result;means for providing at least one recommended resource output result based on said at least one database application parameter for said desired database application implementation wherein said at least one database application parameter is input to said neural network.

Neural network resource sizing apparatus for database applications

Information

Publication Number

Date Filed

Date Published

Inventors

CPC

US Classifications

International Classifications

Abstract

Description

Claims