The present invention relates in general to a method and apparatus for storing digital information, and in particular, to storing digital information in relational databases via the methodology of the Data Access Object software design pattern.
Many organizations collect large amounts of data such as employee data, customer data, product data, etc. This data can be useful if presented in the right form to the right people at the right time. The data is usually stored in a computer database. For example, a school will collect data about students, such as an address, birthday, social security number, previous schooling, phone numbers, etc. The school also collects data about the faculty, such as address, experience, start date, and courses they can and are teaching. The school can also collect data on the different courses being offered, who teaches the courses, who is signed up as a student for the courses, and where the courses are held.
All this data is usually stored in a database. General purpose computer applications manipulate the data and often use a specialized computer application known as a database server to store the data. There are several advantages achieved by using a database. First, application data, also known as objects, may be persisted between use when the total amount of data stored in the database exceeds the resources required to keep all data resident within the local memory. Second, objects may be persisted between execution of an application and/or invocations of the machine upon which they are executed. Third, a plurality of computer applications that work together for some common purpose can share data with complex relationships.
Most modern relational databases are first designed and configured, and then they are used by applications. During the design and configuration phase, database instances are created, and tables are created within those database instances. While it is possible for database administrators to use a command line interface to construct database tables, it is a more common practice to build script files which execute the command line interface as a batch operation. This practice facilitates the debugging of the initial database tables as well as the recovery of lost databases. Information such as table names, column names, column types and relationships are defined by these database commands.
Once a database has been designed and implemented, the second phase may commence, namely the use of the database. When an object-oriented computer language such as C++ or Java is used, application data structures tend to be represented as a collection of properties defined as a class. In most cases, it is the non-transient values of class instances which developers are interested in persisting in a database. To achieve the storage and retrieval of class instance properties with respect to database tables, special code must be written. This code takes into account the structure of the database as designed, as well as the association between database columns and class instance properties. It is important to note that a change to the schema of the database will usually result in a change to this process and vice versa.
There are many different database client applications which a developer can use to store, modify and retrieve data from a database server application. These different client applications each have their own specific language or commands to manage the data. There are also many different database server applications each with their own specific language or commands.
Different types of organizations have different types of data, and often collect and need the data in different forms. Very often the people entering the data, and needing the data, are only slightly computer literate, or do not have sufficient time to prepare the data in the proper form. Therefore separate database client applications are needed to collect and present the data. It is desirable to make the client applications, that present the data, be very easy to use, and anticipate the needs of the operator. As the amount of data increases and as the different ways in which the data is to be presented increases, the database client applications become more and more complicated to create. One way to make writing the data base client easier, is to make generic commands for storing, modifying and retrieving the data from different database applications.
In recent years a software design pattern known as Data Access Objects (DAO) has emerged. When employing this pattern in software designs, special DAO classes are developed that encapsulate the design details of the database as well as the procedure calls used to effect changes in the database. This technique is useful when a development team is made up of application and database specialists. By using DAO objects, application developers are able to focus on application development without undue concern for database interaction. Although advantageous in some respects, the DAO introduces yet another layer of coordination that must be considered as database and/or class structures change.
It is an object of the present invention to allow the objectives of DAO while eliminating two of the three points of maintenance associated with changing data structures: schema and the DAO itself. That is to say, a change (addition, modification or removal) to class design is discovered and accommodated by a generic DAO object, and that generic DAO object has the ability to reconfigure the underlying database.
The present invention accomplishes this by evaluating classes as they are referenced through the DAO. As new classes are seen, a schema for that class is programmatically created and the database structure is changed. If the structure of a class has changed, the generic DAO compares the structure of the class properties with the database and effects whatever changes are required to the schema of the database to make the two agree. By providing general purpose methods for functions typically implemented in DAO objects, such as add, get, getMany, update and remove, this generic DAO object is able to interact with all classes persisted to the underlying database.
In a preferred embodiment of the present invention, a server application manages the database, and a client application sends data to the database server application, the database client reads data from the server application, and the client application manipulates the data for presentation to a user.
The client application organizes the data into separate collections, for example a plurality of tables, each with columns and rows. As a further example, a school may have a table for student information, where each row of the table is for a different student, and the columns describe different information for each student, such as name, social security number, address, etc. The school can have another table for course information with each row describing a different course and the columns describing different information about the course.
The client application describes these separate collections or tables by creating a class for each table. The class indicates to which collection or table the class is associated with. The class also indicates the different items or columns in the associated table.
When the client application wants to add data to the database, the client application creates an object based on the appropriate class. For example, if an additional student is to be added to the student table, a student object is made based on the class for the student table. The student object includes local storage for the different items of information of the student. The client application will process the student object and send the database instructions to the database server.
Prior to the present invention, custom methods would need to be established for each table and/or class so that the data in the objects could be entered into the appropriate table. The present invention includes a software module in the client application which analyzes the object and determines the table to which the object belongs. The software module analyzes the database to determine if the database includes that table. If the database does not include that table, the software module creates a table in the database corresponding to the table to which the object belongs. Once the database includes the table for the object, the software module analyzes the object to determine the different items of information or columns in the object. The software module then compares the columns in the object with the columns in the corresponding table. The software module adjusts the columns in the table to match the columns in the object. The software module is preferably created using the Data Access Objects (DAO) pattern. DAO's are already known to the person of ordinary skill in the art of databases, and therefore no further explanation is necessary concerning DAO's.
According to the DAO pattern, the software module is also written with classes and objects. A datasource class is created to process data objects such as the student object. This datasource class has general-purpose commands which operate on a data object. The general-purpose commands can add, read, modify and delete the data of the data object in the database. These general-purpose commands need to know the tables and the columns of the different data classes created by the client application, and present in the database. These general-purpose commands also need to know how to add, read, modify and delete the tables and columns. The present invention provides further subclasses under the datasource class, called the classmetadata which include general commands for processing the tables. The present invention has subclasses to the classmetadata called membermetadata which includes general commands for processing the columns of the tables. In this way, a datasource class can be written to process all data objects for a database.
The developer of a client application only needs to create a class for each of the tables that the developer desires. The class for each table includes sufficient information to describe how the table is to be arranged. The present invention extracts this information from the class and configures a table accordingly, if the table is not already configured. The developer of the client application does not need to know the specific commands for the server application once the datasource class, the classmetadata and the membermetadata are created according to the present invention. The client developer only needs to create the classes, and the corresponding tables will be created automatically.
This is especially advantageous during the development stages of a client application. During initial development, the arrangement of the database often changes frequently. With the present invention, the developer only needs to change the classes in the client application, and the proper tables in the database will be created the next time the client application is executed. Only one change needs to be made, not the many changes such as for the DAO and database schema. Furthermore, the corresponding tables are only created when they are needed. The tables do not need to be created ahead of time by the developer, which saves the developer time, and allows the developer to concentrate on how the data is to be manipulated for presentation to the user. It is also possible with the present invention to have the database client access several different database servers at the same time. The developer does not need to know the specific commands for each server once the datasource classes are provided.
It is often desirable to hash or sort tables based on one or more of the columns: collectively a key. For example, if each student is given a student ID number, the student table can have all the rows sorted by the ID number. In this way, when the data for a particular student needs to be modified, it is very easy to find the row containing that student's data. The datasource class can include methods for sorting a table based on one or more columns. In a particular embodiment of the present invention, one or more columns where the name of the column has the word or suffix “ID,” is selected to be used as key fields in the table. When a row of data is to be added to the table, the datasource class can examine the column that is used for sorting, and places the row in the table according to the sort order of that column. Key fields are evaluated by the datasource as new tables are added or modified.
The various features of novelty which characterize the invention are pointed out with particularity in the claims annexed to and forming a part of this disclosure. For a better understanding of the invention, its operating advantages and specific objects attained by its uses, reference is made to the accompanying drawings and descriptive matter in which preferred embodiments of the invention are illustrated.
The various features of novelty which characterize the invention are pointed out with particularity in the claims annexed to and forming a part of this disclosure. For a better understanding of the invention, its operating advantages and specific objects attained by its uses, reference is made to the accompanying drawings and descriptive matter in which preferred embodiments of the invention are illustrated.
In the drawings:
a is the object-oriented design for a Java class adhering to the Java Bean standard;
b is the object-oriented design notation for a Java based DAO class;
Referring to the drawings, in particular
To create a table in a database, a schema must be provided. This is typically done via a command line interface. The preferred method for executing these commands is via a script file as a batch job. For example to create the grade table 118 from
CREATE DATABASE SCHOOL;
USE SCHOOL;
CREATE TABLE StudentGrade {
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
studentId BIGINT NOT NULL,
professorId BIGINT NOT NULL,
courseID BIGINT NOT NULL,
grade FLOAT NOT NULL
};
In the preferred embodiment using the Java programming language, this table would be represented by the Java class shown in
Referring to
Using these techniques, the same information about the names and types of the properties associated with a bean needs to be entered three times: in the schema, in the bean class and in the DAO. Likewise, whenever a change is made to any one of these components, a similar change must be made in the other two.
Referring to
A DataSource instance maintains information about each type of abstract bean 130 via a ClassMetaData 132 class object. The ClassMetaData 132 is therefore an aggregate of, or belongs to, the DataSource 134, and each DataSource 134 has one ClassMetaData object for each Abstract bean 130 class derivative it has seen. The ClassMetaData 132 describes the abstract bean 130 at a high level. The individual properties, their “getters,” “setters,” name and type information is maintained in a list of MemberMetaData 136 classes objects. Therefore, the MemberMetaData 136 is an aggregate of, or belongs to, the ClassMetaData 132, and there is one MemberMetaData class 136 object per property of the associated abstract bean 134.
Referring to
Referring to
Referring to
In order to adjust an existing database table, the schema of the database table must be understood; therefore the first step in the scanning process is to create a list of column names and their type for each column in the database table associated with the bean at hand 180: the “delete” list. Next, an empty list is created 182 to hold instances of MemberMetaData associated with each property of the bean: the “member” list. At this point an introspection of the bean is performed 184. Again, the introspection will result in a list of bean properties: the “property” list. The resulting “property” list is iterated, and for each bean property, the following steps are taken. First, a MemberMetaData object is instantiated 186 for the property. Second, the “delete” list is checked 188 for a corresponding entry for the property. If the property is in the “delete” list, it is removed 190 from the “delete” list. However, if the property is not in the “delete” list, it is a new property, and a SQL INSERT COLUMN command is executed against the database to add 194 a new column for the new property. In both cases, the newly created MemberMetaData object is added to the “member” list. Finally, once all properties in the “property” list have been processed, a SQL DELETE COLUMN command is executed against the database to remove the properties remaining in the “delete” list. Once all of the scanning steps have been completed, the database schema will agree with the ClassMetaData representation of the bean, and the ClassMetaData object will contain a valid list of MemberMetaData objects.
The process of creating database tables and scanning beans described above will be executed once per DataSource 134 upon the DataSource 134 seeing an Abstract bean 130. This is due to the fact that the MemberMetaData objects are cached in ClassMetaData objects, and the ClassMetaData objects are cached in the DataSource. Since DataSource objects tend to live for the duration of the application, these discovery and maintenance tasks are typically only performed once per application invocation.
Referring to
Referring to
Referring to
The terms line, row, table, column are used in this specification to describe different collections of data and are used to assist the reader in understanding how the present invention relates to a database using a two-dimensional table or sheet format. These terms are not intended to be limited to collections where data must be in line, row, table or column format, or the database must be two-dimensional. Instead these terms are only a preferred form of the data collection, and the present invention can be used with many different forms of data collections.
The use of terms with mixed upper and lower case has been used to make the application easier to read and in accordance with the conventions used in major object oriented programming languages such as Java and C++. The mixed case does not imply any further limitations on the terms.
While specific embodiments of the invention have been shown and described in detail to illustrate the application of the principles of the invention, it will be understood that the invention may be embodied otherwise without departing from such principles.