Object-relational mapping tools (ORMs) have become a fixture in application programming over relational databases. They provide an application developer the ability to develop against a conceptual model which is generally an entity-relationship model with inheritance. The conceptual model is coupled to a mapping that describes the relationship between the model and a physical database schema. The ORM uses this mapping to translate queries and updates against the model into semantically-equivalent ones of the relational database.
When an application changes, however, the conceptual model for the application may need to change as well. Recompilation and validation in response to a change may be time consuming.
The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practiced.
Briefly, aspects of the subject matter described herein relate to incrementally modifying schemas and mappings. In aspects, an indication of a change to a client schema is received and a compilation directive is received. The compilation directive may indicate how one or more entities or associations in the client schema are to be mapped to a store schema. In response to receiving the indication of the change and the compilation directive, the mapping data and storage schema may be incrementally modified with incremental revalidation and incremental updating of query and update views.
This Summary is provided to briefly identify some aspects of the subject matter that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
The phrase “subject matter described herein” refers to subject matter described in the Detailed Description unless the context clearly indicates otherwise. The term “aspects” is to be read as “at least one aspect.” Identifying aspects of the subject matter described in the Detailed Description is not intended to identify key or essential features of the claimed subject matter.
The aspects described above and other aspects of the subject matter described herein are illustrated by way of example and not limited in the accompanying figures in which like reference numerals indicate similar elements and in which:
As used herein, the term “includes” and its variants are to be read as openended terms that mean “includes, but is not limited to.” The term “or” is to be read as “and/or” unless the context clearly dictates otherwise. The term “based on” is to be read as “based at least in part on.” The terms “one embodiment” and “an embodiment” are to be read as “at least one embodiment.” The term “another embodiment” is to be read as “at least one other embodiment.” Other definitions, explicit and implicit, may be included below.
Aspects of the subject matter described herein are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, or configurations that may be suitable for use with aspects of the subject matter described herein comprise personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microcontroller-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, personal digital assistants (PDAs), gaming devices, printers, appliances including set-top, media center, or other appliances, automobile-embedded or attached computing devices, other mobile devices, distributed computing environments that include any of the above systems or devices, and the like.
Aspects of the subject matter described herein may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, and so forth, which perform particular tasks or implement particular abstract data types. Aspects of the subject matter described herein may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
With reference to
The computer 110 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by the computer 110 and includes both volatile and nonvolatile media, and removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media.
Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. Computer storage media includes RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile discs (DVDs) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer 110.
Communication media typically embodies computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer-readable media.
The system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132. A basic input/output system 133 (BIOS), containing the basic routines that help to transfer information between elements within computer 110, such as during start-up, is typically stored in ROM 131. RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120. By way of example, and not limitation,
The computer 110 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
The drives and their associated computer storage media, discussed above and illustrated in
A user may enter commands and information into the computer 110 through input devices such as a keyboard 162 and pointing device 161, commonly referred to as a mouse, trackball, or touch pad. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, a touch-sensitive screen, a writing tablet, or the like. These and other input devices are often connected to the processing unit 120 through a user input interface 160 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port or a universal serial bus (USB).
A monitor 191 or other type of display device is also connected to the system bus 121 via an interface, such as a video interface 190. In addition to the monitor, computers may also include other peripheral output devices such as speakers 197 and printer 196, which may be connected through an output peripheral interface 195.
The computer 110 may operate in a networked environment using logical connections to one or more remote computers, such as remote computer(s) 180. Each of the remote computer(s) 180 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 110, although only a memory storage device 181 has been illustrated in
When used in a LAN networking environment, the computer 110 is connected to the LAN 171 through a network interface or adapter 170. When used in a WAN networking environment, the computer 110 may include a modem 172 or other means for establishing communications over the WAN 173, such as the Internet. The modem 172, which may be internal or external, may be connected to the system bus 121 via the user input interface 160 or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 110, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,
As mentioned previously, recompilation and validation in response to a mapping change may be time consuming. In an object-to-relational mapping system (hereinafter “ORM”), a user may define a mapping from an object-oriented view of data that applications manipulate into a relational representation of the data that is stored in a database. The term “object” in ORM is to be read broadly to include both object-to-relational systems and entity-to-relational mapping systems, where an entity differs from an object in that it may not support methods (i.e., user-defined functions). An extended-relational model may include inheritance, which is a feature of object models and entity models. Thus the term ORM also includes extended-relational to relational mapping systems, where the extended relational model includes inheritance. Aspects of the subject matter described herein may be applied to the systems above as well as similar systems.
Herein, the terms “entity” and “association” are sometimes used instead of “object” and “relationship”. Furthermore, the terms “entity set” and “association set” are sometimes used instead of “class” and “relationship set”. In addition, “entity type” and “association type” are sometimes used instead of “object type” and “relationship type”. Even though these alternative terms are often used, they are to be understood to include, in alternative embodiments, the terms for which they are alternatives.
Entity types may be related in a hierarchy. If entity type F is a child of entity type E, then entities of type F are also of type E. Therefore entities of type F have the attributes of E. These attributes are called inherited attributes of F, and F is said to inherit attributes from E.
An ORM may compile a user-defined mapping into a representation that is more convenient for the implementation of run-time operations such as queries and updates over the entity sets. For example,
As illustrated in
For example, a mapping system may offer a declarative language for defining mappings, which are equations that relate data in an entity set view to data in a relational view. The mapping system may compile equations in this language into query views and update views, where query views enable a framework to execute queries against entity sets and update views enable the framework to execute updates against entity sets.
The compilation process may also validate the mapping to ensure it “roundtrips.” A mapping that “roundtrips” means that given a data set D expressed as an entity set, the following process returns exactly D: (1) store D in an entity set, (2) propagate the updated entity set to the database using the update view, and (3) use the query view to retrieve the entity sets stored in the database. Stated more precisely, it means that the composition of the query view and update view is the identity. This process of ensuring the mapping round-trips may be called “mapping validation.”
A user may decide to change a small part of an object-to-relational mapping. For example, the user might decide to add an entity set or an association set. One exemplary way to handle this is to recompile the entire mapping. Recompilation is certainly correct, but it may be slow. For example, in one implementation, the compilation process may take 5-10 minutes, primarily due to the expense of the validation phase. This compilation time may be annoying to a user, who may have only made a simple change, such as adding an entity set.
An alternative to a complete recompilation is to incrementally modify the query and update views and to define another validation test, all of which leverage the assumption that before the user's change the mapping was correct.
User-defined directives about a mapping are sometimes referred to herein as compilation directives. A compilation directive indicates how objects in the client schema are mapped to objects in the store schema. In particular, a compilation directive may indicate that entities are mapped as types using table-per-type (TPT), table-per-concrete-type (TPC), table-per-hierarchy (TPH), and partitioned entity across tables.
To perform the incremental validation and query, it may be assumed that there is a client schema , a store schema , and a mapping M⊂× that is specified by a set Σ of mapping fragments. Furthermore, it may be assumed that:
1. Mapping M roundtrips.
2. For every entity type and association, a corresponding query view has already been computed.
3. For every store relation, a corresponding update view has already been computed.
A mapping fragment may be expressed as a constraint that equates a query over a client schema to a query over a store schema. A query may be a project-select query over a single table or entity type that includes a key. A selection clause may check whether an attribute equals a particular value (e.g., A=5), whether an attribute is or is not null (e.g., A IS NULL or A IS NOT NULL), whether an entity is of a particular type (e.g., E IS OF T), or whether an entity is of exactly one type (e.g., E IS OF (ONLY T)). A selection query may include a combination of selection clauses related by AND or OR.
Let ′ represent a new new client schema obtained from by an incremental change, and a specification that describes how the new part in ′ is to be mapped to . Then, the mapping mechanism 310 may adapt mapping M into a new valid mapping M′ and compute query and update views for the new mapping M′.
A view (e.g., for entities, associations, and tables) may be represented in the following form:
Q
E|τE
where QE is a relational-algebra expression, and τE is an expression that performs entity (or row) construction. The entity construction τE may be performed, for example, in Entity SQL as a SELECT VALUE clause. (Entity SQL is an extension of SQL that permits the construction of entity objects and reasoning over entity types.) A complete view of QE|τE may take the following form:
SELECT VALUE τE FROM QE
To simplify the exposition for views QE|τE, the result of QE may contain all the attribute names of E consistently named.
For example, for an entity type E with attributes Id and A that has a derived type F with attributes Id, A and B, and two tables, R1(X, Y) and R2(U,V), the following is an example of a view for entity type E (where π is a realtional projection operator and is a relational left outerjoin):
Q
E:πX AS Id,Y AS A(R1)ΣU AS Id,V AS B,ƒ←true(R2)
τE:if (ƒ=true) then F(Id,A,B) else E(Id,A)
In queries mentioned herein, an extended relational projection operator π may contain constants in the projection lists assigned to specific attribute names. For example the expression πU,ƒ←true(R) is equivalent to πU,ƒ(R×F) where F is a relation with a single attribute ƒ and a single tuple with the value true. The keyword AS may also be used in the projection list to rename attributes.
The complete query QE|τE may be expressed in Microsoft's Entity SQL as follows:
Unless indicated otherwise, relational expressions are expressed herein as natural joins, natural outer joins, etc. To translate a relational expression into Entity SQL, aliases may be used for intermediate results and to specify the attributes over which to perform joins and outer joins.
A query may be constructed as a union of two previously constructed queries. To do this, the results of the queries that are to be unioned may be converted into schema-compatible tables. For example, the following two queries are not schema-compatible and do not even have the same number of attributes:
Q
F:πId,A,B,C,t
Q
G:πId,C,D,t
Given two lists of attributes α1 and α2, their union may be constructed as a new list denoted by (α1 pad α2) as follows:
For example, for the lists α1=(Id, A, B, C, t1←true) and α2=(Id, C, D, t2←true), their union is as follows:
(α1 pad α2)=(Id,A,B,C,t1,D←null,t2←false)
(α2 pad α1)=(Id,C,D,t2,A←null,B←null,t1←false)
Thus, given two queries of Q1 and Q2 that have as output lists of attribute names α1 and α2 respectively, a union may be performed by the expression:
πα
The outermost projections over attributes α1∪α2 ensure that the order of the attributes in both sides of the union is the same. These projections may be omitted if both projections over (α1 pad α2) and (α2 pad α1) have their attribute names in the same order. For example, for queries QF and QG above, a union may be taken using the following expression:
πId,A,B,C,D←null,t
This expression may be simplified to obtain:
πId,A,B,C,D←null,t
To simplify the exposition, the symbol {circumflex over (∪)} may be used to denote the above union. With this terminology, the expression:
πId,A,B,C,t
is equivalent to expression (2).
A method for incremental view computation for mapping entity types such as table-per-type (TPT) and table-per-concrete-type TPC is described below. The example below relates to a case where a new entity type E is added as a leaf in a (possibly empty) hierarchy, but the teachings described herein may also be applied in other situations. For the example, it may be assumed that PKE is the set of primary key attributes of E, and that E is added to a hierarchy inside an entity set ε. Furthermore, it may be assumed, for this example, that ε is fixed.
When the TPT strategy is followed, the primary key and the non-inherited attributes of E are mapped into a table, for example, T. Then to construct entities of type E, data from T may be joined with data from other tables. On the other hand, when one follows TPC, all the attributes of E are mapped into a table, say R. Thus, in this case, to construct entities of type E, only data from R is needed.
Mapping fragments may be defined to use more general forms to map entities. Below, a generalization of TPT and TPC is used in which an arbitrary set of the attributes of E (along with the primary key) is mapped to some table. The primitive for adding entities is the following:
AddEntity(E,E′,α,P,T,ƒ),
where:
1. E is the new entity type to be added.
2. E′ is the parent of E in the hierarchy (NIL if E is the root of a new hierarchy).
3. α is a subset of the attributes of E, denoted att(E), that contains PKE.
4. P is an ancestor of E in the hierarchy such that α∪att(P)=att(E). P can be specified as NIL in which case it holds that α=att(E). P may not be an abstract entity type.
5. T is a table in the store schema that is not mentioned in any mapping fragment.
6. ƒ:α→att(T) is a 1-1 function that maps the primary key of E to the primary key of T. Unless the context indicates otherwise, functions mentioned herein are total functions. The function ƒ is such that for every attribute AεΔ it holds that dom(A)⊂dom(ƒ(A)). Moreover, all attributes in att(T)\ƒ(α) are nullable.
The semantics of the addition of a new entity by using AddEntity(E, E′, α, P, T, ƒ) is given by the following mapping fragment, where the Entity SQL clause “IS OF E” is true for entities of type E:
πα(σIS OF E(ε))=πƒ(α)(T) (3)
The above mapping fragment specifies how attributes a are mapped into table T. To reconstruct E-entities, the missing attributes may be retrieved from some other tables in the store schema. The reference to the ancestor P in AddEntity is used to deal with this problem. That reference states that all the attributes of E that are not mapped to T are to be mapped as attributes of P.
Both TPT and TPC may be obtained from the above primitive. For instance, to map a new entity E by using the TPC strategy into a table T, the following may be used:
AddEntity(E,E′,att(E),NIL,T,ƒ)
That is, all E attributes (inherited and non-inherited) are mapped to table T. The NIL reference indicates that there is no need to obtain information from any other table to reconstruct entities of type E. On the other hand, to map the new entity E by using the TPT strategy the following may be used:
AddEntity(E,E′,(att(E′)\att(E))∪PKE,E′,T,ƒ)
That is, only the non-inherited attributes of E plus its primary key are mapped to table T. The reference to entity type E′ states that the remaining attributes of entities of type E are mapped in the same way as for E′ (which is the parent of E in the hierarchy).
Below is presented the formal procedures needed to adapt (and validate) a mapping after the addition of a new entity type, and to create query and update views incrementally.
Adapting the Mapping Fragments.
Assume that a new entity has been added by using AddEntity(E, E′, α, P, T, ƒ), as explained above. Let Σ− be the set of mapping fragments before the addition of the new entity, and let φE denote the mapping fragment
φE: πα(σIS OF E(ε))=πƒ(α)(T).
The set Σ− does not specify a valid mapping for the new client schema since mapping fragments in Σ− do not even mention the new entity type E. Moreover, if φE is added to the set Σ−, the resulting mapping may be non valid. Thus changes are made to Σ−before adding φE in order to ensure that the new mapping roundtrips. An idea behind the process is to construct a set Σ* that is semantically equivalent to Σ− when considering the old schema, but such that the set Σ*∪{φE} specifies a valid mapping for the new schema.
The formal process to adapt the mapping fragments after using AddEntity(E, E′, α, P, T, ƒ) is shown in Algorithm 1 below.
For some terminology, an ancestor of an entity E is an entity E′ that lies on the path from E to the root of the entity set. Every entity is an ancestor of itself. A proper ancestor of an entity is an ancestor that is not the entity itself. Likewise, a descendant of an entity includes all nodes of a tree of which the entity is an ancestor, while a proper descendant of an entity is a descendant of the entity that is not the entity itself.
Point 0 of the algorithm is used for validating the mapping. The algorithm checks that the addition of new entities of type E does not violate foreign key constraints in the store. An example of failure of this check is shown in
Now consider entity type E which inherits from E′ and is mapped as TPC to table T. Now association may be relating entities of the new type E and storing the corresponding key values in table R. Notice that since E is mapped as TPC all the values (including key attributes) for entities of type E are stored in table T. Thus, table S does not contain the key values of entities of type E. This implies that the foreign key constraint β→γ will be violated whenever an entity of type E participates in association . Point 0 of Algorithm 1 uses the update views described below to skip the adaptation of mappings whenever a foreign key constraint can be violated when storing data of the new added entity type, as in the case explained above.
Point 1 adapts the mapping for the addition of the new entity. The semantics of the primitive AddEntity(E, E′, α, P, T, ƒ) states that a attributes of entity type E are to be mapped to table T and all the remaining attributes are to be mapped as for entities of type P. Thus, if some mapping fragment includes a condition of the form IS OF (ONLY P) it is to be replaced by IS OF (ONLY P)IS OF E. These changes are performed in Point (1a) of Algorithm 1.
Point (1b) makes a complementary adaptation regarding entity types inbetween E and P. For example, assume that a mapping fragment includes a condition of the form IS OF F where F is a proper ancestor of E and a proper descendant of P in the hierarchy. Notice that a condition of the form IS OF F will be satisfied by entities of type E (since E is a descendant of F in the hierarchy). Nevertheless, all the attributes of E are mapped either to table T or to the tables to which attributes of P are mapped. Thus, the expression IS OF F is replaced by an expression that rules out entities of type E. Point (2b) does these changes. As an example of a generated expression, consider the hierarchy of
(IS OF(ONLY F1)IS OF F3)(IS OF(ONLY F2)IS OF F4) (4)
The set of mapping fragments Σ* generated by Algorithm 1 is semantically equivalent to Σ− when considering client states that do not have entities of type E (that is, when the client-side constraint σIS OF E (ε)=Ø is imposed). If entity type E is empty then expression (5) is equivalent to IS OF (ONLY P) and expression (6) is equivalent to IS OF F. In particular, considering the hierarchy of
Furthermore, Algorithm 1 makes changes to the previous mapping fragments. Depending on the application scenario, these changes may be undesirable. If that is the case, the adaptation procedure may be changed to a pure validation procedure that tests whether the new entity may be safely added without changing any previous mapping fragment.
Also, if in Point 0 Check b of Algorithm 1, β is not part of the primary key of T, then the algorithm may safely abort without checking the query containment. One reason for this is that mapping fragments only map keys in the client with keys in the store. Thus, since β′ is the primary key of table T′, either β′ is mapped to a key of some entity set or is not mapped at all. In the latter case, T′ is not mentioned in any mapping fragment, and the query containment fails since QT′ returns an empty result. In the former case, the containment implies that values of some non-key attributes of E entities will always be keys of entities of some entity type, which fails in general.
Incrementally Computing Views.
Given an entity type F in , QF−|τF− denotes the query view for type F before the addition of the new entity type. Similarly, given a table R in , QR−|τR− denotes the update view for table R before the addition of the new entity type. Assume that a new entity E is added to the client schema by using:
AddEntity(E,E′,α,P,T,ƒ),
and that mapping fragments have been adapted as explained above. Algorithm 2 shows a procedure to compute the new query views for the new entity type and to incrementally recompute query views for the previous entity types as follows:
In Algorithm 2 the expression ƒ(α) AS α denotes a renaming of attributes. For example, if α is the list Id, A, and ƒ(Id)=X and ƒ(A)=Y, then ƒ(α) AS α denotes the list X AS Id, Y AS A.
One assumption mentioned previously is that the result of query view QP− contains all P attributes consistently named. Since att(E)=att(P)∪α, the query QE in Step 1 of Algorithm 2 contains all the attributes in att(E). Also in Algorithm 2, the new attribute tE attached to query views in Steps 2 and 3 is used to test provenance of tuples.
Furthermore, there is not an assumption that att(P)∩α=PKE. That is, att(P) and α may share more attributes than just the primary key of E. This may be used to rewrite some queries in order to obtain queries that are more efficient to execute. In particular, every usage of α in Algorithm 2 may be replaced by the set of attributes α′=(α\att(P))∪PKE. Then for example, the query view for E created in Step 1 of Algorithm 2 for the case of P≠NIL may be rewritten as:
Q
P
−
πƒ(α′) AS α′(T).
Algorithm 3 (shown below) recomputes update views. Algorithm 3 uses a strategy similar to the strategy used for adapting the mapping fragments.
HR(Id,Name),Emp(Id,Dept),Client(Cid,Eid,Name,Score,Addr),
and the following foreign key constraints: Emp. Id→HR. Id, Client. Eid→Emp. Id as illustrated in
Step 1: adding an entity type: Person. As a first step, an entity type Person (Id, Name) is added as the root of a hierarchy inside an entity set Persons, and this entity type is mapped to table HR as shown in
AddEntity(Person,NIL,(Id,Name),NIL,HR,ƒPerson),
where ƒPerson is such that ƒPerson(Id)=Id and ƒPerson(Name)=Name. Algorithm 1 is then followed. No check of validity is necessary in this case since there is no foreign key from HR to any other table and there is no ancestor of Person in the hierarchy. Also, since Σ0=Ø, there is no previous mapping to adapt. Thus, the algorithm adds the mapping fragment φ1 given by:
πId,Name(σIS OF Person(Persons))=πId,Name(HR),
and considers the new mapping M1 specified by the set Σ1={φ1}.
Next query and update views are computed for M1. Following Algorithm 2 (Step 1) the following query view is obtained:
Q
Person
1:πId,Name(HR)
τPerson1:Person(Id,Name).
Following Algorithm 3 (Step 1), the following update view is obtained:
Q
HR
1:πId,Name(σIS OF Person(Persons))
τHR1:HR(Id,Name).
No renaming or padding is needed given that the name of attributes in Person and HR match.
Step 2: adding a derived entity type Employee as TPT. To do this, an entity type Employee (Id, Name, Department) is added that derives from Person. This new entity type is mapped as TPT to table Emp, as shown in
AddEntity(Employee,Person,(Id,Department),Person,Emp,ƒEmployee),
where ƒEmployee is such that ƒEmployee(Id)=Id and ƒEmployee(Department)=Dept.
To adapt the mapping, Algorithm 1 is followed. At Step 0, a check is performed to determine that the foreign key constraint Emp. Id→HR. Id is not violated. For this check, an update view for Emp is constructed. Then the foreign key constraint is checked later. Since φ1 does not mention any condition of the form IS OF (ONLY Person), there is no mapping fragment to adapt. Thus, the algorithm considers the mapping M2 specified by the set Σ2={φ1, φ2} where φ2 is the mapping fragment:
πId,Department(σIS OF Employee(Persons))=πId,Dept(Emp)
Next, the query and update views are computed for the new entity type, and the previous queries are incrementally recomputed. Algorithm 2 may be followed first to construct query views. The entity type Person is playing the role of P in the algorithm. Thus, to construct the query view for entity Employee, the previously computed query view QPerson1 (see Step 1 of the algorithm) is used as follows:
Q
Employee
2
:Q
Person
1
πId,Dept AS Department(Emp)=
πId,Name(HR)πId,Dept AS Department(Emp)
τEmployee:Employee(Id,Name,Department)
Now, since Person is playing the role of P, to reconstruct query view for Person, Step 3 of Algorithm 2 is followed. Thus, the new query view for Person is obtained by considering QPerson1 as follows:
Q
Person
2
:Q
Person
1
πId,Dept AS Department,t
πId,Name(HR)πId, Dept AS Department,t
τPerson2:if (tEmployee) then τEmployee2 else τPerson1=
if (tEmployee) then Employee(Id, Name, Department) else Person(Id, Name)
To construct update views Algorithm 3 is followed. In Step 1, the algorithm constructs the update view for table Emp as follows:
Q
Emp
2:πId,Department AS Dept(σIS OF Employee(Persons))
τEmp2:Emp(Id,Dept).
The only previously computed update view is the view for table HR. Since this view does not mention an expression of the form IS OF (ONLY Person), it is not changed in Step 2 of the algorithm and the following results:
Q
HR
2
:Q
HR
1=πId,Name(σIS OF Person(Persons))
τHR2:τHR1=HR(Id,Name).
Now that the update views have been recomputed, Step 0 (part b) of Algorithm 1 may be performed to test whether the foreign key constraint Emp. Id→HR. Id is violated. This step tests the containment:
πId(QEmp2)⊂πId(QHR2).
By unfolding the update views in the previous expression, the check
πId(σIS OF Employee(Persons))⊂πId(σIS OF Person(Persons)).
holds since Employee inherits from Person. Thus, the mapping is valid and the algorithm has been followed to incrementally recompute all the query and update views.
Step 3: adding a derived entity type Customer as TPC. Now, an entity type Customer(Id, Name, CreditScore, BillingAddr) is added that derives from Person. This new entity type is then added as TPC to table Client as shown in
AddEntity(Customer,Person,(Id,Name,CreditScore,BillingAddr), NIL,Client,ƒCustomer),
where ƒCustomer is such that ƒCustomer(Id)=Cid, ƒCustomer(Name)=Name, ƒCustomer(CreditScore)=Score, and ƒCustomer(BillingAddr)=Addr.
Algorithm 1 is followed to first adapt the mapping fragments. In Step 0, there is no need to check validity with respect to the foreign key constraint Client. Eid→Emp. Id since none of the attributes of the new entity type is mapped to attribute Eid in table Client. Furthermore, NIL is playing the role of P in Algorithm 1. Thus, Person is a proper ancestor of Customer and a proper descendant of P (NIL in this case). Then, since mapping fragment φ1 mentions the expression IS OF Person, it is replaced by:
IS OF(ONLY Person)IS OF Employee
(see formula (6) in Step 1b of Algorithm 1). Thus, the output of the algorithm is the set of mapping fragments Σ3={φ1, φ2, φ3}, where
φ1′:πId,Name(σIS OF(ONLY Person)IS OF Employee(Persons))=πId,Name(HR)
φ2: πId,Department(σIS OF Employee(Persons))=πId,Dept(Emp)
φ3: πId,Name,CreditScore,BillingAddr(σIS OF Customer(Persons))=πCid,Name,Score,Addr(Client)
Next, the query and update views for the new entity are computed and the previous queries recomputed. Following Step 1 of Algorithm 2, since P=NIL, for entity Customer the query view is constructed by considering only the table Client:
Q
Customer
3:πCid AS Id,Name,Score AS CreditScore,Addr AS BillingAddr(Client)
τCustomer3:Customer(Id,Name,CreditScore,BillingAddr)
Person is a proper ancestor of Customer and a proper descendant of P=NIL. Thus, in Step 2 of Algorithm 2, to recompute the query view for Person, Q2Person and (QCustomer3)* may be used as follows:
Q
Person
3
:Q
Person
2
{circumflex over (∪)}πCid AS Id,Name,Score AS CreditScore,Addr AS BillingAddr,t
(πId,Name(HR)πId,Dept AS Department,t
(πCid AS Id,Name,Score AS CreditScore,Addr AS BillingAddr,t
τPerson3:if (tCustomer) then τCustomer3 else τPerson2=
if (tCustomer) then Customer(Id,Name,CreditScore,BillingAddr)
else{if (tEmployee) then Employee(Id,Name,Department)
else Person(Id,Name)}
For the case of Employee the query view does not change, giving:
Q
Employee
3
:Q
Employee
2=πId,Name(HR)Id,Dept AS Department(Emp)
τEmployee3:τEmployee2=Employee(Id,Name,Department)
To compute update views, Algorithm 3 is followed. In Step 1, the update view for table Client is generated as follows:
Q
Client
3:πId AS Cid,Eid←null,Name,CreditScore AS Score,BillingAddr AS Addr(σIS OF Customer(Persons))
τClient3:Client(Cid,Eid,Name,Score,Addr).
Eid←null appears in the projection list because of the padding operation that is needed to construct the update view (see Step 1 in Algorithm 3). As for the adaptation of the mapping fragments, since the update view QHR2 mentions the expression IS OF Person, in Step 2 of the algorithm the last expression is replaced by:
IS OF(ONLY Person)IS OF Employee.
Thus, the new update view for HR is the following:
Q
HR
3:πId,Name(σISOF(ONLY Person)IS OF Employee(Persons))
τHR3:HR(Id,Name).
For the case of Emp the update view does not change and is as follows:
Q
Emp
3
:Q
Emp
2=πId,Department AS Dept(σIS OF Employee(Persons))
τEmp3:τEmp2=Emp(Id,Dept).
This completes the example for now. Below, a solution is described to add associations and complete the example by adding an association between entity types Customer and Employee.
Two cases relate to adding associations:
1. A new association set is mapped to a join table T where the keys of both endpoints of the association are mapped to the primary key of T. Table T is not previously mentioned in any mapping fragment (and thus, it is not mentioned in any update view).
2. A new association set is mapped to table T, the key of one of the endpoints of the association is mapped to the primary key of T, and the key of the other endpoint is mapped to a different set of attributes in T. Table T is already mentioned in a mapping fragment and thus has an associated update view.
Associations Mapped to Join Tables.
The following primitive may be used to add associations:
AddAssocJT(,E1,E2,mult,T,ƒ)
where:
The semantics of the addition of the new association using the above primitive is given by the following mapping fragment:
πE
Adapting the Mapping Fragments. Let Σ− be the set of mapping fragments before the addition of the new association, and let denote the mapping fragment
πE
In this case, there is no need to adapt the previous set of mapping fragments. Rather, it is sufficient to check the validity of the mapping specified by Σ=Σ−∪{}. To do this, Algorithm 4 checks that if T has associated foreign key constraints, these constraints are not violated.
Reconstructing Views. In this case, all the previous query views are not changed. However, a query view for the new association set is added. Similarly, for the case of update views, the algorithm creates the update view for table T. All other update views are not changed. Algorithm 5 below shows how to create these views.
Associations Mapped to a Previously Used Table.
For this case, the following primitive may be used to add associations:
AddAssocFK(,E1,E2,mult,T,ƒ)
where
1. is the name of the new association set.
2. E1 and E2 are the endpoints of the association.
3. mult is an expression that denotes the multiplicity of the association. The multiplicity is such that the endpoint corresponding to E2 is not *.
4. T is a table previously mentioned in mapping fragments, and has update view QT−|τT−.
5. ƒ is a 1-1 function that satisfies the following. Assume that α is the set of primary key attributes of E1 and β is the set of primary key attributes of E2. Then ƒ(E1, α) and ƒ(E2·β) are sets of attributes in T, and ƒ(E1·α) is the primary key of T. Moreover, if there is a foreign key from T that mentions the attributes ƒ(E2·β) then it covers all the attributes ƒ(E2·β) that is, it is of the form ƒ(E2·β)→β′.
The semantics of the addition of the new association using the above primitive is given by the following mapping fragment:
πE
Adapting Mapping Fragments. Let Σ− be the set of mapping fragments before the addition of the new association, and let denote the mapping fragment
πE
In this case, there is no need to adapt the previous set of mapping fragments. A check of the validity of the mapping specified by Σ=Σ−∪{} is sufficient. Algorithm 6 performs this validation.
To avoid clashes when the association is mapped to a previously used table, the algorithm ensures that the attribute in T used to store the key of one of the ends of the association has not been previously used in any mapping fragment (Check 1 in Algorithm 6). Moreover, as for the previous case of adding an association, to check the validity of the mapping specified by Σ=Σ−∪{}, a check is performed that if T has a foreign key, it is not violated (Check 3). Furthermore, a check is performed that the endpoint of the association corresponding to entity E1 may be entirely stored in the primary key of T (Check 2).
Reconstructing Views. In this case, all the previous query views are not changed and the algorithm adds the query view for the new association set . Similarly, for the case of update views, the algorithm incrementally recomputes the update view for table T; all other update views are not changed. Algorithm 7 shows how to create these views.
An Example of Adding an Association.
Below, the example mentioned previously is continued to add an association between entity types Customer and Employee. In particular, an association Supports is mapped to the foreign key constraint Client. Eid→Emp. Id, as shown in
AddAssocFK(Supports,Customer,Employee,[*−0 . . . 1],Client,ƒsupports)
where ƒSupports is such that ƒSupports(Customer.Id)=Cid, and ƒSupports (Employee. Id)=Eid.
In particular, the mapping M4 specified by the set Σ4={(φ1, φ2, φ3, φ4} is considered, where φ4 is the mapping fragment:
πCustomer.Id,Employee.Id(Supports)=πCid,Eid(σEid IS NOT null(Client)).
Algorithm 6 is first used to check whether this new mapping is valid. In Step 1, a check is performed to determine that attribute Eid of table Client is not previously mentioned in the mapping fragment, which is the case (since it is not mentioned in φ1, φ2, or φ3). In Step 2, a check is performed to determine that the identifiers of entities of type Customer may be stored in table Client by checking the containment
πId(σIS OF Customer(Persons))⊂πCid AS Id(QClient3).
By unfolding the definition of QCustomer3 (and simplifying the expression) the following is obtained:
πId(σIS OF Customer(Persons))⊂πCid AS Id(σIS OF Customer(Persons)),
which holds. Finally, in Step 3 the foreign key constraint Client. Eid→Emp. Id is checked. The key of entity type Employee is mapped to Eid in table Client, and the foreign key is from table Client to table Emp. In this case, the following containment is tested:
πId(σIS OF Employee(Persons))⊂πId(QEmp3).
By unfolding the view (and simplifying the expression), the following containment check is obtained:
πId(σIS OF Employee(Persons))⊂πId(σIS OF Employee(Persons)),
which holds. Because of the above tests, it may be determined that there is a valid mapping fragment.
The query and update views may be computed using Algorithm 7. Step 1 of the algorithm computes the query view for Supports as follows:
Q
Supports
4:πCid AS Customer.Id,Eid AS Employee.Id(σEid IS NOT null(Client))
τSupports4:Supports(Customer.Id,Employee.Id)
For the case of update views, in Step 2 the algorithm recomputes the update view for table Client as follows:
Q
Client
4:πCid,Name,CreditScire,BillingAddr(QClient3)
πCustomer.Id AS Cid,Employee.Id AS Eid(Supports)
=πId AS Cid,Name,CreditScore AS Score,BillingAddr AS Addr(σIS OF Customer(Persons))
πCustomer.Id AS Cid,Employee.Id AS Eid(Supports)
τClient4:Client(Cid,Eid,Name,Score,Addr).
The query and update views for all other components remain unchanged.
With the above, a valid mapping has been incrementally constructed that is given by the set of mapping fragments:
πId,Name(σIS OF(ONLY Person)IS OF Employee(Persons))=πId,Name(HR)
πId,Department(σIS OF Employee (Persons))=πId,Dept(Emp)
πId,Name,CreditScore,BillingAddr(σIS OF Customer(Persons))=πCid,Name,Score,Addr(Client)
πCustomer.Id,Employee.Id(Supports)=πCid,Eid(σEid IS NOT null(Client)).
As an example, the query view for entity type Person that is computed is:
Q
Person:(πId,Name(HR)πId,Dept AS Department, t
(πCid AS Id,Name,Score AS CreditScore,Addr AS BillingAddr,t
τPerson:if (tCustomer) then Customer(Id,Name,CreditScore,BillingAddr)
else{if (tEmployee) then Employee(Id,Name,Department) else Person(Id,Name))
As an example, the update view for table Client that is computed is:
Q
Client:πId AS Cid,Name,CreditScore AS Score,BillingAddr AS Addr (σISOF Customer(Persons))πCustomer.Id AS Cid,Employee.Id AS Eid(Supports)
τClient:Client(Cid,Eid,Name,Score,Addr).
For a first example, the case is considered in which a new entity type is added to a hierarchy that is completely mapped as TPH. For simplicity of exposition, the entire hierarchy is stored in a single table.
The primitive for adding entities according to the TPH strategy is the following:
AddEntityTPH(E,E′,T,Disc,dE,ƒ),
where:
The semantics of the addition of a new entity by using AddEntityTPH (E, E′, T, Disc, dE, ƒ) is given by the following mapping fragment:
πatt(E)(σIS OF(ONLY E)(ε))=πƒ(att(E))(σDisc=d
From the above, it can be seen that all the attributes of E are mapped to the same table T.
Adapting the Mapping Fragments.
Assume that a new entity has been added by using AddEntityTPH(E, E′, T, Disc, dE, ƒ), as explained above. Let Σ− be the set of mapping fragments before the addition of the new entity, and let φE denote the mapping fragment (7). Algorithm 8 shows how to validate and adapt the mapping.
Since the whole hierarchy is mapped as TPH into the same table T, then every ancestor F of E′ is mapped to T by using a condition of the form IS OF (ONLY F). Furthermore, if E′ was a leaf previous to the addition of E, then it may be the case that a condition like IS OF E′ had been used to map E′. In Step 1, this expression is changed by IS OF (ONLY E′) in order to safely add the information of entity type E to table T.
Since dEεdom(Disc) the query in Step 0 Check b is unsatisfiable if and only if Disc has not been previously mapped to any entity attribute A such that dEεdom(A) and the condition Disc=dE has not been previously used in a mapping fragment. This test can be done by inspecting the mapping fragments.
The addition may be generalized by relaxing the assumption that the whole hierarchy is stored in the same table T as TPH. To do this, some changes may be made to the above validation and adaptation strategy. In particular, first, a check may be performed that the primary key constraint in T is not violated by the addition of the new entities. This may be done by checking the containment πPK
Incrementally Computing Views.
Algorithm 9 may be used to compute the new query views for the new entity type and incrementally recompute query views for the previous entity types.
Algorithm 10 may be used to incrementally recompute update views.
For an example below, a case is considered in which there is an entity type E1(Id, A) mapped to a table T1(Id, A). In this case, the query view for entity type E1 is QE
Q
E
=πId,A(T1)πId,B(T2).
If a new entity type E3(Id, A, C) is added that inherits from E1 and is mapped as TPT to a table T3(Id, C), then following the process previously described, the query view for E3 may be obtained by considering the previously computed query for E1 and table T3. The process constructs the following query view:
Q
E
=(πId,A(T1)πId,B(T2))πId,C(T3).
In this case, the above view may be rewritten into an equivalent query view for entity type E3 as follows:
Q
E
′=πId,A(T1)πId,c(T3).
In this rewriting, the following implicit constraint is exploited:
πId(T2)∩πId(T3)=Ø,
which is satisfied by all the store states in the range of the mapping. The constraint is implied because the set of identifiers of entities of type E2 and E3 are disjoint.
Below, a case is considered where a new entity type is added using a primitive similar to AddEntity, but in which some client-side conditions are specified to split entities across several tables. The primitive for adding entities that is considered is the following:
AddEntity(E,E′,P,Γ),
where:
1. E is the new entity type to be added.
2. E′ is the parent of E in the hierarchy (NIL if E is the root of a new hierarchy).
3. P is a proper ancestor of E in the hierarchy.
4. F is a set of tuples {(α1, ψ1, T1, ƒ1), . . . ,(αn, ψn, Tn, ƒn)}, where for every iε{1, . . . , n} it holds that:
The semantics of this addition is given by the following set of mapping fragments:
πα
πα
As for the case of the addition of entity types that were introduced under the TPT/TPC heading above, the reference to the ancestor P is used to specify that all the attributes of E that are not mapped by the above mapping fragments are to be mapped as the attributes in P. If a set Γ is considered with a single tuple of the form (α, true, T, ƒ), the primitive discussed under the TPT/TPC heading is obtained. Nevertheless, in this case the verification process is different. In particular, in the description under the TPT/TPC heading, it was enough to have att(P)∪α=att(E) in order to ensure that entities of type E can be losslessly stored in the store schema. Below, a different condition is checked to test whether the pairs (ψi, αi) cover all the possible entity types of E.
Validation and Adaptation of Mapping Fragments.
To describe what coverage means in this case, some notions are introduced below. Given a conjunction of conditions ψ over some set of attributes and constant values, fix-att(ψ) denotes the set of all attributes A such that there exists a constant value c for which the formula A=c is a logical consequence of ψ. For example, let ψ be the formula
A
1=1A1=A22≦A3A3≦A4A4≦2A5≠3,
where the usual intepretation of inequalilty symbols and natural numbers is assumed.
Then, fix-att(ψ)={A1, A2, A3, A4} since A1=1, A2=1, A3=2, and A4=2 are all logical consequences of ψ. If ψ is just a conjunction of equalities between attributes and constant values, then fix-att(ψ) may be obtained efficiently (assuming that ψ is satisfiable). Furthermore, IS null or IS NOT null conditions may be treated as =null or ≠null considering null as a certain constant value. Given a pair (α, ψ), the pair covers attribute A if Aεα∪fix-att(ψ).
Below is described how coverage is to be checked for entities of type E in this case. Given an attribute Aεatt(E)\att(P) that is not part of the primary key of E, a check is performed to determine that the mapping fragments (8) completely cover A by doing the following. Let XA⊂{1, . . . , n} be the set of indexes i such that Aεαi∪fix-att(ψi), that is, iεXA if and only if (αi,ψi) covers A. Then check is performed that the formula
is a tautology. If that is the case, then the attribute A is covered by the mapping fragments. For example, assume that
Γ={(α1,ψ1,ƒ1,T1), (α2,ψ2,ƒ2,T2), (α3,ψ3,ƒ3,T3)} where α1={A2}, α2={A1}, α3={A1, A2}, and ψ1, ψ2 and ψ3 are the formulas:
ψ1:A1=3A2≠4
ψ2:A1≠3A2≠4
ψ3:A2=4
Then, the attribute A1 is covered since XA
(A1=3A2≠4)(A1≠3A2≠4)(A2=4)
is a tautology. On the other hand attribute A2 is not covered since XA
(A1=3A2≠4)(A2=4)
is not a tautology (e.g., consider A1=1 and A2=3).
Algorithm 11 performs the above-mentioned cover check.
Algorithm 11 follows a slightly different strategy to minimize the number of tautology tests. In Step 2 the algorithm constructs the sets XA for every attribute A and puts them into another set X depending on whether or not there exists a set of indexes in X that is contained in XA (Step 2c). Then (in Step 2d) the algorithm deletes all the sets in X that contain XA. At the end of the iteration of Step 2 the set X only contains the minimal (w·r·t·⊂) sets of indexes for attributes in att(E)\att(P). The reason to keep only those minimal sets is that if A and B are attributes such that XA⊂XB, then if the formula ViεX
Algorithm 11 performs a tautology test. Testing if a formula is a tautology is in general a computationally complex process. In practice the tautology test may be done by negating the formula and checking unsatisfiability with a satisfiability problem (SAT) solver.
Algorithm 12 makes the complete adaptation and validation of mappings fragments.
Algorithm 12 repeats a process similar to Algorithm 1 for checking foreign key constraints over tables T1, . . . , Tn and for adapting the mapping fragments. In the algorithm, Σ− is the set of mapping fragments before the addition of the new entity type.
The client-side conditions that may be considered can be arbitrary as long as they can be tested for being a tautology. In some cases, conditions may be handled by considering partitions of the domain of every attribute. The partition scheme may not be suitable to handle conditions of the form A=B, where both sides of the equation are column or property references.
Incrementally Computing Views.
To formally describe how to incrementally compute query views, some additional notation is introduced. Given a formula ψ over attributes and constants att(ψ) may denote the set of all attribute names occurring in ψ. Moreover, asgn(ψ) may denote the set of assignments A←c where A=c is a logical consequence of ψ. For example, let ψ be the formula
A
1=1A1=A22≦A3A3≦A4A4≦2A5≠3.
Then, att(ψ)={A1, A2, A3, A4, A5} and asgn(ψ)={A1←1, A2←1, A3←2, A4←2}. Furthermore, the set fix-att(ψ) that was introduced above may be defined as the set of all attributes in att(ψ) that are mentioned in asgn(ψ). If a formula ψ is used when adding a new entity type E, the set of assignments asgn(ψ) may be used to create part of the query view for E. For example, assume that a new entity E is added with attributes KE, A1, A2, A3, A4 with KE the primary key. Further assume that when adding E the tuple (α, ψ, ƒ, T) is added in Γ such that α is the set of attributes {KE, A1, A2, A3}, ψ is the formula A1≠1A3=2A4=3, and ƒ is a function such that ƒ(KE)=KT, ƒ(A1)=B1, ƒ(A2)=B2, and ƒ(A3)=B3. In this case, att(ψ)={A1, A3, A4}, fix-att(ψ)={A3, A4}, and asgn(ψ)={A3←2, A4←3}. When constructing the query view for E, the following expression may be used
πK
That is, for attributes in fix-att(ψ), just the assignments in asgn(ψ) are considered when extracting information from T.
Algorithm 13 shows the complete procedure to create query views in this case.
In considering Algorithm 13, entity E is being split into several pieces and these pieces are being mapped according to mapping fragments (8). Since every mapping fragment maps the primary key attribute of E to a different table, the entity E may be reconstructed by taking the full-outer-join of the information stored in every such table. That is what is done in Step 1 of Algorithm 13. Information on the ancestor P is used to reconstruct the missing information of entities of E. Steps 2 and 3 incrementally recompute the query views for the entity types that are ancestors of E by following a strategy similar to the strategy of Algorithm 2.
Algorithm 14 recomputes update views. It follows a strategy similar to the strategy used for recomputing update views in Algorithm 3.
In the presence of store-side constraints (e.g., foreign key constraints), different frameworks may impose differing levels of validity checks. For example, assume that an entity type E is mapped to a table T in the store schema and assume that T that has a foreign key to a table T′. For this mapping to be considered valid, one framework may need an association to be mapped to the foreign key relation as well as an entity type mapped to table T′. This requirement may be imposed even in the case that the foreign key attributes in T do not participate in the mapping from entity E. This validity checking is stronger than checking roundtripability, since if the foreign key attributes in T are nullable, then it allows the information of entities of type E to be stored in table T without losing data.
A validity check of a framework may forbid some incremental additions of entity types and associations. For example, a framework may not allow adding an entity type mapped to a table with foreign keys and then adding associations mapped to those foreign keys, since the first addition would result in an invalid mapping.
To deal with this problem, Algorithm 15 is described below that illustrates how an entity and associations may be added to the client schema in a single step. The example considered below is when an entity type E is mapped to a table T that has foreign key constraints γ1→γ1′, . . . , γn→γn′, to tables R1, . . . , R. In this example, n association sets 1, . . . , n mapped to the foreign key constraints are needed. In the example, the case is considered where E is mapped to a single table T.
Consider the following primitives being used together:
where
The semantics of the addition of the new components using the above primitives is given by the following mapping fragments:
Algorithm 15 shows how to validate and adapt the mapping fragments.
Algorithm 15 follows a similar strategy to the strategy followed by Algorithm 1 to check validity with respect to previously added associations and adapting mappings (Steps 1 and 3 of Algorithm 15). In addition, a check is performed to determine whether all the foreign key constraints in T are covered by the new associations and that the associations can be safely mapped to those foreign key constraints (Step 2 of Algorithm 15).
Below is described mapping adaptation and incremental compilation for the case in which an association between two entity types is replaced by an inheritance relationship between both entity types. Although this change is not an incremental addition to the client schema, it may be treated in a way similar to some of the incremental additions covered so far. First, adding an association (not previously described above) is considered.
Adding Associations with Client Side Referential Constraints.
Below, a procedure is described to address the addition of a class of associations that are manifested in the client schema as explicit client-side referential constraints. These constraints are similar to foreign-key constraints in the store schema and are similarly enforced in the client side. As an example, let E1 be an entity type with a key given by a single attribute K1, and E2 an entity type with a primary key composed of two attributes L1, L 2. Consider now an association between E1 and E2 with multiplicity 1 for E1 and * for E2. In one framework, it may be specified that the association is given by a referential constraint of the form L1→K1. Thus, an entity e1 of type E1 is associated with an entity e2 of type E2 whenever the value of attribute K1 of e1 is equal to the value of attribute L1 of e2. This referential association establishes a oneto-many association.
Below, adding the above described type of associations to the client schema is described. Because the association is manifested in the client as a referential constraint, the association may be mapped to an already used table T such that the key of one of the endpoints of the association is mapped to the primary key of T and the key of the other endpoint is mapped to a subset of the primary key attributes of T.
Associations may be added with the following primitive:
AddAssocRC(,E1,E2,mult,T,ƒ)
where:
1. E1 and E2 are the endpoints of the association set .
2. is manifested as a referential constraint in the client-side of the form β→α where α is the set of key attributes of E1 and β is a subset of the key attributes of E2.
3. mult is an expression that denotes the multiplicity of the association. The multiplicity is such that the endpoint corresponding to E1 is 1. Depending on the referential constraint the multiplicity of E2 is to be:
4. T is a table previously mentioned in mapping fragments and has update view QT−|τT−.
5. ƒ is a 1-1 function that satisfies the following. Assume that γ is the set of primary key attributes of E2. Then ƒ(E2·γ) is the primary key of T. Moreover, if there is a foreign key from T that mentions the attributes ƒ(E2·β) then it covers all the attributes ƒ(E2·β) that is, it is of the form ƒ(E2·β)→β′.
The semantics of the addition of the new association using the above primitive is given by the following mapping fragment:
πE
The primary key α of E1 is mapped to the set of attributes ƒ(E2·β) in table T which are part of the primary key of T. If the primary key α of E1 is mapped to a different set of attributes not previously used in table T, the methodology previously described (e.g., under the title Associations Mapped to a Previously Used Table) above may be used to add the association to the client schema.
When a case is covered by the methodology previously described, there may be no need to adapt the previous mapping fragment. Instead, a check may be performed of the validity of the new mapping given by the previous set of mapping fragments plus the mapping fragment (10). Algorithm 17 performs this validation.
As described previously, in this case, all the previous query views are not changed and there is only a need to add the query view for the new association set . Algorithm 18 shows how to create this view.
For the case of update views there is no need to make any changes since the new association is stored in table T just by using the information of the primary key of entities of type E2 (that was already mapped to table T) materialized in the store side in the primary key of table T.
Replacing Association by Inheritance.
A diagram of the general case described below is shown in
To formalize the process, assume that initially entity type E1 is part of an entity set E, entity type E2 is part of an entity set ε′ and that E2 is the root of the hierarchy of types to which it belongs (as shown in
For the refactoring process, the following primitive is considered:
Refact(E1,E2,)
where:
1. is an association between E1 and E2 given by a referential constraint in the client side from the key attributes of E2 to the key attributes of E1.
2. The multiplicity of E1 in the association is 1 and the multiplicity of E2 is 0 . . . 1.
3. E2 is the root of a hierarchy of entity types.
The application of Refact(E1, E2, ) has the same effect shown in
Algorithm 19 shows the adaptation of the mapping fragments after Refact(E1, E2, ).
In Algorithm 19, Σ− denotes the set of mapping fragments before the refactoring. For point 1(a) of Algorithm 19, assume, for explanation sake, that before the refactoring the set of mappings contains a mapping fragment of the form
πE
where α is the set of key attributes of E1 and E2, and T is an arbitrary table in the store. After the refactoring, the association set is no longer part of the client schema. In Step 1(a), Algorithm 19 deals with deletion of replacing the mapping mapping fragment (11) by:
πE
Above, the reference to has been replaced by a single reference to entity type E2, and both endpoints of have been replaced by references to the key attributes of E2. Similarly, since after the refactoring entity set ε′ is no longer part of the client schema, in Step 1(b), Algorithm 19 replaces any reference to ε′ by ε.
The adaptation does not need any additional validation. Since the mapping was valid before the refactoring, it can be proved that the form of adapting the mapping fragments described above gives a valid mapping.
Algorithm 20 shows how query views may be recomputed. For an entity type E, QE−|τE− denotes the query view for E before the refactoring.
For recomputing update views a strategy similar to the strategy for adapting mapping fragments is followed. The details of the strategy are described in Algorithm 21.
Turning to
Where the system 1105 comprises a single device, an exemplary device that may be configured to act as the system 1105 comprises the computer 110 of
The incremental components 1110 may include a change receiver 1115, an entity manager 1120, an association manager 1125, a validator 1130, a query view manager 1135, an update view manager 1145, and other components (not shown). As used herein, the term component is to be read to include hardware such as all or a portion of a device, a collection of one or more software modules or portions thereof, some combination of one or more software modules or portions thereof and one or more devices or portions thereof, and the like.
The communications mechanism 1155 allows the system 1105 to communicate with other entities. For example, the communications mechanism 1155 may allow the system 1105 to communicate with applications or database management systems (DBMSs) on remote hosts. The communications mechanism 1155 may be a network interface or adapter 170, modem 172, or any other mechanism for establishing communications as described in conjunction with
The store(s) 1150 include any storage media capable of providing access to data. The term data is to be read broadly to include anything that may be represented by one or more computer storage elements. Logically, data may be represented as a series of 1's and 0's in volatile or non-volatile memory. In computers that have a non-binary storage medium, data may be represented according to the capabilities of the storage medium. Data may be organized into different types of data structures including simple data types such as numbers, letters, and the like, hierarchical, linked, or other related data types, data structures that include multiple other data structures or simple data types, and the like. Some examples of data include information, program code, program state, program data, other data, and the like.
The store(s) 1150 may comprise hard disk storage, other non-volatile storage, volatile memory such as RAM, other storage, some combination of the above, and the like and may be distributed across multiple devices. The store(s) 1150 may be external, internal, or include components that are both internal and external to the system 1105.
The store(s) 1150 may host databases and may be accessed via conesponding DBMSs. Access as used herein may include reading data, writing data, deleting data, updating data, a combination including two or more of the above, and the like.
The change receiver 1115 is operable to receive an indication of a change to a client schema (e.g., via message passing, being called, or otherwise) and to receive a compilation directive (e.g., TPT, TPC, TPH, partition, or the like) associated with the change. The client schema is mapped to a store schema via mapping data specified by a set Σ of mapping fragments.
The entity manager 1120 is operable to use the compilation directive in incrementally modifying the store schema in response to the change to the client schema. The entity manager 1120 may select one or more of the algorithms described herein to perform the incremental modification. The entity manager 1120 may use the compilation directive to select the algorithm(s). For example, for a TPT or TPC directive, the entity manager 1120 may select Algorithm 1.
The association manager 1125 is operable to use the compilation directive in incrementally modifying the mapping data in response to the change to the client schema. The association manager 1125 may operate similarly to the entity manager 1120 in selecting the algorithm. For example, for a TPT or TPC directive, the association manager 1125 may select Algorithm 1. In one embodiment, the association manager 1125 and the entity manager 1120 may be combined into one component or may share an algorithm selection function.
The validator 1130 is operable to validate that modifications to the store schema and the mapping data do not violate constraints placed on the store schema. The validator may use as a starting point (e.g., an assumption) that the store schema was valid (e.g., did not violate any constraints) prior to the modifications and may perform an incremental validation (also referred to as a local validation) that is effective for the modifications instead of re-validating the entire store schema. The validator may use appropriate algorithms described herein to validate the store schema.
The query view manager 1135 may update query views as described by algorithms herein. Likewise, the update view manager 1145 may modify update views as described herein.
Turning to
At block 1215, a compilation directive associated with the change is received. The compilation directive may be received together with the indication of the change or in a separate communication. The compilation directive may indicate how one or more types in the client schema are mapped to one or more types in the store schema. As mentioned previously, these mapping strategies may include one or more of: table-per-type, table-per-concrete-type, table-per-hierarchy, and partitioned across tables. For example, referring to
At block 1220, validation is performed as needed. As mentioned previously, this validation may forego re-validating the entire store schema and may focus on local validations that validate only a portion of the mapping data where the portion is affected by the change. For example, referring to
At block 1225, incremental actions are performed. Incremental actions may include incrementally modifying the mapping data and storage schema to be consistent with the requested change. For example, if the incremental action is to add entity type, then the storage schema may be modified to add a table to store the entity type, and the mapping data may be modified to include mapping fragments that express the mapping from the added entity type to the added table. Incrementally modifying the mapping data and storage schema may involve creating new entities and/or relationships, deleting existing entities and/or relationships, updating existing entities and/or relationships, adding mapping fragments, incrementally modifying query and/or update views, and the like. For example, referring to
Incrementally modifying the mapping data and storage schema to be consistent with the requested change may include incrementally modifying only a subset of the mapping data and storage schema, where the subset including only a minimal portion of the mapping data and storage schema
At block 1230, other actions, if any, may be performed.
Turning to
At block 1320, a compilation directive associated with the change is received. The compilation directive may be received together with the indication of the change or in a separate communication. The compilation directive may indicate how one or more types in the client schema are mapped to one or more types in the store schema. For example, referring to
At block 1325, if the compilation directive indicates mapping TPT or TPC, the actions continue at block 1330; otherwise, the actions continue at block 1335. At block 1330, a first set of actions is performed to incrementally modify the mapping data. For example, referring to
1. checking whether referential constraints expressed as tests for query containment and associated with modifying the mapping for the new type are valid;
2. if the referential constraints are not valid, aborting modifying the mapping data; and
3. if the referential constraints are valid, then modifying the mapping data for each proper ancestor and for each proper descendant of the new type.
The first set of actions may also include other actions specified by algorithms described herein.
At block 1335, if the compilation directive indicates mapping TPH, the actions continue at block 1340; otherwise, the actions continue at block 1345. At block 1340, a second set of actions is performed to incrementally modify the mapping data. For example, actions under the heading “Adding an Entity: the TPH case” may be performed. These actions may include, for example:
1. checking whether keys associated with modifying mapping for the new type are valid;
2. if the keys are not valid, aborting modifying the mapping data;
3. if the keys are valid, identifying a parent type of the new type and modifying the mapping data only for the parent type.
At block 1345, if the compilation directive indicates partitioning the new type across a plurality of tables, the actions continue at block 1350; otherwise, the actions continue at block 1355. At block 1350, a third set of actions is performed to incrementally modify the mapping data. For example, the actions under the heading “Adding a New Entity Type Partitioned across Several Tables” may be performed. These actions may include, for example:
1. checking coverage of attributes for the new type;
2. if the coverage fails, aborting modifying the mapping data;
3. if the coverage succeeds, checking whether keys associated with modifying mapping for the new type are valid;
4. if the keys are not valid, aborting modifying the mapping data; and
5. if the keys are valid, then modifying the mapping data for each proper ancestor and for each proper descendant of the new type.
In some implementations there may be other ways of mapping entities and mappings in the client schema to the store schema. At block 1355, if the mapping directive indicates a different mapping, the actions continue at block 1360; otherwise, the actions continue at block 1365. At block 1360, actions appropriate for the mapping directive are performed.
At block 1365, other actions, if any, are performed. These other actions may include, for example:
1. receiving an indication that a new association has been added to the client schema and in response checking validity of mapping fragments for the new association. If the mapping fragments for the new association are invalid, then aborting the mapping data;
2. modifying the mapping data to account for the new type in conjunction with modifying the mapping data to account for the new association;
3. incrementally updating an update view based on the new type;
4. incrementally updating a query view based on the new type;
5. other actions indicated by algorithms and elsewhere herein.
As can be seen from the foregoing detailed description, aspects have been described related to incrementally modifying schemas and mappings. While aspects of the subject matter described herein are susceptible to various modifications and alternative constructions, certain illustrated embodiments thereof are shown in the drawings and have been described above in detail. It should be understood, however, that there is no intention to limit aspects of the claimed subject matter to the specific forms disclosed, but on the contrary, the intention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of various aspects of the subject matter described herein.