The subject disclosure generally relates to mapping programming languages to a destination, and more particularly to mapping languages between a domain modeling system and a relational storage system.
When a large amount of data is stored in a database, such as when a server computer collects large numbers of records, or transactions, of data over long periods of time, other computers sometimes desire access to that data or a targeted subset of that data. In such case, the other computers can query for the desired data via one or more query operators. In this regard, historically, relational databases have evolved for this purpose, and have been used for such large scale data collection, and various query languages have developed which instruct database management software to retrieve data from a relational database, or a set of distributed databases, on behalf of a querying client.
It is often desirable to author source code for such management functions in a declarative programming language. Unlike imperative programming languages, declarative programming languages allow users to write down what they want from their data without having to specify how those desires are met against a given technology or platform. However, current models authored in a declarative modeling language usually go through a series of tools that transform declarative definitions into various concrete implementation artifacts. Moreover, once a model is received by a database, the declarative nature of the model has been transformed into an imperative model, which may be undesirable. Accordingly, there is a need for a method and system for mapping between a domain modeling system and a relational storage system in which the declarative nature of an execution model is preserved.
The above-described deficiencies of current relational database systems and corresponding database management techniques are merely intended to provide an overview of some of the problems of conventional systems, and are not intended to be exhaustive. Other problems with conventional systems and corresponding benefits of the various non-limiting embodiments described herein may become further apparent upon review of the following description.
A simplified summary is provided herein to help enable a basic or general understanding of various aspects of exemplary, non-limiting embodiments that follow in the more detailed description and the accompanying drawings. This summary is not intended, however, as an extensive or exhaustive overview. Instead, the sole purpose of this summary is to present some concepts related to some exemplary non-limiting embodiments in a simplified form as a prelude to the more detailed description of the various embodiments that follow.
Embodiments of a method and system for mapping between constructs in a domain modeling language and a relational storage language are described. In various non-limiting embodiments, the method includes receiving a source code authored in a source language and identifying a set of constructs in the source code. Within such embodiment, the set of constructs in the source code are mapped to a set of constructs in a target language. The source code is then compiled into a target code authored in the target language, such that one of the source code or the target code includes a declarative constraint-based execution model.
In another embodiment, a system is provided. Within such embodiment, a receiving component is configured to receive a source code authored in a source language. The system includes a first construct library configured to store constructs that support the domain modeling language, and a second construct library configured to store constructs that support the relational storage language. The system also includes a processor component configured to execute computer-readable instructions for mapping a set of constructs in the source code to a set of constructs in a target language. And finally, the system includes a compiling component configured to generate a target code authored in the target language and analogous to the source code, such that one of the source code or the target code includes a declarative order-independent execution model.
In yet another embodiment, a system for mapping between an M language programming code and an SQL programming code is provided. The system includes means for receiving a source code authored in either M or SQL, and means for identifying a set of constructs in the source code. The system also includes a means for mapping the set of constructs in the source code to a set of constructs in either M or SQL. A means for compiling the source code into a target code authored in either M or SQL is also provided, such that one of the source code or the target code includes a declarative constraint-based and order-independent execution model.
These and other embodiments are described in more detail below.
Various non-limiting embodiments are further described with reference to the accompanying drawings in which:
As discussed in the background, among other things, conventional systems do not provide an adequate mechanism for translating declarative execution models into a relational storage language. Accordingly, in various non-limiting embodiments, the present invention provides a method and system for mapping constructs between a declarative domain modeling system and a relational storage system. It should be appreciated that an exemplary domain modeling programming language that is compatible with the scope and spirit of the disclosed subject matter is the M programming language (hereinafter “M”), which was developed by the assignee of the subject application. In one aspect, M is a purely declarative language that is operable to produce either an order-independent or constraint-based execution model. In addition to M, however, it is to be understood that other similar programming languages may be used, and that the utility of the disclosed subject matter is not limited to any single programming language.
In one embodiment, the methods described herein are operable with a programming language, such as M, having a constraint-based type system. Such a constraint-based system provides functionality not simply available with traditional, nominal type systems. In
For an illustration of the contrast between a nominally-typed execution model and a constraint-based typed model according to a declarative programming language described herein, such as the M programming language, exemplary code for type declarations of each model are compared below.
First, with respect to a nominally-typed execution model the following exemplary C# code is illustrative:
For this declaration, a rigid type-value relationship exists in which A and B values are considered incomparable even if the values of their fields, Bar and Foo, are identical.
In contrast, with respect to a constraint-based model, the following exemplary M code (discussed in more detail below) is illustrative of how objects can conform to a number of types:
For this declaration, the type-value relationship is much more flexible as all values that conform to type A also conform to B, and vice-versa. Moreover, types in a constraint-based model may be layered on top of each other, which provides flexibility that can be useful, e.g., for programming across various RDBMSs. Indeed, because types in a constraint-based model initially include all values in the universe, a particular value is conformable with all types in which the value does not violate a constraint codified in the type's declaration. The set of values conformable with type defined by the declaration type T:Text where value<128 thus includes “all values in the universe” that do not violate the “Integer” constraint or the “value<128” constraint.
Thus, in one embodiment, the programming language of the source code is a purely declarative language that includes a constraint-based type system as described above, such as implemented in the M programming language.
In an embodiment, the synchronization method described herein may also be operable with a programming language having an order-independent execution model. Similar to the aforementioned constraint-based execution model, such an order-independent execution model provides flexibility that is also particularly useful for programming across various RDBMSs.
In
However, data storage abstraction 400 may have also resulted from the following code:
And each of the two codes above may be functionally equivalent to the following code:
f: Foo*={{Bar=“2”}, {Bar=“3”}, {Bar=“1”}};
In an embodiment, it should be appreciated that aspects disclosed herein may be implemented as part of a compiling unit that is configured to compile source code in any of a plurality of ways. For instance, in
In an aspect, the post-processed definitions 524 include the processed source code and any of a plurality of designtime/runtime artifacts associated with the processed source code. Such artifacts, for example, may include artifacts based on dependencies to subsequent source models 510, other repositories 550, and/or external resources 542 (e.g., CLR assemblies). A packaging component 530 may then package the post-processed definitions 524 as image files, which are installable into particular repositories 550. Within such embodiment, image files include definitions of necessary metadata and extensible storage to store multiple transformed artifacts together with their declarative source model. For example, packaging component 530 may set particular metadata properties and store the declarative source definition together with compiler output artifacts as content parts in an image file.
As illustrated, system 500 may also include synchronization component 540, which may be utilized to manage image files. For example, synchronization component 540 may take an image file as an input and link it with a set of referenced image files. In between or afterwards, there could be several supporting tools (like re-writers, optimizers etc.) operating over the image files by extracting packaged artifacts, processing them and adding more artifacts in the same image file. These tools may also manipulate metadata of the image file to change the state of the image file (e.g., digitally signing an image file to ensure its integrity and security).
Next, a deployment utility deploys the image file and an installation tool installs it into a running execution environment within repositories 550. In an embodiment, repositories 550 are a collection of running relational database management systems (RDBMS). Here, however, it should be noted that the default system catalog of many RDBMSs do not provide adequate assistance to manage deployment of image files themselves or their packaged contents. In order to address this limitation, an embodiment may include having data structures included within an image file extracted and used to populate an extended catalog so as to augment the default system catalog of such RDBMSs.
For some applications, however, it may be desirable for compiler 520 to directly provide target code 522. Accordingly, as illustrated in
Referring next to
In one aspect, processor component 710 is configured to execute computer-readable instructions related to performing any of a plurality of functions. Such functions may include controlling any of memory component 720, interface component 730, M construct library component 740, SQL construct library component 750, and/or compiler component 760. Other functions performed by processor component 710 may include analyzing information and/or generating information that can be utilized by any of memory component 720, interface component 730, M construct library component 740, SQL construct library component 750, and/or compiler component 760. Here, it should also be noted that processor component 710 can be a single processor or a plurality of processors.
In another aspect, memory component 720 is coupled to processor component 710 and configured to store computer-readable instructions executed by processor component 710. Memory component 720 may also be configured to store any of a plurality of other types of data including, for instance, queued source codes to be processed, syntactical rules, compile-time artifacts, etc., as well as data generated by any of interface component 730, M construct library component 740, SQL construct library component 750, and/or compiler component 760. Memory component 720 can be configured in a number of different configurations, including as random access memory, battery-backed memory, hard disk, magnetic tape, etc. Various features can also be implemented upon memory component 720, such as compression and automatic back up (e.g., use of a Redundant Array of Independent Drives configuration).
As shown, mapping system 700 may also include interface component 730. In an embodiment, interface component 730 is coupled to processor component 710 and configured to interface mapping system 700 with external entities. For instance, interface component 730 may be configured to receive source code from either a domain modeling language system and/or a relational store. Interface component 730 may also be configured to display an output to a user, as well as to transmit the output to an external entity (e.g., via a network connection).
In another aspect, mapping system 700 also includes M construct library 740 and SQL construct library 750, as shown. Within such embodiment, M construct library 740 and SQL construct library 750 respectively include a plurality of constructs in M and SQL that may be utilized to translate M code into SQL code, and vice-versa. Moreover, each of M construct library 740 and SQL construct library 750 provide the universe of constructs from which mapping system 700 may select to map a source code to a target code. Here, it should be appreciated that each individual construct in M does not necessarily have a particular corresponding construct in SQL. Nevertheless, as will be discussed in more detail below, by combining various constructs, an extensive mapping landscape may be provided.
Mapping system 700 also includes compiler component 760, as shown. In an embodiment, compiler component 760 is coupled to processor component 710 and configured to compile source code received via interface component 730. Here, it should be noted that compiler 760 may be configured to compile any of a plurality of types of compile-time artifacts. For instance, in an aspect, a plurality of candidate mappings for a given expression might be compiled, wherein particular rules may be utilized to subsequently select which candidate mapping to include at runtime.
Referring next to
In the discussion that follows, an overview of exemplary mappings are discussed within the context of a generic domain modeling language mapping to a generic relational storage language.
In an embodiment, any storage specifier in a domain modeling language may map to a corresponding table in the relational storage language. Within such embodiment, the domain modeling language includes a constraint-based system which maps to particular constraints imposed on tables in the relational storage language. In an aspect, whereas scalars may map to simple cells, entities having multiple identity fields may map to a table having multiple columns. An exemplary mapping of such a multi-field entity is provided in
As illustrated, mapping 900 includes an Entity Foo 910 in a domain modeling language being mapped as a Table Foo 920 in a relational storage language. Within such embodiment, Entity Foo 910 includes Field A 912, Field B 914, and Field C 916, as shown. Here, it should be appreciated that each of Field A 912, Field B 914, and Field C 916 are themselves storage identifiers, which could be mapped to individual tables. Accordingly, each of Field A 912, Field B 914, and Field C 916 may correspond to a particular type of data (e.g., integer32, text, etc.), wherein particular constraints may be imposed on such data. In an embodiment, these data types and constraints are subsequently mapped to Column A 922, Column B 924, and Column C 926 such that Entity Foo 910 is analogous to Table Foo 920.
For some embodiments, it may be desirable for entities to reference other entities. Referring next to
In an aspect, it should be appreciated that mapping 1000 may similarly illustrate either a “one-to-many” or “one-to-one” entity reference. To implement the one-to-many pattern (e.g., Field C 1016 referencing Field D 1032 and Field E 1034, which maps to Column C 1026 referencing Column D 1042 and Column E 1044), an entity reference to the Parent in the child field may be included. Similarly, to implement a one-to-one pattern (e.g., Field C 1016 referencing Field E 1034 which maps to Column C 1026 referencing Column E 1044), an entity reference from one of the fields to the other may be included. Although, both relationships may look the same in the domain modeling language, the one-to-one relationship may be enforced in the relational storage language by utilizing a unique constraint.
In an embodiment, a mapping system implements entity references with a column with the same type as the entity's identity column, and a foreign key from the referencing table to the referenced table. Here, it should also be noted that retrieving particular values from either of Entity Foo 1010 or Entity Bar 1030 may be mapped as executing a “select” command on particular cells from either of Table Foo 1020 or Table Bar 1040, wherein multiple retrievals may be mapped as performing a “union” of multiple “select” commands. A more detailed discussion of such retrievals is provided later within the context of “member access” and “anonymous collections” in M.
Referring next to
It should be further noted that various novel mapping schemes for performing initializations are also provided. To facilitate such initialization embodiments, an “insert” construct may be available in the relational storage language construct library. In
In some embodiments, mapping nested initialization schemes may also be supported. Indeed, as illustrated in
In yet another embodiment, labeled initialization schemes may also be supported, as illustrated in
As stated previously, an exemplary domain modeling programming language that is compatible with the scope and spirit of the disclosed subject matter is the M programming language. In the subsequent discussion, so as to provide additional context for some of the novel features described herein, a plurality of exemplary translations between M and SQL are discussed.
In an embodiment, with respect to basic types, scalar “M” types translate to scalar SQL types with the following mapping. “Constraint” is used when the SQL type does not perfectly represent the “M” type—for example, when it allows a larger range of values than the corresponding “M” type. In these cases, a CHECK constraint may be emitted on the table restricting the range of the value, as shown in Table T-1 below.
In an embodiment, no other scalar types are supported. The unsupported scalar types are: Scientific, Unsigned, Integer, Decimal, Number, and Unsigned64.
With respect to modules, “M”->SQL translates “M” “modules” to SQL “schemas.” Dots in module names may be translated directly to schema names. In an embodiment, all tables, functions and views generated from extents and computed values defined in a module are placed within that schema. In one aspect, module imports and exports are purely “M” constructs and have no effect on the SQL output by “M”->SQL. Exemplary translations for a module and dotted module are provided in Table T-2.
In an embodiment, “M” extents are simply storage specifiers (e.g., “MyInts Integer32*” means “MyInts is a place where I can hold a bunch of integers.”). Within such embodiment, “M”->SQL compilation may translate extents into tables with the same name. Moreover, within such embodiment, all supported extents are converted to SQL tables.
In one aspect, “M”->SQL supports extents of four different variations: Scalar, collection of Scalar, Entity, and collection of Entity. With respect to entities, it should be appreciated that SQL tables may have columns of the same name as the entity fields. Exemplary translations for entities are provided in Table T-3.
The translation of “M” scalar types to SQL scalar types has been discussed previously with respect to translating Basic Types. Nevertheless, it should be appreciated that SQL tables for scalars may have one column named “Item”, as described in Table T-4 below.
As can be seen from the “Scalar” and “Entity” examples above, tables created to hold exactly one value may look the same as tables that hold many values. In some embodiments, “M” may be configured to assume there is always exactly one row in such tables. Enforcement of such assumption may then be achieved by creating appropriate constraints. In other embodiments, it should be appreciated that “M”->SQL may be configured to support collections of collections (such as GroupsOfPeople:(Person*)*).
With respect to identity columns, “M” entities may have one or more identity fields specified that make up a unique key for the entity. In one aspect, such identity is the basis for entity relationships (e.g., one-to-one, one-to-many and many-to-many relationships). “M”->SQL may support single or multiple field identities, and will create a primary key containing those columns. In some embodiments, “M”->SQL supports any type for identity, whereas other embodiments may exclude unconstrained Text. “M”->SQL may also support the identity autonumbering scheme of SQL.
In one aspect, “M”->SQL explicitly specifies the “clustered” keyword for the primary key even though it is the default for SQL. Within such embodiment, the clustered index is the index in which the data for the table is actually stored (so that it could be looked up quickly when the primary key is used to look it up). Exemplary translations for identities are provided in Table T-5.
In some embodiments, unconstrained Text field identities are not supported because SQL cannot create a primary key for them. For such embodiments, in order to use a Text field as an identity, its length might be constrained. In other embodiments, it should also be noted that collection fields and nullable fields may be supported as part of a primary key. Furthermore, although primary keys that would take more than 900 bytes to store are presently unsupported due to limitations in SQL, future embodiments may include such primary keys.
Referring next to entity references, one-to-one and one-to-many relationships in “M” may be modeled using entity references. To implement the one-to-many pattern, an entity reference may be placed to the Parent in the child extent (e.g., OtherTable:OtherType). To implement one-to-one, an entity reference may be placed from one of the extents to the other (e.g., OtherTable:OtherType). It should be appreciated, however, that both of the above implementations look the same in “M”. (Note: to enforce the 1:1 relationship, a unique constraint can be used.)
In an embodiment, “M”->SQL implements entity references with a column with the same type as entity's identity column, and a foreign key from the referencing table to the referenced table. Meanwhile, trees are simply self-referencing one-to-many relationships. In one aspect, it should be noted that “M”->SQL will generate an error for entity references without a membership constraint (e.g., “where value in OtherExtent” or some variant thereof). Exemplary translations for entity references are provided in Table T-6.
“M”->SQL supports one-to-many (and one to one) relationships using collection fields. Unlike other entity fields, however, collection fields do not translate directly to a column in the underlying SQL table. Instead, a many-to-many “child table” is created with the name “<Extent>_<Field>”, with a parent id and an “Item” column to hold the values. If the parent has a multi-column primary key, the parent reference in the child table may be multi-column as well. For collections of entities, the “Item” column(s) of the child table may be a foreign key to the storage in the same manner as a one-to-many relationship. In one aspect, the child table has “on delete cascade” specified, so that if a row in the parent table is deleted, corresponding rows in the child table will also be deleted. In other aspects, collection fields are only supported if the containing entity has an identity field, so that the parent ID could be mapped to. Exemplary translations for many-to-many relationships are provided in Table T-7.
It should be appreciated that fields can have default values in “M”, for example “MyGuid:Guid=NewGuid( )” or “Z: Integer32=0.” “M”->SQL translates simple constant defaults into the “default” keyword for a column. Future embodiments may also include default values for more complex values including references to columns, subqueries, and math operators. As a result, an “insert into” command may be applied to a table without specifying those values, and they will be replaced with their defaults. Here, it should also be noted that an “AutoNumber( )” default, only specifiable on an integer identity field, may also be used which will cause the field type to be “identity.” In future embodiments, it should be further noted that “M”->SQL may handle default values for collection and/or entity columns, including default values that refer to collection or entity columns in the same extent. Exemplary translations for implementing default values are provided in Table T-8.
Referring next to constraints, in one embodiment, aside from membership and identity constraints, all “where” constraints on a type are translated in “M”->SQL as CHECK constraints. In order to allow a richer set of expressions, CHECK constraints may be placed in a function returning a Logical value. It should also be noted that, before CHECK constraints are created, identity and foreign key constraints may be removed from the expression (i.e. they will not be turned into CHECK constraints) by, for example, breaking up the expression by &&'s. So if the “M” has a constraint such as: “where identity(Id) && Address in Addresses && Age >10”, “Age >10” is turned into a check constraint, but the first two constraints will not because they are identity and foreign key constraints, respectively. Exemplary translations for implementing constraints are provided in Table T-9.
In some embodiments, translations of computed values inside an extent may also be contemplated. Here, it should be noted that such computed values translate into computed columns in SQL-speak.
Extents may also be initialized. In an embodiment, “M” extents can have one or more initializers that state the data that needs to be in the table. Within such embodiment, “M”->SQL translates these initializers into insert statements. Nested entity and collection initializers may also be supported. In one aspect, when values that have defaults are not filled in, the corresponding SQL INSERT is not filled in either, and SQL fills in the values itself. Exemplary translations for implementing an initialization are provided in Table T-10.
In the above initializations, it should be noted that any depth of nested entity reference may be supported. It should be further noted that embodiments that include initialization of nested entities and entity references may also be contemplated when the referenced/nested entity is part of an extent with multi-field identity.
Referring next to labeled extent initialization, when inserting instances, the instances may be labeled and referred to in other insert statements or in expressions. Meanwhile, labeled scalars may be translated as constants wherever they are seen. Exemplary translations for labeled entities are provided in Table T-11.
In the above translations, it should be noted that embodiments where references to labeled entities are supported may also be contemplated. Other embodiments that may be contemplated include references to labeled entities with multiple identity fields.
Referring next to identifiers and collisions, all views, functions, tables, columns, constraints, and triggers created in “M” may have names that are similar to the underlying table. However, because SQL only supports 128-character names, if the name of an identifier would be longer than 128 characters, it may be truncated to 128 characters.
In an embodiment, identifier collisions are also detected and disambiguated. It is possible for conflicts between autogenerated names to exist. For example in the following “M” code, it can create a conflict between the autogenerated join table and the declared “People_Addresses” extent:
In an embodiment, when collisions are detected, a number is appended to the database object in question. In the example above, these two objects would be named “People_Addresses” and “People_Addresses1.” How these numbers are assigned may be contemplated in other embodiments.
Next module-level computed values are discussed. In an embodiment, computed values in “M” represent functions, expressions or queries that can be reused. Within such embodiment, computed values can have a range of expressions.
In one aspect, “M”->SQL translates computed values into scalar functions, table-valued functions, or views, depending on their most probable usage in SQL, which may be based on the return type and parameters to the function. Scalar functions may be created when the computed value return type is scalar, which allows these functions to be used like “SELECT*Module.Function(2)+1”. Meanwhile, views may be created when the computed value has no parameters and the return type is an entity or a collection, which allows views to be used like “SELECT*FROM Module.View”. Table-valued functions may then be created when the computed value has parameters and the return type is an entity or a collection, which allows these functions to be used like “SELECT*FROM Module.Function(10)”. It should be noted that, in addition to supporting scalars in parameters, future embodiments in which “M”->SQL supports entity and collection parameters to computed values may also be contemplated. Exemplary translations for module-level computed values are provided in Table T-12.
Query return types may also be supported. For table-valued functions and views, the returned rowset may have the same set of columns that the corresponding table would have. Therefore, in one aspect, any SQL query that can be done on a table of type T, can also be done with a view or function returning type T. The way various types may be represented in tables was discussed previously with respect to extents.
Referring next to expressions, “M”->SQL can translate most “M” expressions including operators (e.g., *, /, +, −), queries (e.g., “from person in People where person.Age>=21”), constants and initializers (e.g, “astring”, {1, 2, 3}), and other expressions such as member access and function calls (e.g., person.Age). These expressions can be used in several different contexts, including computed values, constraints, default values, and initializers. And although “M” supports a wide range of expressions over a wide range of types, some “M”->SQL embodiments may only support expressions that return scalars, entities and collections of scalars or entities (i.e., the same set of types that are supported by extent->table translation).
In an embodiment, it should be noted that expressions are composable—the result of one expression can be used in another expression. Hereinafter, because only a few of a plurality of compositions are discussed, the following statements will be used to mean any “M” expression that can fit into A and B:
Accordingly, wherever MyString.Count+10 is provided in “M”, the translated SQL will look like (LEN(MyString))+10. Query expressions may be similarly composed. For instance,
may take the SQL query expression for A and the SQL query expression for B and place them into those slots. Moreover, “Foos|Bars” may be turned into “(SELECT*FROM Foos) UNION (SELECT*FROM Bars)”.
With respect to constants, it should be noted that the majority of constants may be translated verbatim. An exemplary non-exhaustive list of “M” literals and their SQL equivalents is provided in Table T-13.
With respect to numeric expressions, it should be noted that numeric values have a relatively small number of operators that can act upon them. In one aspect, these operators may thus generally be translated verbatim to SQL. Exemplary translations for numeric operators are provided in Table T-14.
Exceptions to the above, however, may exist including: Integer8/16/32/64 and Unsigned8/16/32 divide. These operators may translate to “convert(decimal(x,6), A)/convert(decimal(x,6), B)” (where x is replaced by the smallest possible decimal value that can represent the entire range of values for the concrete integer type). Translations for Single and Double modulo operators may also be contemplated, as well as translations for string expression operators. Exemplary translations for string expression operators are provided in Table T-15.
Date expression operators including operators for expressions defined on Date, DateTime and Time, may also be supported. Exemplary translations for such date expression operators are provided in Table T-16.
Binary expression operators may also be supported. Exemplary translations for such binary expression operators are provided in Table T-17.
Byte expression operators may also be supported. Exemplary translations for such byte expression operators are provided in Table T-18.
GUID expression operators may also be supported. Exemplary translations for such GUID expression operators are provided in Table T-19.
Logical expression operators may also be supported. Exemplary translations for such logical expression operators are provided in Table T-20.
Conditional operators may also be supported. Exemplary translations for such conditional operators are provided in Table T-21.
Referring next to nullable expressions, in an embodiment, the == and != operators in “M” have the semantic that null ==null and null !=1. In SQL, both of these return “unknown,” a third truth value that may not be supported in “M”. In order to get the “M” semantics, when one side of an operator is a nullable type (for example, F(a:Integer32?, b:Integer32?) {a==b}), a special case logic may be generated that checks for null values to ensure that null == null and null !=1 turn out true. In an embodiment, this applies to all supported scalar types. In other embodiments, it should be noted that translations for nullable entity comparison may also be contemplated. It should be further noted that nullable comparison operators (<, >, <=, >=) operating on nullable integers may actually return Logical?: 1<=2 is true, 2<=1 is false, null <=null is null, and 1<=null is null, which may apply to all supported numeric types. Exemplary translations for such nullable expression operators are provided in Table T-22.
Explicit conversion operators may also be supported. An exemplary translation for such a conversion operator is provided in Table T-23.
Referring next to implicit conversion, it should be noted that although “M” allows logical values to intermingle with other scalars, SQL does not support such intermingling. Therefore, logical expressions like == may not be stored directly in an SQL table, or returned as a scalar or as the result of a query. In an embodiment, “bit” is used for that instead, which may require translating the result of == into a “1” or “0.” Likewise, logical values stored in tables as “bit”s may not be used directly in logical expressions like NOT, AND and OR (or where clauses of queries), so expr==1 may need to be used. So where necessary, expression conversion may automatically convert as described in Table T-24 below.
Referring next to global references, it should again be noted that “M”->SQL may translate computed values and extents into functions, views, and tables. In an embodiment, expressions are capable of referencing them as well. Within such embodiment, when an extent is used as an expression, the expression may always mean “get all the data out of this expression.” Exemplary translations for such global references are provided in Table T-25.
In an embodiment, translations for operations on collection expressions (i.e., expressions which work on collections) may also be supported. Exemplary translations for such operations are provided in Table T-26.
With respect to entity types from different tables, it should be noted that translations for Set Comparison (subset, equal, etc.). In, and Intersection may also be contemplated when used over entity types that come from different tables. Furthermore, with respect to entity types with different fields, translations for Union may be contemplated when used over entity types with different sets of fields (e.g., by using different tables).
In an embodiment, translations for operations specific to entity collections may also be supported. Exemplary translations for such operations are provided in Table T-27.
In an embodiment, translations for operations specific to logical collections may also be contemplated. Exemplary operations which may be specific to logical collections are provided in Table T-28.
In an embodiment, translations for operations specific to numeric collections may also be contemplated. Exemplary operations which may be specific to numeric collections are provided in Table T-29.
It should be noted that query modifiers may provide a more complex and feature-rich query syntax for collections, such as the “from . . . ” statement which has a number of clauses, each of which translates roughly separately from the others. Exemplary translations for such query modifiers are provided in Table T-30.
In Table T-30 above, it should be noted that the “from” expressions bring “ident” into scope as a variable. For scalar collections, future references to “ident” may resolve as [ident].[Item]. For entity collections, references to “ident” may resolve as (select [ident].<column1>, [ident].<column2, . . . ). It should be further noted that Join may support comparison of scalars and entities with identity, but not necessarily collections or other entities.
Next, a discussion of “result gatherers” is provided. Result gatherers are the terminus of a query, which may include “select”, “group”, or “accumulate.” “group” and “accumulate” are presently unsupported. An exemplary translation for a select operation is provided in Table T-31.
Here, it should be appreciated that the CROSS APPLY may be optimized out in simple cases (such as anonymous entities). Therefore, the following query:
Translates to this:
Which is then optimized to this:
In an embodiment, translations for query continuations may also be contemplated. Query continuations allow queries to be chained together, storing the result of a first query in a variable that can then be used in a second query. An exemplary query continuation operation is provided in Table T-32.
Translations for operations on entity expressions may also be supported. Here, it should be noted that entities in “M” are a single collection of field/value pairs (i.e., a row in a table). Exemplary translations for such operations are provided in Table T-33.
In an embodiment, member access in “M” operates over a single entity. Within such embodiment, member access may have several permutations depending on the type of the field being accessed. Exemplary translations for various member access operations are provided in Table T-34.
In an embodiment, entity and entity collection member access retrieves the entities, including all of their fields, from the table where the entities are stored. A full example is provided in Table T-35.
Here, it should be noted that other embodiments may also contemplate member access to Computed Columns.
In an embodiment, “M” supports the creation of anonymous collections in an expression (such as {1, 2, 4, 8, 16}). In one aspect, such creation is supported using queries and UNION ALL so that the collections can be returned from views and used anywhere expressions can be used. Exemplary translations of anonymous collections are provided in Table T-36.
Here, it should be further noted that embodiments may be contemplated in which “M”->SQL supports anonymous collections with entities of different types (different sets of columns) or collections of collections.
In an embodiment, “M” also supports the creation of simple entities in an expression (i.e. an anonymous group of name/value pairs). Such creation may be achieved by using the right side of a “from . . . select” query. In one aspect, “M”->SQL supports these by creating a SELECT statement, naming each field with the given name. Within such embodiment, if an entity reference is put as a column, that id of that entity will be assigned to the field. Exemplary translations of anonymous entities are provided in Table T-37.
In an embodiment, “M”->SQL may also support global built-in functions. An exemplary translation of such a global built-in function is provided in Table T-38.
It should also be appreciated that there are a plurality of “M” constructs that have no translation in “M”->SQL, including “type”, “import” and “export” which have no corresponding meaning in SQL. In an embodiment, these constructs are merely used to create the tables, views or functions, and are then discarded.
One of ordinary skill in the art can appreciate that the various embodiments described herein can be implemented in connection with any computer or other client or server device, which can be deployed as part of a computer network or in a distributed computing environment, and can be connected to any kind of data store. In this regard, the various embodiments described herein can be implemented in any computer system or environment having any number of memory or storage units, and any number of applications and processes occurring across any number of storage units. This includes, but is not limited to, an environment with server computers and client computers deployed in a network environment or a distributed computing environment, having remote or local storage.
Distributed computing provides sharing of computer resources and services by communicative exchange among computing devices and systems. These resources and services include the exchange of information, cache storage and disk storage for objects, such as files. These resources and services also include the sharing of processing power across multiple processing units for load balancing, expansion of resources, specialization of processing, and the like. Distributed computing takes advantage of network connectivity, allowing clients to leverage their collective power to benefit the entire enterprise. In this regard, a variety of devices may have applications, objects or resources that may cooperate to perform one or more aspects of any of the various embodiments of the subject disclosure.
Each object 1510, 1512, etc. and computing objects or devices 1520, 1522, 1524, 1526, 1528, etc. can communicate with one or more other objects 1510, 1512, etc. and computing objects or devices 1520, 1522, 1524, 1526, 1528, etc. by way of the communications network 1540, either directly or indirectly. Even though illustrated as a single element in
There are a variety of systems, components, and network configurations that support distributed computing environments. For example, computing systems can be connected together by wired or wireless systems, by local networks or widely distributed networks. Currently, many networks are coupled to the Internet, which provides an infrastructure for widely distributed computing and encompasses many different networks, though any network infrastructure can be used for exemplary communications made incident to the column based encoding and query processing as described in various embodiments.
Thus, a host of network topologies and network infrastructures, such as client/server, peer-to-peer, or hybrid architectures, can be utilized. The “client” is a member of a class or group that uses the services of another class or group to which it is not related. A client can be a process, i.e., roughly a set of instructions or tasks, that requests a service provided by another program or process. The client process utilizes the requested service without having to “know” any working details about the other program or the service itself.
In a client/server architecture, particularly a networked system, a client is usually a computer that accesses shared network resources provided by another computer, e.g., a server. In the illustration of
A server is typically a remote computer system accessible over a remote or local network, such as the Internet or wireless network infrastructures. The client process may be active in a first computer system, and the server process may be active in a second computer system, communicating with one another over a communications medium, thus providing distributed functionality and allowing multiple clients to take advantage of the information-gathering capabilities of the server. Any software objects utilized pursuant to the column based encoding and query processing can be provided standalone, or distributed across multiple computing devices or objects.
In a network environment in which the communications network/bus 1540 is the Internet, for example, the servers 1510, 1512, etc. can be Web servers with which the clients 1520, 1522, 1524, 1526, 1528, etc. communicate via any of a number of known protocols, such as the hypertext transfer protocol (HTTP). Servers 1510, 1512, etc. may also serve as clients 1520, 1522, 1524, 1526, 1528, etc., as may be characteristic of a distributed computing environment.
As mentioned, advantageously, the techniques described herein can be applied to any device where it is desirable to query large amounts of data quickly. It should be understood, therefore, that handheld, portable and other computing devices and computing objects of all kinds are contemplated for use in connection with the various embodiments, i.e., anywhere that a device may wish to scan or process large amounts of data for fast and efficient results. Accordingly, the below general purpose remote computer described below in
Although not required, embodiments can partly be implemented via an operating system, for use by a developer of services for a device or object, and/or included within application software that operates to perform one or more functional aspects of the various embodiments described herein. Software may be described in the general context of computer-executable instructions, such as program modules, being executed by one or more computers, such as client workstations, servers or other devices. Those skilled in the art will appreciate that computer systems have a variety of configurations and protocols that can be used to communicate data, and thus, no particular configuration or protocol should be considered limiting.
With reference to
Computer 1610 typically includes a variety of computer readable media and can be any available media that can be accessed by computer 1610. The system memory 1630 may include computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) and/or random access memory (RAM). By way of example, and not limitation, memory 1630 may also include an operating system, application programs, other program modules, and program data.
A user can enter commands and information into the computer 1610 through input devices 1640. A monitor or other type of display device is also connected to the system bus 1622 via an interface, such as output interface 1650. In addition to a monitor, computers can also include other peripheral output devices such as speakers and a printer, which may be connected through output interface 1650.
The computer 1610 may operate in a networked or distributed environment using logical connections to one or more other remote computers, such as remote computer 1670. The remote computer 1670 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, or any other remote media consumption or transmission device, and may include any or all of the elements described above relative to the computer 1610. The logical connections depicted in
As mentioned above, while exemplary embodiments have been described in connection with various computing devices and network architectures, the underlying concepts may be applied to any network system and any computing device or system in which it is desirable to compress large scale data or process queries over large scale data.
Also, there are multiple ways to implement the same or similar functionality, e.g., an appropriate API, tool kit, driver code, operating system, control, standalone or downloadable software object, etc. which enables applications and services to use the efficient encoding and querying techniques. Thus, embodiments herein are contemplated from the standpoint of an API (or other software object), as well as from a software or hardware object that provides column based encoding and/or query processing. Thus, various embodiments described herein can have aspects that are wholly in hardware, partly in hardware and partly in software, as well as in software.
The word “exemplary” is used herein to mean serving as an example, instance, or illustration. For the avoidance of doubt, the subject matter disclosed herein is not limited by such examples. In addition, any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs, nor is it meant to preclude equivalent exemplary structures and techniques known to those of ordinary skill in the art. Furthermore, to the extent that the terms “includes,” “has,” “contains,” and other similar words are used in either the detailed description or the claims, for the avoidance of doubt, such terms are intended to be inclusive in a manner similar to the term “comprising” as an open transition word without precluding any additional or other elements.
As mentioned, the various techniques described herein may be implemented in connection with hardware or software or, where appropriate, with a combination of both. As used herein, the terms “component,” “system” and the like are likewise intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on computer and the computer can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
The aforementioned systems have been described with respect to interaction between several components. It can be appreciated that such systems and components can include those components or specified sub-components, some of the specified components or sub-components, and/or additional components, and according to various permutations and combinations of the foregoing. Sub-components can also be implemented as components communicatively coupled to other components rather than included within parent components (hierarchical). Additionally, it should be noted that one or more components may be combined into a single component providing aggregate functionality or divided into several separate sub-components, and that any one or more middle layers, such as a management layer, may be provided to communicatively couple to such sub-components in order to provide integrated functionality. Any components described herein may also interact with one or more other components not specifically described herein but generally known by those of skill in the art.
In view of the exemplary systems described supra, methodologies that may be implemented in accordance with the described subject matter will be better appreciated with reference to the flowcharts of the various figures. While for purposes of simplicity of explanation, the methodologies are shown and described as a series of blocks, it is to be understood and appreciated that the claimed subject matter is not limited by the order of the blocks, as some blocks may occur in different orders and/or concurrently with other blocks from what is depicted and described herein. Where non-sequential, or branched, flow is illustrated via flowchart, it can be appreciated that various other branches, flow paths, and orders of the blocks, may be implemented which achieve the same or a similar result. Moreover, not all illustrated blocks may be required to implement the methodologies described hereinafter.
In addition to the various embodiments described herein, it is to be understood that other similar embodiments can be used or modifications and additions can be made to the described embodiment(s) for performing the same or equivalent function of the corresponding embodiment(s) without deviating therefrom. Still further, multiple processing chips or multiple devices can share the performance of one or more functions described herein, and similarly, storage can be effected across a plurality of devices. Accordingly, the invention should not be limited to any single embodiment, but rather should be construed in breadth, spirit and scope in accordance with the appended claims.
This application claims the benefit of U.S. Provisional Application Ser. No. 61/103,135, filed Oct. 6, 2008 entitled “SYSTEM AND METHOD FOR MAPPING A DOMAIN MODELING LANGUAGE TO A RELATIONAL STORE,” the entirety of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61103135 | Oct 2008 | US |