1. Technical Field
This application relates to methods and systems, generally referred to as systems, for defining declarative languages. More particularly, this application relates to a flexible approach for defining a declarative language using existing dynamic languages.
2. Related Art
Declarative languages, such as data definition languages (DDL), may be widely created and used by programmers to simplify a process by which a programmer can implement particularized tasks. For example, many declarative languages are developed to transfer data into and out of storage solutions and application programs, a process is known as serialization and deserialization of data. By creating the declarative language, the programmer may subsequently invoke simple, high level commands to retrieve and/or store complex data objects from and/or to a database. For example, Structured Query Language (SQL) commands, such as create, alter, and the like, provide an easy way for a programmer to work with data in a SQL database.
To create a new declarative language, however, a programmer may need to embark on an elaborate process that includes defining the syntax and semantics of the declarative language. Known processes may seem fairly straightforward, but they may be time consuming and require the programmer to know additional languages used to define the grammar of the new declarative language. Additionally, such processes may require additional processing time and resources to analyze and determine tokens and create a parse tree. Moreover, such processes may need to be repeated for every declarative language a programmer wishes to create. As a result, programmers may be unable to quickly and easily define a new declarative language. Additionally, programmers may be unable to reuse features and commands of existing declarative languages.
A method is disclosed for generating source code for implementing a declarative language is provided. The method may include receiving a first set of information defining at least one data entity for use in a new declarative language, the at least one data entity have an associated type, receiving a second set of information defining translation requirements for translating the at least one entity to a source code representation of the at least one data entity, and translating, based on the associated type of the at least one data entity and the translation requirements, the data entity into a source code representation of the data entity.
A system is disclosed for generating source code for implementing a declarative language is provided. The system may include a standard data type translation file defining translation requirements for translating a plurality of standard data types into source representations of the data type, an API analysis file that may define a standard data model for specifying a data entity for use with a new declarative language, and a translation tool adapted to, based on the standard data type translation file and the API analysis file, translate the data entity into a source code representation of the data entity.
These and other aspects are described with reference to the noted Figures and the below detailed description of the preferred embodiments.
Systems and methods, generally referred to as systems, are disclosed for defining a declarative language. Existing technologies may limit the manner in which programmers are able to define declarative languages by requiring lexical and syntactic analysis and corresponding specification files.
The program 110 may or may not be lexically or syntactically analyzed until the programmer also may define the grammar of the new declarative language using other languages, such as lex and yacc. As used in the art, lex may be a lexical analyzer that uses pattern matching to break down the program into a series of lexical tokens. To determine the tokens, the lex program requires an additional configuration file that may define the patterns corresponding to particular tokens. The token series may then be passed to a parser, such as one generated by parser generation tool yacc, which also requires a configuration file that may define the syntax of the declarative language to assemble the tokens into the parse tree 114.
The systems described herein may allow a flexible way for programmers to quickly and easily define the data entities for use with the new declarative language and automatically generate code for implementing the new language without requiring the programmer to define the syntax of the new language. Additionally, the systems described herein can reuse existing code defining a declarative language using object oriented programming features. Although reference is made below to specific components of the system performing specific features, it should be apparent that such reference is exemplary, is not intended to limit the scope of the claims in any way, and that the functionalities described herein may be implemented in a virtually unlimited number of configurations.
In
The data entities of the new declarative language (new data entities) may be defined using complex and primitive data types of an existing programming language (existing data types). The existing data types may be standard data types of the existing language, or may be user defined data objects. An exemplary set of primitive and complex data types for use in defining a new declarative language and their corresponding C++ and JAVA equivalents are shown in Tables 1 and 2, respectively.
The existing programming language may be a dynamic programming language. As used herein, a dynamic programming language is a language which allows for, during runtime, the addition of new code, the extension of objects and definitions, and/or modification of the type system. An exemplary dynamic language for use in defining a declarative language is Python, created by Guido van Rossum and currently developed by the Python Software Foundation. By defining the new declarative language in terms of entities of a dynamic language, the system may allow for a more flexible manner of extending created declarative languages that have already been defined in terms of the dynamic language.
After defining the new declarative language using the standard data types, predefined functions in the destination programming language for working with the standard types may be created at 214. The predefined functions may define standard functions for dealing with data of a given type, and may be used to implement any function desired of the new declarative language. In other words, the standard functions may include an API for defining the new language. Exemplary predefined functions for a declarative language for storing and retrieving data may include accessor and acceptor functions for use by a program 110 to access data, serialization and deserialization functions for working with a data store, and the like. Other functions for manipulating data in any way may also be used.
The predefined functions and data entity definitions may be used to generate code to implement in another programming language at 216. For example, class definitions may be created in the destination programming language for the data entities that define the data entities and functions used to manipulate those entities. These classes may then be easily incorporated into an existing application framework to extend the functionalities of the framework to utilize the newly created declarative language. Code to implement the new declaratively language may be generated in any language, such as C, C++, C#, JAVA, and the like. Exemplary generated code in C++ for a data entity of the type “int32” named DataEntityOne is shown in Table 3.
Functions defining an API to the system may be created at 320. These functions may include the definition of a standard data model used to define the data entities of the new declarative language and functions for how to interpret the standard data model to generate code in the destination language based on the programmer defined data entities. An exemplary data model defining an API for defining a new declarative language is show in
Optionally, views of the data contained in the table may also be created to allow a mechanism for collating information from multiple column sets may also be included. Typically, views may be used only to retrieve data, and therefore, only accessor functions to obtaining the data specified in the view may be generated in the destination programming language. Alternatively, any type of function, including acceptor functions may also be generated in the destination programming language for views.
An exemplary table definition for a table including data relating to IP address is shown in Table 5. The table may include documentation information for the table. Additionally, a column set definition statement defining the column sets included in the table. For example, the Table 5 includes an IpInfo columnset, a HostVec columnset, and a DomainVec columnset. The column set definitions may be included in the same file as the table definition, or may be included by reference as illustrated in Table 5. Finally, a view definition statement defining the views included in the table may also be included in the table definition.
Exemplary column set definitions are shown in Table 6. Each column set definition may describe the columns of the column set. Multiple column sets may be defined in a single file, or each column set may be defined in a separate file. Each column set definition may include documentation information relating to the column set, and a list of the columns included in the column set. For example, an IpInfo columnset may include columns defining an IP address, the number of Uniform Resource Locators (URLs) associated with the address, and the number of hosts associated with the address. The columns may be defined explicitly in the column set definition, as illustrated in Table 6, or may be included in separate files. For example, each column may be defined as having a particular data element having a name and a data type selected from the standard data types identified above. Additionally, attributes such as compression information that indicates which method if any is used to compress the columnsets, order information that indicates how the columns are ordered and the like may also be defined in the column set definition.
Once a standard data model is established for defining the data entities of the new declarative language, code may be defined for accessing and analyzing the standard data model. For example, code may be generated to interpret the table definitions (including column set and column definitions) described above, and generate the corresponding destination language code. Exemplary Python code including functions for analyzing a table definition is shown in Table 7.
Referring again to
An exemplary user specified data entity definition file and subsequently generated C++ code are shown in Tables 9, 10, and 11. As illustrated, a class definition may be created based on the defined data entity (Table 9) and placed in a C++ header file (Table 10). The underlying methods for manipulating the data may be included in a C++ source file, as shown in Table 11. In this example, the data definition is described in a file named “testrec.jr.” Additionally, the toolkit 500 is configured to generate class definitions based on a standard class definition for the standard API data model (referred to as std::Record) and to generate accessor and acceptor functions for retrieving and storing data by a program, as well as serialization and deserialization functions for retrieving and storing data to/from a data store.
It is therefore intended that the foregoing detailed description be regarded as illustrative rather than limiting, and that it be understood that it is the following claims, including all equivalents, that are intended to define the spirit and scope of this invention.