Information models are generated to model various combinations of data stored in a database. Information models are generated for various purposes such as for performing analysis, generating reports, etc. Examples of various types of information models include an attribute view, an analytical view, and a calculation view. Different editors are used to generate different types of information models. A user is required to specify a type of information model each time an information model is to be generated. Based upon the specified type of information model, a suitable editor is used to generate the information model. However, specifying the type of information model to be generated each time might be inconvenient. Further, if the user mistakenly specifies a wrong information model type, the information model cannot be later changed to another type during generation. Therefore, in order to change the type of the information model the user is required to start all over again which consumes effort and time.
The claims set forth the embodiments with particularity. The embodiments are illustrated by way of examples and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. The embodiments, together with its advantages, may be best understood from the following detailed description taken in conjunction with the accompanying drawings.
Embodiments of techniques for generating information models are described herein. In the following description, numerous specific details are set forth to provide a thorough understanding of the embodiments. One skilled in the relevant art will recognize, however, that the embodiments can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail.
Reference throughout this specification to “one embodiment”, “this embodiment” and similar phrases, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one of the one or more embodiments. Thus, the appearances of these phrases in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
In-memory database is a database which typically relies on main memory for computation and storage. The in-memory database may be a relational database, an object oriented database, or a hybrid of both. The in-memory database utilizes the maximum capabilities of an underlying hardware to increase performance. The possibility to perform fast calculations on in-memory data, allows fast ad-hoc reporting based on business requirements and business transaction data. Thus, complex analytics can be performed based on business requirements. Information required for processing are available in the main memory, so computation and read operations can be executed in the main memory, without involving a hard disk input/output operation.
A database table or a table is typically a two dimensional data structure with cells organized in rows and columns. However, in the typical in-memory database, memory organization is linear. In a linear memory organization, data may be stored as a row store or a column store. In a row store, fields of every row of the table are stored in a row sequentially, whereas in a column store, the fields of every column are stored in a column, in contiguous memory locations. Row based storage and column based storage store data and metadata, which can be accessed by various components of the in-memory management system, such as any development tool, a query processing engine, and the like.
Data field is a container of data or a place where data can be stored within the database table. In one embodiment, the data fields refer to different columns of the database table. For example, in the database table related to users, the columns such as a ‘user ID,’ a ‘user name,’ and a ‘user address,’ etc., are referred as data fields.
Quantifiable data field is a type of a data field which comprises measurable data or data having a numerical value. For example, the data field ‘sales quantity’ may be defined as ‘quantifiable’ data field as the ‘sales quantity’ is measurable data and includes numerical values. In one embodiment, the quantifiable data field is referred as ‘measure.’
Non-quantifiable data field is the type of the data field which comprises non-measurable or non-numerical data. For example, the data field ‘user name’ may be defined as ‘non-quantifiable’ data field as the ‘user name’ is not measurable. Typically, the non-quantifiable data field comprises a descriptive data. In one embodiment, the non-quantifiable data field is referred as ‘attribute.’
Model development environment (MDE) is a type of a modeling tool used for generating models based on the business requirements. In one embodiment, the models are generated corresponding to the in-memory database. The MDE may be a front-end development tool such as, for example, ‘information modeler’ within SAP® high-performance analytic appliance (HANA®) studio. In one embodiment, the MDE may be an interactive development wizard, where a corresponding user input is requested at every step. Typically, the MDE provides a graphical user interface of the underlying in-memory database and used at design-time to create models. Generally, the design-time is referred to as a design or development phase prior to compiling the software code.
Information model or model is a design-time artifact which provides a logical view of the data and its relationship, using the underlying tables in the in-memory database. The design-time artifact is by-product produced during the development of software, such as tables, views, functions, schema, models, and the like. The model is the design time artifact generated using one or more database tables. In one embodiment, only some data fields of the database tables are selected for generating the model. The database tables are logically connected to generate the model. In one embodiment, the model may be one of a type namely an attribute view, an analytical view, and a calculation view.
Attribute view is a type of the model which includes only attributes or non-quantifiable data fields. In one embodiment, the attribute view models an entity comprising the non-quantifiable data fields from one or more database tables. The entity may be modeled based on a relationship between the data fields of the one or more database tables. In one embodiment, the attribute view is termed as a join type of column view.
Analytical view is a type of the model which includes one or more quantifiable data fields (measures) from a single database table. Therefore, the analytical view models quantifiable data fields of the single database table. In one embodiment, the analytical view also includes at least one of one or more non-quantifiable data fields and one or more attribute views. In one embodiment, the analytical view is termed as an on-line analytical processing (OLAP) type of column view.
Calculation view is a type of the model which includes a plurality of quantifiable data fields (measures) from multiple database tables. Typically, the calculation view models more advanced combination of data stored in the database. In one embodiment, the calculation view also includes at least one of one or more non-quantifiable data fields, one or more attribute views, and one or more analytical views. In one embodiment, the calculation view is termed as a calculation type of column view.
The to-be-generated model 120 may be generated using a model development environment (MDE). In one embodiment, the MDE may be a front end development tool.
In one embodiment, the models are generated and stored within a package. The package may be created under another node ‘content’ 230 in the navigator 210. The ‘content’ 230 represents a design-time repository which holds all the packages and models created within the packages. In one embodiment, the models are organized within the package in the ‘content’ 230. A user (e.g., a business analyst) creates the package and then creates one or more models within the package. For example, the user may create the package 2 and a model ‘R’ within the package 2.
In one embodiment, for creating a new package, the user selects the ‘content’ 230. The ‘content’ 230 may be selected by right clicking the ‘content’ 230.
In one embodiment, the input field ‘name’ 410 is mandatory while the input field ‘description’ 420 is an optional field. In one embodiment, the mandatory field is marked with a suitable symbol such as ‘*.’ Once the ‘name’ 410 and/or ‘description’ 420 are provided, the user can select ‘OK’ 430 to complete the generation of the new package ‘my_demo.’ Once the package ‘my_demo’ is created, the package ‘my_demo’ is displayed within the ‘content’ 230, as shown in
Various types of models can be created within the package ‘my_demo.’ In one embodiment, referring to
In one embodiment, as illustrated in
In one embodiment, the user can hide one or more data fields of the selected tables in the model_M. Typically, the data field may be hidden by changing a property (e.g., visibility) of the data field. The visibility of the data field may be changed in a property window (not shown) of the data field. In one embodiment, the user can double click the data field to open its property window to change the visibility of the data field.
In one embodiment, the data fields of the table 1 and table 2 displayed on the data foundation tab 910 can be categorized or classified into various types. The data fields of the table 1 and table 2 is categorized into various types by the user. For specifying the type of the data field, the user selects or right clicks the data field. For example, the user right clicks the data field ‘REGION_ID’ of table 1. Once the data field, e.g., ‘REGION_ID,’ is right clicked, a menu 1000 is displayed as shown in
In one embodiment, when the user makes the data field ‘REGION_ID’ the key attribute, a symbol (e.g., symbol ‘*’) may be prefixed to the data field ‘REGION_ID’ to show that the ‘REGION_ID’ is the key attribute. Similarly, a suitable symbol may also be prefixed to the data fields categorized as the attribute and the measure. For example, a symbol may be prefixed to the data field ‘SALES_AMOUNT’ to show that it is the measure.
Once the data fields are categorized as the key attribute, attribute, and measure, the user can select the logical view tab 920. In the logical view tab 920, the user can define a relationship between the tables, e.g., the table 1 and the table 2. The relationship may be defined by associating the same data fields of the tables. For example, the data field ‘REGION_ID’ of table 1 and the data field ‘REGION_ID’ of table 2 is associated or linked, as illustrated in
Once the relationship between the tables is defined, a placeholder for model_M is created and the model_M becomes visible in the ‘content’ 230. The user can activate model_M so that a real model corresponding to the model_M is generated. For activating model_M, the user right clicks the model_M within the ‘content’ 230.
Upon activating model_M, the editor 110 identifies all the database tables, e.g., table 1-table 2, associated with the model_M. Once the table 1 and table 2 is identified, the editor 110 determines whether any data field of the table 1 and table 2 is quantifiable data field (measure). In case none of the data fields of the tables is measure, the editor 110 determines that the model_M is of type 1 or an attribute view. In case the one or more data fields are quantifiable data fields, the editor 110 determines whether the quantifiable data fields are from a single or same table. In case the quantifiable data fields, e.g., ‘SALES_AMOUNT,’ are from the same table, e.g., table 2, the editor 110 determines that the model_M is of type 2 or an analytical view. In case the quantifiable data fields are from multiple tables, then the editor 110 determines that the type of model_M is type 3 or a calculation view.
Once the type of the model_M is determined, the editor 110 automatically selects the section of code corresponding to the type of the model_M. Typically, each section of code corresponds to a type of the model or view. Various sections of codes are predefined corresponding to various model types.
The editor 110 automatically selects the section of code corresponding to the type of the model_M. For example, as the type of the model_M is analytical view, the editor 110 selects the section of code C2. The selected section of code, e.g., C2, is executed to generate the model corresponding to the model_M. The generated model is stored in a database. In one embodiment, the database is the in-memory database.
In one embodiment, the editor 110 assigns an identifier to the generated model based upon its type. In one embodiment, the editor 110 assigns the identifier by tagging (e.g., prefixing or post fixing) the name of the model with the identifier. For example, when the model_M is of type ‘attribute view,’ the editor 110 may prefix the name ‘model_M’ with the identifier ‘ATR.’ The user can easily identify the type of the model by reading their identifier. For example, the user can easily make out that the model_M is of type ‘attribute view’ from its identifier ‘ATR.’ In one embodiment, the identifier for each model type is predefined. For example, the identifier for model type 1 (e.g., the attribute view) may be predefined as ‘ATR’ or ‘ATTR,’ the identifier for model type 2 (e.g., the analytic view) may be predefined as ‘AV,’ and the identifier for model type N (e.g., the calculation view) may be predefined as ‘CAL.’ In one embodiment, the identifier may include any alphanumeric character, special symbols, and a combination thereof.
In one embodiment, the models are stored or segregated in the database based upon their respective types or identifiers. For example, all the models of the type ‘attribute view’ or identifier ‘ATR’ are stored together. In one embodiment, the models having the same prefix or post fix are stored or segregated under same category.
Embodiments described above enable generating different types of models using a single editor. As there is a single editor, therefore, a user is not required to pre-specify the model type for selecting a suitable editor. Also, the user has a flexibility to switch from one model type to another, during generation, as the model type is not pre-specified. Therefore, the editor is more generic, flexible, and efficient. Additionally, the editor automatically identifies the model type and segregates the models based upon their respective type or category so that the models can be easily reused later.
Some embodiments may include the above-described methods being written as one or more software components. These components, and the functionality associated with each, may be used by client, server, distributed, or peer computer systems. These components may be written in a computer language corresponding to one or more programming languages such as, functional, declarative, procedural, object-oriented, lower level languages and the like. They may be linked to other components via various application programming interfaces and then compiled into one complete application for a server or a client. Alternatively, the components maybe implemented in server and client applications. Further, these components may be linked together via various distributed programming protocols. Some example embodiments may include remote procedure calls being used to implement one or more of these components across a distributed programming environment. For example, a logic level may reside on a first computer system that is remotely located from a second computer system containing an interface level (e.g., a graphical user interface). These first and second computer systems can be configured in a server-client, peer-to-peer, or some other configuration. The clients can vary in complexity from mobile and handheld devices, to thin clients and on to thick clients or even other servers.
The above-illustrated software components are tangibly stored on a computer readable storage medium as instructions. The term “computer readable storage medium” should be taken to include a single medium or multiple media that stores one or more sets of instructions. The term “computer readable storage medium” should be taken to include any physical article that is capable of undergoing a set of physical changes to physically store, encode, or otherwise carry a set of instructions for execution by a computer system which causes the computer system to perform any of the methods or process steps described, represented, or illustrated herein. Examples of computer readable storage media include, but are not limited to: magnetic media, such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs, DVDs and holographic indicator devices; magneto-optical media; and hardware devices that are specially configured to store and execute, such as application-specific integrated circuits (“ASICs”), programmable logic devices (“PLDs”) and ROM and RAM devices. Examples of computer readable instructions include machine code, such as produced by a compiler, and files containing higher-level code that are executed by a computer using an interpreter. For example, an embodiment may be implemented using Java, C++, or other object-oriented programming language and development tools. Another embodiment may be implemented in hard-wired circuitry in place of, or in combination with machine readable software instructions.
A data source is an information resource. Data sources include sources of data that enable data storage and retrieval. Data sources may include databases, such as, relational, transactional, hierarchical, multi-dimensional (e.g., OLAP), object oriented databases, and the like. Further data sources include tabular data (e.g., spreadsheets, delimited text files), data tagged with a markup language (e.g., XML data), transactional data, unstructured data (e.g., text files, screen scrapings), hierarchical data (e.g., data in a file system, XML data), files, a plurality of reports, and any other data source accessible through an established protocol, such as, Open Database Connectivity (ODBC), produced by an underlying software system, e.g., an ERP system, and the like. Data sources may also include a data source where the data is not tangibly stored or otherwise ephemeral such as data streams, broadcast data, and the like. These data sources can include associated data foundations, semantic layers, management systems, security systems and so on.
In the above description, numerous specific details are set forth to provide a thorough understanding of embodiments. One skilled in the relevant art will recognize, however that the one or more embodiments can be practiced without one or more of the specific details or with other methods, components, techniques, etc. In other instances, well-known operations or structures are not shown or described in details.
Although the processes illustrated and described herein include series of steps, it will be appreciated that the different embodiments are not limited by the illustrated ordering of steps, as some steps may occur in different orders, some concurrently with other steps apart from that shown and described herein. In addition, not all illustrated steps may be required to implement a methodology in accordance with the one or more embodiments. Moreover, it will be appreciated that the processes may be implemented in association with the apparatus and systems illustrated and described herein as well as in association with other systems not illustrated.
The above descriptions and illustrations of embodiments, including what is described in the Abstract, is not intended to be exhaustive or to limit the embodiments to the precise forms disclosed. While specific embodiments of, and examples for, the embodiment are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the embodiments, as those skilled in the relevant art will recognize. These modifications can be made to the embodiments in light of the above detailed description. Rather, the scope of the one or more embodiments are to be determined by the following claims, which are to be interpreted in accordance with established doctrines of claim construction.