The field generally relates to data processing and database systems.
Enterprise data is generated at a high speed mostly through transactional systems used by companies. Organizations can gain business value by exploring and analyzing transactional data, which may be generated within the enterprise or from other raw data of internal or external sources (e.g. social media). Business entities may utilize database offering to store big data. Such data may have little meaning without appropriate data analysis. Some business entities may utilize platforms that include databases to handle big data and to perform real time analyses. Such platforms may involve advanced processing features, e.g., of both the platforms and the associated databases. At the databases, object instances may be stored. The object instances may be created based on a data model, including definitions of attributes of the objects. Object instances may be organized in a hierarchical manner according to hierarchy criteria, such as location, time, enterprise organization, etc.
The claims set forth the embodiments with particularity. The embodiments are illustrated by way of examples and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. The embodiments, together with their advantages, may be best understood from the following detailed description taken in conjunction with the accompanying drawings.
Embodiments of techniques for providing support for hierarchy member selections in queries based on relational databases are described herein. In the following description, numerous specific details are set forth to provide a thorough understanding of the embodiments. One skilled in the relevant art will recognize, however, that the embodiments can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail.
Reference throughout this specification to “one embodiment”, “this embodiment” and similar phrases, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one of the one or more embodiments. Thus, the appearances of these phrases in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
In one embodiment, a semantic layer may be created to reside between a database storing enterprise data and an end user. The semantic layer may be created to provide a representation of data stored in the database. The semantic layer may be represented as a universe, supporting the user to interact with the stored data in the database, without having to know the complexities of the database or where the data is stored. The universe may facilitate the retrieval of data of interest by the user. A universe may be generated for a data source including a multi-dimensional array of data, such as an Online Analytical Processing (OLAP) cube, or from a query. In this case, the universe may be created automatically from a selected connection to a data source. A universe may be also created by selecting multiple tables and columns from various relational data sources. Once the universe is created, it can be exported to a repository and may be available for user requests to run queries or to create reports.
In one embodiment, relational databases, such as Structured Query Language (SQL) databases, are used by customers to generate reports and to perform data analysis. Some relational databases do not support hierarchy metadata for stored object instances that can be created in a standardized and/or non-database specific manner. A universe from the universes 140 may be created by selecting a data source connection to the DB 120. The DB 120 may be a relational database that do not support hierarchy metadata related to stored object instances. The universe creation process may be automated and started once the connection is selected. This universe creation process may be also started after selecting tables and columns from the relational data source. The structure of the connected data source is mapped to objects from the universe. For example, the universe may include objects defining classes, measures, dimensions, details, etc. The universe structure may be accessed through the design tool 105. The design tool 105 may provide a user interface (UI) including a universe pane. Once a universe is created, the universe may be modified, for example, through the UI of the design tool 105.
In one embodiment, during an activity of analyzing a data set from the DB 120, different aspects with regards to columns in a data set may be taken into account. Classifications of the columns of the database tables in the DB 120 may be determined. For example, the classifications may be related to a data type, an analysis type, an aggregation type, a format, a category, a hierarchy, etc., of the data stored in a column. The columns in the database tables may be defined as attributes of the data objects. Data for data object instances are stored in the database tables of the DB 120. An attribute (e.g. a column) may be classified as a measure or a dimension.
In one embodiment, columns that include qualitative data may be classified as dimensions, and columns that include quantitative or numeric data may be classified as measures For example, when a column includes data related to an amount of money, for example, for received month revenue, then the column may be classified as a measure. In another example, when a column includes data related to a geographical location. e.g. France, then the column may be classified as a dimension. If a column qualifies as a measure, an appropriate aggregation type may be determined. If a column is classified as a dimension, the column may be categorized, e.g. in a geographical or time category. A logical correlation for geographical and time categorized columns may be determined and different hierarchies may be defined. The classification of the columns in the data set may be stored as metadata entries in a corresponding database schema. The data stored in the DB 120 may include hierarchical data for data objects.
The universes 140 represent a dimensional semantic layer that resides between an organization's database, such as the DB 120, and an end user, who performs analysis and interacts with the data residing in the DB 120. A universe from the universes 140 may be created as a business representation of a data warehouse or a transactional database. The definition of universes 140 allows users to interact with data without having to know the complexities of the underlying database or where the data are stored. The universes 140 may be created using familiar to the user business terminology to describe a business environment of an enterprise working with data from the DB 120. A universe of the universes 140 may be created in relation to the connection to the DB 120. A universe may include a definition of a connection parameter to a data structure, such as the DB 120. The universes 140 include objects 160 that map to actual data structures in correspondingly connected databases. The objects 160 may be grouped into classes and subclasses. A universe from the universes 140 may include a schema of tables and joins from the corresponding DB (e.g., DB 120). The objects 160 in the universe may be built based on tables that are included in the schema.
Hierarchies may be detected in relation to dimensions such as time and geographical dimensions. For example, a dimension storing year values and a dimension storing month values may be determined to be part of one hierarchy, where “year” is at a higher level than “month” level. The universes 140 may be modified to include hierarchy characteristics of data from the correspondingly associated database connections. For example, hierarchies for the different dimensions of objects may be detected from a DB such as DB 120. Based on a definition of such hierarchies, hierarchy objects 150 may be generated correspondingly for the universes 140. The creation of the hierarchy objects 150 may be performed through the design tool 105. After the hierarchy objects 150 are generated, they may be stored in corresponding universes in the repository 130. In such a manner, a hierarchy object is created inside a relational universe without adding new metadata at database level. When the hierarchy objects 150 are published in the repository 130 as part of the universes 140, the hierarchy objects 150 may be used for querying data from corresponding databases, such as DB 120, and provide a report (e.g., report 170) as a result. The report 170 may be generated through a query, created within the design tool 105, or from a Business Intelligence (BI) tool for generating reports. The universes 140 including hierarchy objects 150 define a dimensional semantic layer that integrates the concept of storing multi-dimensional data with the concept of hierarchical organization inside universes.
When generating a report, such as report 170, the information for the universes 140 may be consumed by the design tool 105, or by other business intelligence tools for generating reports. Through the design tool 105, hierarchical reports may be generated as outputs based on universes 140, which include hierarchy objects 150. For example, reports with fold and unfold capabilities that support expanding or hiding sections of a report without having to drill down. For example, the generated reports may include level totals, hierarchical columns and graphics, hierarchical formulas, etc. Such reports may facilitate better data analysis on datasets with drill-through capabilities supporting opening a report by clicking a link within another report, and/or navigating through multi-dimensional data. The report 170 may be generated by running powerful hierarchical queries including member selections. The generated hierarchical queries may be based on the provided hierarchy objects 150 in the universes 140.
At 220, a hierarchy object is created. The hierarchy object defines a hierarchy of objects from the universe. The hierarchy object may be such as the hierarchy objects 150,
In one embodiment, a request to provide the updated universe from the repository to the information design tool may be received. The request may be provided by a user, requesting to generate a report, such as the report 170,
An information design tool 325 may be related to the BOP 305. Through the information design tool 325, universes 320 may be invoked, reviewed, and modified. The information design tool 325 may correspond to the design tool 105,
In one embodiment, a user may select a universe from the universes 320, and import it on a local storage through the information design tool 325. The selected universe is imported and a new hierarchy object may be created on top of the universe. The created new hierarchy object may be shared and published into the enterprise repository as part of the universe in the universes 320 section. Once it is published in the enterprise repository, the updated universe including the new hierarchy object becomes accessible for generating business intelligence (BI) reports through reporting tools associated with the BOP 305. The BOP 305 is associated with a BI tool 340, where reports may be created and queries for data may be generated. The reports requested through the BI tool 340 may be based on queries including universe hierarchy objects, as in the universes 320.
The BOP 305 includes a report server 345, which is an environment for running reports. The report server 345 includes a repo proxy 350, a report engine 355, and an information engine 360. The information engine 360 communicates with the BI tool 340 when a request for generating a report is received. The information engine 360 interacts with the report engine 355. The report engine 355 provides information associated with relevant data and columns associated with generated reports. The information engine 360 communicates with the universes 320 to receive information for relevant universes, as requested through the BI tool 340. The information engine 360 is an engine of the semantic business layer of the BOP 305 and defines queries, such as SQL queries. The defined queries in the information engine 360 may be executed on corresponding databases from the database layer 370. The information engine 360 is communicating with the universes 320 to receive universe definitions. The Information Engine 360 includes implemented logic to interpret universes including hierarchy objects, while generating queries including hierarchy member selections. At runtime, the queries defined in the BI tool 340 for requesting generation of reports may be transmitted to the information engine 360 and executed therein. Once the reports are executed and created, they may be communicated to the enterprise repository through the repo proxy 350. Generated reports may be stored in the reports 315 section of the enterprise repository.
In one embodiment, when a request for a report generation is received at the BI tool 340, there may be a correspondingly defined query in the BI tool 340. When the requested query includes hierarchy member selections based on hierarchy objects from a relevant universe, the information engine 360 may first load the relevant universe from the enterprise repository in order to get the definition of the universe hierarchy objects, and a data connection type to a data source related to the universe. According to a relational connection type, the Information Engine 360 may generate a native query script according to the query specification and the hierarchy member selections in the query. The native query script may be executed on the relevant data source (e.g. a DB) from the database layer 370 in order to retrieve query results data sets. Once the query result data set is retrieved from the data source, the data may be formatted according to user-predefined settings and may be returned to the client BI tool. A metadata definition for hierarchy objects associated with the query and the requested report may also be transmitted to the report engine 355 for defining reporting workflows. For example, such workflows may include a collapse or expand options, a fold and unfold options, a use of hierarchy formulas, other.
In one embodiment, a user may create a document on top of a universe X. A query may be created and hierarchy object A may be selected. Together with the selected hierarchy object A, a measure from the universe objects may also be selected and added in the query panel in relation to generating a result object. A hierarchy member selection may be created on the selected hierarchy object A. The selection of a hierarchy member may be performed through a member selector tool, where available members may be selected in set expressions. The query may be run and hierarchy properties of the hierarchy may be included. The hierarchy properties may correspond to the described hierarchy Properties 435,
On
Some embodiments may include the above-described methods being written as one or more software components. These components, and the functionality associated with each, may be used by client, server, distributed, or peer computer systems. These components may be written in a computer language corresponding to one or more programming languages such as, functional, declarative, procedural, object-oriented, lower level languages and the like. They may be linked to other components via various application programming interfaces and then compiled into one complete application for a server or a client. Alternatively, the components maybe implemented in server and client applications. Further, these components may be linked together via various distributed programming protocols. Some example embodiments may include remote procedure calls being used to implement one or more of these components across a distributed programming environment. For example, a logic level may reside on a first computer system that is remotely located from a second computer system containing an interface level (e.g., a graphical user interface). These first and second computer systems can be configured in a server-client, peer-to-peer, or some other configuration. The clients can vary in complexity from mobile and handheld devices, to thin clients and on to thick clients or even other servers.
The above-illustrated software components are tangibly stored on a computer readable storage medium as instructions. The term “computer readable storage medium” should be taken to include a single medium or multiple media that stores one or more sets of instructions. The term “computer readable storage medium” should be taken to include any physical article that is capable of undergoing a set of physical changes to physically store, encode, or otherwise carry a set of instructions for execution by a computer system which causes the computer system to perform any of the methods or process steps described, represented, or illustrated herein. A computer readable storage medium may be a non-transitory computer readable storage medium. Examples of a non-transitory computer readable storage media include, but are not limited to: magnetic media, such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs, DVDs and holographic devices; magneto-optical media; and hardware devices that are specially configured to store and execute, such as application-specific integrated circuits (“ASICs”), programmable logic devices (“PLDs”) and ROM and RAM devices. Examples of computer readable instructions include machine code, such as produced by a compiler, and files containing higher-level code that are executed by a computer using an interpreter. For example, an embodiment may be implemented using Java, C++, or other object-oriented programming language and development tools. Another embodiment may be implemented in hard-wired circuitry in place of, or in combination with machine readable software instructions.
A data source is an information resource. Data sources include sources of data that enable data storage and retrieval. Data sources may include databases, such as, relational, transactional, hierarchical, multi-dimensional (e.g., OLAP), object oriented databases, and the like. Further data sources include tabular data (e.g., spreadsheets, delimited text files), data tagged with a markup language (e.g., XML data), transactional data, unstructured data (e.g., text files, screen scrapings), hierarchical data (e.g., data in a file system, XML data), files, a plurality of reports, and any other data source accessible through an established protocol, such as, Open DataBase Connectivity (ODBC), produced by an underlying software system (e.g., ERP system), and the like. Data sources may also include a data source where the data is not tangibly stored or otherwise ephemeral such as data streams, broadcast data, and the like. These data sources can include associated data foundations, semantic layers, management systems, security systems and so on.
In the above description, numerous specific details are set forth to provide a thorough understanding of embodiments. One skilled in the relevant art will recognize, however that the embodiments can be practiced without one or more of the specific details or with other methods, components, techniques, etc. In other instances, well-known operations or structures are not shown or described in detail.
Although the processes illustrated and described herein include series of steps, it will be appreciated that the different embodiments are not limited by the illustrated ordering of steps, as some steps may occur in different orders, some concurrently with other steps apart from that shown and described herein. In addition, not all illustrated steps may be required to implement a methodology in accordance with the one or more embodiments. Moreover, it will be appreciated that the processes may be implemented in association with the apparatus and systems illustrated and described herein as well as in association with other systems not illustrated.
The above descriptions and illustrations of embodiments, including what is described in the Abstract, is not intended to be exhaustive or to limit the one or more embodiments to the precise forms disclosed. While specific embodiments of, and examples for, the one or more embodiments are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the one or more embodiments, as those skilled in the relevant art will recognize. These modifications can be made in light of the above detailed description. Rather, the scope is to be determined by the following claims, which are to be interpreted in accordance with established doctrines of claim construction.