The field generally relates to the software arts, and, more specifically, to methods and systems for applying hierarchy information to data items.
A hierarchy is an arrangement of entities in which the entities are represented as being above, below, or at the same level one to another. The hierarchy is simply an ordered set or an acyclic graph. The entities in the hierarchy can be linked directly or indirectly, vertically or horizontally. A system that is largely hierarchical can also include alternative hierarchies. Indirect hierarchical links can extend the hierarchy vertically upwards or downwards via multiple links in the same direction, following a path. All parts of the hierarchy which are not linked vertically one to another can be associated horizontally through a path or a level. A hierarchy can be a nested hierarchy, when it contains a hierarchy of hierarchies.
In a hierarchy, the data can be organized in a tree structure. A data model, in which the data is organized in a tree structure, is a hierarchical data model. The structure allows repeating information using parent/child relationships: each parent can have many children but each child only has one parent. All attributes of a specific record are listed under an entity type. In a database, an entity type is the equivalent of a table; each individual record is represented as a row and an attribute as a column Entity types are related to each other using one-to-many relationships, also known as 1: N mapping. An organization could store employee information in a table that contains attributes/columns such as employee number, first name, last name, and department number. The organization provides each employee with computer hardware as needed, but computer equipment may only be used by the employee to which it is assigned. The organization could store the computer hardware information in a separate table that includes each part's serial number, type, and the employee that uses it. In this model, the employee data table represents the “parent” part of the hierarchy, while the computer table represents the “child” part of the hierarchy. Each employee may possess several pieces of computer equipment, but each individual piece of computer equipment may have only one employee owner.
Often, there is the need to organize given data items into a hierarchy or hierarchies. The data items themselves may not contain the information necessary to create the desired hierarchy. Some databases provide hierarchy information for the data they contain and can apply this hierarchy information to the results of queries to the database. However, these hierarchies are defined within the system and are tightly coupled to the data to which they apply and originate from the same source as the data. In addition, the hierarchical relationships are often encoded in the relational data itself. For example, a data item having a field that references the parent's identifier (ID) value. Again, the hierarchy information is bound to the data itself.
Various embodiments of systems and methods for applying hierarchy information to data items are described herein. In various embodiments, the method includes loading an external hierarchy structure including a plurality of entities as nodes of the external hierarchy structure, wherein each entity in the plurality described with a first set of properties. A property is identified from the first set of properties that is common for the plurality of entities and for a plurality of data items, wherein each data item is described with a second set of properties. Then, the plurality of data items is sorted according to a value of the property and one or more data items are identified from the plurality of data items that correspond to an entity from the external hierarchy structure based on the value of the property. Finally, the entity from the external hierarchy structure is linked to the one or more data items.
In various embodiments, the system includes an external hierarchy structure including a plurality of entities as nodes, wherein each entity in the plurality is described with a first set of properties. Further, the system includes a database storage unit for storing a plurality of data items, wherein each data item is described with a second set of properties. Also, the system includes a processor in communication with the database storage unit, the processor to load the external hierarchy structure and identify a property from the first set of properties that is common for the plurality of entities and for the plurality of data items. The processor also sorts the plurality of data items according to a value of the property and identifies one or more data items from the plurality of data items that correspond to an entity from the external hierarchy structure based on the value of the property. Finally, the processor links the entity from the external hierarchy structure to the one or more data items.
These and other benefits and features of embodiments of the invention will be apparent upon consideration of the following detailed description of preferred embodiments thereof, presented in connection with the following drawings.
The claims set forth the embodiments of the invention with particularity. The invention is illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. The embodiments of the invention, together with its advantages, may be best understood from the following detailed description taken in conjunction with the accompanying drawings.
Embodiments of techniques for applying hierarchy information to data items are described herein. In the following description, numerous specific details are set forth to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
Reference throughout this specification to “one embodiment”, “this embodiment” and similar phrases, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of these phrases in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiment.
In various embodiments, an externally provided hierarchy definition can be used to organize the data items into a hierarchy following the structural organization of this externally provided hierarchy.
Further, the entities of external hierarchy definition structure 100 are linked to the corresponding data items. For example, parent entity 110 has ID=005 and data item 230 had ID=005 and thus, the entity 110 is linked to data item 230. There may be the case where one hierarchy entity is linked to more than one data item. As a result, parent entity 105 is linked to data item 255; child entity 115 is linked to data item 240; child entity 120 is linked to data items 245 and 265; child entity 125 is linked to data items 235 and 270; and child entity 130 is linked to data items 250 and 260.
Regardless of which option is used to handle orphaned data items, information about the data items that were orphaned and the corresponding solution is recorded. This allows clients using the resulting hierarchy of data items to handle these portions of the hierarchy in an appropriate fashion. For example, dummy data items or bridged connections could be displayed differently to make the user aware of the discrepancy between the data items and the hierarchy definition.
At block 705, an external hierarchy structure is loaded. The external hierarchy structure is a hierarchy definition that is external to the data items that need to be organized. The external hierarchy definition structure (e.g., hierarchy structure 100) can be loaded from a location or system that is separate from the one containing the data items. The external hierarchy structure contains a number of entities as nodes of the hierarchy. The hierarchy definition contains information (properties and values) defining the relationship between the entities in the hierarchy. At block 710, a plurality of data items is loaded from a database table. The data items in the plurality are not ordered in any fashion. Each data item is described and stored in the database with a set of properties. There should be at least one property value of a data item that matches a property value of an entity from the external hierarchy structure, so these two elements to be linked. At block 715, a common property is identified for the entities of the external hierarchy structure and the plurality of data items. At block 720, the data items in the plurality are sorted according to the value of the common property for each data item. At block 725, the plurality of data items are searched based on the common property value to find the range of items corresponding to each entity in the external hierarchy structure.
At block 730, one or more data items are identified to correspond to an entity in the external hierarchy structure based on the common property value. At block 735, the corresponding entity is linked to the identified one or more data items. All entities that have in common some of the data items the same value of the common property are linked to these data items. At block 740, the data items in the plurality are sorted according to the hierarchy organization of the external hierarchy definition structure. As a result, a hierarchy of data items is produced as a structure following the hierarchy of the external hierarchy structure. The new hierarchy of data items also follows the relationship of the entities and the depth of dependencies of the external hierarchy structure. At block 745, the orphaned data items are handled according to user's preferences. In some embodiments, for data items with no corresponding entity in the external hierarchy structure, the user can choose to either discard the data items or to treat them as if they corresponded to a top level node in the hierarchy. In other embodiments, the gap produced by the missing data item may be bridged or may be filled by creating a dummy data item. At block 750, operations may be performed on the data items in the new structure. Examples of operations are report processing operations. Data is extracted from data source as specified by a report schema, also specifying how data is to be processed and formatted. In some embodiments, the report is a business intelligence (BI) document such as a Crystal Report® or SAP® BusinessObjects™ Web Intelligence® report.
Some embodiments of the invention may include the above-described methods being written as one or more software components. These components, and the functionality associated with each, may be used by client, server, distributed, or peer computer systems. These components may be written in a computer language corresponding to one or more programming languages such as, functional, declarative, procedural, object-oriented, lower level languages and the like. They may be linked to other components via various application programming interfaces and then compiled into one complete application for a server or a client. Alternatively, the components maybe implemented in server and client applications. Further, these components may be linked together via various distributed programming protocols. Some example embodiments of the invention may include remote procedure calls being used to implement one or more of these components across a distributed programming environment. For example, a logic level may reside on a first computer system that is remotely located from a second computer system containing an interface level (e.g., a graphical user interface). These first and second computer systems can be configured in a server-client, peer-to-peer, or some other configuration. The clients can vary in complexity from mobile and handheld devices, to thin clients and on to thick clients or even other servers.
The above-illustrated software components are tangibly stored on a computer readable storage medium as instructions. The term “computer readable storage medium” should be taken to include a single medium or multiple media that stores one or more sets of instructions. The term “computer readable storage medium” should be taken to include any physical article that is capable of undergoing a set of physical changes to physically store, encode, or otherwise carry a set of instructions for execution by a computer system which causes the computer system to perform any of the methods or process steps described, represented, or illustrated herein. Examples of computer readable storage media include, but are not limited to: magnetic media, such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs, DVDs and holographic devices; magneto-optical media; and hardware devices that are specially configured to store and execute, such as application-specific integrated circuits (“ASICs”), programmable logic devices (“PLDs”) and ROM and RAM devices. Examples of computer readable instructions include machine code, such as produced by a compiler, and files containing higher-level code that are executed by a computer using an interpreter. For example, an embodiment of the invention may be implemented using Java, C++, or other object-oriented programming language and development tools. Another embodiment of the invention may be implemented in hard-wired circuitry in place of, or in combination with machine readable software instructions.
A data source 860 is an information resource. Data sources include sources of data that enable data storage and retrieval. Data sources may include databases, such as, relational, transactional, hierarchical, multi-dimensional (e.g., OLAP), object oriented databases, and the like. Further data sources include tabular data (e.g., spreadsheets, delimited text files), data tagged with a markup language (e.g., XML data), transactional data, unstructured data (e.g., text files, screen scrapings), hierarchical data (e.g., data in a file system, XML data), files, a plurality of reports, and any other data source accessible through an established protocol, such as, Open DataBase Connectivity (ODBC), produced by an underlying software system (e.g., ERP system), and the like. Data sources may also include a data source where the data is not tangibly stored or otherwise ephemeral such as data streams, broadcast data, and the like. These data sources can include associated data foundations, semantic layers, management systems, security systems and so on.
In the above description, numerous specific details are set forth to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however that the invention can be practiced without one or more of the specific details or with other methods, components, techniques, etc. In other instances, well-known operations or structures are not shown or described in details to avoid obscuring aspects of the invention.
Although the processes illustrated and described herein include series of steps, it will be appreciated that the different embodiments of the present invention are not limited by the illustrated ordering of steps, as some steps may occur in different orders, some concurrently with other steps apart from that shown and described herein. In addition, not all illustrated steps may be required to implement a methodology in accordance with the present invention. Moreover, it will be appreciated that the processes may be implemented in association with the apparatus and systems illustrated and described herein as well as in association with other systems not illustrated.
The above descriptions and illustrations of embodiments of the invention, including what is described in the Abstract, is not intended to be exhaustive or to limit the invention to the precise forms disclosed. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize. These modifications can be made to the invention in light of the above detailed description. Rather, the scope of the invention is to be determined by the following claims, which are to be interpreted in accordance with established doctrines of claim construction.