The present invention relates to a system consisting of reusable components for implementing data warehousing (DW) and business intelligence (BI) solutions. The system is a combination of various components that would enable to have an access to the best practices as well as certain domain and business function specific data models, components, applications that enable building an integrated data warehousing (DW) and business intelligence (BI) infrastructure faster as well as enable their easy maintenance and support.
Further, the system provides an enriched framework which assists in applying certain unique concepts, experiences, philosophies and pre-packaged solutions to all its data warehousing and business intelligence engagements.
The data warehouse is a subject-oriented, integrated, time-variant, non-volatile collection of data used to support the strategic decision-making process for the enterprise. The data warehouse supports online analytical processing (OLAP), data mining and other statistical/analytical and related decision support applications, the functional and performance requirements of which are quite different from those of the online transactional processing (OLTP).
Thus, the unique combination of DW and BI addresses every requirement. The components play a very vital role in ensuring the achievement of objectives relating to DW and BI engagements.
The known systems available in the market such as Informatica Analytical Applications and Business Objects Application Foundation to just name a few. Informatica Analytical Applications offers pre-built data models for customer analytics, financial analytics, HR analytics and supply chain analytics. The system also offers the functionality of data integration and information delivery thorugh Informatica PowerCentre and Informatica Power Analyser respectively, which are two separate products. Similarly, Business Objects Application Foundation is a framework for delivering analytical applications. It comes with pre-built matrices, business rules which enable various kinds of analysis apart from offering the functionality to perform predictive analysis and statistical process control.
The known systems however suffer from certain deficiencies.
1. They are products which need to be purchased.
2. These products are tied to their own extraction and information delivery tools, which means customers have to purchase these tools separately
3. These products do not come with pre-built data mappings with any of the standard data sources.
4. Lack of pre-built data mappings also means that any version changes in any of the standard data sources would require re-mapping the data sources with the target data models.
An object of the present invention is therefore to provide a system to be utilised in DW and BI service engagements including plan, build and operate and across specialised service offerings.
Another object of the present invention is to enable DW and BI to build an asset bass of reusable objects.
Yet another object of the present invention is to increase DW and BI engagement productivity.
A further object of the present invention is to ensure uniformity in approach to all DW and BI engagements.
Still another object of the present invention is to provide a structured channel for capturing engagement knowledge as well as to act as a self-reinforcing feedback loop.
Still further object of the present invention is to develop, test and incorporate applications that are necessary to plug gaps, in a cost effective manner, in the available set of tools and technologies.
The system of the present invention using reusable components solves the above deficiencies. The system uses reusable components consisting of pre-built vertical specific data models and key performance indications (KPIs) and pre-built maplets linking standard source systems to the vertical specific KPIs, to aid faster and cost effective implementation of data warehousing and business intelligence projects. While the system's data model component houses the vertical specific data models and KPIs, the metadata component houses the technical and business metadata along with the associated mappings. Being compliant with the common warehouse metamodel (CWM), the system is extraction and reporting tool neutral. In other words, the technical and business metadata can be exported to any of the CWM compliant extraction and reporting tools available in the market. It also consists of an algrorithm to automatically detect version changes in the standard source systems.
The system of the present invention seeks to solve the deficiences in the products available in the market in the following ways:
Thus the present invention provides a system consisting of reusable components for implementing data warehousing (DW) and business intelligence (BI) solutions, said reusable components comprising a data model component housing an exhaustive pre-built vertical and business function specific generic data models and key performance indicator (KPI) libraries; a MetAL component serving as a key repository of all mappings between all standard source data systems and vertical and function specific data models and KPIs housed in the data model component; and a component with the ability to export these mappings to any ETL and reporting tool, making it BI tool neutral and a platform neutral framework; thereby positioning itself as a technology neutral platform for organizations implementing data warehouses.
The data model component houses exhaustive pre-built vertical and business function specific generic data models and key performance indicator libraries.
The MetAL component serves as a key repository of all mappings between all standard source data systems and the vertical and function specific data models and KPIs, housed in the data model component.
The MetAL component contains the metadata of the various versions of all standard source systems and the metadata of the pre-built data models and KPIs in CWM format. It also contains the associated mappings between these two sets of metadata.
Thus, the MetAL component has four different engines. The BI configuration engine houses tie technical and business metadata of pre-built KPI libraries. The data sources engine of the MetAL component houses the technical and business metadata of the standard source systems. The integration engine contains the mappings between the metadata in the data sources engine and the metadata of the BI configuration engine. The mapping export engine exports the metadata to any extraction or information delivery tool.
Some of the additional features of these components are given below:
The MetAL component is able to support the metadata for Oracle applications, SAP, People Soft, Siebel, Oracle CRM, Vantive, Clarify, JD Edwards, BaaN, MfgPro.
It can export the mappings to ETL Tools-Oracle Warehouse Builder, Informatica, Sagent, SAS, Acta, Visual Datawarehouse Admininstrator, Abinitio, DTS, Data junction.
The MetAL component is also able to export mappings to leading BI tools, packaged applications, CASE tools, database and system management tools.
It is able to bring into framework relational and non-relational databases RDBMS, File Systems, Dbase, Paradox, Btree.
The MetAL component is provided with user interface to construct source to target mappings.
The system stores its reusable components in the common warehouse metamodel framework, thereby positioning itself as a technology neutral platform for organizations implementing data warehouses.
The system of the present invention is provided in the form of an application which comprises the following components:
Data model component
MetAL component
Additionally, the system also constitutes certain add on applications having independent applications and which can be provided separately.
The data model component, a part of the overall system of the present invention, provides for access to pre-packaged data models, enable their reconfiguration as well as provide aids to dimensional modeling in the DW and BI context.
The data model component is organized across verticals and business functions across these verticals. The data model component of the system has the following additonal features:
The list of in-built data models currently available in the data model component is given in Table 1.
The MetAl component, a part of the system of the present invention, provides for acquisition, maintenance and movement of metadata to and from various architecture components in the enterprise. This component provides for a MetAl database—a central, shared source of metadata including prepackaged metadata—enabling reduction in implementation and maintenance costs and thereby helping customers get more value.
The MetAl component of the system has the following additional features:
The data warehousing and business intelligence implementation methodology is a unique full life-cycle methodology for implementation of data warehousing and business intelligence solutions covering all the phases. The DW and BI methodology provides for a structured and uniform approach to all DW and BI engagements as well as encapsulates the best practices and unique approach/philosophy towards such engagements. It is a unique methodology (defined series of steps) for implementation and maintenance of data warehousing projects. The methodology is carried out in five stages including requirements analysis, design, development, deployment and maintenance and support.
The process flow chart for the implementation methodology is shown in
Requirement analysis stage consists of collecting the requirements from the business users and IT users in the organization mainly through interviews. Analysis is done on the critical success factors, existing business processes, source data, IT infrastructure, and reporting needs, and the requirments are documented and prioritized.
The design stage consists of the following activities. (The logical data modeling and physical database designing are executed in sequence. The other activities are executed more or less in parallel with overlaps/staggered start of activities. Normally back room processes design and end user applications design are taken up after technical architecture and database designing have progressed enough to give inputs to these).
Technical architecture of the solution is defined based on the user requirements and the information about the existing infrastructure. Following are defined as part of the technical architecture:
A conceptual data model is first developed based on analysis of source data and the requirements. From the conceptual data model, the logical data models for the staging area, ODS, data marts/data warehouse are created as required.
Physical database design focuses on defining the physical structures necessary to support the logical data model. Primary elements of this stage involve defining naming standards and setting up the database environment. Preliminary indexing and partition strategies are also determined.
The back room services include the extraction, transformation and loading services, metadata services, and warehouse administration services, if any. This stage involves design/customization of all back room processes/tools.
The back room services design includes:
The end user application (front room) design involves design/customization of all data access components/tools (end user applications), screens and reports.
End user application design involves:
The development stage consists of the following activities:
Back room services development activity involves coding/scripting for all the back room services including the ETL processes and warehouse administration processes. Alternately, if any tool from the market need to be used for extraction and transformation/scrubbing/cleansing, customization of the same is carried out.
In the end user applications development stage, the end user applications are developed by configuring the data access tools and/or developing screens and reports. Administrative modules, if any, are also developed in parallel.
Product installation involves installation and testing all hardware and software including ETL tools, servers (DB/application/web), DBMS, data access tools, metadata management tools etc.
Creation and testing of a prototype of the solution. The scope and nature of the prototype is decided in the requirement analysis stage.
Prototyping involves:
The deployment stage consists of the following activities:
Creation of the physical databases for the operational data store/data mart/data warehouse. Deployment of the backroom and front room applications (custom-developed) is also done in this stage.
Initial load and validation of the database comprising the extraction, transformation and loading processes are executed for the initial load of the data warehouse; data validation is done against the pre-defined data quality norms to ensure the completeness and correctness of data loaded.
System tests are conducted as per the System Test plan, and covering the entire application.
System test includes:
In the transition stage, the complete solution is handed over to the customer after acceptance tests and user training.
Transition involves:
In the user training stage:
The DW and BI methodology has the following features:
Detailed reference material on the phases, tasks, activities and all relevant templates
Detailed aids, guidelines and best practices reflecting experience, expertise and philosophy relating to such engagements. These would be especially useful in stages relating to technical architecting, dimensional modeling, choice of tools, technologies and approaches, etc.
Enable systematic documentation relating to the engagement, structured storage and provide for its import and/or export across locations
Audit trail and configuration management
The unique methodology supports the generic and specialized solution offersings shown in Table 2.
Data warehousing and business intelligence engagements identified gaps with respect to availability of appropriate tools and technologies vis-ä-vis certain specific requirements. As part of this philosophy to constantly enrich the base of reusable components, the following add on applications can be included in the system of the present invention.
Desktop version can be installed on PCs as well as laptops. The desktop application can work in stand-alone mode. The data can be extracted from a relational database or flat files into the desktop PC so that the user can work on the data independently without connecting to the corporate data warehouse or data mart. The administrator of this application just needs to plug the model into the application and this model is then available to the end user for his analysis needs.
The web verison of the product gives access to users from any location within the company via the intranet or even over the internet. Any version updates can be replicated for all users by updating the application only at the server, thereby eliminating the need for version updates at different user locations. Also the model can be plugged into the application only on the application server and the model is then available to all users for their analysis needs. The web verison can extract data from XML files apart from any relational database or flat files.
Currently this application architecture contains two layers.
Application process layer contains the following tiers
The model building layer performs the following processes
This architecture is highly modularized for easy maintenance. The desktop version of the product contains the following modules.
A data mining application is used for prediction, analysis and visualization. It uses algorithm models built by using data mining tools such as oracle Darwin, SAS E-miner, SPSS Clementine. These models are ‘plugged’ into the application and used for prediction, analysis and visualization. The product comes in two versions, one for desktop users and another web based version.
There are several data mining applications available in the market. But the primary deficiency is that it does not differentiate between expert users and the ordinary business users. The data modeling component involving complex statistical techniques and the query and visualization component, which helps in interpreting the results are tied together, thereby making the analysis a difficult proposition for the ordinary users.
Off-line analysing and processing involves providing for information anytime, anywhere. It is an application, which provides for multidimensional analysis of the data in stand-alone mode without connecting to the server, transmission of reports via multiple communication channels (push mechanism) to the user and sharing of analytical business data with business partners without compromising on security.
From the foregoing description, it should be undestood that the description is made by way of example only and that the invention should not be understood as limited to the particular embodiments described herein. It is also to be understood that various modifications, rearrangements and substitutions can be made by one skilled in the art without departing from the scope and spirit of the invention.
Number | Date | Country | |
---|---|---|---|
60375447 | Apr 2002 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10422998 | Apr 2003 | US |
Child | 11389491 | Mar 2006 | US |