Data warehousing systems and methods having reusable user transforms

Information

  • Patent Application
  • 20070174308
  • Publication Number
    20070174308
  • Date Filed
    August 02, 2006
    19 years ago
  • Date Published
    July 26, 2007
    18 years ago
Abstract
Data warehousing systems and methods are described herein. In one example methodology, a plurality of transformation templates are stored in a transformation library, each transformation template including software objects that describe how to transform extracted data prior to loading the data into the data warehouse. A unique identifier may be associated with each of the plurality of transformation templates. Using this unique identifier, an instance of at least one of the plurality of transformation templates may be instantiated into a data transformation process by mapping the software objects of the transformation template into the instance of the transform using code segments. The code segments of the data transformation process are then executed to transform and load the extracted data into the data warehouse.
Description

BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is an example data warehousing process employing the ETL methodology.



FIG. 2 is an example network topology showing components of a data warehousing system employing a library of reusable user transformations.



FIG. 3 is an example block diagram of a data warehousing system in which a plurality of user defined processes are linked into a plurality of template transformations maintained in a library of reusable user transformations.



FIG. 4 is an example methodology for designing a template transformation and using the template to process and load data into a data warehouse.



FIG. 5 is an example methodology for modifying a template transformation and for propagating such modifications to user defined processes linked to the modified template.



FIG. 6 is an example flow diagram for building a new template transform.



FIG. 7 is an example metadata model representation of a template transformation.



FIG. 8 is an example block diagram showing a process used by the transformation factory to construct an instance of a templated transform.



FIG. 9 is an example block diagram explaining how the metadata contained in the template transforms maps to software items used by the transformation factory to construct an instance of a templated transform in a job process.



FIG. 10 is an example graphical user interface for visualizing impact analysis in a data warehousing system.



FIG. 11 is an example graphical user interface for defining the inputs/outputs of a template transform.



FIG. 12 is an example graphical user interface for defining prompts to be displayed corresponding to the required number of inputs for a template transform.



FIG. 13 is an example graphical user interface for defining options of a template transform.



FIG. 14 is an example graphical user interface for defining details of the options selected from the interface of FIG. 13.



FIG. 15 is an example graphical user interface that displays how a template transform would appear to a user when integrating the template into a user defined job process.



FIG. 16 is an example graphical user interface that shows the options of the template transform as they would be displayed to the user of the transform.



FIG. 17 is an example relationship diagram depicting a system and method of data warehousing.


Claims
  • 1. A data warehousing method, comprising: storing a plurality of transformation templates in a transformation library, each transformation template including metadata objects that describe how to transform extracted data prior to loading the data into a data warehouse;associating a unique identifier with each of the plurality of transformation templates;instantiating an instance of at least one of the plurality of transformation templates into a data transformation process via the unique identifier associated with the at least one transformation template, wherein the metadata objects of the transformation template are mapped into the instance of the transform using executable code segments; andexecuting the data transformation process to transform and load the extracted data into the data warehouse.
  • 2. The data warehousing method of claim 1, further comprising: exporting the executable code segments to a file.
  • 3. The data warehousing method of claim 1, wherein the metadata objects of the transformation template include a main object that describes the name and purpose of the transform, a code object that supplies the generated code for the transform, and an input/output object that describes the inputs and outputs of the transform.
  • 4. The data warehousing method of claim 3, further comprising: a version object that tracks modifications to the transformation template.
  • 5. The data warehousing method of claim 3, further comprising: an options object that describes an interface to a user of the transform, the interface including one or more variable fields that are supplied by the user when customizing the instantiated transform.
  • 6. The data warehousing method of claim 3, wherein the unique identifier associated with each transformation template is stored along with the template as a metadata object.
  • 7. The data warehousing method of claim 1, wherein the executable code segments comprise XML code segments.
  • 8. The data warehousing method of claim 1, further comprising: modifying the at least one transformation template stored in the transformation library; andperforming an impact analysis on the modified template using the unique identifier to determine whether any data transformation processes will be affected by the modifications.
  • 9. The data warehousing method of claim 1, further comprising: accepting the modifications to the at least one transformation template and saving the modified template in the transformation library.
  • 10. The data warehousing method of claim 9, further comprising: propagating the modified template to any data transformation process that is linked to the template via the unique identifier; andautomatically updating the data transformation process using the modified template.
  • 11. The data warehousing method of claim 9, further comprising: propagating the modified template to any data transformation process that is linked to the template via the unique identifier; andprompting a user of the data transformation process to accept or reject the modified template.
  • 12. The data warehousing method of claim 11, further comprising: if the user accepts the modified template, then updating the data transformation process using the modified template; andif the user rejects the modified template, then discarding the modified template.
  • 13. The data warehousing method of claim 4, further comprising: comparing the version object of the transformation template with an associated version object of the instance of the transformation in the data transformation process; andautomatically updating the instance of the transformation in the data transformation process with a current version of the transformation template if the version object of the transformation template indicates a newer version than the version object of the instance.
  • 14. A data warehousing method, comprising: designing a transformation template and storing the template, along with a unique identifier for the template, in a transformation library;instantiating a plurality of instances of the transformation template into a plurality of data transformation processes, each instance of the transformation template being linked to the transformation template stored in the transformation library via the unique identifier;modifying the transformation template stored in the transformation library; andpropagating the modifications to the transformation template to each of the instantiated instances of the template using the unique identifier.
  • 15. The data warehousing method of claim 14, further comprising: executing the data transformation processes to transform and load data into a data warehouse.
  • 16. The data warehousing method of claim 14, further comprising: prior to propagating the modifications, performing an impact analysis on the modifications to determine how the modifications will affect the instantiated instances of the transformation template in the plurality data transformation process.
  • 17. The data warehousing method of claim 16, further comprising: in response to the impact analysis, determining whether or not to save the modifications to the transformation template to the transformation library.
  • 18. The data warehousing method of claim 17, further comprising: if it is determined not to save the modifications to the transformation template, then inhibiting the propagation of the modifications to each of the instantiated instances of the template.
  • 19. The data warehousing method of claim 14, further comprising: notifying a user of the instantiated instance of the transformation template that the template has been modified.
  • 20. The data warehousing method of claim 19, further comprising: prompting the user to accept the modifications to the transformation template, and if the user accepts the modification, then storing the modifications along with the instantiated instance of the transformation template for use in the data transformation process.
  • 21. The data warehousing method of claim 14, further comprising: securing the stored transformation templates from being modified by unauthorized users.
  • 22. A method of generating a transformation template for use in a data warehousing system, comprising: providing a name and description of the transformation template;providing a code body to be executed when the transformation template is instanced into a job process for warehousing data;providing a description of the user interface to be presented by the transformation when it is executed in the job process; andproviding input and output parameters for the transformation;wherein the provided information for the transformation template is stored in a transformation library along with a unique identifier that enables instances of the transformation template in job process to be linked to the stored template.
  • 23. A data transformation template stored in a computer-readable medium for use in transforming data prior to loading into a data warehouse, comprising: a main object for storing a name and description of the transformation template;a code object for storing computer-executable instructions to be executed when the transformation template is instanced into a job process for warehousing data;a user interface object for storing one or more variable fields that are supplied by the user when customizing an instantiated version of the transform;an input/output object for storing the input and output parameters for the transformation; anda transform link field that stores a unique identifier that enables instances of the transformation template in data warehousing job process to be linked to the stored template.
  • 24. The data transformation template of claim 23, further comprising a version object that is updated when the transformation template is modified.
  • 25. The data transformation template of claim 23, wherein the user interface object includes one or more property objects, each property object defining configuration settings for the one or more variable fields that are supplied by user.
  • 26. The data transformation template of claim 25, wherein the property object includes a type field that defines the type of the variable field and a valid values field that defines the permissible values that can be entered in the variable field by the user of the template.
  • 27. The data transformation template of claim 26, wherein the property object includes a default value for the variable field.
Provisional Applications (1)
Number Date Country
60757690 Jan 2006 US