The present invention relates to an improved system and method for analysing data from multiple perspectives. In particular, the present invention relates to a method of or system for generating multiple perspective views of a graphical representation of data based on a single database query.
When analysing data, different results can be provided depending on the perspective in which the data is viewed or analysed.
For example, the data analysis may be performed from a financial perspective where the data analysed may be transaction specific. According to another example, the data analysis may be performed from a machine or device perspective where the data analysis may be machine or device specific. As a further example, the data analysis may be performed from a marketing, consumer or customer perspective where the data analysis may be location specific.
That is, for different query purposes, it is beneficial to look at data in different ways that tie up with a specific desired purpose. For example, the data may be analysed and viewed in the context of location, time, customer groups, device groups, configurations or themes, for example.
In known data analysis systems, the end user is required to understand the structure of the data that is being analysed and perform programmatic query functions to return different sets of results for each different query purpose. That is, the end user must create and/or re-write the query functions each time a different analytical perspective is needed.
This clearly provides a burden on the end user in having to train themselves or other individuals in order to ensure that correct queries are generated so that accurate analysis of the data is performed.
Data visualization systems have been developed to enable the data analysis results to be visualized in a user friendly and intuitive manner. One such system has been created by the current applicants and is described in U.S. Ser. No. 13/000,323 entitled “METHODS, APPARATUS AND SYSTEMS FOR DATA VISUALIZATION AND RELATED APPLICATIONS” filed 20 Dec. 2010.
An object of the present invention is to provide an improved data visualization system or method that provides analytical data query results showing multiple perspectives.
A further object of the present invention is to provide an improved data visualization system or method that improves efficiency in generating multiple perspective query results.
Each object is to be read disjunctively with the object of at least providing the public with a useful choice.
The present invention aims to overcome, or at least alleviate, some or all of the afore-mentioned problems.
Further objects and advantages of the invention will be brought out in the following portions of the specification, wherein the detailed description is for the purpose of fully disclosing the preferred embodiment of the invention without placing limitations thereon.
The background discussion (including any potential prior art) is not to be taken as an admission of the common general knowledge.
It is acknowledged that the terms “comprise”, “comprises” and “comprising” may, under varying jurisdictions, be attributed with either an exclusive or an inclusive meaning. For the purpose of this specification, and unless otherwise noted, these terms are intended to have an inclusive meaning—i.e. they will be taken to mean an inclusion of the listed components that the use directly references, but optionally also the inclusion of other non-specified components or elements.
According to one aspect, the present invention provides in a data visualisation system, a method of generating multiple perspective views of a graphical representation of data based on a single database query, the method comprising the steps of: receiving a database query, determining the query data associated with the database query retrieving the query data from a data storage module that is in communication with the data visualisation system, generating multiple perspective query results from the database query, and generating a graphical representation based on one or more of the multiple perspective query results.
Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
The following described invention is suitable for use in conjunction with other methods, and the incorporation into one or more systems, for example as described in METHODS, APPARATUS AND SYSTEMS FOR DATA VISUALISATION AND RELATED APPLICATIONS (earlier filed by the applicant in the entirety as U.S. provisional patent application Ser. No. 61/074,347 filed on 20 Jun. 2008), which is incorporated by reference, and a portion of which herein follows.
Four key terms (or concepts) form the foundation of the specification set out in this document and accordingly have been defined as follows:
The four key terms are:
The key terms are defined as follows:
Business Performance Drivers (BPDs): A Business Performance Driver (BPD) is a business metric used to quantify a business objective. For example, turnover, sales. BPDs are Facts (sometimes referred to as measures). Facts are data items that can be counted. For example, Gross Sales; Units Sold.
BPDs comprise of:
In other words a Business Performance Driver (BPD) is a ‘measure’ that can be normalized. Measures are data items that can be counted. For example, Gross Sales; Units Sold. BPDs might be displayed on visualizations. For example, Revenue earned per store on a map. Restrictions and/or Normalizations could be applied to a BPD. The following table provides examples of these:
BPD Packages: A BPD Package is made up from a set of related BPDs. This relationship (between a BPD Package and its BPDs) is defined using metadata.
BPD Packages can be thought of as the Visual Document's vocabulary.
Visual Designs: Visual Designs are a classification of the different types of visualizations that a user may choose. Within each Visual Design, there are a number of visualizations. For example, the ‘spatial’ category can have retail store location maps or geographical location maps.
The software solution allows users to select one visualization (one visual form within a Visual Design category) to create a Visual Document.
Visual Document: A Visual Document contains visual representations of data. Access to the data used to construct the visual representation is in many ways analogous to a textual document.
A Visual Document is constructed by applying BPD data to a specific Visual Design. It is designed to illustrate at least one specific point (using the visualization), supports the points made with empirical evidence, and may be extended to provide recommendations based on the points made. The Visual Document is a deliverable to the user.
Heatmaps: A heat map is a graphical representation of data where the values taken by a variable in a two-dimensional map are represented as colors. A very similar presentation form is a Tree map.
Heat maps are typically used in Molecular Biology to represent the level of expression of many genes across a number of comparable samples (e.g. cells in different states, samples from different patients) as they are obtained from DNA microarrays.
Heat maps are also used in places where the data is volatile and representation of this data as a heat map improves usability. For example, NASDAQ uses heat maps to show the NASDAQ-100 index volatility. Source: Wikipedia (http://en.wikipedia.org/wiki/Heat_map).
This is shown diagrammatically in
If a user hovers over a stock, additional intra-day data is presented—as shown in
The key terms are set out diagrammatically in
Many organizations are facing massive and increasing amounts of data to interpret, the need to make more complex decisions faster, and accordingly are turning to data visualization as a tool for transforming their data into a competitive advantage. This is particularly true for high-performance companies, but it also extends to any organization whose intellectual property exists in massive, growing data sets.
One objective of the described solution is to put experts' data visualization techniques in the customer's hands by skillfully guiding the end user through choosing the right parameters, to display the right data, and to create its most useful visualizations to improve business performance.
The described solution is a generic tool and can apply to multiple business areas that require decisions based on and understanding massive amounts of data. The resulting browser-based output is defined as a ‘Visual Document’.
The solution provided is summarized in
The system identifies user tasks 201 in the form of defining visual documents, requesting visual documents, requesting rendered documents, calls to action, and analyzing results. These tasks are then detected by the system in conjunction with other systems 203, which include CRM applications, third party Business Intelligence (BI) Tools and other third party applications, all of which may access data stored in an enterprise data warehouse (EDW). The visual design layer concept 207 may be utilized within the visual documents 205. The creation of the visual documents is made in conjunction with a number of different defined visual design types 209, BPD packages 211, spatial analysis maps 213 and other application components 215, such as application servers and application infrastructure.
A Visual Document contains visual representations of data. Access to the data used to construct the visual representation is in many ways analogous to a textual document. It is constructed by applying Business Performance Driver(s) (BPD) data to a specific Visual Design (Visual Designs are grouped into ten classifications).
A Visual Document is designed to illustrate at least one specific point (using the visualization), support the points made with empirical evidence, and may be extended to provide recommendations based on the points made. The Visual Document is the actual deliverable from the software to the software user. Visual Documents may be stored, distributed or analyzed later, as needed.
The Visual Document is fed by data and a metadata database that stores definitions of BPDs—the BPDs are the focus of the Visual Document. A Business Performance Driver is a business metric used to quantify a business objective. Examples include, gross sales or units sold. For instance, the Visual Document may be used to graphically depict the relationship between several BPDs over time.
In the Visual Document, data is rendered in up to seven layers in one embodiment. However, it will be understood that the number of layers may be varied as needed by the user. Specific Visual Document Layers are described herein. However, it will be understood that further Visual Document Layers may be included over and above the specific types described.
Visual Designs are explicit techniques that facilitate analysis by quickly communicating sets of data (termed BPD Packages) related to BPDs. Once constructed, Visual Documents may be utilized to feed other systems within the enterprise (e.g., Customer Relationship Management (CRM) systems), or directly generate calls to action.
The described solution utilizes the best available technical underpinnings, tools, products and methods to actualize the availability of expert content.
At its foundation, the solution queries data from a high performance enterprise data warehouse characterized by parallel processing. This database can support both homogeneous (identical) and heterogeneous (differing but intersecting) databases. The system is adaptable for use with a plurality of third party database vendors.
A scalable advanced web server framework can be employed to provide the necessary services to run the application and deliver output over the web.
A flexible and controllable graphics rendering engine can be used to maximize the quality and speed levels required to support both static and dynamic (which could be, for example, animated GIF, AVI or MPEG) displays. All components can operate with a robust operating system platform and within secure network architecture.
Pre-existing (and readily available) third party components can be employed to manage user security (e.g. operating system security), industry specific applications and OLAP (Online Analytical Processing) or other more traditional reporting. The described solution is designed to facilitate speedy and reliable interfaces to these products.
A predictive modeling interface assists the user in analyzing forecasted outcomes and in ‘what if’ analysis.
Strict security, testing, change and version control, and documentation standards can govern the development methodology.
Many organizations are facing massive and increasing amounts of data to interpret, the need to make more complex decisions faster, and accordingly are turning to data visualization as a tool for transforming their data into a competitive advantage. This is particularly true for high-performance companies, but it also extends to any organization whose intellectual property exists in massive, growing data sets.
This clash of (a) more data, (b) the increased complexity of decisions and (c) the need for faster decisions was recently recognized in an IDC White Paper (Gantz, John et. al.; IDC White Paper; “Taming Information Chaos: A State-of-the-Art Report on the Use of Business Intelligence for Decision Making” November 2007), which described this clash as the “Perfect Storm” and that this ‘storm’ will drive companies to make a quantum leap in their use of and sophistication in analytics.
Today's business tools and the way they operate barely allow business users to cope with historical internal data, let alone internal real time, predictive, and external data.
Hence, a new paradigm in business intelligence solutions is required.
As explained above,
There are five key components to the system. These are:
A description of each of these components is set out below under the respective headings.
The Visual Documents form the core of the solution from a user perspective. This may include visualization(s), associated data and/or metadata (typically the visual form) that the user defines requests and interacts with. The Visual Documents may consist of single frames or animated frames (which could be, for example, implemented in AVI, GIF or MPEG format or a sequence of still images).
The Visual Document is typically viewed in a dynamic web browser view. In this interactive view the user may observe, select and navigate around the document.
Once created, the Visual Documents may be stored in the database and may be distributed to key persons (printed, emailed etc.) or stored for later use and analysis.
The Visual Designs are a classification of the different types of visualizations that a user may choose. Within each Visual Design category, there are a number of visualizations. For example, the ‘spatial’ category can have retail store location maps, network maps or geographical location maps, such as, for example, maps available from Google™ or Yahoo™.
The described system allows users to select one or more visualizations (e.g. one visual form within a Visual Design category) to create a Visual Document.
There are ten Visual Design categories defined below, however it will be understood that further Visual Designs are envisaged, as well as the number of visualizations within each classification and the number of classifications.
Business Performance Drivers (BPDs) are a metric applied to data to indicate a meaningful measurement within a business area, process or result. BPDs may be absolute or relative in their form of measurement.
The Business Performance Driver (BPD) concept differs from the known KPI concept by introducing BPDs that
(1) may have multiple dimensions,
(2) place the BPD in the context of the factors used to calculate them,
(3) provide well understood points of reference or metadata around which visual document creation decisions can be made, and
(4) may contain one or more methods of normalization of data.
Common groups of BPDs are called BPD Packages. For example, BPDs relating to one industry (say, telecommunications) can be grouped into one BPD Package. BPDs may be classified into one or more BPD Packages. For example, Net Revenue with normalizations available of per customer or per month may be applicable in a number of industries and hence, applicable to a number of BPD Packages.
Spatial maps allow for a user-owned and defined spatial map and/or for the user to use publicly available context maps such as Google™ Maps or Yahoo™ Maps. In either case, the user can display selected BPDs on the chosen spatial map.
Typically, a user-owned spatial map may be the inside floor space of a business and a publically available context map may be used for displaying BPDs on a geographic region e.g. a city, county, state, country or the world.
The described application includes two main components, the Application Servers and the Application Infrastructure.
The Application Server includes a number of servers (or server processes) that include the Rendering Engine (to make (or render) the Visual Documents), Metadata Servers (for the BPD Packages, the Visual Designs and the BPDs) and the Request Queue.
The Application Infrastructure is also comprised of a number of servers (or server processes) that may include a Listener (which ‘listens’ for document requests) and central error logging.
Based on the user selections made above (Visual Documents, Visual Designs and BPDs), the user can click on an action and send a communication to a third party system (CRM, Business Intelligence or other application). The third party system could, for example, load the list from the solution and then send out a personalized email to all members on that list.
According to one embodiment, the described server components of the application are a Java based application and utilize application framework such as the IBM™ WebSphere application server framework, other platforms and server applications may be utilized as alternatives. The client application may be a mashup that utilizes the server components or it could be a rich internet application written using the Adobe™ Flash framework.
Other key elements of the system may include:
The diagram shown in
These modules are described in the subsequent table. More detailed descriptions and diagrams of each of the software modules are provided below.
The table below outlines the following four items in relation to each module:
This section contains descriptions and diagrams of the architectural views of the system. The architecture shows how the system components fit and operate together to create an operational system. If compared to a vehicle, the wiring diagrams, the physical body, the driving circle and key complex components like the engine would be shown in architectural views.
This view does not describe how the system is written; it describes the high-level architectural considerations.
Architectural considerations are typically implemented by one or more software modules. The modular view described herein lays out a high-level view of how the software modules are arranged.
The following modules or components are shown:
Web interface Module 4105: User interfaces are browser based or may be a web services client, a rich internet application or may be a thick client. In all cases the user interface uses the same interface to the back end services.
Rendering Definition Module 4110: The user interface is used to define and request the rendering of Visual Documents
Rendering Use Module 4115: Visual Documents are used for analysis, and precipitate calls to action.
Connectivity Services Module 4120: The definition and rendering of Visual Documents is performed through a set of programs or services called the Connectivity Services.
Configuration Management Tools Module 4125: Multiple versions of the basic elements; BPD, Visual Design, Visual Documents; are managed by a set of programs called the Configuration Management Tools.
Visual Document Management Catalog 4130: One such Configuration Management Tool (4125) is a set of programs that manage a users' catalog of available Visual Documents.
Predictive Modeling Module 4135: Predictive modeling is used for forecasting unknown data elements. These forecasts are used to predict future events and provide estimates for missing data.
Map Management Tool 4140: Another of the Configuration Management Tools (21125) is the Map Management Tool. It is designed to manage versions of the spatial elements of a visual design such as a geographic map or floor plan.
Visual Document Definitions Management Module 4145: Visual Document Definitions are managed through the use of metadata (4175).
Message Queue Submission Module 4150: Requests for Visual Documents are handled through queued messages sent between and within processes.
Visual Design Type Module 4155: Visual Documents are comprised of one or many Visual Designs in these categories.
Visual Document Status Module 4160: The status of Visual Documents is discerned from the metadata and displayed on the user interface.
Interaction and Visual Document View Module 4165: The user interacts with the Visual Documents through the user interface, and appropriate changes to and requests to read are made to the metadata.
List Production Module 4170: Where additional output such as customer lists are required, they are requested using the user interface and stored in the EDW (4215).
Data Packages Metadata Module 4175: Metadata is used to describe and process raw data (data packages).
Message Queue Module 4180: Messages may be queued while awaiting processing (4150).
Visual Design and BPD Metadata Module 4185: Metadata is used to describe and process the BPD's and Visual Designs associated with a particular Visual Document.
Visual Documents Module 4190: Visual Documents may be comprised of layered Visual Designs.
Third Party Modules 4195: Visual Documents may be used with or interact with other third party tools.
Listener Module 4200: The listener processes messages (4150) in the message queue (4180)
Document Controller Module 4205: The document controller is used to provide processed data to the rendering or query engines.
Central Error Logging Module 4210: System errors are detected and logged in the EWP (4215).
EDW 4215: All data is typically stored on a database, typically, multiple fault tolerant processors in an Enterprise Data Warehouse.
The following architectural components are described in more detail.
The following terms have been also been used in
A further high-level system delivery overview of the solution is set out as shown in
The described solution 500 is hosted by the enterprise 510. The figure shows the logical flow from the submission of a request to the end result, viewing the rendered Visual Document.
The data being visualized belongs to the customer 512 and the submitted request is unknown to the entity running the visualization system 500.
The controlling entity, integrators and customers may wish to have summaries of technical performance data (usage patterns, errors etc) sent from the operational system back to the integrator or controlling entity.
The system 500 has access to the data in a EDW 505. The system utilizes a request queue 515 to control requests from a corporate network 510. These requests are forwarded to a document controller 520. The document controller 520 accesses both the EDW 505 and reads visual designs and BPD metadata services 525, as well as data packages metadata services 530.
The system described thus enables various methods to be performed. For example, data is transformed into visually interpretable information. The visually interpretable information is in the form of visual representations that are placed within one or more visual documents.
The User Interface 610 allows the user to define BPD's 615 in terms of raw data 627, which become the focus of the Visual Document 630.
Further, the User Interface 610 allows the user, through automated expert help, to create the Metadata 620, the most appropriate Visual Designs 635 that make up the Visual Document 625 in order to provide detailed analysis of data related to the BPD 615. The data acquisition, visual design rendering and visual document rendering processes utilize massive amounts of raw data 627.
The Metadata 620 is used by the Processes 625 to optimize the acquisition of the appropriate Data 627, processing of the data into useful information, and to optimize the creation and rendering of the Visual Designs 635 and the Visual Document 630 that contains them.
This method includes the steps of providing comprehensive yet easy to understand instructions to an end user that has accessed the system and the visual design application. The instructions assist the end user in obtaining data associated with a theme, wherein the theme may be focused on objectives that have been derived from the data. The objectives may be business objectives, for example. In this way, the system guides a user carefully through the many choices that are available to them in creating the visual representations, and the system automatically tailors its instructions according to not only what the user requires, but also according to the data that is to be represented. The system focuses on providing instructions to enable a visual representation to be created that will enable an end user to more effectively understand the data that has been collated.
Further, the instructions assist the end user in determining one or more summaries of the obtained data that enable the end user to understand the theme, as well as organizing the determined summaries into one or more contextual representations that contribute to the end user's understanding of the theme.
Further, instructions are provided that assist an end user in constructing one or more graphical representations of the data, where each graphical representation is of a predefined type, as discussed in more detail below, and includes multiple layers of elements that contribute to the end user's understanding of the theme.
Finally, instructions are provided to assist an end user in arranging the produced multiple graphical representations in a manner that enables the end user to understand and focus on the theme being represented as well as to display or print the organized graphical representations. The system assists in the organization or arrangement of the representations, elements thereof, within the visual document so as to ensure certain criteria are met, such as, for example, providing a suitable representation in the space available, using the minimum amount or volume of ink to create the representation, and providing a suitable representation that depicts the theme in a succinct manner, or visually simplistic manner.
The data being processed to create the graphical representations may be particularly relevant to the theme being displayed, disparate information or indeed a combination of relevant and disparate information.
There are multiple types of graphical representations that may be included within the visual document. The types are discussed in more detail below and include a hierarchical type, a spatial type, a virtual type, a classical type, a navigational type, a temporal type, a textual type, a structural type, a pivotal type, and an interactive type.
Further, the instructions may assist an end user in arranging the graphical representations in order to display high density data in a manner that conveys important information about the data, rather than swamping the end user with multiple representations that look impressive but do not convey much information.
In addition instructions may be provided to assist the end user in arranging the graphical representations to allow supplementary information to be added, where the supplementary information may be provided in any suitable form. Particular examples provided below depict the supplementary information being provided in subsequent visual layers that overlay the graphical representation. Alternatively, or in addition, supplementary information may include additional elements to be displayed within a single layer of the representation, for example, in the form of widgets.
Step 6105: Process Starts. User decides to manage the business.
Step 6110: Available data is identified and analyzed.
Step 6115: Business Process Drivers (metrics defined in terms of the data to indicate a meaningful measurement within a business area, process or result).
Step 6120: Data influencing the BPD metrics are identified.
Step 6125: BPD's are input into a computer system
Step 6130: BPD is categorized and appropriate metadata describing it is generated.
Step 6135: Visual Designs to display the influential data are created.
Step 6140: Visual Designs are aggregated into Visual Documents and rendered. Adjustments are made based on the freshness of all components (e.g., BPD, available data).
Step 6145: Visual documents are analyzed by the end user.
Step 6150: The end user decides on and implements actions based on the analysis in 6145.
As touched on above, business performance drivers (BPDs) are used to enable more efficient data analysis so as to produce accurate and relevant visual representations of the data. A BPD is a form of advanced business measure wherein additional information is included within the BPD that enables the system using the BPD to understand how to manipulate the BPD. That is, one or more intelligent attributes are included with the business measure to form the BPD, where those attributes reference or include information on how the BPD is to be processed or displayed. The form of processing and display may also be varied according to the device type or media upon which the business measures are to be displayed.
The attributes are attached to the business measure by storing the BPD in the form of a mark up language, such as, for example, HTML or XML. It will however be understood that any other suitable format for storing the BPD may be used where the attributes can be linked to the business measure.
In the example of HTML, the attribute is included as a tag. One such example would be to include the data or business measure within the body of the HTML code and follow the business measure with a tag that references the attributes, or dimensions, associated with that business measure.
Further, the attributes may also be modified or deleted, or indeed new attributes added, during or after the processing of the BPD so that the attributes are maintained, or kept up to date, bearing in mind the requirements of the entity using the BPD to visualize their data.
The business performance drivers, or measurable business objectives, are identified in order to create graphical representations of the business objectives, where those representations are placed within a visual document. A business objective may be, for example, a metric associated with a business.
Instructions are provided by the system to the end user, in order to assist the end user in establishing multiple business objectives as functions of available metrics, as well as assisting the user in organizing the business objectives into a contextual form that contributes to the end users understanding of the business objectives.
Further, instructions are provided to assist the end user in constructing one or more graphical representations of the business objectives, where each graphical representation is of a predefined type, as mentioned above and described in more detail below. Further, each graphical representation includes multiple layers of elements that contribute to the end user's understanding of the business objective.
The elements within the graphical representation may include, for example, a shape, position, color, size, or animation of a particular object.
Instructions are also provided by the system to assist the user in arranging multiple graphical representations in a suitable manner that enables the end user to understand and focus on the business objectives being represented.
Finally, the end user is also assisted with instructions on how to display the organized graphical representations.
The following section describes a method of creating a visual representation of data in the form of a visual design.
The method includes the steps of the system providing instructions to an end user to assist the end user in constructing multiple graphical representations of data, where each graphical representation is one of a predefined type, as defined above and explained in more detail below, and the graphical representation includes multiple layers of elements that contribute to the end user's understanding of the data
The system also provides instructions to an end user that assist the end user with arranging multiple graphical representations of different types within the visual representation in a manner that enables the end user to understand and focus on the data being represented, as well as providing instructions to assist the end user in displaying the visual representation in a suitable manner.
The visual representation may be displayed in a number of different ways, such as on a color video screen or a printed page. The information that is forwarded to the display device to create the visual representation may differ according the type of display device so that the visual representation is produced in the best known suitable manner utilizing the advantages of the display device, and avoiding any disadvantages.
The data being displayed may be based on a measured metric or an underlying factor that affects a metric.
The elements within the graphical representation may include a shape, position, color, size or animation of a particular object.
Although a single visual document may include only one type of graphical representation, either in the form of multiple graphical representations or a single representation, there will also be situations where multiple types of graphical representations may be organized within a single visual document in order to convey different aspects of the data, such as, for example, temporal as well as spatial information. The inclusion of different types of graphical representations within a single document can provide an end user with a better understanding of the data being visualized.
Further, the single visual representation may be arranged to be displayed as an image on a single page or screen. This may be particularly useful where space is at a premium yet the user requires the visual representation to be provided in a succinct manner. For example, the user may request certain information to be displayed in a visual representation on a single mobile telephone display, or a single screen of a computer display, in order to show a customer or colleague the results of a particular analysis without the need to flick between multiple screens which can result in confusion, a waste of energy and ultimately a loss of understanding of the visual representations.
The same issue applies to printed representations, where the result of the system enabling a user to arrange a single representation, which may include multiple elements or layers, on a single page not only succinctly represents the data being analyzed but also saves the amount of paper being printed on and the amount of ink being used to print the document.
Further, the amount of ink required for a visual representation may be further reduced by providing instructions to the end user in a manner that directs them to control and use white space in a representation in an efficient manner so as to reduce the requirement of ink.
Multiple types of graphical representations may be merged together within a single visual document, or representation.
As mentioned above, instructions can be provided by the system to assist the end user in adding supplementary information to the visual representation, and the supplementary information may be provided in layers within the representation.
The following description provides the visualization framework that will support embodiments of the present invention. The description includes an overview of the importance of Visual Design including a brief historical recount of a world-recognized leading visualization. The description also sets out the Visual Design classifications for the described solution.
It will be understood that the Visual Design examples described in this section are examples for illustrative purposes to identify the concepts behind how the visualization is produced. Therefore, it will further be understood that the concepts described can produce visual designs different to those specifically described. The Visual Design examples shown are also used to help the reader understand the narrative describing the Visual Designs.
The system described is specifically adapted to create actual specific visualization designs relevant to selected vertical and horizontal industry applications being deployed.
A vertical industry application is one that is associated with a solution directed at a specific industry, such as, for example, the entertainment industry. In this example, BPDs relevant to that industry are created, such as rental patterns of movies over different seasons.
A horizontal industry application is one that is associated with solutions across multiple industries. For example, the BPD may be based on CRM analytics, which applies across a whole range of different industries.
Design is now a fundamental part of almost every aspect of how people live work and breath. Everything is designed from a toothbrush to every aspect of a web site. Compare visual design to architectural design—in both cases anybody can draw quite complex pictures. The resulting pictures could have stimulating and well drawn graphic elements. In both cases, the question is why does the world need designers? Exploring this question more deeply one can ask—does it make such a difference to how one perceives and understands a design when it is made by a professional rather than an amateur?
The trend in business intelligence is to design tools to provide flexibility and leave the world of visual design to the amateurs. Stephen Few comments in Information Dashboard Design (Few, Stephen—from white paper “BizViz: The Power of Visual Business Intelligence”—Mar. 7, 2006. www.perceptualedge.com) that “Without a doubt I owe the greatest debt of gratitude to the many software vendors who have done so much to make this book necessary by failing to address or even contemplate the visual design needs of dashboards. Their kind disregard for visual design has given me focus, ignited my passion, and guaranteed my livelihood for years to come.”
Visual Designs within the described framework are well thought through in how the data is displayed. The described system allows good information visualization design concepts to be captured and delivered back to users as Visual Documents using unique data processing and analysis techniques.
According to this embodiment, ten Visual Design types are defined and incorporated into the described system. It will be understood that additional Visual Designs may be further defined including the creation of certain examples and actual Visual Designs for specific industry applications.
The visual design types include:
The following describes a method for the assessment of Visual Design quality. In assessing the quality of a Visual Design the following factors should be considered:
There are seven defined Visual Design Layers which are set out diagrammatically as shown in
These seven Visual Design Layers are described in the following table:
In terms of the Special Layer, two examples of Special Layers are set out below:
Source: Wikipedia (http://en.wikipedia.org/wikiNoronoi_diagram)
In mathematics, a Voronoi diagram, named after Georgy Voronoi, also called a Voronoi tessellation, a Voronoi decomposition, or a Dirichlet tessellation (after Lejeune Dirichlet), is a special kind of decomposition of a metric space determined by distances to a specified discrete set of objects in the space, e.g., by a discrete set of points.
In the simplest and most common case, in the plane, a given set of points S, and the Voronoi diagram for S is the partition of the plane which associates a region V(p) with each point p from S in such a way that all points in V(p) are closer to p than to any other point in S.
A Voronoi diagram can thus be defined as a Special Layer, where a set of polygons are generated from a set of points. The resulting polygon layer can then be subjected to thematic treatments, such as coloring.
A calendar can be generated as a Special Layer for display of a temporal visual document. This Special Layer would require a ‘start date’ and an ‘end date’, most other information regarding the nature and structure of the Calendar could be determined automatically. The thematic layers would then use the structure of the calendar as a basis for thematic treatments such as coloring and contouring.
In an example from ENTROPÍA (ENTROPÍA; “Más tiempo”; http://www.luispabon.com/entropia/index.php?entry=entry071129-145959) a calendar is shown that can be created into a spiral. The structure and layout of this spiral will be the subject of considerable design discussions by information designers focused on issues such as aesthetics and clarity of information. The result of this discussion is a visual design of a spiral calendar Special Layer. This Special Layer can then be used for thematic treatments such as coloring.
As touched on in the background discussion, it would be beneficial to review, analyze and/or represent data based on the purpose of a query.
Referring to
A set of data interface objects 1203 are generated by the herein described system based on the user's request. A data interface object is an object that provides an interface between a user and the underlying data. For example, data interface objects include BPD's and Dimensions.
A data interface object builder 1205 generates the data interface objects based on a set of rules contained and implemented by a rules engine 1207. That is, the builder 1205 operates in accordance with the rules of the rules engine.
A query builder 1209 receives the generated data interface objects from the builder 1205 to generate queries based on instructions received from a perspective generator 1213. The perspective generator receives input data from a perspective request module 1211 that receives instructions from the user interface 1201.
The query builder develops the relevant queries and accesses the appropriate or required data from a data store 1215. The results are then transferred to a visualization document output module which generates the visualization document. The resultant document is then provided to an output module 1219, such as a display, printer or further storage module.
Various different perspectives can be generated by the perspective generator 1213 without having to rework all of the data interface objects, thus providing a dynamic system. For example, the different perspectives may include the following:
A first perspective wherein a change in state of whatever is being analyzed may be considered irrelevant to the user's data requirements. For example, a recorded change in location within the data may not be considered relevant to the user's requested perspective of the data. Therefore, the only requirement is access a view of the available data for a defined time period regardless of the state or condition of the item or entity being analyzed. One example would be a financial view of a company as it has moved through various different location changes (another example could be restructuring of a company). When analyzing the financial data for that company for a defined time period, the actual state (e.g. location or structure) of the company at the time may not be considered relevant to a financial analyst. That is, the analyst may only want to see the true financial perspective.
A second perspective wherein a last known state of what is being analyzed is relevant and only the transaction data associated with that last state is retrieved and analyzed for this perspective. For example, a change in location for a set of data may be considered particularly relevant as the analyst only wants to see how that data is reacting to that location.
Third perspective wherein a change in state of whatever is being analyzed may be considered irrelevant to the user's data requirements as in the first perspective, but in this case the data for the time period being analyzed is considered to be relevant for the attribute of the dimensions of the data set as it was at the end time point of the analysis. For example, financial data for a company located in three different locations is used to determine the likely financial state of that company in its current location. That is, the earlier financial data for different locations is considered relevant for the purposes of the current state of the company, even though the location has changed. In terms of a gaming device for example, for marketing program analysis it is important to know how a device has performed in a number of different locations to get an understanding of how it may perform in its current location. Merely ignoring all previous locations for that device may corrupt the data analysis.
For example, the query may specifically relate to data obtained from:
Each query has specific attributes associated with it. These attributes define the purpose of the query. For example, for different purpose's different time periods may need to be analysed, e.g. what time period must be analysed to satisfy the purpose of the query?
Database queries are generated according to this invention in order to produce multiple perspective query results.
The present application is focussed on a query builder (i.e. an SQL generator) that is used to query a database. A mechanism is provided that enables the query builder to understand a pre-defined set of different perspectives and to generate these multiple perspectives from the query provided.
A single query is generated using standard SQL techniques based on the requirements of the analytical perspective. The query generated is not represented in standard SQL but is based on a set of rules that is then used to build the SQL.
The system then generates a set of multiple pre-defined perspectives from the single query.
The system does not carry out location or machine based analysis but instead performs business type analysis on the queries to produce the appropriate kind of SQL.
Two examples of considerations of the SQL generation that take place are:
How is the business type analysis performed on the queries?
What are the basic steps of the business type analysis?
One of the key advantages of the visual representation of the different perspectives is the ability to view OR manage representations of the data over a structure. The structure may be a map, layout or hierarchy for example.
According to one example, slowly changing dimensions (SCDs) are implemented to enable data to be retrieved from specific points in time based on the location or configuration of the item or asset being analysed. It will also be understood that a SCD may also be based on variables other than location or configuration, such as version, product line, rating etc.
According to another example, an extension of the AS AT statement may be utilised to query the database. The AS AT statement is an extension to SQL that some databases support. Where this extension has been implemented then the SQL generation may use these constructs. One advantage of this implementation is that the SQL is easier to understand and the complexity of a slowly changing dimension (SCD) is shielded from the end user. The AS AT statement enables a query to be made to a database to retrieve relevant data associated with a specific point in time.
The herein described system and methods aim to solve the problem associated with structure of data values versus the data value transactions. That is, the system addresses the question: If the structure (e.g. map or hierarchy) of the data values changes, which transactions should be used by the system in order to solve the query?
For example, if an airplane no longer has first class seats it does not make any sense to include the first class fares from historic flights when looking at the rate of revenue being generated by the airplane, as these first class seats can no longer have any effect on the rate of revenue.
According to another example, a website hierarchy changes all the time. That is, the structure and content of the website are continually changing. However, a key defining characteristic of each change is that it has a start date and an end date. For current data, a future value may be allocated to the data to indicate that the data is current. Taking the airplane scenario again, the airplane may have a specific configuration of seats from a specific start date to a specific end date.
A more detailed description of the present invention is now provided.
Embodiments of the present invention are described herein with reference to a system adapted or arranged to perform a method for generating multiple perspective views of a graphical representation of data based on a single database query.
In summary, the system includes at least a processor, one or more memory devices or an interface for connection to one or more memory devices, input and output interfaces for connection to external devices in order to enable the system to receive and operate upon instructions from one or more users or external systems, a data bus for internal and external communications between the various components, and a suitable power supply. Further, the system may include one or more communication devices (wired or wireless) for communicating with external and internal devices, and one or more input/output devices, such as a display, pointing device, keyboard or printing device.
The processor is arranged to perform the steps of a program stored as program instructions within the memory device. The program instructions enable the various methods of performing the invention as described herein to be performed. The program instructions may be developed or implemented using any suitable software programming language and toolkit, such as, for example, a C-based language. Further, the program instructions may be stored in any suitable manner such that they can be transferred to the memory device or read by the processor, such as, for example, being stored on a computer readable medium. The computer readable medium may be any suitable medium, such as, for example, solid state memory, magnetic tape, a compact disc (CD-ROM or CD-R/W), memory card, flash memory, optical disc, magnetic disc or any other suitable computer readable medium.
The system is arranged to be in communication with external or internal data storage systems or devices in order to retrieve the relevant data. For example, an SQL database may be in communication with the system and a database analysis module may be arranged to communicate with the SQL database to retrieve data based on the SQL statements generated by the system.
It will be understood that the system herein described includes one or more elements that are arranged to perform the various functions and methods. The following portion of the description is aimed at providing the reader with an example of a conceptual view of how various modules and/or engines that make up the elements of the system may be interconnected to enable the functions to be implemented. Further, the following portion of the description explains in system related detail how the steps of the herein described method may be performed. The conceptual diagrams are provided to indicate to the reader how the various data elements are processed at different stages by the various different modules and/or engines.
It will be understood that the arrangement and construction of the modules or engines may be adapted accordingly depending on system and user requirements so that various functions may be performed by different modules or engines to those described herein, and that certain modules or engines may be combined into single modules or engines.
It will be understood that the modules and/or engines described may be implemented and provided with instructions using any suitable form of technology. For example, the modules or engines may be implemented or created using any suitable software code written in any suitable language, where the code is then compiled to produce an executable program that may be run on any suitable computing system. Alternatively, or in conjunction with the executable program, the modules or engines may be implemented using any suitable mixture of hardware, firmware and software. For example, portions of the modules may be implemented using an application specific integrated circuit (ASIC), a system-on-a-chip (SoC), field programmable gate arrays (FPGA) or any other suitable adaptable or programmable processing device.
The methods described herein may be implemented using a general purpose computing system specifically programmed to perform the described steps. Alternatively, the methods described herein may be implemented using a specific computer system such as a data visualization computer, a database query computer, a graphical analysis computer, a retail environment analysis computer, a gaming data analysis computer, a manufacturing data analysis computer, a business intelligence computer, a social network data analysis computer, etc., where the computer has been specifically adapted to perform the described steps on specific data captured from an environment associated with a particular field.
The data provided as an input to the system may be of any suitable type of data, for example, real world data including, but not limited to, gaming or gambling data associated with a gaming environment such as a casino, event data, test or quality control data obtained from a manufacturing environment, business data retrieved from an accounting system, sales data retrieved from a company database, data received or accumulated from a social network, etc. All this data may be received by the system in real time (e.g. by receiving streaming data) in a cache memory or may be stored in a more permanent manner.
The following example of an implementation of the herein described invention is directed at gaming floor data. However, it will be understood that the present invention may be applied to other forms of data, such as retail data, business data, social network data, manufacturing data, travel services data (e.g. flight, rail and/or vehicle travel related data) etc.
When analyzing a gaming floor it is important to determine what the question is that the person is looking to answer, as this impacts on the data requirements. When examining these questions it can be seen that there may be more than one correct answer to the question “what are the performance numbers from my gaming floor?”.
To illustrate these different perspectives, three quite different questions are analyzed.
1) From the financial perspective—e.g. how much money is my gaming floor making, and what are the contributions from individual games? For example, this may be a transaction specific query.
2) From a device perspective, such as a gaming machine or slot machine, e.g. how are my devices performing? For example, this may be a device specific query.
3) From the marketing or customer perspective, e.g. what impacts are my marketing programs or incentives having on my gaming floor? For example, this may be a location specific query.
Within the world of transaction databases typically current records of financial transactions are kept. However in the world of analytical databases it is more beneficial to put together a seamless view of the world involving where the data can be analyzed at a future point in time. For example if looking at a gaming floor from January 2009 and displaying the results on a map it is important that the map being used is correct as of January 2009. There are a number of approaches to solving this problem including the “period type” which database companies such as Teradata have implemented.
A period type typically has a beginning bound and an ending bound; both of these must be of the same type—DATE, TIME, or TIMESTAMP. A period represents a duration starting from the beginning and going up to the ending bound; period does not include the ending bound.
Using the period type the process of querying the database is much simpler as the database handles the storage of the history of the objects.
One approach is to use a slowly changing dimension (SCD). An SCD applies to a situation in which the attribute for a record can vary over time.
Consider, as an example, the case of a carded customer Harold who used to live in Las Vegas. So the initial entry in the player lookup table of Casino Silver Nugget looked like—
Harold has moved to Henderson now, and the Silver Nugget has to modify the customer table to reflect this change. This is known as the SCD problem.
There are many options for implementation of SCD, the most common one being a Type 2 SCD as shown below. These variations involve adding columns for start date and end date to the record.
The Type 2 method tracks historical data by creating multiple records in the dimensional tables with separate keys. With Type 2, unlimited history preservation is provided as a new record is inserted each time a change is made.
In the same example, if the supplier moves to Illinois, the table would look like this:
Another popular method for tuple versioning is to add effective date columns.
Null End_Date signifies current tuple version. In some cases, a standardized surrogate high date (e.g. 9999-12-31) may be used as an end date, so that the field can be included in an index.”
In the world of slots, the definition of a new record in the SCD is a little more complex than one would think. Following are some questions that should be considered before defining the SCD record.
The challenge is that the more SCD records that are added the more fragmented the data becomes. This balance in requirements between analytical fragmentation and the questions that can be answered makes the process of answering these questions a critical part of the warehouse design. It is our recommendation that any changes to theme, location or denomination are tracked and that small adjustments to hold % (which are often corrections) are not tracked.
The alternative is to create a maximum detail SCD that tracks every change as a new record. This alternative requires that additional views of the data are created to build SCDs for the different kinds of analyses. While this maximum detail approach is very appealing, pragmatism normally rules and only one SCD is kept for analytical purposes.
The following is an example of a slot configuration SCD, where each row represents an attribute in the dimension table. The key to this dimension table is that it combines both attributes of the game and the location of the game.
This configuration table is quite a simplification of the machine config table as managed by the gaming system; however it also holds the additional information that shows the location of each gaming device. In other words if a gaming device is moved it will create a new record in the SCD.
This simplification enables users to answer business questions related to changes in the attributes in question. For example, “show me the number of theme changes at this location” or by joining to the transaction table, “show me the total revenue generated by game type by location or by area of the property”.
If the LocationID is not part of the SCD then location related questions may not be possible, so it will be difficult to answer questions like, “how did the game moves affect the performance of the games?” or “how did this game perform in a different areas of the gaming floor?”.
With the buzz surrounding downloadable games and the ability to place any game at any location with the push of a button location based analysis emerges as a dimension in the gaming data. Location based questions like “how are these themes going in this location?” or “what is the best kind of theme in this location?” are difficult to answer. As downloadable games can have dramatic numbers of changes the management of the associated SCD creates a complex data management problem.
In the building of data warehouses, the data integrity questions associated with the transaction data are relatively easy to answer. For example, the actual win numbers can be added up and compared to the total revenue. However, SCDs are much more difficult to manage, for example, if location based changes are not captured accurately as they happen it is likely this information will be lost forever. If these changes are not accurately tracked then analysis that depends on the correct dimensional information may be fundamentally flawed.
Referring to
From the financial perspective, the main question is usually “how much money is my gaming floor making?”.
These reports contain gaming machines that have been removed or reconfigured must reconcile back to the overall volume of money that is generated by the gaming floor.
The queries to generate this data include all transactions for the period and link each transaction to the SCD that is appropriate for the time. As with all things, compromises in the construction of the SCD, for example adjustments to the hold percentage, may result in variations in the final numbers.
Three machines, 1, 2 and 3 are being viewed. A number of records are recorded for the machines. That is, machine 1 has a first record 801 and a second record 803. Machine 2 has a first record 805. Machine 3 has a first record 807, a second record 809 and a third record 811. Each record indicates a particular state of the machine for a particular time period. That is, as the location or configuration of a machine changes, a new record is created. This produces slowly changing dimensions (SCDs) associated with the machines.
The circled area for each machine identifies the period or records that is relevant for the appropriate measures and performance analysis. That is, for this form of analysis, the measures that are relevant are for the whole period (between the start date and the end date) with one record per machine configuration.
Referring to
According to this perspective the key question from the slot perspective is “how are my games performing?”.
When moving to the gaming floor analysis, the current or active floor at the end point of the analysis is of greatest importance. The gaming analyst is looking at different locations on the casino floor, the different games at these locations, and the numbers they are generating on different days of the week. For gaming floor analysis we would consider it to be quite misleading to include the performance of historic games in location performance numbers.
According to this example, the Transaction data is restricted to include only values from machines and their locations at the End Date of the visualization. Common issues in the SCD here include not tracking location changes of games, if these new locations do not create new SCD records then the location performance numbers could be blend of the transactions from an old location and the new location. For example, an extreme case of this would be showing the numbers the machine generated while it was placed in the high limit room on the new location the game was moved to.
Again, three machines, 1, 2 and 3 are being viewed. A number of records are recorded for the machines. That is, machine 1 has a first record 901 and a second record 903. Machine 2 has a first record 905. Machine 3 has a first record 007, a second record 909 and a third record 911. Each record indicates a particular state of the machine for a particular time period. That is, as the location or configuration of a machine changes, a new record is created. This produces slowly changing dimensions (SCDs) associated with the machines.
The circled area for each machine identifies the period or records that is relevant for the appropriate measures and performance analysis. That is, for this form of analysis, the measures that are relevant are for the machines that are located on the gaming floor at the end date. The dimensions are machines on the floor at the end date.
Referring to
According to this perspective the key question from the marketing perspective is “what impacts are my marketing programs or incentives having on my gaming floor?”.
Now enter the marketing department and the questions become quite different. The marketing analyst is asking “where did my loyalty or incentive dollars get redeemed?” or “where did players who responded to my marketing program play on the gaming floor?”. The marketing person is unlikely to ask to be shown the transactions of only players who played on games that have not been changed by slots.
This quite different question requires quite a different analysis, for this analysis we are only interested in locations on the floor at the end point of the analysis, and we typically do not wish to restrict the analysis to only games at these locations. The Transaction Data is shown at the location that it occurred, irrespective of the games that were at the location at the time.
Again, three machines, 1, 2 and 3 are being viewed. A number of records are recorded for the machines. That is, machine 1 has a first record 1001 and a second record 1003. Machine 2 has a first record 1005. Machine 3 has a first record 1007, a second record 1009 and a third record 1011. Each record indicates a particular state of the machine for a particular time period. That is, as the location or configuration of a machine changes, a new record is created. This produces slowly changing dimensions (SCDs) associated with the machines.
The circled area for each machine identifies the period or records that is relevant for the appropriate measures and performance analysis. That is, for this form of analysis, the measures that are measures are for the whole period. The dimensions are machines on the floor at the end date.
The correct management of the SCD enables the answering of questions relating to how the gaming floor is being managed and the impacts of configuration changes and location moves.
As illustrated above, each of the perspectives is correct. Each perspective will generate different data sets, and is only right for the questions that are being asked.
For example, machine 1 has a number of location SCDs that are created each time the machine is moved to a new location. It also has associated with it a number of configuration SCDs that are generated each time the machine's configuration is changed. For machine 1, it can be seen that the machine has been relocated once hence the creation of LSCD2. Also, for machine 1, the configuration has been changed quite regularly to create three separate configurations SCDs: CSCD1, CSCD2 & CSCD3.
Machine 2 has also been relocated once to create two location SCDs. The configuration of machine 2 was also changed to create CSCD2 when the machine was relocated.
Machine 3 was relocated twice within the time period shown to create three different location SCDs. However, the configuration of the machine remained constant during this time period.
Machine 4 remained in one location during the time period shown. However, its configuration changed once halfway through the time period.
Machine 5 was neither moved nor had a change in configuration during the time period shown.
Therefore, for the various different types of analysis, different SCDs would be retrieved and analyzed to generate the multiple perspectives of the data.
For example, for performance analysis, all transactions would be retrieved that have occurred or are occurring between the start and end times of the period shown for each machine.
As a further example, for machine analysis, the transactions used are the most current SCDs. That is, for machine 1, LSCD2 (not LSCD1) would be used and CSCD3 (not CSCD1 or CSCD2) would be used.
As a further example, for customer analysis, only transactions from the location SCDs could be used. All transactions for all LSCDs within the time period could be used.
It will be understood that the embodiments of the present invention described herein are by way of example only, and that various changes and modifications may be made without departing from the scope of invention.
This application is a Continuation-in-Part application of PCT/NZ2012/000045, filed 23 Mar. 2012, which claims benefit of U.S. Provisional Ser. No. 61/467,260, filed 24 Mar. 2011. This application is a Continuation-in-Part of U.S. Ser. No. 13/000,323, filed 20 Dec. 2010, which is a National Stage of PCT/NZ2009/000114, filed 19 Jun. 2009, which claims benefit of U.S. Provisional Ser. Nos. 61/074,347, filed 20 Jun. 2008, 61/093,428, filed 1 Sep. 2008, 61/101,670, filed 30 Sep. 2008, 61/101,672, filed 30 Sep. 2008, 61/107,665, filed 22 Oct. 2008, 61/115,036, filed 15 Nov. 2008, 61/118,211, filed 26 Nov. 2008, 61/140,556, filed 23 Dec. 2008, 61/145,775, filed 20 Jan. 2009, 61/146,133 filed 21 Jan. 2009, 61/146,430, filed 22 Jan. 2009, 61/146,525, filed 22 Jan. 2009, and 61/161,472, filed 19 Mar. 2009. All applications are incorporated herein by reference. To the extent appropriate, a claim of priority is made to each of the above disclosed applications.
Number | Date | Country | |
---|---|---|---|
61467260 | Mar 2011 | US | |
61074347 | Jun 2008 | US | |
61093428 | Sep 2008 | US | |
61101670 | Sep 2008 | US | |
61101672 | Sep 2008 | US | |
61107665 | Oct 2008 | US | |
61115036 | Nov 2008 | US | |
61118211 | Nov 2008 | US | |
61140556 | Dec 2008 | US | |
61145775 | Jan 2009 | US | |
61146133 | Jan 2009 | US | |
61146525 | Jan 2009 | US | |
61146430 | Jan 2009 | US | |
61161472 | Mar 2009 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/NZ2012/000045 | Mar 2012 | US |
Child | 14035247 | US | |
Parent | 13000323 | Jul 2011 | US |
Child | PCT/NZ2012/000045 | US |