The invention is more fully appreciated in connection with the following detailed description taken in conjunction with the accompanying drawings, in which:
Like reference numerals refer to corresponding parts throughout the several views of the drawings.
Various features associated with the operation of the present invention will now be set forth. Prior to such description, a glossary of terms used throughout this description is provided.
Axis. An axis is a space along which data is arranged. For example, an axis is a line or curve in a visualization that defines a spatial direction within the visualization. An axis can be a line with equal values. A pair of orthogonal axes, e.g., an x-axis and a y-axis, defines a Cartesian coordinate system.
Chart. A chart includes a collection of visual elements used to convey information. A chart is a visualization.
Data. Data is qualitative or quantitative information that is stored in a data source. Data is the information that is presented in a report. Data can have associated metadata.
Dimension. A dimension is a line in a real or abstract space. An example of a real space dimension is a pair of antiparallel cardinal points on a compass, e.g., North and South, North-northwest and South-southeast. Another real dimension is time. An example of an abstract space dimension is a list of stores. The dimension is abstract because the list can be ordered alphabetically by name, by store number, by distance from head office, etc. Examples of dimensions include region, store, year, customer, employee, product line, and the like.
Family. A family is a group of similar or related things. Visualizations can be grouped into families. Charts can be grouped into families. Families of charts include, but are not limited to: status charts (e.g., gauges, barometers/thermometers, LEDs); variation charts (e.g., radar, polar, heat maps); contribution comparison charts (e.g., pie, stacked 100%, pie series); rank compare charts (e.g., horizontal, grouped bar, deviation/zero axis bar, floating, stacked/subdivided); time series charts (e.g., line graph, column, waterfall/floating, deviated/zero axis, stacked/subdivided bar, stock/open-high-low-close, times series line, times series surface); frequency distribution charts (e.g., histogram, histograph); correlation charts (e.g., scatter plot, bubble plot, paired bar chart, paired/multiple scatter plot, bubble chart); combination charts (e.g., bar chart with line, pie slice with stacked bar, pie in time series, table); and other charts (e.g., graphical lists, spie chart, chart, log plot, semi-log plot, stereogram, contour plot, hanging rootogram, box plot, bag plot, mesh plot, contour plot, graph, network, and tree).
Measure. A measure is a quantity as ascertained by comparison with a standard, usually denoted in some unit, e.g., units sold, dollars. A measure, such as revenue, can be displayed for the dimension “Year”. Corresponding measures can also be displayed for each of the values within a dimension.
Region of focus. The region of focus is an area of the report which the user wishes to explore. The region of focus is either set by default or is definable by a user event.
User event. A user event is an action taken by the operator of a computer. User events include the user clicking on an area of a table, chart, map or portion thereof which displays quantitative information. The user can select one or more: charts, maps, columns or rows in a table, axes or data within a chart, data in a time series, or regions in a map. Alternatively, the user event can include the user specifying a parameter to a report document.
Metadata. Metadata is information about information. Metadata can constitute a subset or representative values of a larger data set. For example, a piece of metadata could be associated with a piece of data and provide a description to that piece of data.
Table. A table maps the logical structure of a set of data into a series of columns or rows. Thus, a table is a visualization. To facilitate representation in two dimensions, higher-dimensional tables of data are often represented in an exploded view comprising a plurality of two dimensional tables. A table can be rectangular, triangular, octagonal, etc. A table can have row and column headings, where each cell in a table can show the value associated with the specific combination of row and column headings. Some tables can hold charts or maps in their cells; this is a spatially economic way to display many charts with common axes. A table is to be conceptually differentiated from a database table.
Value. A dimension includes one or more values, each of which can have associated measures. For example, the “Year” dimension may include 1999, 2000, 2001, 2002 as its values. The “Quarter” dimension would normally have 4 values corresponding to each quarter. Values can be displayed with associated measures.
Visualization. A visualization is a graphic display of quantitative information. Types of visualizations include charts, tables, and maps.
Cross-tab. A cross-tab (abbreviation of cross-tabulation) is a visualization of data that displays the joint distribution of two or more variables simultaneously. Cross-tabs are usually presented in a matrix format. Each cell shows the value associated with the specific combination of row and column headings.
A memory 110 is also connected to the bus 106. In an embodiment, the memory 110 stores one or more of the following modules: an operating system module 112, a graphical user interface (GUI) module 114, a business intelligence (BI) module 116, a data source interface module 118, and a visualization determination module 120.
The operating system module 112 may include instructions for handling various system services, such as file services or for performing hardware dependant tasks. The GUI module 114 may rely upon standard techniques to produce graphical components of a user interface, e.g., windows, icons, buttons, menu and the like, examples of which are discussed below.
The BI module 116 includes executable instructions to perform BI related functions, such as, generate reports, perform queries and analyses, and the like. The BI module 116 can include a data source interface module 118, as a sub-module. The data source interface module 118 includes executable instructions for interfacing with an OLAP data source, such as, an OLAP cube or semantic layer. The data source interface module 118 can include executable instructions to allow computer 100 to link any OLAP data source, such as via an application program interface, to specific types, versions, or formats of a data source.
The visualization determination module 120 includes executable instructions to automatically determine if a visualization could be created based on specified data. The module 120 includes rules to determine whether a given chart type can render a meaningful representation for the data. The visualization determination module 120 can be interrogated by the BI module 116 or the data source interface module 118.
The executable modules stored in memory 110 are exemplary. It should be appreciated that the functions of the modules may be combined. In addition, the functions of the modules need not be performed on a single machine. Instead, the functions may be distributed across a network, if desired. Indeed, the invention is commonly implemented in a client-server environment with various components being implemented at the client-side and/or the server-side. It is the functions of the invention that are significant, not where they are performed or the specific manner in which they are performed.
A user interacts with a BI tool in the BI module 116. In an embodiment, the user maps data from a data source, e.g., an OLAP cube to an axis of a visualization in a GUI. The visualization determination module 120 provides feedback to the BI tool and the user as to which visualizations can be rendered from the data.
For each visualization that the application is capable of rendering, the application queries the possible visualization module to determine if the visualization makes sense for the given data. The application selects a visualization and submits it to the visualization determination module 120, 204. Instructions in the visualization determination module 120 determine if the visualization makes sense for the given data 206. If the visualization is inappropriate (206-No), this fact is reported to the user or application 208. Typically, the application will not use the visualization. If the visualization is possible (206-Yes), then the given visualization is flagged by the application as meaningful 210. In an embodiment, the visualization determination module 120 uses metadata associated with the given data to determine if the visualization makes sense.
The application then determines if there are more visualizations 212. If so (212-Yes), then the current visualization is incremented 214. If not (212-No), then the processing continues at operation 216. In an embodiment, in operation 216 the visualization options are presented to the user. In another embodiment, the visualization options are updated view of a data source.
The following code segment is pseudo code that invokes some of the processing operations of
In this segment, the pseudo code at line AB declares a view and equates it with the view of the current scenario of the application. A view is a data structure that contains details about the data source, e.g., OLAP data source or semantic layer and how the data is mapped to the axes of a visualization. At line AC, a data structure is created with all the visualizations the application can render. At lines AD through AI, each of these visualization is tested against a canRender function. In an embodiment, the canRedner function is stored in visualization determination module 120. If the visualization can be rendered (line AG), the visualization is added to a list of commands that the application can execute (line AH). The for loop in lines AD through AI correspond to implementation processing operations 204 through 214 of the set of processing operations 200 of
The following code segment is pseudo code for the canRender function invoked in Pseudo Code Segment A. Code Segment B is for bar charts.
In this segment, the canRender function takes a view and determines if a bar chart would be a meaningful visualization for the data. In lines BC and BD each axis is checked to see if it contains a measure. The logical exclusive-OR of the result of these checks is returned to the invoking code. In an embodiment, the invoking code is similar to Pseudo Code Segment A. The logic in the Pseudo Code Segment B indicates that a bar chart can be meaningful if either the x-axis or the y-axis contains a measure, but not both. Hence, a bitwise exclusive-OR or XOR (“̂”) is used at line BE.
The following code segment is pseudo code for the canRender function invoked in Pseudo Code Segment A for pie charts.
In this segment, the canRender function takes a view and determines if a pie chart is a meaningful visualization for the data. Pseudo Code Segments B, C, D and E all contain the same function name “canRender”, but with different interfaces, such that the executable instructions in memory 110 can be easy expanded. In lines CC and CD, each axis is checked to see if it contains a measure. At line CE, the size of the axis is determined. The exclusive-OR of the results of the first two checks is applied to a logical AND operation with a check that the axis has greater than zero size and is returned to the invoking code (line CF). The logic in the Pseudo Code Segment C indicates that a pie chart can be meaningful if either the x-axis or the y-axis contain a measure, but not both, and the z-axis must contain at least one dimension.
The following pseudo code segment is for scatter charts. The segment is pseudo code under the canRender function interface invoked in Pseudo Code Segment A.
In this segment, the canRender function takes a view and determines if a scatter chart is a meaningful visualization for the data. In lines DC and DD, each axis is checked to see if it contains a measure. These checks are applied to a logical AND operation and the result is returned to the invoking code. The logic in the Pseudo Code Segment D indicates that a scatter chart can be meaningful if either the x-axis and the y-axis contain a measure. The scatter chart cannot be rendered in the absence of a dimension in the z-axis. However, it is possible to have a scatter chart with a measure in each axis.
The following pseudo code segment is for displaying data. The segment is pseudo code under the canRender function interface invoked in Pseudo Code Segment A.
In this segment, the canRender function takes a view and determines if raw data can be displayed. Since data can always be displayed, true is returned to the invoking code at line EC.
The workflow depicted in
In an embodiment of the present invention, the list of visualizations for which the view of the data can be checked against is expandable. In an embodiment, to expand the list, executable instructions are loaded into the memory 110 of
The aforementioned rules or instructions encoding the rules could be created using the above pseudo code segments as a guide. That is, canRender functions can be created for more visualizations. For example, a stacked bar chart requires multiple measures. Whereas a chart from the status chart family (e.g., a speedometer or thermometer chart) requires a single value per cell (i.e., one measure) and nothing on the z-axis. Many visualization could be added to embodiments of the present invention. These include the following chart families: status charts, variation charts, times series charts, correlation charts, compare contribution charts, combination charts, frequency distribution charts, and rank compare charts.
An embodiment of the present invention relates to a computer storage product with a computer-readable medium having computer code thereon for performing various computer-implemented operations. The media and computer code may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known and available to those having skill in the computer software arts. Examples of computer-readable media include, but are not limited to: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs, DVDs and holographic devices; magneto-optical media; and hardware devices that are specially configured to store and execute program code, such as application-specific integrated circuits (“ASICs”), programmable logic devices (“PLDs”) and ROM and RAM devices. Examples of computer code include machine code, such as produced by a compiler, and files containing higher-level code that are executed by a computer using an interpreter. For example, an embodiment of the invention may be implemented using Java, C++, or other object-oriented programming language and development tools. Another embodiment of the invention may be implemented in hardwired circuitry in place of, or in combination with, machine-executable software instructions.
The foregoing description, for purposes of explanation, used specific nomenclature to provide a thorough understanding of the invention. However, it will be apparent to one skilled in the art that specific details are not required in order to practice the invention. Thus, the foregoing descriptions of specific embodiments of the invention are presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed; obviously, many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, they thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the following claims and their equivalents define the scope of the invention.
This application is related to the following pending, commonly owned U.S. patent application entitled “Apparatus And Method For Visualizing Data”, Ser. No. 11/478,836, Attorney Docket No. BOBJ 102/00US, filed Jun. 30, 2006, which is incorporated herein by reference.