The present disclosure generally relates to computer-implemented systems and methods for generating visual data presentations, data analysis, and data management in response to user requests.
Graphs are used to display data and assist in data analysis in many fields, for example, statistical analysis of data sets. Many techniques for generating graphs in connection with statistical analysis require the user to manually create graphs for the data set. In addition, the user may be required to determine what graph types are desirable as well as what particular variables of the data set to utilize.
In accordance with the teachings provided herein, systems and methods for automated generation of graphs related to a data set are provided.
The disclosure provides a graphing computer system comprising a processor and a non-transitory computer-readable storage medium that includes instructions that are configured to be executed by the processor such that, when executed, the instructions cause the graphing computer system to perform operations including receiving a request at the graphing computer system to generate a geometric plot having at least two axes, wherein the request specifies a dataset from which the system will generate at least one shape for the geometric plot, and wherein the request specifies data relative to a plurality of discrete, categorical index values related to at least one axis of the plot, and wherein the instructions further cause the system to perform additional operations of processing the received request to determine a mapping of shape-defining vertices of the at least one shape to respective locations comprising a sum of a discrete, categorical index value of the plot axis and an offset value, generating a set of data in accordance with the received request such that the generated set of data specifies a plot location on the geometric plot for each of the shape-defining vertices of the at least one shape, and providing the generated set of data to a graphing engine configured to render the set of data and generate the geometric plot on an electronic display.
The disclosure further provides a graphing computer system, wherein the dataset comprises a table of numeric and alphanumeric data.
The disclosure further provides a graphing computer system, wherein zero or more of the specified vertex locations on the categorical axis fall in between the categorical tick marks on the categorical axis.
The disclosure further provides a graphing computer system, wherein the categorical index values comprise values that may be non-numerical.
The disclosure further provides a graphing computer system, wherein the dataset of the request specifies a plurality of polygons for the geometric plot.
The disclosure further provides a graphing computer system, wherein the dataset of the request may specify an angle of rotation for at least one polygon of the geometric plot.
The disclosure further provides a graphing computer system, wherein the dataset of the request can specify an angle of rotation for each polygon of the geometric plot that is independent of the remaining polygons of the geometric plot.
The disclosure further provides a graphing computer system, wherein the set of data provided or a statement option specifies fill information of the shape in the geometric plot that is illustrated on the display of the graphing computer system after rendering. That is, with the disclosed system, the user specifies a set of data and statement options to generate the shape and fill information in the geometry plot that is illustrated on the display of the graphing computer system after rendering.
The disclosure further provides a graphing computer system, wherein the shape of the generated geometric plot comprises a closed polygon.
The disclosure further provides a graphing computer system, wherein the received request specifies an alphanumeric label associated with the shape.
The disclosure further provides a graphing computer system, wherein the received request includes a dataset that defines a shape comprising a plurality of polygons that are illustrated on the display of the graphing computer system after rendering.
The disclosure further provides a graphing computer system, wherein the geometric plot includes a first axis and a second axis that is perpendicular to the first axis, and wherein the first axis and second axis may have numeric or categorical index values.
The disclosure further provides a graphing computer system, wherein the set of data may be generated in response to a plurality of requests to generate a geometric plot, and wherein the graphing engine is further configured such that the generated geometric plot includes all the shapes specified in the plurality of requests. In this way, the system can generate a plot having an overlay of two or more plots or charts in one illustration. As described herein, a “graph” is an illustration that is made up of one or more plots or charts. That is, a “geometric plot” comprises a special type of graph generated with polygon shapes or with overlaid multiple polygon shapes. As described further below, the polygon shapes may be generated in response to a “polygonplot” program statement.
The disclosure further provides a graphing computer system, wherein at least one of the shapes specified in the plurality of requests results in a heat map.
The disclosure further provides a graphing computer system, wherein the heat map illustrates a plurality of data values that provide a visualization of data that comprises response data, and the shapes of the heat map comprise a plurality of polygon shapes, such that at least two of the polygon shapes represent different data values having different magnitudes, and the different data values are represented by different corresponding colors.
The disclosure further provides a graphing computer system, wherein the request may specify an angle of rotation for at least one of the polygon shapes, and the angle of rotation may indicate the magnitude of the visualized data.
The disclosure further provides a graphing computer system, wherein at least one axis of the geometric plot has discrete, non-numerical categorical index values.
The disclosure further provides a graphing computer system, wherein the shapes specified in the plurality of requests together comprises a spiral heat map.
The disclosure further provides a graphing computer system, wherein a data set with two or more categorical and numeric columns is provided to the system with a request to create a bar chart with a special option to generate a hygrometer plot. One categorical data column is assigned to the category axis of the bar chart, another is assigned to the group role of the bar chart and one to the response role of the bar chart. The special “Group100” option normalizes all the bar magnitudes to 100%, plotting the positive values above the axis line and negative values below the axis line.
The disclosure further provides a graphing computer system, wherein at least one of the shapes specified in the plurality of requests comprises a pie chart.
The disclosure further provides a graphing computer system, wherein the pie chart provides a visualization of data, wherein the request specifies a dataset and the pie chart comprises a plurality of pie chart segments, such that the data processing apparatus calculates display information to define a pie chart start angle that is perpendicular to one of the geometric plot axes and to produce an arrangement of the pie chart segments such that the first segment is symmetric about the start angle.
The disclosure further provides a graphing computer system, wherein the pie chart segments of the pie chart displayed on the display of the data processing apparatus comprise a plurality of segments that are located in a concentric arrangement such that the first slice in each ring is symmetric about the start angle.
The disclosure further provides a method of operating a graphing computer system, the method comprising:
The disclosure also provides a corresponding computer-program product, tangibly embodied in a non-transitory machine-readable storage medium, including instructions configured to be executed to cause a graphing computer system to perform a method that provides the features noted above. The method performed by the computer-program product may include receiving a request at the graphing computer system to generate a geometric plot having at least two axes, wherein the request specifies a dataset from which the system will generate at least one shape for the geometric plot, and wherein the request specifies data relative to a plurality of discrete, categorical index values related to at least one axis of the plot; processing the received request to determine a mapping of shape-defining vertices of the at least one shape to respective locations comprising a sum of a discrete, categorical index value of the plot axis and an offset value; generating a set of data in accordance with the received request such that the generated set of data specifies a plot location on the geometric plot for each of the shape-defining vertices of the at least one shape; and providing the generated set of data to a graphing engine configured to render the set of data and generate the geometric plot on an electronic display.
In accordance with the teachings provided herein, systems and methods for graph processing generate a geometric plot having at least two axes, wherein a dataset from which the plot will be generated specifies at least one shape for the geometric plot and wherein the plot includes at least one column of data having a plurality of discrete, categorical index values. A column of offset value is specified that determines a mapping of one or more shape-defining vertices of the at least one shape to a location that is a fractional distance between two of the discrete, categorical index values, such that a generated set of data specifies a pixel location for each of the shape-defining vertices of the at least one shape.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings will be provided by the Office upon request and payment of the necessary fee.
Like reference numbers and designations in the various drawings indicate like elements.
This application discloses a graphing computer system and associated techniques for data visualization, including a plot statement, such as the “POLYGONPLOT” statement disclosed herein, and extensions of bar chart and pie chart representations. The system processes a request for a geometric plot by generating a graph having at least two axes, where the request relates to a dataset from which the graph will be generated and specifies at least one geometric shape for the geometric plot. The generated plot includes at least one axis having multiple discrete, categorical index values. One column of offset values is specified that determines a mapping of one or more shape-defining vertices of the at least one shape to a location that is a fractional distance between two of the discrete, categorical index values, such that a generated set of data specifies a pixel location for each of the shape-defining vertices of the at least one shape. More particularly, the data set contains observations of data values, and the computer graphing system maps the data values to geometric shapes and draws the shapes relative to a graph of the data values. The data-to-shape mapping may be relatively direct, such as in the case of the plot statement, illustrated as the “PolygonPlot” statement discussed further below. In other situations, as in the PieChart and BarChart discussed further below, the data-to-shape mapping may not be so direct, and a summarization operation may be performed before the data-to-shape mapping is generated.
In one embodiment, the data visualization techniques utilized in the system present a chart comprising a heat map with an additional indicator to help decode the data when color mapping alone is not sufficient due to either fine color change or the graph is presented in gray scale medium or color blindness on the part of a person viewing the chart.
The data visualization techniques may be provided in a computer graphing system in which processes to produce the display features are invoked using a graphing request to generate a geometric plot having at least two axes. For example, in a system that supports graphing techniques in accordance with applications software from SAS Institute Inc. (“SAS”) of Cary, N.C., USA, the display features may be provided through a “polygonplot” statement, in which a geometric plot is generated having at least two axes and at least one shape. For example, the geometric plot may be produced by generating a set of data that specifies a pixel location for vertices of the at least one shape and by rendering the generated set of data.
In one example embodiment, the computer-implemented environment 100 may include a stand-alone computer architecture where a processing system 110 (e.g., one or more computer processors) includes the computer system 104 on which the processing system is being executed. The processing system 110 has access to a computer-readable memory 112. In another example embodiment, the computer-implemented environment 100 may include a client-server architecture, and/or a grid computing architecture. Users 102 may utilize a personal computer (PC) or the like to access servers 106 running a computer system 104 on a processing system 110 via the networks 108. The servers 106 may access a computer-readable memory 112.
A disk controller 210 can interface one or more optional disk drives to the bus 202. These disk drives may be external or internal floppy disk drives such as storage drive 212, external or internal CD-ROM, CD-R, CD-RW, or DVD drives 214, or external or internal hard drive 216. As indicated previously, these various disk drives and disk controllers are optional devices.
A display interface 218 may permit information from the bus 202 to be displayed on a display 220 in audio, graphic, or alphanumeric format. Communication with external devices may optionally occur using various communication ports 222. In addition to the standard computer-type components, the hardware may also include data input devices, such as a keyboard 224, or other input/output devices 226, such as a microphone, remote control, touchpad, keypad, stylus, motion, or gesture sensor, location sensor, still or video camera, pointer, mouse or joystick, which can obtain information from bus 202 via interface 228.
As described herein, a computer graphing system configured in accordance with this disclosure can process geometric data values that specify geometric shapes for a graph and are positioned in the graph with respect to numeric or categorical data values for each vertex of the plot. When the values are categorical, they are further adjusted in position by offset data values relative to the nearest categorical value on the axis. The categorical values are character values positioned along the horizontal axis or vertical axis of the graph. The offset data values are numeric values between −0.5 and +0.5 relative to adjacent categorical values along the axis, and are used to specify positions that are located between the categorical tick values (index marks) on the axis. For example, an offset value of +0.5 (i.e., ½) specifies a position that is half-way between a selected category tick value (i.e., an index mark) and the next (adjacent) tick value. The offset data values are part of the data provided to the plot statement.
When an axis is discrete, an offset value can be used to provide the location of a vertex of the polygon shape depicted in a graph. The vertex data value is generally specified as a category name plus an offset value, such as “A+0.5” or “A−0.5”, where the +/−0.5 is understood to refer to the proportional location between the specified category value, “A”, and the next adjacent category value in the specified positive or negative direction (right or left, up or down, respectively). Thus, the category values plus the offset data values define vertices and otherwise specify the position of the geometric shape that is located between the category name values (index marks) on the display screen. If desired, additional parameters may be included in the data request, such as rotation angles of the shapes. With such additional parameters, it is possible to generate a graph that includes a geometric plot based on a data set that can contain numeric values, discrete categorical values, offset adjustment, and per-polygon screen rotation angles.
In accordance with the description herein, a single request for a geometric plot specifies a data set having data values from which a graph will be plotted relative to a horizontal axis and a vertical axis. The request specifies at least one polygonal shape defined by a set of vertices, whose locations can be specified by numeric or categorical columns in the data. If the column of data is categorical, an additional numeric column can be provided for offsets. A rotation angle may be optionally included, for a specified shape that is capable of rotation, to specify a rotation of the shape relative to a default orientation. In this way, a single request can be processed to produce a desired geometric plot relative to numeric or category data, without necessity of tedious specification of exact pixel values and explicit graphing instructions to produce the desired geometric plot.
In the next operation, indicated by the flowchart box 312, the system generates a set of data in accordance with the received request, such that the generated set of data specifies a pixel location for each of the shape-defining vertices of the at least one shape such that the pixel location is directly renderable by the computer graphing system.
In the next operation, indicated by the flowchart box 316, the system provides the generated set of data to a graphing engine configured to render the set of data and generate the geometric plot on a display of the graphing computer system. The geometric plot includes a first axis and a second axis that is perpendicular to the first axis, and at least one of the first axis and the second axis have discrete, categorical index values. For example, the categorical index values may be values that are non-numerical, such as category names.
The operation of the system to produce a graph display comprising a geometric plot may be summarized by the operations specified by the pseudocode listed in TABLE 1 below:
Program code for the TABLE 1 operations may be specified at a user computer terminal of the system via a command line interface or a graphical user interface. Examples of command line code for illustrated graph displays are provided below in accordance with applications software and graphing systems from SAS Institute Inc. as mentioned above. Those skilled in the art will understand how to implement such code in graphing systems and will understand how to specify and implement corresponding code that would be used in similar graphing systems.
In the next operation, represented by the decision box 412, the system initiates processing according to whether the request relates to a polygon plot such as a heat map or spiral heat map, or relates to a bar chart, or relates to a pie chart. For each particular type of requested graph, the system generates data according to the requested type of graph, as represented by the box 416. System operation continues with additional processing, as indicated by the continuation box 420.
In the computer graphing system of
In the program code of TABLE 2, the polygonplot statement and its options are designated by the large bracket at the left margin. For the polygonplot ID statement, both X and Y parameters are required. The code that is associated with the polygonplot statement may include data columns, that must be provided for these to process the polygons. The XOFFSET and YOFFSET parameters are optional, and provided after the “/” character. Other options can be provided such as ROTATE, etc. described further below. The polygon plot statement continues over four lines as indicated above, and the “;” character is only at the end. Optional parameters are marked in <>.
In general, the data specification is in the form of a data=dataset instruction, in which the dataset instruction provides the name of a dataset that is accessible to the computer system, such that the system can retrieve the dataset from storage of the computer system, and can obtain data for a geometric plot having at least two axes and at least one shape and operate on the retrieved dataset to produce a graph. In the general format of the non-limiting program code example of TABLE 2 above, “proc sgrender” is the name given to the procedure of the program code that comprises the request for a plot as defined in the associated template, in this case a geometric plot, and initiates the operations that produce the geometric plot. The “proc sgrender” statement above comprises a request to the computer graphing system for a geometric plot, for which the computer graphing system is configured to carry out. The “data-set-name” is the name of the dataset that contains the graphing specifications that are desired for the graph of the geometric plot; “polygon” includes an identifier “id” that is arbitrary, and includes data columns for x and y values. The parameter “xoffset” is an X-axis offset value and “yoffset” is a Y-axis offset value that may be provided in case the column assigned to x or y is categorical. If the X-axis index marking is categorical (i.e., discrete), then an X-axis offset value can be specified. If the Y-axis index marking is categorical (i.e., discrete), then a Y-axis offset value can be specified. Thus, the “xoffset” value (or “yoffset” value) can be a constant value (not part of the dataset) or can be specified as a column in the data set. The “other options” indicate that various plotting options can be provided. Such plotting options may include, for example, label locations, label positions, and the like. Such plotting options are illustrated in
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as
In command line format, for example, the operations of TABLE 1 and TABLE 2 may be initiated with a non-limiting example of command line GTL code such as listed in TABLE 3 below to produce the graph plot of
In the TABLE 3 program code above, it can be seen that data for two polygon shapes comprising triangles are provided, in the “data poly” section of code. Thus, each triangle includes three vertices. One
The arrowhead 604 is a polygon having seven vertices. As described further below, the data specification of the program code may be used to define the arrowhead shape 604 as an object having vertices 608, 612, 616, 620, 624, 628, 632. The received request specifies the graph locations for a corresponding one of the vertices of each arrowhead to be plotted in a graph.
Thus, the arrowhead 604 may be viewed as a single polygon shape made up with seven vertices: 608, 620, 624, 628, 632, 616, and 612. It should be apparent that a Y-coordinate value is also needed to specify each vertex. That is, if the “A” values are understood to be specified along a horizontal (X-axis) line, in the left-right dimension, then vertical (Y-axis) line values also must be specified, to indicate distance from the X-axis line, in the up-down dimension of the illustration. If the Y-axis has discrete, categorical values, then offsets can be specified, as with the X-axis in
In
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as
In the TABLE 4 code above, after an initial section in which display and data parameters are provided, it can be seen that a “data poly” section specifies data for the graph display, comprising four polygon shapes. The shapes are positioned in the
Heat Map
In the use case of
The heat map of
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as
In the TABLE 5 code above, after an initial section in which display and data parameters are provided, it can be seen that a “data Authors” section and a “data Grid” section specifies data for shapes and plot grid, respectively, of the graph display. Another section of the TABLE 5 code comprises a specification of the graph layout, in the “proc template” sections of code. The first “proc template” section of code corresponds to the color version as in
For a display that is incapable of showing color, or for viewers who cannot discern such colors, the information that is otherwise provided by a color display is lost. In both cases, displays that are incapable of showing color and viewers who cannot discern color, it may be more difficult to discriminate between the different color shades in the cells representing the popularity value. This same graph can be represented with an additional visual cue to help understand the data, such as the visual cue shown in
In a color display, as illustrated in
Ordinarily, a user would have to utilize a different color coding technique to address different audiences. Using the technique described herein allows the authors of the graph to create only one graph that can be consumed by a wider audience, independent of color capability.
As noted above, a new POLYGONPLOT statement facilitates the relatively convenient creation of such heat map graphs. For example, in the “Graph Template Language” (GTL) utilized by SAS Institute Inc., a suitable “POLYGONPLOT” statement allows the creation of polygons in the numeric and discrete space.
Specifying points on a polygon typically can be achieved in the numeric space. However, in the graph of
Spiral Heat Map
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as
In the TABLE 6 code above, after an initial section in which display and data parameters are provided, it can be seen that a “data temp” section specifies data for plotting, and a “data spiral_Curve” section and “data spiral_Data” section specify data for constructing the display shapes of the graph display. A data section called “data gridlines” specifies the arrangement of the radial month grid lines. A data section called “data spiral_combined” initiates merging shapes and spiral arrangement of the display. The last section of TABLE 6 code initiates the rendering of the graph display, in the “proc sgplot” section of code. Those skilled in the art will appreciate that some of the lines of code in TABLE 6, such as format specifications, process calls, and calls to an output delivery system (ods) and the like, are unique to the operating environment of the SAS system referred to above. Those skilled in the art will understand corresponding code that would be utilized in other graphing systems.
Centered Pie Chart
Another useful data visualization technique possible with the description herein is the “Centered Pie Chart”. A pie chart is a very useful visual when making “part to whole” comparisons, and especially useful with two slices. This is illustrated in
The pie chart 910 illustrated on the left of
Thus, the pie chart 910 on the left of
As before, the pie chart 1010 on the left has two pie chart portions 1018, 1022 that represent relative data values, but the two portions are not especially easy or aesthetically pleasing to view. The pie chart 1014 on the right has two pie chart portions 1026, 1030 that represent relative data values, and the two portions are better arranged and easier to view and comprehend, in part because they are arranged symmetrically relative to the Y-axis.
In both
The displays of
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as the pie chart graph of
In the TABLE 8 code above, after an initial section in which display and data parameters are provided, it can be seen that data for the graph display is provided in a “GTL_Sedans” data section. Another section of the TABLE 8 code comprises a specification of the graph layout, in the “proc template” section of code. The last section of TABLE 8 code initiates the rendering of the graph displays, in the “proc sgrender” sections of code. Those skilled in the art will appreciate that some of the lines of code in TABLE 8, such as format specifications, process calls, and calls to an output delivery system (ods) and the like, are unique to the operating environment of the SAS system referred to above. Those skilled in the art will understand corresponding code that would be utilized in other graphing systems.
Hygrometer Plot
The G100 feature provides what is referred to as a “percent of frequency” graph, where the sum of all category values in a group are normalized and displayed as a stack such that the category values add up to 100%. Only frequency or percent data can be displayed in this type of graph, and normally only positive frequency values are generally supported.
In the hygrometer plot display 1200, response values of all groups in a category are normalized and then stacked as one bar so the total height of each bar is 100%. This type of plot supports both positive and negative data. Positive values are stacked above a zero line of the plot, and negative values are stacked below the zero line. The individual combination of positive and negative response values makes each bar “float” at different heights of the graph. Hence, the name “Hygrometer Plot” is used for this type of plot.
The vertical position of the bar (whether above or below the zero line) indicates the nature of the volume for that day, whether incoming or outgoing. The proportion of each segment within the bar shows which products have higher or lower contributions to the volume. The label in each segment shows the % amount. The legend below the plot identifies the type of traffic. For example, in
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as the hygrometer graph of
In the TABLE 9 code above, overlay techniques are used to generate the desired appearance for the graph. The barchart statement specifies category-variable names and corresponding responses or response-variable values. The “group100” statement is used for generating the desired graph. Those skilled in the art will appreciate that some of the lines of code in TABLE 9, are unique to the operating environment of the SAS system referred to above. Those skilled in the art will understand corresponding code that would be utilized in other graphing systems.
The computer operations for producing a graph like the hygrometer plots of
The
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as
In the TABLE 11 code above, after an initial section in which display and data parameters are provided, it can be seen that data for the graph display is provided in several data sections, noted as a “data G100”, a “data poly”, and a “data Merged” section. Another section of the TABLE 11 code comprises a specification of the graph layout, in the “proc template” section of code. The last section of TABLE 11 code initiates the rendering of the graph display, in the “proc sgrender” section of code. Those skilled in the art will appreciate that some of the lines of code in TABLE 11, such as format specifications, process calls, and calls to an output delivery system (ods) and the like, are unique to the operating environment of the SAS system referred to above. Those skilled in the art will understand corresponding code that would be utilized in other graphing systems.
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as the bar chart of
In the TABLE 12 code above, after an initial section in which display and data parameters are provided, it can be seen that data for the graph display is provided in several data sections, noted as a “data G100”, a “data G100_Net”, a “data rect”, and a “data Merged2” section. Another section of the TABLE 12 code comprises a specification of the graph layout, in the “proc template” section of code. The last section of TABLE 12 code initiates the rendering of the graph display, in the “proc sgrender” section of code. Those skilled in the art will appreciate that some of the lines of code in TABLE 12, such as format specifications, process calls, and calls to an output delivery system (ods) and the like, are unique to the operating environment of the SAS system referred to above. Those skilled in the art will understand corresponding code that would be utilized in other graphing systems.
Additional examples of output from the polygonplot statement discussed above are shown in
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as the bar chart of
The displays of
In the TABLE 13 code above, after an initial section in which display and data parameters are provided, it can be seen that data for the graph display is provided in a “data class” data section. Another section of the TABLE 13 code comprises a specification of the graph layout, in the “proc template” section of code. The last section of TABLE 13 code initiates the rendering of the graph displays, in the “proc sgrender” sections of code. Those skilled in the art will appreciate that some of the lines of code in TABLE 13, such as format specifications, process calls, and calls to an output delivery system (ods) and the like, are unique to the operating environment of the SAS system referred to above. Those skilled in the art will understand corresponding code that would be utilized in other graphing systems.
As noted above, the GTL of SAS may be used to provide program code that, when executed, will specify graph displays, such as
In the TABLE 14 code above, after an initial section in which display and data parameters are provided, it can be seen that data for the shapes of the US states in the graph display are provided in the “data usa” data section. The data may be obtained from data storage of the computer system. This type of data retrieval illustrates the flexibility of the configuration provided by the POLYGONPLOT technique disclosed in this document. The stored shape data can be utilized in multiple graphing routines, and in that way comprises a library of shapes that are suitable for reuse. Additional data sections in TABLE 14 are noted as a “data Fipsusa”, a “data usa1”, a “data usa1a”, a “data usa2”, a “data null”, and a “data usa3” section. Another section of the TABLE 14 code comprises a specification of the graph layout, in the “proc template” section of code. The last section of TABLE 14 code initiates the rendering of the graph display, in the “proc sgrender” section of code. Those skilled in the art will appreciate that some of the lines of code in TABLE 14, such as format specifications, process calls, and calls to an output delivery system (ods) and the like, are unique to the operating environment of the SAS system referred to above. Those skilled in the art will understand corresponding code that would be utilized in other graphing systems.
Potential Uses for Aspects of the Disclosure.
Most “plot” statements in computer graphing systems create a specific representation of data so that all the data is represented and plotted in a regular, uniform way. As described herein, a computer graphing system according to this disclosure is able to process (x, y) data and generate data for rendering in a way such that each observation (i.e., each (x, y) pair of data) can be plotted in a specific way. This may be referred to as a scatterplot technique. Similarly, the computer graphing system may also use the same (x, y) data, but plot it as a connected line. In this way, each data response value is related to at least one prior data value. The same processing may be applied to all the plot statements described herein.
Each plot statement as described herein also is processed so that its data information is communicated via the indexes to the axes, so the appropriate amount of space for each plot can be allocated to the axes. The plots also may work with other graphing system objects such as the legends and attribute maps. The plots can be interleaved, as noted above. Other than the pie chart, each plot discussed is attached to, or associated with, a horizontal axis and vertical axis. Each axis asks for the data ranges of every plot that is attached this axis, merges the data ranges and creates a common (combined or unified) data range. It uses this common range to derive data-to-screen mapping.
On the other hand, the system may include an annotation facility that allows users to draw atomic-level graphical elements on top of the graph. These can be polygons, but these polygons neither communicate with the axes, the legends, nor do they work with the attribute maps. An annotate operation cannot be interleaved with the plot statements. For the commonly-known annotation tools such as the one offered by SAS/GTL, the annotation is drawn either on top of every plot or underneath of every plot.
Thus, the geometric plot described herein is a “hybrid” type of plot that provides a feature set between the typical plot statements and annotate statements of most computer graphing systems. The geometric plot may not create a specific representation of a graph. It instead renders a polygon as described by the user. In this attribute, it is similar to annotation. However, as a plot statement, it can interleave with other plot statements, interact with the axes and work with other GTL components like legends and attribute maps.
The Centered Pie chart described above helps to deliver part-to-whole information in a way that is easier for people to consume. Centering the pie on one of the four cardinal directions creates a simple, symmetric graph that is easy to understand.
The Hygrometer graph described above creates a unique graph that is useful to visualize data that can be represented as “opposites”, but can still be aggregated. For example, “Sell” or “Buy” initiated stock volume, or “Arrival” versus “Departure” data. These can be seen individually as a proportion of the daily “volume”. But their magnitudes can also be aggregated to contribute the daily (or over some interval) total. The individual bar of this graph “floats” at a level determined by the ratio of the positive to negative values, and can display the trend at-a-glance.
Systems and methods according to some examples may include data transmissions conveyed via networks (e.g., local area network, wide area network, Internet, or combinations thereof, etc.), fiber optic medium, wireless networks, etc. for communication with one or more data processing devices. The data transmissions can carry any or all of the data disclosed herein that is provided to, or from, a device.
Additionally, the methods and systems described herein may be implemented on many different types of processing devices by program code comprising program instructions that are executable by the device processing subsystem. The software program instructions may include source code, object code, machine code, or any other stored data that is operable to cause a processing system to perform the methods and operations described herein. Other implementations may also be used, however, such as firmware or even appropriately designed hardware configured to carry out the methods and systems described herein.
The system and method data (e.g., associations, mappings, data input, data output, intermediate data results, final data results, etc.) may be stored and implemented in one or more different types of computer-implemented data stores, such as different types of storage devices and programming constructs (e.g., RAM, ROM, Flash memory, removable memory, flat files, temporary memory, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.). It is noted that data structures may describe formats for use in organizing and storing data in databases, programs, memory, or other computer-readable media for use by a computer program.
A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, subprograms, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network. The processes and logic flows and figures described and shown in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.
Generally, a computer can also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data (e.g., magnetic, magneto optical disks, or optical disks). However, a computer need not have such devices. Moreover, a computer can be embedded in another device, (e.g., a mobile telephone, a personal digital assistant (PDA), a tablet, a mobile viewing device, a mobile audio player, a Global Positioning System (GPS) receiver), to name just a few. Computer-readable media suitable for storing computer program instructions and data include all forms of nonvolatile memory, media and memory devices, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices); magnetic disks (e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks). The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
The computer components, software modules, functions, data stores and data structures described herein may be connected directly or indirectly to each other in order to allow the flow of data needed for their operations. It is also noted that a module or processor includes, but is not limited to, a unit of code that performs a software operation, and can be implemented, for example, as a subroutine unit of code, or as a software function unit of code, or as an object (as in an object-oriented paradigm), or as an applet, or in a computer script language, or as another type of computer code. The software components or functionality may be located on a single computer or distributed across multiple computers depending upon the situation at hand.
The computer may include a programmable machine that performs high-speed processing of numbers, as well as of text, graphics, symbols, and sound. The computer can process, generate, or transform data. The computer includes a central processing unit that interprets and executes instructions; input devices, such as a keyboard, keypad, or a mouse, through which data and commands enter the computer; memory that enables the computer to store programs and data; and output devices, such as printers and display screens, that show the results after the computer has processed, generated, or transformed data.
Implementations of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Implementations of the subject matter described in this specification can be implemented as one or more computer program products (i.e., one or more modules of computer program instructions encoded on a computer-readable medium for execution by, or to control the operation of, data processing apparatus). The computer-readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated, processed communication, or a combination of one or more of them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question (e.g., code that constitutes processor firmware, a protocol stack, a graphical system, a database management system, an operating system, or a combination of one or more of them).
While this disclosure may contain many specifics, these should not be construed as limitations on the scope of what may be claimed, but rather as descriptions of features specific to particular implementations. Certain features that are described in this specification in the context of separate implementations can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be utilized. Moreover, the separation of various system components in the implementations described above should not be understood as requiring such separation in all implementations, and it should be understood that the described program components and systems can generally be integrated together in a single software or hardware product or packaged into multiple software or hardware products.
Some systems may use Hadoop®, an open-source framework for storing and analyzing big data in a distributed computing environment. Some systems may use cloud computing, which can enable ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. Some grid systems may be implemented as a multi-node Hadoop® cluster, as understood by a person of skill in the art. Apache™ Hadoop® is an open-source software framework for distributed computing. Some systems may use the SAS® LASR™ Analytic Server in order to deliver statistical modeling and machine learning capabilities in a highly interactive programming environment, which may enable multiple users to concurrently manage data, transform variables, perform exploratory analysis, build and compare models and score. Some systems may use SAS In-Memory Statistics for Hadoop® to read big data once and analyze it several times by persisting it in-memory for the entire session.
It should be understood that as used in the description herein and throughout the claims that follow, the meaning of “a,” “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise. Finally, as used in the description herein and throughout the claims that follow, the meanings of “and” and “or” include both the conjunctive and disjunctive and may be used interchangeably unless the context expressly dictates otherwise; the phrase “exclusive or” may be used to indicate situations where only the disjunctive meaning may apply.
The present disclosure claims the benefit of priority under 35 U.S.C. §119(e) to U.S. Provisional Application No. 61/983522 filed Apr. 24, 2014 and titled “Techniques for Visualization of Data”, by inventors Sanjay Matange, et al., the entirety of which is incorporated herein by reference.
| Number | Date | Country | |
|---|---|---|---|
| 61983522 | Apr 2014 | US |