This invention relates to a method of extracting data and recommending and generating visual displays of data on a computer system. In particular, this invention relates to executing a visualization tool accessible on a Web-based computing platform.
Computing tools include tools that perform simple or complex computations as well as tools that display data and computations in charts, graphs, and other visual forms for an end user. When reviewing a large data source, it can be intimidating, overwhelming, and seemingly impossible to distill the data into subsets and categories for further analysis. For example, a review of the United States Patent and Trademark Office online database of issued patents contains a large volume of data, including classifications, priority dates, filing dates, issue dates, examiners, art units, cited prior art, figures, inventors, and many more categories. Distilling and organizing that data to find trends can be daunting. Moreover, it may not be readily discernible the many ways one could visualize the results. A tool that can access the database, analyze the database for trends based on simple or complex commands, recommend visualization options and convert the data to a visual presentation would be valuable and useful to both casual and sophisticated users.
Current computing tools for visualizing data include hardware tools, software tools, and World Wide Web-based tools. For example, graphing calculators, one type of a visualization tool, are widely used hardware tools for computations. Graphing calculators, however, have a primitive and small display, are highly specialized, lack the ability to expand, and require a user to individually input data. Moreover, graphing calculators require a battery and can be expensive.
Examples of software computing and visualization tools include programs such as Matlab®, Maple™, Mathematical, spreadsheet programs such as Excel®, and PowerPoint®. Typically, however, these software tools involve a complex interface, require expensive hardware, have poor collaboration, have limited customization capability are subject to restrictive licensing with high licensing fees, involve multiple editions, and require maintenance and upgrade costs. Furthermore, these software tools lack the ability to access, distill, analyze, and organize data and recommend appropriate visualization options to a user.
Due to its wide use in the modern world, the World Wide Web (commonly shortened to “Web”) has become the basis of modern technology and a desirable platform for computing tools. Currently available Web-based computing tools, however, have limited features and poor navigation. Additionally, current Web-based computing tools have confusing interfaces and are not scalable or reusable. Finally, the current tools lack a user community, have no collaboration features, and lack the ability to distill, analyze and organize varied data for visualization.
Recent trends in Web development revolve around Web 2.0, which refers to the transition of websites from isolated information sites to interlinked computing platforms that act like software to the user. In general, Web 2.0 surpasses the original Web with its information storage, creation, and dissemination capabilities. The infrastructure of Web 2.0 includes server-software, content-syndication, messaging-protocols, standards-based browsers with plug-ins and extensions, and various client-applications. Additionally, recent trends include cloud computing in which IT-related capabilities are provided as a service, allowing users to access technology-enabled services from the internet without knowledge of, expertise with, or control over the technology infrastructure that supports them. Similarly ubiquitous computing has become prevalent; information processing has been integrated into everyday objects and activities such that a human engages many computational devices and systems simultaneously in the course of their ordinary activities possibly without even being aware he is doing so. Another trend, the Semantic Web, is an evolving extension of the World Wide Web in which the semantics of information and services on the Web is defined, making it possible for the Web to understand and satisfy the requests of people and machines to use the Web content.
Web 2.0 supports technologies such as weblogs, social bookmarking, wikis, podcasts, RSS feeds, social software, web application programming interfaces (APIs), and online Web services. Web 2.0 websites exhibit characteristics such as delivering and allowing users to use applications entirely through a Web browser; allowing users to own data on a site and exercise control of the data; having users add value to an application as the user uses it; providing an interactive and rich interface based on Ajax (short for “Asynchronous JavaScript and XML”) or similar frameworks; and providing some social-networking aspects.
With Web 2.0, Web-based applications and desktops have evolved. Through Ajax, Adobe A Flex®, Microsoft® Silverlight™, or similar rich Internet application frameworks developers have been able to provide richer user experiences through websites that mimic personal computer applications such as word-processing and spreadsheet applications. Additionally, users can now use several browser-based operating systems or online desktops, which function as application platforms rather than as operating systems per se. While these services appear to the user as a desktop operating system, they are capable of running within any modern browser.
In addition to rich Internet application techniques frameworks, Web 2.0 websites typically also include semantically valid XHTML and HTML markups; microformats enriching pages with additional semantics; folksonomies, such as tags or tagclouds; cascading style sheets; and REST and/or XML- and/or JSON-based APIs. Web 2.0 websites also include syndication, aggregation and notification of data in RSS or Atom feeds; client- and server-side mashups, which merge content from different sources; weblog publishing tools; wiki or forum software to support user generated content; openID for transferable user identity; and use of open source software. It would be desirable to create a Web-based computing platform with rich interactive features that includes computing and visualization tools.
Accordingly, it is an object of this invention to create a dynamic Web-based computing platform that involves Web services and Web applications that are easy to navigate and understand, are scalable and reusable, and contain rich features. It is particularly an object of this invention to provide a visualization computing tool where users can access the tool and data from any computer, can collaborate with others, can publish results, can solicit assistance from a worldwide audience, and can print or email their visual displays and computations. It is an object of this invention to enable visualization developers to design presentation programs without having to worry about data mining, data processing and compatibility issues. It is a further object of this invention to provide a visualization tool capable of accessing various forms of live and static data, analyzing the data, organizing the data, and presenting a user with various visualization options for displaying the data.
This invention involves a method of extracting data and recommending and generating visual displays of data by executing a visualization tool that operates as part of a comprehensive Web-based computing platform. The computing platform and visualization tool can be accessed via a website, customizable interface, email, telephone, or other remote communication device. The visualization tool operates by accessing the data source and executing an analysis engine to parse and extract numerical and other forms of data. The visualization tool also executes a recommendation engine that considers the extracted data and recommends suitable visual display styles and visual display options and recommends additional compatible algorithms. Additionally, users can provide their own compatible algorithms for data processing. The user selects one or more display styles or graphs and display options. Additionally, if there are compatible algorithms, the user can select a pre-programmed algorithm or a user-generated algorithm as well. Additionally, the visualization tool transforms the data with a computation engine according to the user's selections and through execution of any selected algorithms and outputs a file according to a given protocol. The output file can then be accessed by third-party presentation programs that use the same given protocol to generate a visual display. Finally, the visualization tool can deliver the visual display to the user and can optionally save or publish the display as well.
The data accessed by the visualization tool can be live or static data, and there can be multiple data sources. Additionally, the visualization tool can be applied to user-supplied data, data stored in a data repository on the computing platform, data stored elsewhere on the internet, or mined data. The visualization tool can access and execute a data mining tool and a semantic template editor that are part of the Web-based computing platform to download, parse and extract data from single or multiple pages of numerical or other forms of data. Additionally, the visualization tool can access a conversion tool to convert, for example between equations and data or convert between different API protocols, and can access other features of the Web-based computing platform such as a library of equations to enhance the visualization tool's features. The visualization tool through the computing platform also can be incorporated into user-created applications on the computing platform and can be available for a community of users for sharing and discussing their results and research.
a is a schematic of services and tools offered by the Web-based computing platform.
b is a diagram of the distributed computing model preferably used by the Web-based computing platform.
a is a schematic of the data repository of the visualization tool of the Web-based computing platform.
b is a schematic of the data mining tool of the Web-based computing platform.
c is a schematic of the analysis engine of the Web-based computing platform.
d is a schematic of the markup language parser of the analysis engine of the Web-based computing platform.
e is an illustration of the Web-based interface and template markup of the semantic template editor of the Web-based computing platform.
a is a schematic of the recommendation engine of the Web-based computing platform.
b is a diagram of the ranking and sorting options for the recommendation engine of the Web-based computing platform.
a is a schematic of the transformation interfaces of the Web-based computing platform.
b is an example of the output of the computations engine according to the visualization API protocol and the associated generated display.
c is another example of the output of the computation engine according to the visualization API protocol and the associated generated display.
a is a schematic of an embodiment of the visualization tool for displaying equations as flowcharts.
b is a schematic of an embodiment of the visualization tool where multiple live data sources are compared.
c is a schematic of an embodiment of the visualization tool for generating equations and data where the input data is a graph.
a illustrates the overall features of the Web-based computing platform 100 that includes and supports a visualization tool 110. As shown, the Web-based computing platform includes the visualization tool 110, a data repository 120, a library of equations and relationships 130; a community of users 140; multiple user interfaces 150; and a payment system 160. The Web-based computing platform 100 also includes a recommendation engine 170, transformation interfaces 180, abilities to display visualizations 190, a data and equation generator 200, a gallery 210, Web services 220, and a data mining tool 250. In general, the computing platform 100 provides the ability to create, import, export, tag, share, and store data and visualizations and associated programs and research; access to a community for publishing visualizations and collaborate; and a point-based payment method related to the complexity of the visualizations and associated computations. The visualization tool 110 and overall computing platform 100 can be accessed through a customizable user interface 154, a website interface 151, email interface 153, an application programming interface 155, or through other remote communication devices and methods 152 such as mobile devices, telephones, and text messaging. The visualization tool 110 can display visualizations 190 and can access the data repository 120, the recommendation engine 170, the transformation interfaces 180, the data and equation generator 200, the gallery 210, the community 140, and the library of equations and relationships 130 of the computing platform 100.
In general, the computing platform 100 and visualization tool 110 can be accessed by any user from any computer or other communication device having access to the World Wide Web. No other special software or hardware is needed on the user's local computer or communication device. The functions performed by the computing platform are preferably done on the server-side so that code execution on the end user's computer is not necessary. Large calculations are handled by multiple computers for faster results. The platform provides an application programming interface (API) that allows users to utilize and implement features the computing platform provides for their own applications, websites or Web services.
b illustrates the distributed computing system preferably used by the Web-based computing platform, which allows the Web-based computing platform features and tools to operate in a decentralized environment. As illustrated in
If a user contacts the computing platform via email, the website may record the email address of the user for billing purposes and optionally for a usage log assigned to each user. For example, when the visualization tool and a presentation program generate a visual display, the display can be delivered to the user via an email and it can also be stored in the data repository of the computing platform. Similarly, if a user contacts the Web-based computing platform via telephone or text message, the computing platform records the incoming phone number and uses voice recognition to record the input. The visualization tool and the presentation program generate the display and then the result or display can be saved as a static or animated file and delivered to the user via a text or picture message. Finally, if the user accesses the visualization tool via the website directly, then the result or visual display can be displayed on the user's monitor or display screen. Additional alternatives include emailing the request and instructions and receiving a picture message in response or telephoning the request and instructions and receiving an email in response. Any combination of input methods and output delivery methods can be used.
When using the computing platform overall and, in particular, when using the visualization tool, a user can register and be assigned a unique username or identification code that he inputs for billing purposes, for recording or logging purposes, for storing and later publishing or exporting data and visual displays, and for designating preferences with respect to delivery methods and other options. Additionally, each user can have a personalized view of the computing platform's website that reflects his particular use of the system. For example, a user's frequently-used equations or functions can be prominently listed or displayed. Likewise, a user's community connections and collaborative efforts can be displayed, highlighted, linked or similarly noted and personalized. Additionally, the website can provide additional social networking features such as displaying a list of associates, forming or joining an interest group, and searching for people with similar interests.
Accessing user-selected data sources with the data repository 120 of the visualization tool 110 involves accessing several data sources and features. The data repository 120 includes, for example, existing stored data on the World Wide Web 121 or in the data repository 120, uploaded or imported data 122, live data 123, generated data 124, mined data 125, and any other form of data that can be delivered to the Web-based computing platform. For example, data can be generated from equations, either user-defined equations or equations accessed through the computing platform or from visual displays. A user can also import live or static data using the computing platform. For example, a user may want to import data from files such as excel, word, pdf, txt, gif, xml, png, or zip files. Data can also be imported, for example, from Matlab® programs. Users can collect live or static data with common devices such as cell phones via SMS or with special hardware such as temperature sensors or blood pressure monitors. Users can also tag and give meanings to pure numbers and data. Additionally, users can access the data repository of the computer platform, link to other external databases, and communicate with other Web services. Users also can cross-reference and perform calculations with multiple data sources, for example stock market index verses interest rates. Data can also be mined with a data mining feature from external sources.
a illustrates the data repository 120 features of the Web-based computing platform. The data repository can store and access any type and format of data. In general, data includes data, data information and data format, and data format includes any information about the data structure including data schema and data patterns. Additionally, data can be numerical, string, or generic, and can be tabular, vector, relational, work flow, hierarchical, animated, or streamed. When using the visualization tool 110, a user enters or selects 400 a data source or chooses to run 401 the data mining tool 250. The data source can be data already stored in the data repository or it can be uploaded data, live data, generated data, data stored elsewhere, or other forms of data. If the user entered, selected, or provided a data source 400, then the data source is reviewed 402 to see if it is already stored in the data repository 120 or elsewhere. If it is stored in the data repository 120, then the visualization tool can proceed to the recommendation engine 170. The recommendation engine 170 will be described in detail below with respect to
b illustrates the data mining tool 250. Through the computing platform's user interface 150, the user inputs, selects, or provides 400 one or more resources, pages, URLs, documents or other data source from which the computer platform should mine data. For example, if the user wants to mine data from U.S. Pat. No. 7,000,000 as available on the USPTO website, the user inputs the URL http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&I=50&s1=7000000.PN.&OS=PN/7000000&RS=PN/7000000. Alternatively, if the user wants to mine data from U.S. Pat. Nos. 7,000,000 through 7,000,005, the user inputs the URL pattern http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&I=50&s1={7000000-7000005}.PN.&OS=PN/{7000000-700005}&RS=PN/{7000000-7000005}. The URL pattern may also be in regular expression or wild card format. Through the user interface 150, the user also can set additional parameters such as the number of data mining agents and the data mining schedule. Additionally, through the user interface 150 the user can monitor the speed and cost of the data mining job, can save selections as a data mining job for immediate or delayed execution, and can combine multiple saved data mining jobs as a batch data mining job. After the data mining parameters are set, the data mining tool checks the server's robots.txt file for permissions, if allowed it then downloads the first page 411. If the page contains numerical data, the data mining tool 250 accesses and runs the analysis engine 240 and generates a template for parsing the data 414. If the page contains data in the form of text, the data mining tool accesses and runs the semantic template editor 260 and generates a template for parsing the text data 414. The user can choose whether to run either the analysis engine 240 or run the semantic template editor 260, or the data mining tool first accesses and executes the analysis engine 240 and second accesses the semantic template editor 260 only if the analysis engine 240 was not appropriate. After a template has been generated, the next page of data is downloaded 415 and data is parsed according to the previously generated template 416. This then repeats for each additional page of data. The data and the database schema are then saved 417 in the data repository 120.
c illustrates the analysis engine 240 that is used by both the data mining tool 250 and the data repository 120. The analysis engine 240 begins by retrieving 420 the data if it has not already been retrieved. From the data, it next trims the leading and trailing spaces 421 and runs the markup language parser 450, which will be discussed in detail with respect to
In
e illustrates the semantic template editor 260 used by the data mining tool 250 for data that is not numerical. If the analysis engine 240 determines that the data is not numerical, the semantic template editor 260 is accessed. The semantic template editor 260 is an HTML like editor where the user can either use a graphical WYSIWYG (What You See Is What You Get) interface 490 or alter the template markup manually 495. To use the graphical interface 490, the first page of the data is displayed and the user highlights the text to use as a data field. Additional information fields may be used to assign meaning to content. For example, in
a illustrates the recommendation engine 170 of the visualization tool 110. The computing platform 100 accesses and executes the recommendation engine 170 to determine what types of visual displays or presentations would be appropriate or suitable given the data. The analysis engine retrieves 501 data information including data and any data schema from the data repository 120 and next compares 502 it to a look-up table of visualization styles 502a. Following is an example of the visualization styles look-up table 502a:
The analysis further compares 503 the data information to a look-up table of algorithms 503a. Following is an example of the algorithm look-up table 503a:
The analysis engine compares 507 the data type patterns to a set of stored data type patterns 507a. Following is an example of stored data type patterns:
The analysis engine can additionally rank or sort 504 the results of the visualization style compatibility look-up table 502a and the algorithm compatibility look-up table 503a and stored data type patterns 507a to reflect popularity of styles, uses, preferences, and published information. For example, the results can be sorted based on identification, title, hits, published date, style, mode, developer, and user rating. Additionally, the user can choose how he would like his results sorted. See
a illustrates the transformation interfaces 180 accessed and executed by the visualization tool 110. In general, the transformation interface 180 includes an interface where a user can select display styles, display options, and, if appropriate, additional built-in or pre-programmed algorithms or user-provided algorithms. The transformation interfaces 180 includes a computation engine 185 that executes any selected algorithms and performs any calculations associated with the chosen visual style and outputs the data as a file according to a given API protocol, such as a visualization API protocol of the computing platform as shown in
After a user has chosen a built-in or pre-programmed algorithm module or a user-programmed algorithm, the computation engine 185 executes any selected algorithms and performs any calculations associated with the chosen visual style and outputs the data into a file formatted according to the given API protocol 608. If desired, the output file can also include instructions for embedding advertisements when the visual display is later generated. The visualization tool 110 then uses the output of the computation engine 185 with a third party presentation program 235 compatible with the given API protocol. For example, using the given API protocol, designers can build visualization and Flash-based applications to generate visual displays from the output of the computation engine 185. Third-party presentation programs 235 include programs developed by independent programmers and designers as well as any presentation programs that are part of the Web-based computing platform 100 or directly associated with the visualization tool 110. Additionally with respect to the visualization API protocol of the computing platform 100, it is backwards compatible so that its results can also be converted to a protocol design for use with additional third-party presentation programs. The display created by the third-party presentation program 235 can then be displayed 190 by the Web-based computing platform 100 according to the user's preferred display method, stored in a user's account or published to a private or public website, a blog, a mobile device, a gallery or elsewhere.
a illustrates one example of the visualization tool 110 as applied to data in the form of an equation. The visualization tool 110 can transform data in the form of equations into a visual algorithm. For this type of visual display, the visualization tool 110 is preferably accessed directly by the website associated with the computing platform. First, the user inputs an equation 801 as the data source and the computing platform accesses and executes 803 the visualization tool. Through the visualization tool, if the user knows he wants to display a flowchart, he can proceed directly to the transformation interface. Otherwise, he can access and execute the recommendation engine 170, which would suggest a flowchart as a suitable display style. Next, the user elects through the transformation interfaces 180 to generate the visual display style of a flowchart. The computation engine 185 then generates an output file according to the visualization API protocol. A third-party presentation 235 program builds a flowchart display 804, which the visualization tool 110 can then display 190 on the user's computer screen. If desired, the solution can also be displayed. Additionally, the equation in a simplified form can be displayed. As with the other visualization options, the process also can be reversed: a flow chart can be supplied by the user and the corresponding equation can be generated.
b illustrates another example of the visualization tool 110 of the computing platform 100 as it is applied to multiple live data sources. In this example, multiple data sources can be analyzed and a custom algorithm can be provided. The user can choose to have the display be a simple message. For example, a user may want to know whether to take Route A to get home from work or Route B to get home from work based on live reports of current traffic conditions. Current traffic conditions are recorded and broadcast by various agencies and news organizations. The user accesses the Web-based computing platform through the website and selects multiple live data sources 821. The visualization tool 110 accesses and executes the analysis engine 240. Then, through the transformation interfaces 180, the user selects either a built-in algorithm module or inputs a user-generated algorithm for comparing the two data sources and determining which one is larger, i.e. a slower route 823. For example, the user elects to use the built-in algorithm Route A-Route B, associates Route A and Route B with a live data source, and then requests the solution to identify the slower route. The computation engine 185 calculates the data and creates an output file according to the visualization API protocol. A third-party presentation program 235 then generates a display 190 saying which route is slower, which is delivered to the user by the visualization tool 110. For example, the message 824 “Route A is slower” displayed if the solution is greater than 0 or the message “Route B is slower” displayed if the solution is less than 0. The visualization tool tracks the live data and the presentation program is executed again when the solution changes 825. As shown by this example, a user can associate a computation with any live data source and can request appropriate messages depending on the solutions. As with the other examples, the user may also input his unique identification code or username, for billing purposes and for recording the session. Additionally, a user may save the computation and personalize his access and use for the platform to always include a display of the equation and its solution.
c illustrates a third example of the visualization tool 110 as applied to data already in the form of a graph. The graphical data can be transformed into an equation and data. First, a user inputs data by drawing a graph on the chosen interface 810.
An additional feature of the computing platform and visualization tool is a point-based and complexity-based payment system 160 that accurately reflects the complexity of the visualizations displayed and computations performed. See
According to the payment system 160 illustrated in
Finally, the computing platform includes a gallery 220 for showcasing visualizations and algorithms and a community for users 140 to collaborate and exchange expertise with others. Collaborative uses of the website include using it as an education center for schools. For example, teachers can use the visualization tool to track students' performance and create visual displays. The platform can also be used for medical uses, such as collecting and tracking data related to a patient's or user's diet or blood pressure. Additional collaborative uses include scientists and engineers sharing their research and results with colleagues around the world, finding people with similar interests who may contribute to research, publishing live results, and accessing other community-members research and results.
Throughout the specification the aim has been to describe the invention without limiting the invention to any one embodiment or specific collection of features. Persons skilled in the relevant art may realize variations from the specific embodiments that will nonetheless fall within the scope of the invention.
This application claims the benefit of U.S. Provisional Application No. 61/000,618 filed Oct. 26, 2007.
Number | Date | Country | |
---|---|---|---|
61000618 | Oct 2007 | US |