A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the United States Patent and Trademarks Office patent or records, but otherwise reserves all copyright rights whatsoever.
Data-analysis is a process involving the organization, examination, display, and analysis of collected data using narratives, figures, charts, graphs and tables. Data analyses are aided by data-analysis processors, which are computational engines, either in hardware or software, which can execute the data-analysis process. High-end data-analysis processor typically have a language component like the R, S, SAS, Mathlab®, Python, and Perl families of languages. The availability of a language component facilitates data-analysis in numerous ways including the following: arbitrary data transformations; applying one analysis result to results form another; abstraction of repeated complex analysis steps; and development of new methodology.
A principal challenge of data-analysis processors is communicating the results of data-analysis to data owners. Generation of reports as part of a data-analysis project typically employs two separate steps. First, the data are analyzed using a data-analysis application based on a data-analysis processor. And two, data-analysis results (tables, graphs, figures) are used as the basis for a report document using a word processor application. Although, many data-analysis applications try to support this process by generating pre-formatted tables, graphs and figures that can be easily integrated into a report document using copy-and-paste from the data-analysis application to the word processing application, the basic paradigm is to construct the report document around the results obtained from data-analysis.
Another approach for integration of data-analysis and report document generation is to embed the data-analysis itself into the report document. The concept of “literate programming systems”, “literate statistical practice” and “literate data-analysis” are big efforts in this area. Proponents of this approach advocate software systems for authoring and distributing these dynamic data-analysis documents that contain text, code, data, and any auxiliary content needed to recreate the computations. The documents are dynamic in that the contents, including figures, tables, etc., can be recalculated each time a view of the document is generated. The advantage of this integration is that it allows readers to both verify and adapt the data-analysis process outlined in the document. A user can readily reproduce the data-analysis at any time in the future and a user can present the data-analysis results in a different medium.
Whatever the precise merits and features of the prior art in this field, the earlier art does not achieve or fulfill the purposes of the present invention. The prior art does not provide for the following:
Accordingly, a need exists for an object-oriented framework that supports the creation of computer-implemented applications, methods and systems that enable users to generate a data-analysis results collection and at least one data-analysis results document using new software applications or familiar existing software applications like a word processor application.
In accordance with the present invention, the above and other problems are solved by providing the following:
The object-oriented framework comprises a collection of software objects and interfaces that facilitate the following activities:
In accordance with the present invention, a method is provided that employs the object-oriented framework to generate a data-analysis results collection. This method entails instantiating a data-analysis parts container object (comprising a property identifying a platform runtime for the data-analysis parts container); reading the platform runtime property of the data-analysis parts container; instantiating a platform runtime object associated with the platform runtime property; and finally communicating the data-analysis parts container object to the platform runtime object. This method results in the generation of a data-analysis results object.
In addition, a computer program product is also provided that employs the object-oriented framework to generate a data-analysis results collection. This computer program product comprises a data-analysis parts container object (comprising a property identifying a platform runtime); a platform runtime associated with the platform runtime property; and an object from the object-oriented framework for instantiating the platform runtime object for generating a data-analysis results collection.
Moreover, a computer program product is additionally provided that employs the object-oriented framework and a word processor application to generate a data-analysis results collection. This computer program product comprises a word processor application; a data-analysis template comprising a data-analysis template comprising a word processor document and a data-analysis parts container; and a plug-in module for use in the word processor application which facilitates data-analysis on the data-analysis template using the object-oriented framework.
In accordance with the present invention, a method is provided that employs the object-oriented framework to generate at least one data-analysis results document. This method entails instantiating a data-analysis template object comprising a word processor document object and a data-analysis parts container object; instantiating a data-analysis results object; specifying an export service; instantiating an export service object (associated with the specified export service); and communicating the data-analysis template and the data-analysis results object to the export service object. This method results in the generation of a data-analysis results document object.
In addition, a computer program product is also provided that employs the object-oriented framework to generate at least one data-analysis results document. This computer program product comprises a data-analysis template comprising a word processor document object and a data-analysis parts container object; a data-analysis results object comprising a data-analysis results collection generated using the data-analysis parts container object; a specified export service; an export service object (associated with the specified export service) for generating a data-analysis results document; and an object from the object-oriented framework that communicates the data-analysis template object and the data-analysis results object to the export service object to generate a data-analysis result document.
Finally, a computer program product is additionally provided that employs an object-oriented framework for data-analysis and a word processor application to generate at least one data-analysis results document. This computer program product comprises a word processor application, a data-analysis template comprising a word processor document and a data-analysis parts container, and a plug-in module for use in the word processor application for generating a data-analysis results document using the object-oriented framework.
Referring now to the drawings, in which like numerals represent like elements through several figures, aspects of the present invention and the exemplary operating environment will be described.
The steps of the claimed method and apparatus are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the methods or apparatus of the claims include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The steps of the claimed method and apparatus may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, and other computer instructions or components that perform particular tasks or implement particular abstract data types. The methods and apparatus may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network, such as web services. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
With reference to
Computer 110 typically includes a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 110 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by computer 110. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above should also be included within the scope of computer readable media.
The system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132. A basic input/output system 133 (BIOS), containing the basic routines that help to transfer information between elements within computer 110, such as during start-up, is typically stored in ROM 131. RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120. By way of example, and not limitation,
The computer 110 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
The drives and their associated computer storage media discussed above and illustrated in
The computer 110 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 180. The remote computer 180 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 110, although only a memory storage device 181 has been illustrated in
When used in a LAN networking environment, the computer 110 is connected to the LAN 171 through a network interface or adapter 170. When used in a WAN networking environment, the computer 110 typically includes a modem 172 or other means for establishing communications over the WAN 173, such as the Internet. The modem 172, which may be internal or external, may be connected to the system bus 121 via the user input interface 160, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 110, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,
A software framework is a defined support structure in which another software project can be organized and developed. A framework may include support programs, code libraries, a scripting language, or other software to help develop and glue together the different components of a software project. Frameworks are designed with the intent of facilitating software development, by allowing designers and programmers to spend more time on meeting software requirements rather than dealing with the more tedious low level details of providing a working system. By bundling a large amount of reusable code into a framework, much time is saved for the developer, since he/she is saved the task of rewriting large amounts of standard code for each new application that is developed. Application frameworks are especially useful when implementing graphical user interface (GUI), since these tended to promote a standard structure for applications. It is also much simpler to create automatic GUI creation tools when a standard framework is used, since the underlying code structure of the application is known in advance. Object-oriented programming techniques are preferably used to implement frameworks such that the unique parts of an application can simply inherit from pre-existing classes in the framework; such frameworks are referred to as “object-oriented frameworks.”
Those skilled in the art will recognize that object-oriented frameworks are items of commerce. For example, one of the first commercial frameworks was MacApp, written by Apple Computer for the Macintosh. Microsoft has developed frameworks for its Windows application development beginning with the Microsoft Foundation Classes (MFC) for Visual C++ to the most recent .NET programming platform for Microsoft Visual Studio.
Referring now to
According to a preferred embodiment of the invention, the platform runtimes (data-analysis processors) are represented by pluggable program modules. This allows a programmer to add and to remove different platform runtimes to the object-oriented framework for data analysis. In this way the object-oriented framework and the platform runtimes may be universally adapted to different kinds of data-analysis requirements.
In one embodiment of the invention, pluggable platform runtimes are constructed for data-analysis processors including but not limited to the following: R developed by R-Project for Statistical Computing; Python developed by Python Foundation; and IronPython developed by Microsoft Corporation. In another embodiment of the invention, data-analysis processors include the following: S-Plus™ processor developed by Insightful Corporation; MATLAB™ processor developed by MathWorks Corporation; Perl processor developed by Perl Foundation; SAS™ processor developed by SAS Institute Corporation; Mathematica™ processor developed by Wolfram Research Corporation; Octave processor developed by the University of Wisconsin; F# processor developed by Microsoft Corporation; Haskell processor developed by the Yale Haskell group; and Ruby processor developed by Gardens Point.
According to a preferred embodiment of the invention, the export services are represented by pluggable program modules. This allows a programmer to add and to remove different export services to the object-oriented framework for data analysis. In this way the object-oriented framework and the export services may be universally adapted to different kinds of data-analysis requirements.
In one embodiment of the invention, pluggable export services are constructed for the file export formats including: single file web page (*.mht, *.mhtml); web page (*.htm, *.html); and binary Word (*.doc). In another embodiment of the invention, the file export formats include but are not limited to the following: portable document format (*.pdf); XML paper specification format (*.xps); extensible markup language file (*.xml); rich text format (*.rtf); plain text (*.txt); Word markup language (*.docx); Word markup language macro-enabled document (*.docm); and TeX format (*.tex).
It should be appreciated by those skilled in the art that while embodiments of the exemplary object-oriented framework illustrated in
Referring now to
Within operation 320, the platform runtime property identifies the platform runtime associated with the data-analysis parts container from the platform runtimes library 271, which contains the collection of installed platform runtimes. Each platform runtime 280 is comprised of a platform runtime object 281 and an XML document 282 that defines one or more properties of the platform runtime. Such properties may include the string key associated with the platform runtime and an additional string that identifies the software object that comprises the platform runtime. The string that identifies the software object is used within objects of the object-oriented framework to dynamically instantiate platform runtime objects. It is also contemplated that NET reflection may be used in place of the XML document 282 to define an installed platform runtime. The object-oriented framework also contains an object runtime manager 235, which may be used to create a list of the available pluggable platform runtimes installed.
Within operation 330, the instantiated platform runtime object 281 implements the interface IRuntime 221. The instantiated platform runtime object of operation 330 is further comprised of sub-objects that provide additional functionality. The first sub-object is the runtime info object 283 that defines one or more properties of the platform runtime; this object implements the interface IRuntimeinfo 222. The second sub-object is the runtime engine object 284 that performs data-analysis on an entire data-analysis parts container; this object implements the interface IRuntimeEngine 223. The third sub-object is the runtime evaluator object 285 that performs data-analysis on a portion of a data-analysis part; this object implements the interface IRuntimeEvaluator 224. These interfaces are used to ensure that all pluggable platform runtime objects created for use within the object-oriented framework conform to specific application programming interface contracts, such that the objects can operate interchangeably.
In another embodiment of the invention, additional objects contained in the object-oriented framework may also be leveraged within the implementation of the platform runtime object of operation 330. Such objects include runtime helper 233 (an object containing a collection of subroutines useful in creating new pluggable platform runtimes), workspace 234 (an object that provides a virtual workspace where temporary files can be written and collected), workspace manager 236 (an object that manages the collection of workspace 234 objects created by the system), and code results 232 (an object that assists with the recording and collection of data-analysis result objects).
Referring to
In one embodiment of the invention, the solution employs Word by Microsoft Corporation as the word processor application and a smart document solution as the plug-in software module. Smart document solutions allow programmatic customization of a word processor document or template, including writing code that executes in response to word processor events or custom user-interface additions.
In another embodiment of the invention, the solution may employ Word by Microsoft Corporation as the word process application and a Microsoft Word Add-In as the plug-in software module. Microsoft Word Add-In solutions allow the addition of custom functions to the application environment.
Within operation 730, the desired export service is specified from the export services library 272, which contains the collection of installed export services. Each export service 290 is comprised of an export service object 291 and an XML document 292 that defines one or more properties of the export service. Such properties may include the string key associated with the export service and an additional string that identifies the software object that comprises the export service. The string that identifies the software object is used within objects of the object-oriented framework to dynamically instantiate export service objects. In another embodiment of the invention, .NET reflection may be used in place of the XML document 292 to define an installed export service. The object-oriented framework also contains on object export services manager 266, which may be used to create a list of the available pluggable export services installed.
Referring again to
Within operation 740, the instantiated export service object 291 implements the interface IExportService 251. The instantiated export service object of operation 330 is further comprised of a sub-object export service info object 293 that defines one or more properties of the export service; this object implements the interface IExportServiceInfo 252. These interfaces are used to ensure that all pluggable export service objects created for use within the object-oriented framework conform to specific application programming interface contracts, such that the objects can operate interchangeably.
Additional objects contained in the object-oriented framework may also be leveraged within the implementation of the export service object of operation 740. Such objects include export services helper 263 (an object containing a collection of subroutines useful in creating new pluggable export services), data-analysis results UI service 264 (an object providing consistent user-interface presentation for data-analysis results and error handling), and word document service 265 (an object encapsulating routines associated with the generation and modification of Microsoft Word documents).
The object-oriented framework consumes objects contained in the data-analysis parts container object model. Further details on the data-analysis parts container object model are described in co-pending U.S. patent application entitled “Method and Apparatus for Utilizing an Extensible Markup Language Data Structure to Define a Data-Analysis Parts Container for Use in a Word Processor Application,” the disclosure of which is incorporated herein, in its entirety.
The following is a description of objects and interfaces comprising application programming interfaces (API) that constitute the object-oriented framework. Following each of the objects set out below is a description of the operation, properties and methods of the object.
The following are properties and methods of the object.
IRuntimeInfo Interface—This interface defines a runtime info object, which defines properties associated with the platform runtime.
The following are properties and methods of the object.
IRuntimeEngine Interface—This interface defines a runtime engine object, used to perform data-analysis on an entire data-analysis parts container.
The following are properties and methods of the object.
IRuntimeEvaluator Interface—This interface defines a runtime evaluator object, used to perform data-analysis on a portion of a data-analysis part.
The following are properties and methods of the object.
CodeResults Object—This object assists with the recording and collection of data-analysis result objects. This object extends CollectionBase and comprises a collection of CodeResult objects.
The following are properties and methods of the object.
AppendCodeResult Method [Return Type CodeResult]—Searches for a specified code result in the collection—if found, the contents of the specified code result are appended; otherwise, the specified code result is added to the internal collection.
CodeResult Object—This object encapsulates the text, figure, and table output associated with executing a portion or the entirety of a code block or expression.
The following are properties and methods of the object.
Figure Object—This object records a path to a figure graphics file.
The following are properties and methods of the object.
Table Object—This object records a path to a structured table file.
The following are properties and methods of the object.
RuntimeHelper Object—This object contains a collection of subroutines useful in creating new pluggable platform runtimes.
The following are properties and methods of the object.
Workspace Object—This object provides a virtual workspace where temporary files can be written and collected.
The following are properties and methods of the object.
WorkspaceManager Object—This object manages the collection of workspace objects created by the system.
The following are properties and methods of the object.
RuntimeManager Object—This object create a list of the available pluggable platform runtimes installed. This object stores the list of available platform runtimes using an internal RuntimeEntities object.
The following are properties and methods of the object.
RuntimeEntities Object—This object comprises the collection of available platform runtimes. This object extends CollectionBase and comprises a collection of RuntimeEntry objects.
The following are properties and methods of the object.
RuntimeEntity Object—This object records the properties necessary to instantiate the required platform runtime objects for use in the pluggable system.
The following are properties and methods of the object.
“BlueRef.Inference.Runtimes.R”).
IExportService Interface—This interface defines an export service object.
The following are properties and methods of the object.
The following are properties and methods of the object.
DataAnalysisResultsService—This object encapsulates a service to generate a data-analysis results collection.
The following are properties and methods of the object.
The following are properties and methods of the object.
The export service can be specified by providing either an export service object (a software object that implements IExportService) or a string key of an export service. An existing CodeResults object can also be specified; if one is not specified, code results will be generated using the DataAnalysisResultsService object.
DataAnalysisResultsUIService—This object encapsulates a service to show a user-interface dialog presenting the data-analysis results document and/or data-analysis results, and any associated error messages.
The following are properties and methods of the object.
WordDocumentService—This object encapsulates a service to generate Microsoft Word documents from an existing template Word document.
The following are properties and methods of the object.
The following are properties and methods of the object.
ExportServiceEntities Object—This object comprises the collection of available export services. This object extends CollectionBase and comprises a collection of ExportServiceEntry objects.
The following are properties and methods of the object.
ExportServiceEntry Object—This object records the properties necessary to instantiate the required export service objects for use in the pluggable system.
The following are properties and methods of the object.
The following code example illustrates how the object-oriented framework may be used to generate a data-analysis results collection from an existing file comprising a data-analysis parts container:
The following code example illustrates how the object-oriented framework may be used to generate and present to the user a data-analysis result document (in HTML format) from an existing data-analysis template containing a data-analysis parts container:
The following code example illustrates how the object-oriented framework may be used to list out the collection of installed platform runtimes:
Finally,
Although the forgoing text sets forth a detailed description of numerous different embodiments, it should be understood that the scope of the patent is defined by the words in the claims set forth at the end of the patent. The detailed description is to be construed as exemplary only and does not describe every possible embodiment because describing every possible embodiment would be impractical, if not impossible. Numerous alternative embodiments could be implemented, using either current technology or technology developed after filing date of this patent, which would still fall within the scope of the claims.
Thus, many modifications and variations may be made in the techniques and structures described and illustrated herein without departing from the spirit and scope of the present claims. Accordingly, it should be understood that the methods and apparatus described herein are illustrative only and not limiting upon the scope of the claims.
U.S. patent application Attorney Docket No. BLUEREF-001, filed on Jan. 3, 2007 and entitled “Method and Apparatus for Utilizing an Extensible Markup Language Data Structure For Defining a Data-Analysis Parts Container For Use in a Word Processor Application,” U.S. patent application Attorney Docket No. BLUEREF-002, filed on Jan. 3, 2007 and entitled “Method and Apparatus for Managing Data-Analysis Parts in a Word Processor Application,” and U.S. patent application Attorney Docket No. BLUEREF-004, filed on Jan. 3, 2007 and entitled “Method and Apparatus for Data Analysis in a Word Processor Application,” which are assigned to the same assignee as the present invention, are hereby incorporated, in their entirety, by reference.