This invention relates to the field of computer systems, and in particular to a method and system for creating business models suitable for processing on computer systems.
Computer systems are often used to model the operation of a business, for financial reporting, planning, and forecasting. The invention of an automated spreadsheet program in the late 1970s, for example, provided a major advancement in the practical use of computers for such business applications. It was one of the first computer applications designed for non-programmers, and specifically for business professionals with little or no programming background. Users could create spreadsheets that presented the financial performance of a business based on actual revenues and expenses, or spreadsheets that projected the future performance of the business based on given assumptions, and so on.
The invention is presented herein using the paradigm of a spreadsheet program as an application that uses a model of a business, or a model of segments of a business, to facilitate an analysis of the operation of the business. Other applications that include the use of a business model will be evident to those skilled in the art, and include, for example, simulation systems, database management systems, inventory control systems, and so on. In like manner, although the invention is presented in the context of business modeling, one of ordinary skill in the art will recognize that the techniques presented herein can be applied other modeling tasks as well.
The strength of a spreadsheet program lies in its ability to use equations that reference cells of the spreadsheet to automatically compute values in other cells of the spreadsheet. For example, a cell at the bottom of a column of ten numbers could be configured to automatically contain the sum of these numbers by a simple formula: =SUM(A1 . . . A10). The cell references A1, A10 identify the first column “A”, and the first “1” through tenth “10” rows, and the ellipsis “ . . . ”identifies the inclusion of all the rows between the first and tenth rows. In most embodiments, the user could merely click on a target cell, and its coordinates would be automatically entered in the equation being created. In complex systems, cells from other spreadsheets can be referenced, so that, for example, spreadsheets that describe the performance of a corporation could be created using data from individual business units within the corporation.
As the complexity of a spreadsheet increases, however, the likelihood of error increases, particularly given that the content of many of the cells is based on reference to contents of other cells, and a mistaken reference can have devastating results. If the mistaken reference is grossly misplaced, the erroneous resultant cell value may be easily recognized, and the mistake corrected; if, on the other hand, the mistaken reference is only slightly off-target, the error may be subtle, and not easily recognized. Debugging such an error, for example, when a ‘Balance Sheet’ doesn't balance, but the source of the error is unknown, can be a time consuming and often frustrating process. An audit of a moderately complex spreadsheet, including a thousand equations or so, often takes days, and sometimes weeks or more, depending upon the complexity and underlying structure of the spreadsheet.
The European Spreadsheet Risks Interest Group (EuSpRIG; www.eusprig.org) maintains a web site that includes compilation of a variety of Spreadsheet-mistake news stories, some of which report spreadsheet mistakes that amounted to millions of dollars, and in some cases, billions of dollars. In the United States, the Sarbanes-Oxley Act was signed into law on 30th Jul. 2002, and introduced highly significant legislative changes to financial practice and corporate governance regulation. It introduced stringent new rules with the stated objective: “to protect investors by improving the accuracy and reliability of corporate disclosures made pursuant to the securities laws”, mandates audits to assure that all financial reports are accurate, and holds corporate executives liable to substantial penalties if they cannot attest to assuring the integrity of corporate financial statements.
One of the fundamental drawbacks of a spreadsheet is the inherent lack of documentation and/or the disjoint nature of the documentation and the actual content of the spreadsheet. The available documentation, if any, is likely to exhibit an underlying structure, whereas the occurrence of equations at cells of a tabular spreadsheet display often obscures this structure, or exhibits a contrary structure.
Similarly, the traditional tabular spreadsheet interface is not conducive to the adoption of a uniform development methodology, and an organization's spreadsheets are likely to be custom-tooled by each individual. These ad hoc development techniques make it difficult for subsequent individuals to support and/or enhance existing spreadsheets, and hinder the application of conventional quality control techniques. This lack of a uniform development methodology also substantially hinders the re-use of existing spreadsheets or parts of spreadsheets in other applications, thereby substantially increasing the cost of development of new spreadsheets.
A number of different approaches have been adopted in an attempt to better manage the development of spreadsheets, to reduce the likelihood of errors in spreadsheets, and/or to simplify the audit of spreadsheets. These approaches generally fall into one of two categories: systems and methods that improve the user interface for developing spreadsheets, and systems and methods that facilitate the audit or analysis of existing spreadsheets. Ideally, a system that is used to improve the user interface for developing spreadsheets will also facilitate an analysis of the resultant spreadsheets.
In “Modeling Spreadsheet Audit: A Rigorous Approach to Automatic Visualization”, Report A-1998-5, University of Joensuu, Jorma Sajaneimi presents a technique for analyzing a spreadsheet that includes drawing arrows representing the use of one cell, or a group of cells, at another cell. Using such a system, misplaced references are often typically identified. In “Goals and Plans in Spreadsheet Calculation”, Report A-1999-1, University of Joensuu, Jorma Sajaneimi et al. present a technique for recognizing a structure underlying a spreadsheet by creating directed graphs that link equations in the spreadsheet. Similarly, US Published Patent Application 2003/0106040, “PARSER, CODE GENERATOR, AND DATA CALCULATION AND TRANSFORMATION ENGINE FOR SPREADSHEET CALCULATIONS” filed 15 Aug. 2002 for Michael H. Rubin et al., and incorporated by reference herein, teaches a process that recognizes predefined data objects and structures in a spreadsheet, and generates spreadsheet-independent program source code to effect the operations defined in the spreadsheet. In “EXCELSIOR: BRINGING THE BENEFITS OF MODULARIZATION TO EXCEL”, published in the European Spreadsheet Risks Interest Group (EuSpRIG) 2005 Conference Report, Jocelyn Paine discloses a formal mathematical representation for spreadsheets, and presents techniques for transforming a conventional spreadsheet into this mathematical representation. A programming language is also presented that uses this mathematical representation, and is suitable for creating spreadsheets. However, as the term “programming language” implies, the use of this language is well suited for programmers, but poorly suited for accountants or business managers who are not typically programmers.
A number of commercial systems are also available to facilitate the creation of spreadsheets, including “ExcelWriter” by SoftArtisans; “Model Master” by J. Paine; “Paradigm” by Management Consultants Limited; “Quantrix Modeler” by Quantrix; and others. In “Excel Writer”, the user creates a template on a spreadsheet that includes data markers, and then generates a new spreadsheet by running a script that opens the template and couples a data source to the data markers. Users can also create a spreadsheet using program-like text input, such as ws.Cells(“A1”).value=“Name”, where “A1” indicates the spreadsheet coordinates. In “Model Master”, the user employs a programming language to describe relationships among “objects”. The user has the option of placing any of the defined variables on a spreadsheet, using a command such as “profit at C”, indicating that the profit is to be displayed in column C of the spreadsheet. Although the language allows a user to specify relationships in a straightforward manner, such as “profit=income−outgoings”, the syntax for using such an equation in Model Maker is not well suited for a non-programmer. “Quantrix Modeler” and “Paradigm” provide for a less cumbersome input format, but each requires the user to create the general structure of the spreadsheet using a conventional spreadsheet graphic user interface.
Of particular note, each of these prior art systems require the user to conceptualize and/or create the two-dimensional structure within which results are computed and displayed, thereby requiring the creator of the business model to create the business model within the context and constraints of the form of the output that displays the results of the operation of the model.
This invention is premised on the observation that creating a description of a business model and creating a description of an output format to display the operation of this model are fundamentally different tasks. A financial business model, for example, is typically defined in terms of inflows and outflows, assets and liabilities, product lines, and so on; and, although a two-dimensional matrix is often a convenient form for displaying the results of the operation of such a model, a typical business professional does not describe the operation of a business in terms of a two dimensional matrix. For example, a business person's description of a business may include statements such as: “The company's profit equals its income less its expenses”; “Expenses include the costs of labor, material, and facilities”; “The company's products include printers, scanners, and plotters”; and so on. Such statements include a number of implicit assumptions and constraints. For example, it would generally be understood that the aforementioned profit would be based on the income and expenses associated with each of the products, that these incomes and expenses would be distributed over time, and so on. These implicit assumptions and constraints need to be included in a system that models the operation of the business and provides financial analyses, but requiring the creator of the business model to include all of these implicit assumptions into a description of the business is inconvenient, and, in most cases, unnecessary.
It is an object of this invention to ease the task of creating a business model, such as a model suitable for execution as a spreadsheet or set of spreadsheets. It is a further object of this invention to provide a modeling language that facilitates describing, comprehending, and auditing the business model. It is a further object of this invention to automate the creation of time-based models, such as spreadsheets.
These objects, and others, are achieved by a system and method that automates the creation of business models. The system and method include generators that generate data structures and models based on general assumptions regarding business models. A time series generator, for example, automatically generates a time series model suitable, for example, for creating a spreadsheet, even though the input description of the business model may be time-independent. In like manner, a cross-category generator creates a cross-category hierarchy, even though the business model is described using independent categorizations, such as market categories, product-line categories, organizational categories, and so on. By automatically replicating the description of variables and relationships among such time-series cross-category hierarchies based on general business model assumptions, the creator of the business model is freed of the tedium generally associated with creating a business model, and the occurrence of errors in the resultant models is substantially reduced. Further, the same input description of the business model can be used as the source of alternative models, depending upon the requirements of the intended application of the model.
The invention is explained in further detail, and by way of example, with reference to the accompanying drawings wherein:
Throughout the drawings, the same reference numerals indicate similar or corresponding features or functions. The drawings are included for illustrative purposes and are not intended to limit the scope of the invention.
In the following description, for purposes of explanation rather than limitation, specific details are set forth such as the particular architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the concepts of the invention. However, it will be apparent to those skilled in the art that the present invention may be practiced in other embodiments, which depart from these specific details. In like manner, the text of this description is directed to the example embodiments as illustrated in the Figures, and is not intended to limit the claimed invention beyond the limits expressly included in the claims. For purposes of simplicity and clarity, detailed descriptions of well-known devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
As noted above, this invention is premised on the observation that a typical business person describes a business using terms and expressions that are based on implicit assumptions and generalities that are applicable to all, or most businesses. While these assumptions and generalities need to be included in a business model that is suitable for processing on a computer system, or included within the processing application, burdening the business person with the requirement of encoding or otherwise describing such assumptions and generalities is time-consuming, and, in most cases, unnecessary. Similarly, a business person does not describe a business in terms of the output format that may be used to display the performance of the business, and thus coupling the definition of a business model to an output format, such as a spreadsheet format, is also an inefficient and/or ineffective means for creating the definition, even if a spreadsheet program is the intended target for the business model.
Example inputs 101 to the system of
In accordance with this invention, the text processor 110 analyzes and parses the input 101 to also classify the variables as dependent and independent variables, as illustrated in
The input 101 of
Although not required, per se, for creating a business model, the input 101 may also provide information regarding the reports 114 that facilitate analyses of the business, as well as timeframes 115 associated with such reports 114, or associated with other data collection or analysis functions related to the business. The timeframes 115 generally define a start time, such as a year, and a reporting or data collection period, such as monthly or quarterly. As contrast to conventional business modeling systems, the definition of reporting schemes and formats is substantially unrelated to the definitions of categories, variables, and relationships.
As would be evident to one of ordinary skill in the art, the input 101 may include a variety of inputs, as well as a variety of input devices. That is, for example, the input 101 that provides the relationships 112 may be a different file from the source of the category definitions; in like manner, a scanner may be used to create some of the input 101, and a keyboard used to create other parts of the input 101. Similarly, the input 101 could be created using a speech or handwriting recognition/transcription program, or the input 101 could be created as an output of another business application program, and so on.
One of ordinary skill in the art will recognize that each of the example inputs of
As noted above, a premise of this invention is that most business models are based on implicit assumptions or generalities. For example, the profit of the business described by the relationships and categories of
Referring to
The loop 414-495 instantiates each defined variable at each branch or leaf node of the category hierarchy. Depending upon the complexity of the modeled system, a subset of the defined variables may be instantiated, for efficient processing. For example, in some situations, only those defined variables that are required to satisfy the target report requirements may be instantiated (this dependency is illustrated by the dashed arrow between the report definitions 114 and the category-variable generator 130 in
For each variable to be instantiated, a child node to the current element of the category hierarchy is created, at 420. An identifier/name of this node is preferably created as a concatenation of the upper category hierarchy level (e.g. “Standard”), the category element name (e.g. “HighEnd”), and the variable name (e.g. “Revenue”), to form an identifier such as “Standard.HighEnd.Revenue”, as illustrated in
If, at 430, the category element is a leaf node in the hierarchy, the value of the child node is defined based on the type of variable (111 of
It is significant to note that in accordance with this aspect of the invention, the relationship among variables is retained in each instantiation. That is, for example, wherever the variable “Profit” is instantiated within a category, the “=PreTaxProfit−Taxes” relationship is instantiated; wherever the “PreTaxProfit” variable is instantiated, the “=Revenue−Costs” relationship is instantiated; and so on. Alternatively, higher-level instantiations of a variable could include a composite of the lower-level instantiations, such as “Standard.Profit=LowEnd.Profit+HighEnd.Profit”, but such an instantiation does not preserve the relationship among variables at each category level, which could limit the applications for which the resultant category-variable hierarchy 135 or time series model 145 can be used.
If, at 430, the category element is a branch node (i.e. not a leaf node), the variable's “roll-up rule” is used to define the instantiated variables at these higher levels of the category hierarchy. Preferably, the roll-up rule defines a process or procedure for creating a composite of the instantiations at a lower level of the hierarchy. This composite is generally a value that characterizes the multiple lower level instantiations by a single value, such as a summary statistic or other characteristic value. By default, the roll-up rule for instantiations based on independent variables is a “sum” rule, and the roll-up rule for instantiations based on dependent variables is a “copy from child” rule.
A “sum” roll-up rule defines the instantiation of the variable at each branch node as the sum of the instantiations of the variable at each of the child nodes of the branch node. As illustrated in
A “copy” roll-up rule defines the instantiation of the variable at each branch node as a corresponding copy of the instantiation of the variable at the first child node of the branch node. As illustrated in
In a preferred embodiment, other roll-up rules may be applied, either by expanding the default classifications, or by allowing user-defined rules. For example, if one of the defined variables corresponds to an average of other variables, or a peak value (minimum, maximum) of other variables, the roll-up rule for such a variable may also be an average, or a peak value. In like manner, if a variable is used to hold a constant, such as an interest rate, a text field, and so on, the roll-up rule may be a literal copy from level to level. Any number of techniques may be used to associate roll-up rules with variable types, and/or to define variable types. In the examples of
As illustrated in
Because each of the category elements at each level of the hierarchy includes substantiated ‘roll-up’ values, the creation of reports that are organized based on the hierarchy is straightforward, so that report directives such as “subtotal by products type” can be easily accommodated. In like manner, reports based on time-frames can also be easily provided by including such requirements in the time-series generator 140, as illustrated by the dashed arrow between the reports definition 114 and the time series generator 140. For example, in a preferred embodiment, key terms such as day, week, month, quarter, year, etc. are understood, and the user can provide directives such as: get weekly inputs, report monthly outputs, subtotal per quarter, average per year, and so on.
The example of
The loops 710-760 and 720-750 traverse the hierarchy until a leaf element is found, at 730. When each leaf element is found, the next category is instantiated; that is, each leaf element of an upper level category will include a full instantiation of the next level category. In
The instantiation of a lower level category at a leaf element creates a new set of leaf elements, and if there are other categories being replicated, each lower level category will be instantiated at each newly created leaf element in the resultant cross-category hierarchy until the only leaf elements in the hierarchy are the leaf elements of the lowest level category.
As illustrated in
The category-variable generator 130 of
In like manner, the time series generator 140 provides a time series model 145 by replicating each leaf node of the category-variable hierarchy 130 over each time period. The timeframes definitions 115 define the timeframes to be used for this replication. For example, the timeframes definitions 115 may specify “Quarterly, five years, beginning in 2004”, “Monthly, one year”, “Annual, 2003-2007”, and so on. Specifically, the timeframes parameters should include a start time (relative or absolute), a time increment, and an end time (or number of time increments); preferably, a default set of parameters are provided (e.g. year 0, quarterly, 3 years), and the user input 101 allows for a replacement of one or more of these default parameters. In the context of the business model, the replication is per-time-period, for the total number of time-periods.
Optionally, the report definition parameters 114 may be used to further define or refine these timeframe parameters; for example, the data collection (input) timeframe may be weekly or monthly, but the reporting timeframe may be quarterly or annually. In such an embodiment, a different replication may be performed for input (independent) variables and output (dependent) variables; or, each replication can occur at the shorter time period and marked accordingly as an input period, output period, or both.
In a preferred embodiment, either the category-variable hierarchy 135 or the time series model 145 is used as the model that defines the business, depending upon whether the model definition is time-independent or time-dependent.
In a preferred embodiment, two spreadsheets are created, an input spreadsheet and an output spreadsheet. In the vernacular of spreadsheets, the input spreadsheet is commonly termed the “assumptions” spreadsheet, and is configured to contain the data that is used to produce the output spreadsheet. In the terms of this application, the input spreadsheet is configured to contain values for the independent variables, and the output spreadsheet is configured to display the determined values of the report variables, which may include both independent and dependent variables. Other configurations may also be used; for example, an intermediate spreadsheet may be created to provide an area where dependent variables that are not report variables (i.e. are not variables expressly called out to be reported) are determined. In the model illustrated in
At 805, the two (or more) spreadsheets are initialized. Such an initialization may include providing “title” information, such as the name of the report, the originator, the date, and so on, as well as the headings for each column, using techniques common to one of ordinary skill in the art. At 810, an index to the last-used row is determined, based on the number of rows consumed by the title information, the column headings, and so on.
The loop 815-890 steps through each category-variable CV in the input model (145 of
At 835, the row index is incremented, and the column index is initialized (typically to column 1). At 840, the value of the cell at the initial column of the current row is the name of variable. In the example of
Optionally, as each new set of category-variables in the hierarchy is processed, the block 845 can be configured to create ‘non-data’ rows in the spreadsheet to illustrate the hierarchy, as illustrated in FIGS. 9A-B. In the example spreadsheets of FIGS. 9A-B, when the “All Products” identifier of the hierarchy is identified, a row with a name entry of “All Products” is created, and the next row created, corresponding to the first category-variable of this hierarchy. Because the hierarchical name prefix “All Products” is displayed on the previous row, the value assigned to the cell can be the category-variable name less the hierarchical name prefix. (I.e. “PreTax Profit” in
The blocks 845 and 870 are optionally selected, based on the particular target spreadsheet program. In Excel and other spreadsheets, naming a row allows for automatic cell-index referencing, wherein if reference is made at cell (m,n) to a named row, the system automatically assumes that the column index to the referenced cell in the named row is “n”. That is, if cell (r1,c1) references a named row “All Products.PreTax Profit” that is defined as row r2, the reference is automatically determined to be to cell (r2,c1). In such a system, block 840 is used to name the current row as the name of the category-variable. As would be evident to one of ordinary skill in the art, if the syntax required by the target spreadsheet does not conform to the syntax used for category-variable names, the category-variable name is transformed to comply with the required syntax. For example, if the spreadsheet program does not allow spaces in a name of a row, the system will be configured to remove spaces in the category-variable name to provide a properly formed row name.
The loop 850-885 steps though each time period called for in the report, to create a column corresponding to each time period.
At 860, the column index is incremented, and at 865, the cell at the current row and column index is given the value of the current category-variable CV. That is, using the example of
Note that the equations are provided for each time period. That is, the first equation 581 is actually “All Products.Revenue(t)−All Products.Costs(t)”, where t is the time period. As noted above, if the target spreadsheet program automatically assigns column indices to named rows, the time period reference to each variable in each equation is not required. That is, in the example of
If explicit time-period/column referencing is used, each cell across the columns of the variable are expressly named, including the time or column reference, at 870. That is, the first “Revenue” entry at column 2 of the example of
In a preferred embodiment of this invention, the report definition (114 in
At 895, the spreadsheet is post-processed, to provide an efficient and effective display of the input and/or output sheets. For example, the output sheet will likely include a variety of the aforementioned ‘intermediate values’ that are not explicitly identified as report variables. In a preferred embodiment, the post-processing at 895 includes ‘hiding’ such variables, by including filters in the resultant spreadsheets.
Also in a preferred embodiment of this invention, the post-processing 895 includes “locking” the fields created by the spreadsheet, to assure its integrity and to prevent inadvertent changes or erasures. Such locking is particularly valuable for corporate applications, wherein, for example, the corporation provides audited relationships and a controlled database of input assumptions; by locking the fields created based on these audited relationships, the need to audit each resultant spreadsheet is virtually eliminated.
The foregoing merely illustrates the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are thus within its spirit and scope. For example, although each of the above examples included the use of a combination of generators 120, 130, 140, 150 one of ordinary skill in the art will recognize that each generator can be used independently to replicate variables within each dimension. Similarly, each of the generators 120, 130, 140, 150 are illustrated as receiving a single input set 113, 125, 135, 145 for processing, one of ordinary skill in the art will recognize that these input sets 113, 125, 135, 145 could include multiple sets, each of these sets optionally being generated independently. For example, the time series generator 140 may create a time series model 145 based on multiple category-variable hierarchies 135; or, the spreadsheet generator 150 may create a spreadsheet 155 based on multiple time series models 145; and so on.
These and other system configuration and optimization features will be evident to one of ordinary skill in the art in view of this disclosure, and are included within the scope of the following claims.
In interpreting these claims, it should be understood that:
a) the word “comprising” does not exclude the presence of other elements or acts than those listed in a given claim;
b) the word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements;
c) any reference signs in the claims do not limit their scope;
d) several “means” may be represented by the same item or hardware or software implemented structure or function;
e) each of the disclosed elements may be comprised of hardware portions (e.g., including discrete and integrated electronic circuitry), software portions (e.g., computer programming), and any combination thereof;
f) hardware portions may be comprised of one or both of analog and digital portions;
g) any of the disclosed devices or portions thereof may be combined together or separated into further portions unless specifically stated otherwise;
h) no specific sequence of acts is intended to be required unless specifically indicated; and
i) the term “plurality of” an element includes two or more of the claimed element, and does not imply any particular range of number of elements; that is, a plurality of elements can be as few as two elements.
This application claims the benefit of U.S. Provisional Patent Application 60/696,870, filed 6 Jul. 2005, and 60/709,742, filed 19 Aug. 2005.
Number | Date | Country | |
---|---|---|---|
60696870 | Jul 2005 | US | |
60709742 | Aug 2005 | US |