The following relates generally to mathematical statement recognition and evaluation and more specifically to methods and systems for recognition and evaluation of hybrid statements mixing symbolic expressions and natural language including English provided in one-dimensional formats using ASCII.
Math software has been developed to perform symbolic and/or numerical mathematical computations that have been otherwise performed by humans. For educational users, such as advanced junior high school students, high school students and college students, math software tools may be used to assist in the learning process. Math software can provide efficient results for what are often time consuming and tedious tasks if performed manually. Once a student has an understanding of a process or calculation, using that process or calculation as a building block in other, perhaps more complex, calculations or processes is an important task. However, the time consuming nature of performing such a task may hinder the amount of tasks that such a student may accomplish. Thus, efficient software tools may assist in the educational process by reducing the amount of such tasks. Furthermore, such software may provide encouragement for the learning of different processes and concepts. In general, for a math software to function, a computer language is required to convert a math problem defined in natural mathematics language, which includes symbolic expressions in natural math notations, and hybrid statements mixing natural language and symbolic expressions for defining operations applicable to and relationships among mathematical entities, into an intermediate representation such as abstract syntax tree (AST) that can invoke appropriate algorithms written in, for example, other lower level languages such as C or Fortran that perform the defined mathematical computations.
The usage of math software in is still very limited especially in education, considering the vast accessibility of computers in schools and household. Many factors contribute to this phenomenon. Among them the two most important ones that are directly related to software are:
Item A causes many users to spend a long time learning to use the software, creating a significant barrier to use and consequently impeding market acceptance. The impact is even worse for student users: instead of assisting, the software/devices actually complicate a student's learning because he or she is forced to navigate simultaneously two different sets of notations, one from a textbook and another for the software.
Item B makes it difficult for many users to define math problems since the expressive power of symbolic expressions is nevertheless limited beyond imperatives. Consequently, users have to decompose these problems and write procedural code to solve the problems using primitive constructs provided by the language. This is obviously not plausible for most students learning math or professional users that are not programming-savvy. In fact, it defies the purpose of mathematically-oriented languages and software. For instance, any individual able to write detailed procedural code to solve a math problem is less likely to need the help of math software in the first place.
The most noticeable inadequacy in communication between math software and its users is the lack of procedural details documenting how a problem is solved. That deficiency is more consequential for educational applications: elucidating how a problem is solved and elaborating what are the concepts behind the solution are at least as important as the answer itself. For professional users who use mathematics in their daily work, such a gap in syntaxes can impact their acceptance of the software, especially for potential new users. Such users may not have time or the desire to learn a new language.
The present disclosure generally relates to one or more improved systems, methods, and/or apparatuses for parsing and interpretation of mathematical statements, including what are referred to as hybrid mathematical statements that combine mathematical notation and natural language. Embodiments disclosed herein provide a rigorous and practically tractable formal grammar to distill the essence of the following:
1). Symbolic expressions in natural math notation into one-dimensional expressions through leveling, asciilization, direct adoption and/or operator transformation, which act to minimize the syntactical gap between one-dimensional representations and two-dimensional math notation. The one-dimension representations disclosed herein provide an amount of similarity to natural math notation and cover a wide spectrum of mathematics, easing the definition of math problems for users of the systems, methods, and/or apparatuses; and
2) Some of the most common yet simple syntactic structures found in mathematics language pertinent to problem definition including assertion, command, query and deduction. Such inclusion results in a hybrid language mixing symbolic and natural languages, representing an important early step in computer “comprehension” of scientific knowledge quantitatively expressed through mathematics.
The disclosure herein also provides a unique work flow for receiving and processing one or more statements. The work flow comprises (a) an input step for receiving at least one one-dimensional statement from a user; (b) a parsing procedure, utilizing a grammar library, that operates on the received statement(s) to convert the one-dimensional statement(s) into one or more intermediate representation such as an abstract syntax tree (AST) representing mathematical expressions and the manipulations imposed on them; and (c) an interpretation procedure that evaluates the mathematical expressions embedded in the ASTs and executes the manipulations in accordance with the AST, and provides a result. In some embodiments, both the result as well as narratives similar to human-constructed solutions are provided, in a format ready for display in natural mathematical notation.
Other embodiments disclosed herein provide methodologies that enable the composition of step-by-step procedures and narratives explaining the concepts and background involved in solving math problems, for a math software that employs the above-mentioned computer language. The composition is similar to human-made solutions and thus easy to follow for users such as students.
A further aspect of the disclosure provides a computer program product comprising a non-transitory computer readable medium comprising: (a) code for receiving one or more one-dimensional hybrid statements mixing mathematical expressions and natural language; (b) code for converting, via at least one call to a grammar library, portions of the one or more one-dimensional hybrid statements into a plurality of mathematical expressions and one or more abstract syntax tree (AST) of the expressions, wherein the grammar library includes rules for performing the step of converting; (c) code for initially displaying a two dimensional mathematical expression representing the one or more hybrid mathematical expressions; (d) code for evaluating the mathematical expressions in accordance with the AST of the expressions; and (e) code for displaying a two dimensional mathematical expression representing a narrative for the evaluation of the mathematical expressions.
In one example, novel functionality is described for method for receiving and evaluating a hybrid mathematical statement, comprising: (a) receiving one or more one-dimensional hybrid statements mixing mathematical expressions and natural language; (b) converting, via at least one call to a grammar library, portions of one or more one-dimensional hybrid statements into a plurality of one or more abstract syntax tree (AST) of the expressions, wherein the grammar library includes rules for performing the step of converting; (c) evaluating the hybrid statements in accordance with the AST; and (d) performing at least one of storing and transmitting a result of the evaluation.
A further understanding of the nature and advantages of the present invention may be realized by reference to the following drawings. In the appended figures, similar components or features may have the same reference label.
The present disclosure recognizes that several factors contributed to the aforementioned syntax differences for mathematical expressions between natural mathematics language and the corresponding one-dimension representations, including: (1) math notation is 2-dimensional in that it uses both symbols as well as vertical positioning such as overhead and superscript to convey semantics. On the other hand, ASCII—the character set that commonly used for communication between human and computing software/devices is 1-dimensional in the sense that characters are aligned horizontally when entered; (2) The natural math notation employs many symbols including Greek letters and specially-created symbols such as and E that are not included in ASCII; and (3) The syntaxes of math notation can be context-sensitive. This combination presents a serious challenge to the development of a formal grammar that abstracts the syntactical structures, which is the core of a language usable by the software and devices.
The present disclosure provides a mathematical notation friendly language defined by a rigorous yet practically tractable formal grammar to make the one-dimension representation of mathematical expressions either appears physically similar, or is easily associable to, their forms in natural math notation. To achieve this, a variety of methods including leveling, asciilization, direct adoption and operator transformation to minimize the syntactical gap between the language and the math notation.
Thus, the following description provides examples, and is not limiting of the scope, applicability, or configuration set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the spirit and scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as appropriate. For instance, the methods described may be performed in an order different from that described, and various steps may be added, omitted, or combined. Also, features described with respect to certain examples may be combined in other examples.
Referring first to
As is well understood, in many cases mathematical problems are solved in a number of discrete steps. Often, the steps that are used to solve a particular problem are as important, if not more important, than the ultimate solution to a problem. Particularly, in educational settings, the different steps that are taken to arrive at a solution are of particular importance to practice, and ensure a student has an understanding of, one or more underlying concepts and the interaction of various concepts. In some embodiments, such as illustrated in
As mentioned above, mathematical notation is often two-dimensional in nature, since it not only relies on symbols themselves but also their relative positions (superscript, subscript, overhead, etc.) to convey semantics, whereas default computer code written in ASCII set is one-dimensional, i.e., all characters are aligned horizontally, which can provide an obstacle to providing a user friendly and highly functional mathematical application in situations where it is more convenient for a user to enter information using ASCII characters. Additionally, math notation routinely employs many characters such as Greek letters for variables and constants (π for example), and some of the letters are given special meanings. For example, Σ is used for summation and Π is used for product. Math notation also employs symbols particularly designed for mathematics, such as ∫ for integration, ∂ for partial differentiation, and ∇ for gradient operator for vector field. Furthermore, even if input were available to accommodate all sorts of math symbols that are not included in ASCII, and templates for all sorts of position-dependent structures (power, limit, integration, etc.), a formal grammar would be needed in order to capture the rich syntaxes of mathematical language. The syntaxes of mathematical language are far more complicated then what present day computer languages are capable of expressing directly, whose expressive capacity is usually limited to arithmetic operations (±, ×, ÷) and common control structures (loop, if/else construct, etc). Many applications have formula editors, but these applications generally require cumbersome navigation of menus and selection of particular expressions, and provide simply a display of the particular expression without problem solving capability. The presence of a formal grammar and language to receive and evaluate relatively complex mathematical expressions is not known to exist currently. Fortress, a language being developed by a group of computer scientists of Oracle, represents some recent efforts by the IT industry and the government (Fortress is originally sponsored by DARPA). However, it appears that the designers of the language are more interested in developing new and logical notation for parallel processing, such as using Σ . . . for pre-fix parallel summation, rather than accommodating the existing, well-established notation evolved during the past several hundred years into a computer language.
Another obstacle to providing a user-friendly and still highly functional grammar for mathematical expressions is the context-dependent character of many symbols and expressions. A common perception about mathematical language is that it is precise. While this is largely true, it is also well known that, just like natural language, it can sometimes be context-sensitive in that one symbol may have multiple interpretations according to its context, i.e., its neighboring symbols. Such context-sensitivity needs to be addressed in any language that tries to capture the essence of natural mathematical notation.
The present disclosure provides that mathematical expressions be input through ASCII characters, and no other character set such as Unicode is included as an input option in various embodiments. This constraint requires that the aforementioned three obstacles be explicitly taken into consideration. The approaches and syntaxes used to solve these problems include several novel features. Exemplary methods employed include leveling, asciilization, and adoption and operator transformation. The leveling technique refers to the transformation that makes two-dimensional math structures one-dimensional by placing superscript, subscript and overhead symbols that decorate another symbol to normal positions in manner that is intuitive for users to form a connection to the corresponding natural math notation.
One example of the application of leveling is the expression for definite integral. It is two dimensional due to the existence of upper and lower bound defining the integral. In exemplary embodiments, the bound can be expressed using a notation for range, “a<x<b”. Additionally, rules are required for where to place such a range and how to link to the indefinite integral. In one embodiment, the expression “$f(x)dx” is used to express an integration function. Exemplary expressions are included below in Table 1. With respect to the range notation, this can be placed after the integration sign or at the end of the definition of the indefinite integral. In one embodiment, this statement is placed after the indefinite integral, and the symbol “@” is introduced to separate the indefinite integral and the integration domain. Leveling techniques has also been used to represent family of what we called postfix unary operators that are formed by placing operator symbols after the symbol ̂ indicating the operator is positioned on the shoulder or above the operand, such as ̂̂ (vector normalization), ̂. and ̂.. (Newton's notation), ̂′, ̂″ and f̂(n) for Lagrange's notation. A favorable side-effect of having a dedicated representation for the Lagrange notation is that x′ can be used for a variable loosely related to x, which can be useful for applications such as coordinate transformations and those involving Green's functions. The treatments for limit, summation and product are similar, as shown in Table 1. Note that “̂-” is a postfix unary operator that means less than but infinitely close to a variable when placed after that operator. That is very convenient in expressing a fundamental concept in calculus, namely the limit:
lim(f(x))@(x−>â-)
Also note that in Table 1, f̂(n)(x) means the nth-order derivative of f(x)w.r.t.x when n≧1; f̂n(x) means the n-th power of f̂(x) with f̂−1(x) being reserved for the inverse of f, where f is a function.
The ascillization technique is used to decrease the difference between the computer language and natural math notation associated with the lack of special characters/symbols. In this technique, embodiments simply use ASCII characters or strings to represent the special symbols through some sort of connection. One type of connection is through phonetics. The representations of Greek letters are based on this connection. Another connection is based on visual or calligraphic similarity. Examples include using $ to represent elongated-s symbol ∫ for integration, and { to represent belong-to symbol ε. The other connection used in asciilization is the wording of what the symbol symbolizes. That is what is used in representing the summation (Σ) and product (Π) through “SUM” and “PROD”. The representations of a few special functions such as Bessel and Dirac delta-function are also fall into this category. As shown in the syntax summary table, several options for representing derivatives are provided. One option is based on leveling/asciilization, which expresses du/dx, ∂2ƒ(x,y)/∂x∂y as “du/dx”, “d̂2f(x,y)/dxdy” respectively. One advantage of this representation is that the differentials (du, dx) can be treated as, at least for first-order derivatives, algebraically manipulable symbols. This advantage can be convenient to symbolic manipulations in certain applications such as solving differential equation by separation of variables.
Another option of representing derivatives is through what is referred to as operator-transformation. This is explained as follows. First, note that taking derivative of a function f(x) with respect to a variable x can be viewed as the result of applying a binary operator to its two operands: one is the function, another is the variable differentiated against, as illustrated in
The notation dnƒ(x)/dxn represents the quotient and recursive characteristics of derivative. The quotient nature is preserved by the fraction symbol (the horizontal dividing line or the forward slash in linear style), the number of recursive limiting operations is indicated by the superscript as appeared in the denominator immediately following the infinitesimal operator. In some cases an expansion may be performed on the compact notation and expressed as:
where each pair of parenthesis is associated with one differentiation and thus the number of the pairs are the same as the order of derivatives. One embodiment simply applies the simple leveling technique to express the derivative as below
The syntax is a little difficult due to the distant pairing of parentheses, and another embodiment uses a variation of the above syntax:
In essence, the technique used above to level the derivative is operator-transformation. It is so-named because it changes the binary operator implied by the notation dg(x)/dx to a unary-prefix operator, where g(x) can be a function resulted from differentiation, namely, itself a derivative, as illustrated in
Another technique used by present disclosure to decrease the syntactic difference between the language and math notation is adoption, i.e., directly borrowing the notation used in mathematics. The single most important example in this category is equation labeling and referencing, which greatly reduce the clutter in statements that is common for existing major math software. Other examples include usage of prime ′ following a variable to indicate some sort of connection between the primed variable and the un-primed one, which are commonly used in coordinate transformations and Green's functions. Vertical bracket | . . . |, norm operator ∥ . . . ∥, and dot product are also examples of direct adoption.
To establish the language with the syntactical choice discussed above, it is necessary to have a formal grammar to abstract the syntactic structure of the language. For this purpose, a context-free grammar is constructed and a scanner and parser are developed, along with an interpreter for the language, which takes the intermediate representation generated by the parser as its input.
In one embodiment, a context-free grammar is selected with only one look-ahead symbol and statements that are parsed from left to right. However, as mentioned above, another aspect of the present disclosure is context sensitivity. A brief review of the definition of context-free and context-sensitive grammars, as well as what context-sensitive means, is provided. A formal grammar must have four elements: a set of non-terminal symbols V, a set of terminal symbols T, a set of grammar rules or production rules P and a starting symbol S to initiate the deviation of the language. If an additional constraint is added on the production rules that requires that the left side of each production rule to be a single non-terminal symbol, i.e.,
A→(V∪T)*
where * is the Kleen star operator, indicating zero or more of the elements belongs to the set it decorates. Then, the grammar is called a context-free grammar. Notice that there are no symbols around A, namely, the production rule for A that dictates how A is re-written, does not depend on its context, hence the name of context-free.
On the contrary, the production rules of context-sensitive grammar have the following form:
αAβ→αγβ
In the above rule, α,βε(V∪T)*, AεV, γε(V∪T)+, where + in the is the Kleen plus operator, indicating one or more of the elements belongs to the set it decorates. It seems that, from the above definitions, there is no direct link between whether the grammar is context-free and whether the interpretation can be context sensitive. As is well known, it is not desirable in a programming language to have ambiguities, i.e., multiple interpretations are possible from one statement or expression. As is also well known, ambiguities are not uncommon in natural languages such as Chinese and English. Ambiguities occur if multiple parse trees can be generated by a parser from a single sentence or statement using the chosen grammar. The assumption is that a parse tree of a statement/expression disassemblies the sentence into clearly defined structural units, from which a decisive interpretation can be obtained. It should be noted that context-dependency will not cause problem as long as the dependency can be resolved. This is indeed the case for embodiments herein. In one embodiment, the context-dependency is exclusively associated with operators. Namely, the interpretation of some of the operators depends on their context, i.e., their operands. Some concrete examples follow.
One example is the binary operator “x”, which is represented by ASCII character “*”. Its context-dependency can be resolved by the types of its operands: when both operands are vectors, it means cross product; when one of the two operands is a vector, it defines a multiplication that results in a new vector with components multiplied by the scalar; and when both are scalars, it is the usual multiplication operator. Another example is the vertical brackets | . . . | that is used as an unary prefix operator. Just like in natural mathematics notation, its interpretation depends on its operand: when the operand is a vector, the pair of brackets means the length of the vector; when the operand is complex number, it means the magnitude; when the operand is a matrix, it means determinant; and for numbers, it simply means the absolute value. The usage of curly bracket “{” is another example, which has two applications. When paired with a right curly bracket “}”, it defines a set through list or set-builder notation. Additionally, it can also represent ε—the belong-to set operator in natural math notation. The multiple meaning is resolved during parsing purely based on syntax, namely it is resolved by the grammar itself, instead of being resolved through context using semantic information during execution. Fragments of the grammar rules involving “{” are listed below:
Thus, the disclosure provides a number of syntaxes, summarized in Table 1, that possess similarity to natural math notation and cover a wide spectrum of mathematics. Since the syntax is close or easily associable to natural mathematics, student users may focus on learning mathematics and professional users may focus on analyzing problems rather than learning a new language to solve problems.
As explained earlier, natural mathematical language comprises hybrid statements that mix symbolic expressions and natural language such as English, to describe the manipulations on, or relationships among, mathematical entities such as functions and matrices. Lack of representation of such hybrid statements in computer languages designed for mathematical applications is one of the major drawbacks identified. The present disclosure provides a number of the most common syntactic structures found in mathematical language, which are summarized in Table 2. As seen in Table 2, four general categories of declaratives are provided, namely “assertion,” “command,” “query,” and “deduction.” The inclusion of these syntactic structures greatly enhances the declarative power of the language such that users can focus on defining the math problem to be solved instead of imperative instructions on how the problem should be solved. This capability provides significant pedagogical value to student users, and may also improve the productivity of other users.
The Assertion declarative structure includes three different types of statements, namely a “is a” assertion, referred to as a-statement in Table 2, which has a hierarchal structure illustrated in
The Command declarative operates on, or deduces properties from, single or multiple entities. Two exemplary hierarchal structures of a Command declaratives are illustrated in
The Query statement imposes a question, or query, about certain attributes/characterizations of math entities. Examples include (with Queries noted in bold):
The pedagogical value of “query” should be self-evident. It enables students to ask questions—the most important activity in human learning. During the execution of a query, the description phrase is forwarded to the entity involved. The entity checking its lexicon and finds the criteria of satisfying this description phrase. Subsequently, calculation is done against the criteria and the query is answered. Queries involving multiple-entities can is processed by one of the entities involved or a third being that can communicate with all entities involved.
The Deduction declarative structure includes one or more initial constraints and assertions, along with one or more conclusions that are to be shown or proved. Exemplary hierarchal structure of a deduction statement is illustrated in
The hybrid syntactic structures described above, combined with the math-notation friendly syntax for mathematical expressions, are thus used together to provide a user with a relatively straightforward problem definition mechanism. A number of examples of user input, and of the user input, and generation of output are provided, in order to illustrate some exemplary uses of methods and systems provided herein. In one example, a number of one-dimensional statements, including assertion and command, are used to define the computation of the distance from a point to a plane:
The unique characteristics of the above-mentioned hybrid statements include but not limited to 1) symbolic expressions can be embedded in the hybrid statements with symbols defined in earlier statement; and 2) symbolic statements such as equations and inequalities can be referred through their labels in the hybrid statements including but not limited to command statements.
Composition of step-by-step procedure and narrative explaining the concepts involved is done through implementing documenting routines before and after the method/subroutine associated with an operators within an AST that is deemed as non-trivial, i.e., (+, −, *, / and ̂ as power operator) applied to primitive variables such as integer, real and rational numbers. In the pre-operator documenting routine, normally but not always, the natural of the operation to be performed by the operator is identified and pertinent discussion given. In the post-operator documenting routine, normally but not always, the results resulted from the application of the operator is presented.
Described now are interpretation results that involve solution of algebraic equations, symbolic differentiation and integration, vector calculus, tabular data input, small matrix manipulation, implicit differentiation, triple integral, line and surface integral, examples for query and commands in hybrid syntax. Case A solves of a set of linear equations. This case demonstrates the equation/relation reference mechanism command structure. Case A: Solution of set of linear equations begins with a user input comprising:
Input:
3x−4y+2z=0.3; (1.1)
13x+2y−34z=1.5; (1.2)
−5x−12y+0.1z=10.9; (1.3)
Interpreted Input:
3x−4y+2z=0.3 (1.1)
(13x+2y)−34z=1.5 (1.2)
−5x−12y+0.1z=10.9 (1.3)
Solve x, y, z from 1.1-1.3.
The one-dimensional statements are used to form ASTs, and the command statement used to determine functions to perform on objects in the ASTs. In this example, a final answer to the command statement is determined and output as follows.
Output:
As mentioned above, in some embodiments intermediate steps may be calculated and displayed to a user. In this example, a series of intermediate results are displayed to a user as follows.
To solve the linear system, we use the Gauss-Jordon elimination method. Let's first write the above set of linear equations in the following matrix form:
The Gauss-Jordon method applies a series of elementary row operations to both the coefficient matrix (A) and the column matrix on the right-hand-side (b) simultaneously until A becomes identity matrix (I). Then, the un-known column matrix is the same as b.
There are two types of elementary row operations involved here. One is row swapping represented by rirj; the another is row addition, in which a row (ri) is replaced by the summation or difference between itself and another row (rj) multiplied by a constant (c). Row addition is represented by ri←ri±crj.
Listed below are the sequence of the elementary row operations performed. Notice that the current pivot element is highlighted using boldface font for clarity.
Dividing each row of the coefficient matrix A and the column matrix b by the diagonal element aii, the system becomes:
As can be seen, now the coefficient matrix A is an identity matrix and we have:
The “Intermediate Steps” of this embodiment are created by a “solution composer.” Providing such a display of partial results may provide pedagogical value to student, and other users. In some embodiments, a user may not desire to view this sometimes lengthy and/or verbose section of output, and the display of such intermediate results can be turned off by the user. It worthwhile to point out that the whole linear system instead of the augmented matrix is displayed in the step-by-step illustration of the solving process with the hope that doing so would be more intuitive for student users, for example, to understand the operations involved. While row swapping is important for many practical applications, it could be a distraction to users, and may be turned off in various embodiments.
From an input and interpretation point of view, the system considers a typical piece of scientific writing as a list of the following basic blocks:
Continuing with the examples, Case B solves a single non-linear equation using numerical method. This is a simple example showing how symbolic (differentiation) and numerical algorithms (the Newton-Raphson) can work together to solve math problems. Although the below output examples are static, namely, it only displays a snapshot of the numerical data in the form of tables and graphics, various embodiments allow users to browse the data and interact with its visualizations (2D and 3D) dynamically in an output window. In many cases, users such as engineers and researchers may find that having all data in a problem easily accessible without having to memorize names/location and their contents are useful their daily work.
Case B: Solution of Non-Linear Equation1
Input:
begin
x̂2−exp(x)=sin(x)+0.3; (1.1)
Solve x from (1.1) numerically;
end#
Interpreted Input:
x
2
−e
x=sin(x)+0.3 (1.1)
Solve x from 1.1.
Output:
x=−0.5703899316
Intermediate Steps:
Let's try solving the single non-linear equation
x
2
−e
x=sin(x)+0.3
numerically with the Newton-Raphson method. First, let's transform the equation into a standardized form:
ƒf(x)=(x2−ex)−(sin(x)+0.3)=0
The Newton-Raphson method finds a solution of the equation, i.e., a root of function ƒ(x) by approximate the function linearly in the vicinity of xk through
ƒ(x)=ƒ(xk)+f′(xk)(x−xk), k≧0
where ƒ′(xk) is the derivative of ƒ with respect to x evaluated at x=xk
The iterative equation can be obtained by setting ƒ(x)=0 and solving for x from the approximating equation:
The derivative df(x)/dx can be calculated from the definition of ƒ(x),
Substitutive the derivative to the iterative equation, we obtain:
Starting with initial guess x0=1.0, the iteration is terminated after 6 iterations when the pre-set convergence criterion |xk+1−xk|≦1.e−0062 is met. The intermediate iteration results are listed in the table below.
Continuing with the examples, Case C illustrates syntax for integration and the working of a heuristic symbolic integrator, which can solve common integrals using method of substitution, both the first and the second types, and integration by part (limited depth). The step-by-step illustration of the problem solving process replicates a human-constructed solution. Consistent intermediate representations generated via a formal grammar enable the development of such functionality.
Case C: Symbolic Integration1
Input:
begin
function f, g, p, q;
f(x)=$(x̂3*(i−x̂2))dx;
g(x)=$(kappa*exp(−beta*x)+x/sqrt(â2−x̂2))dx;
p(x)=$(x*cos(x̂2))dx;
q(x)=$(sin(x) ̂3*cos(x) ̂2)dx;
end#
Interpreted Input:
ƒ(x)=∫x3(1−x2)dx
p(x)=ƒx cos(x2)dx
q(x)=∫ sin3(x)cos2(x)dx
Output:
Intermediate Steps:
x
3(1−x2)=x3−x5
Applying the additive rule of integration, we have
∫(x3−x5)dx=∫x3dx−∫x5dx
Summing-up all integrals, we obtain:
Factoring out x-independent factor
∫κe−βxdx=κ∫e−βxdx
To evaluate indefinite integral ∫e−βxdx, let's try substitution
u=−βx
Applying product rule
Therefore
Thus
and
Factoring out u-independent factor
Substituting u=−βx back to the result above gives
Notice that the integrand of integral
contains pattern √{square root over (a2−x2)}.
A common approach for solving this type of integral is to using trigonometric substitution:
x=a sin(φ), −π/2≦φ≦+π/2
We have chosen the value of φ=arcsin(x/a) to be within its principal values between −π/2 and π/2. As a result, we have cos(φ)≧0 and thus
√{square root over (a2−x2)}=|a| cos(φ)
Applying product rule
Thus
and
Factoring out φ-independent factor
Substituting
back to the above result gives
Summing-up all integrals, we obtain:
u=x
2
Thus
and
Factoring out u-independent factor
Substituting u=x2 back to the result above gives
sin(x)2=1−cos(x)2
to the integrand gives:
To evaluate indefinite integral ∫(cos(x)2−cos(x)4)sin(x)dx, let's try substitution
u=cos(x)
Thus
and
Applying the additive rule of integration, we have
∫(u2−u4)du=∫u2du−∫u4du
Summing-up all integrals, we obtain:
Substituting u=cos(x) back to the result above gives
Continuing with examples, Case D illustrates the relative compactness of vector syntax of an embodiment, and its similarity to the corresponding natural math notation. Vector is among some of the most important mathematical concepts that find wide applications in engineering and physics. Many users note that vector calculus can be an initial obstacle to students in learning mechanics and electrostatics. Software with easy syntax for defining vectors and their operations along with adequate illustration of the right-hand rule can be very helpful to students.
Case D: Vector Calculus1
Input:
begin
A=a*î̂+b*ĵ̂+c*k̂̂;
B=x*ŷ2*î̂+ln(y+z)*ĵ̂+(kappa/x)*k̂̂;
r=x*î̂+y*ĵ̂+z*k̂̂;
T_s=A.B*r;
C=(alpha*î̂+beta*ĵ̂)*Â̂;
D=grad*B;
E=|B|*grad.B;
end#
Interpreted Input:
A=aî+bĵ+c{circumflex over (k)}
r=xîyĵ+z{circumflex over (k)}
T
s
=A·B×r
C=(aî+βĵ)×Â
D=∇·B
E=|B|∇·B
Output:
A=aî+bĵ+c{circumflex over (k)}
r=xî+yĵ+z{circumflex over (k)}
Intermediate Steps:
B×r can be calculated from the definition of cross product (or vector product) of two vectors,
Dot product (or scalar product) of two vectors can be calculated by multiplying like-components and then add,
Unit vector  is defined by
where Ax, Ay, Az are the components of the vector, and |A| is the length of the vector:
|A|=√{square root over (a2+b2)}+c2
thus
(αî+βĵ)× can be calculated from the definition of cross product (or vector product) of two vectors,
The curl of vector field B can be calculated from
Applying chain rule for composite functions
Applying additive rule
Therefore
Applying quotient rule
Applying quotient rule
Applying product rule
Applying product rule
Applying chain rule for composite functions
Applying additive rule
Therefore
Thus,
The divergence of vector-field B can be calculated from
Applying product rule
Applying chain rule for composite functions
Applying additive rule
Therefore
Applying quotient rule
Thus
|B| is the length of vector B. By definition,
Continuing with examples, Case E illustrates Symbolic matrix definition and manipulations.
Case E: Symbolic Matrix Definition and Manipulations
Input:
begin
A=(a 2 psi)
//Below is a 3×3 identity matrix
I=(1, 0, 0)
B=|ÂT−kappa*I|;
end#
Interpreted Input:
B=|A
T
−κI|
Output:
B=(a−κ)((c−κ)(z−κ)−yh2)+x(2h2−(c−κ)ψ
Intermediate Steps:
Multiplying a matrix by a scalar results in a matrix with each element multiplied by the scalar. i.e,
The transpose of a matrix is obtained by simply swapping elements aij and aij,
Subtracting a matrix by another matrix of the same dimensions results in a matrix with each element being the difference of the corresponding elements of the two matrices, i.e,
Since the size of the matrix is quite small, we can use Laplace's formula to calculate |AT−κI|. According to the formula, the determinant is the summation of product of each element of a selected row (column) with its corresponding cofactor. Of course, we'd choose the row or column having the most zeros unless the matrix does not have any zero entry. In that case, we simply choose the first row for expansion. Thus,
Continuing with examples, Case F illustrates definition and manipulation of tabular data.
Case F: Definition and Manipulation of Tabular Data1
Input:
begin
function f;
kappa=sqrt(<ŷ2>−<y>̂2);
f(x)=a*exp(−((x−mu)/b)̂2);
Fit y,sigma to f;
end#
Interpreted Input:
Read the tabular data listed below and assign it to T:
κ=√{square root over (<y2>−<y>2)}
Minimize:
Output:
k=0.5448777363
Using Levenberg-Marqurdt method with initial guessing:
the model parameters are estimated to be:
and
χ2=14.594367981
The table below lists both the input data and the corresponding model predication using the fitted parameter(s):
Intermediate Steps:
<y> is the average of column y in Table T,
<y2> is the average of y2,
Since both the observable and its standard error are provided in the input, the merit function can be written as:
where
We use the Leveberg-Marquadrt method to estimate the model parameter(s). The Leveberg-Marquadrt method uses the gradient, i.e., the first order derivatives, of the merit function w.r.t. the model parameter(s), as its guidance in searching minimum. To compute the gradient, we need the derivatives of the model w.r.t. the parameters to be fitted, which are given below:
The algorithm is terminated after 10 iterations, when the pre-set convergence criterion |χk+12−χk2|<1.e−0032 is met. The iteration results are listed in the table below.
Continuing with examples, Case G illustrates a triple integral.
Case G: Triple Integral1
Input:
begin
A=$$$(x+y+1)dxdydz@(0<x<1, 0<y<x, 0<z<2);
end#
Interpreted Input:
A=∫
0
1∫02∫0x(x+y+1)dydzdx
Output:
A=2
Intermediate Steps:
The triple integral can be written as:
∫x=01∫z=02∫y=0x(x+y+1)dydzdx=∫x=01(∫z=02∫y=0x(x+y+1)dydz)dx
The double integral can be written as:
∫z=02∫y=0x(x+y+1)dydz=∫z=02(∫y=0x(x+y+1)dy)dz
∫x=01(x+y+1)dy=∫xdy+∫ydy+∫1dy
∫xdy=yx
∫1dy=y
Summing-up all integrals, we obtain:
Applying the fundamental theorem of calculus,
let's first work out the indefinite integral
Applying the fundamental theorem of calculus,
Therefore,
let's first work out the indefinite integral
Factoring out x-independent factor
Applying the additive rule of integration, we have
Factoring out, x-independent factor
Summing-up all integrals, we obtain:
Therefore
Applying the fundamental theorem of calculus,
Therefore,
∫01∫02∫0x(x+y+1)dydzdx=2
Continuing with examples, Case H illustrates a line integral of scalar field.
Case H: Line Integral of Scalar Field1
Input:
begin
function rho;
rho(x,y)=x+ŷ 2;
C={(x,y)|x=t,y=t, 0<t<i};
ds=sqrt(dx̂ 2+dŷ 2);
m=$rho(x,y)ds@C;
end#
Interpreted Input:
ρ(x,y)=x+y2
C={(x,y)|x=t, y=t, 0<t<1}
ds=√{square root over ((dx)2+(dy)2)}{square root over ((dx)2+(dy)2)}
m=∫
Cρ(x,y)ds
Output:
ρ(x,y)=x+y2
C={(x,y)|x=t, y=t, 0<t<1}
ds=√{square root over ((dx)2+(dy)2)}{square root over ((dx)2+(dy)2)}
Intermediate Steps:
∫Cρ(x,y) ds is a line integral along curve C. To compute it, we need to first work out ρ(x,y) and differential ds along the curve in terms of parameter t. Plugging
x=t
y=t
from the curve definition into ρ(x,y), we obtain
ρ(x,y)=t+t2
∫(t+t2)√{square root over (2)}dt=√{square root over (2)}∫((t+t2))dt
Applying the additive rule of integration, we have
∫((t+t2))dt=∫tdt+∫t2dt
Summing-up all integrals, we obtain:
Therefore
Applying the fundamental theorem of calculus,
Continuing with examples, Case J illustrates a line integral of vector field.
Case J: Line Integral of Vector Field—II1
Input:
begin
C={(x,y,z)|x=cos(t), y=sin(t), z=t, 0<t<pi};
F=y*î̂+x*ĵ̂;
r=x*î̂+y*ĵ̂+z*k̂̂;
N=$F*dr@C;
end#
Interpreted Input:
C={(x,y,z)|x=cos(t), y=sin(t), z=t, 0<t<π}
F=yî+xĵ
r=xî+yĵ+z{circumflex over (k)}
N=∫F×dr
Output:
C={(x,y,z)|x=cos(t), y=sin(t), z=t, 0<t<π}
F=yî+xĵ
r=xî+yĵ+z{circumflex over (k)}
N=(−2)ĵ
Intermediate Steps:
x=cos(t)
y=sin(t)
z=t
From the curve definition into (yî+xĵ), we obtain
Thus, we have
and
∫cos(t)dt=sin(t)
Applying the fundamental theorem of calculus,
∫ sin(t)dt=−cos(t)
Applying the fundamental theorem of calculus,
∫2 sin(t)cos(t)dt=2∫ sin(t)cos(t)dt
To evaluate indefinite integral ∫ sin(t)cos(t)dt, let's try substitution
u=sin(t)
Thus
and
Substituting to u=sin(t) back to the result above gives
Therefore
Applying the fundamental theorem of calculus,
Continuing with examples, Case K illustrates implicit differentiation.
Case K: Implicit Differentiation1
Input:
begin
ŷ2−4*y=x̂2*(x̂2−4); (1.1)
calculate dy/dx from (1.1);
end#
Interpreted Input:
y
2−4y=x2(x2−4) (1.1)
calculate
from 1.1.
Output:
Intermediate Steps:
This is an implicit differentiation problem. The dependency between y and x is defined implicitly through the equation
y
2−4y=x2(x2−4)
To compute derivative,
let's take derivatives with respect to x for both sides of the above equation. Applying additive rule
Therefore
Applying product rule
The derivative of a constant or an independent variable is zero, thus
Therefore
Equating the expressions resulted from taking derivatives for both sides of the defining equation, we obtain
From which, we can solve for
Continuing with examples, Case L illustrates a query of an attribute of a mathematical entity.
Case L: Query the Attribute of a Mathematical Entity1
Input:
begin
A=(cos(phi)−sin(phi))
is A orthogonal?
end#
Interpreted Input:
Is A orthogonal?
Output:
No, A is not orthogonal.
Intermediate Steps:
A square matrix is orthogonal if its rows are mutually orthogonal, i.e., the scalar-product of any row with itself is 1, and the products with all other rows are zero:
For i=1, j=1:
Continuing with examples, Case M illustrates a surface integral of a vector field.
Case M: Surface Integral of Vector Field1
Input:
begin
S={(x,y,z)|z=x, 0<x<R, 0<y<x};
u=u—0*z*k̂̂;
n=((d/dx)z, (d/dy)z, −1);
Q=$(u.n̂̂)dS@S;
end#
Interpreted Input:
S={(x,y,z)|z=x, 0<x<R, 0<y<x}
u=u
0
z{circumflex over (k)}
n=((d/dx)z,(d/dy)z,(−1))
Q=∫
S
u·{circumflex over (n)}dS
Output:
S={(x,y,z)|0<x<R, 0<y<x, z=x}
u=u
0
z{circumflex over (k)}
n=(d/dx)zî+(d/dy)zĵ+{circumflex over (k)}
Intermediate Steps:
∫S(u0z{circumflex over (k)})·{circumflex over (n)}dS is a surface integral on surface S. To evaluate it, we need to map the integrand (u0z{circumflex over (k)})·n onto the surface and work out the differential area dS explicitly in terms of differential area on the xy-plane, i.e., dxdy. Substituting
from the surface definition into (u0z{circumflex over (k)})·{circumflex over (n)}, we have
(u0z{circumflex over (k)})=u0x{circumflex over (k)}
((d/dx)zî+(d/dy)zĵ−{circumflex over (k)})=î−{circumflex over (k)}
and
Dot-product (or scalar-product) of two vectors can be calculated by multiplying like-components and then add them up,
thus,
The differential area dS on surface S is related to the differential area dxdy on the xy-plane through the following equation:
dS=√{square root over ((∂z/∂x)2+(∂z/∂y)2+1)}{square root over ((∂z/∂x)2+(∂z/∂y)2+1)}dxdy
Plugging
z=x
into the above equation, we obtain
dS=√{square root over (2)}dxdy.
Notice that the projection of the surface on the xy-plane is:
D={(x,y)|0<x<R, 0<y<x},
we can express the surface integral as a double integral defined in region D as below:
The double integral can be written as:
∫x=0R∫y=0x−u0xdydx=∫x=0R(∫y=0x−u0xdy)dx
∫−u0xdy=−yu0x
Applying the fundamental theorem of calculus,
Applying the fundamental theorem of calculus,
Therefore,
Continuing with examples, Case N illustrates a hybrid symbolic natural language assertion and command.
Case N: HSNL Assertion and Command1
Input:
begin
A=(x—0, y—0, z—0);
S={(x,y,z)|z=x̂2+ŷ2, 0<x<inf, 0<y<x};
P is the tangent plane of S at x=0, y=h;
calculate the distance between P and A;
end#
Interpreted Input:
A=(x0,y0,z0)
S={(x,y,z)|z=x2+y2, 0<x<∞, 0<y<x}
P is the tangent plane of S at x=0, y=h;
calculate the distance between P and A;
Output:
A=x
0
î+y
0
ĵ+z
0
{circumflex over (k)}
S={(x,y,z)|0<x<∞, 0<y<x, z=(x2+y2)}
P={(x,y,z)|Ax+By+Cz+D=0,xε, yε}
where
A=0
The distance is:
Intermediate Steps:
The tangent plane of a surface at a given point (x′, y′, z′) on the surface can be determined by the normal vector at the point from the fact that the normal vector is perpendicular to any line lies within the tangent plane, or equivalently, any vector pointing to any point on the plane (x,y,z) from the given point (see Figure below). i.e.
{circumflex over (n)}·((x−x′){circumflex over (i)}+(y−y′){circumflex over (j)}+(z−z′)=0
The normal vector of a surface can be calculated from
From the definition of surface S
z=x
2
+y
2
we have:
and
Thus the equation describing the plane can be written as:
or equivalently,
Ax+By+Cz+D=0
A=0
The distance from a given point (x,y,z) to a plane can be calculated by first determining the intersection between the line that is parallel to the normal vector of the plane and passes the point, and then compute the distance between the point and the intersection. It can shown that the formula is as following:
where A, B, C, D are the coefficients defining the plane. Plugging the coefficients defining the plane and the coordinates of the point (x0, y0, z0) into the formula, we obtain:
The above examples provide a number of exemplary inputs, outputs, and intermediate steps that may be displayed according to methods and systems of the present disclosure. With reference now to
The detailed description set forth above in connection with the appended drawings describes exemplary implementations and does not represent the only examples that may be implemented or that are within the scope of the claims. The term “exemplary” used throughout this description means “serving as an example, instance, or illustration,” and not “preferred” or “advantageous over other embodiments.” The detailed description includes specific details for the purpose of providing an understanding of the described techniques. These techniques, however, may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts as described.
The various illustrative blocks and modules described in connection with the disclosure herein may be implemented or performed with a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
The functions described herein may be implemented in hardware, software executed by a processor, firmware, or any combination thereof. If implemented in software executed by a processor, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Other examples and implementations are within the scope and spirit of the disclosure and appended claims. For example, due to the nature of software, functions described above can be implemented using software executed by a processor, hardware, firmware, hardwiring, or combinations of any of these. Features implementing functions may also be physically located at various positions, including being distributed such that portions of functions are implemented at different physical locations. Also, as used herein, including in the claims, “or” as used in a list of items prefaced by “at least one of” indicates a disjunctive list such that, for example, a list of “at least one of A, B, or C” means A or B or C or AB or AC or BC or ABC (i.e., A and B and C).
Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage medium may be any available medium that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code means in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blue-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above are also included within the scope of computer-readable media.
The previous description of the disclosure is provided to enable a person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.