The disclosure is related to the field of document layout, and in particular, to automatically generating and rendering a template for a pre-defined layout and any constraints associated therewith.
A mixed-content document can be organized to display a combination of text, images, headers, sidebars, or any other elements that are typically dimensioned and arranged to display information to a reader in a coherent, informative, and visually aesthetic manner. Mixed-content documents can be in printed or electronic form, and examples of mixed-content documents include articles, flyers, business cards, newsletters, website displays, brochures, single or multi page advertisements, envelopes, and magazine covers just to name a few. In order to design a layout for a mixed-content document, a document designer selects for each page of the document a number of elements, element dimensions, spacing between elements called “white space,” font size and style for text, background, colors, and an arrangement of the elements.
In recent years, advances in computing devices have accelerated the growth and development of software-based document layout design tools and, as a result, increased the efficiency with which mixed-content documents can be produced. A first type of design tool uses a set of gridlines that can be seen in the document design process but are invisible to the document reader. The gridlines are used to align elements on a page, allow for flexibility by enabling a designer to position elements within a document, and even allow a designer to extend portions of elements outside of the guidelines, depending on how much variation the designer would like to incorporate into the document layout. A second type of document layout design tool is a template. Typical design tools present a document designer with a variety of different templates to choose from for each page of the document
However, it is often the case that the dimensions of template fields are fixed making it difficult for document designers to resize images and arrange text to fill particular fields creating image and text overflows, cropping, or other unpleasant scaling issues.
An example of a method for adjusting an automatic template layout by providing a constraint is disclosed, in one example, raw text, figures, references, and semantic information is received. A check is performed for a constraint. An allocation of text and figures is determined for each page of a document. In addition, for each page of the document, a template for displaying the allocation assigned to the page is determined. The template parameters are set to exhibit the text and figures assigned to the page. The document is then rendered with text and figures allocated to each page within appropriate template fields of the template selected for each page while abiding by the constraint.
Examples are described below with reference to numerous equations and graphical illustrations. In particular, examples are based on Bayes' Theorem from the probability theory branch of mathematics. Although mathematical expressions alone may be sufficient to fully describe and characterize examples disclosed herein, the more graphical, problem oriented examples, and control-flow-diagram approaches included in the following discussion are intended to illustrate examples so that the systems and methods may be accessible to readers with various backgrounds. In order to assist in understanding descriptions of various examples disclosed herein, an overview of Bayes' Theorem is provided in a first subsection, template parameters are introduced in a second subsection, and probabilistic template models based Bayes' Theorem for determining template parameters are provided in a third subsection.
An Overview of Bayes' Theorem and Related Concepts from Probability Theory
Readers already familiar with Bayes' Theorem and other related concepts from probability theory can skip this subsection and proceed to the next subsection titled Template Parameters. This subsection is intended to provide readers who are unfamiliar with Bayes' Theorem a basis for understanding relevant terminology, notation, and provide a basis for understanding how Bayes' Theorem is used to determine document template parameters as described below. For the sake of simplicity, Bayes' theorem and related topics are described below with reference to sample spaces with discrete events, but one skilled in the art will recognize that these concepts can be extended to sample spaces with continuous distributions of events.
A description of probability begins with a sample space S, which is the mathematical counterpart of an experiment and mathematically serves as a universal set for all possible outcomes of an experiment. For example, a discrete sample space can be composed of all the possible outcomes of tossing a fair coin two times and is represented by:
S={HH+HT+TH+TT}
where H represents the outcome heads, and T represents the outcome tails. An event is a set of outcomes, or a subset of a sample space, to which a probability is assigned. A simple event is a single element of the sample space S, such as the event “both coins are tails” TT, or an event can be a larger subset of S, such as the event “at least one coin toss is tails” comprising the three simple events HT, HT, and TT.
The probability of an event E, denoted by P(E), satisfies the condition 0≦P(E)≦1 and is the sum of the probabilities associated with the simple events comprising the event E. For example, the probability of observing each of the simple events of the set S, representing the outcomes of tossing a fair coin two times, is ¼. The probability of the event “at least one coin is heads” is ¾(i.e., ¼+¼+¼), which are the probabilities of the simple events HH, HT, and TH, respectively).
Bayes' Theorem provides a formula for calculating conditional probabilities. A conditional probability is the probability of the occurrence of some event A, based on the occurrence of a different event B. Conditional probability can be defined by the following equation:
where P(A|B) is read as “the probability of the event A, given the occurrence of the event B,”
P(A∩B) is read as “the probability of the events A and B both occurring,” and
P(B) is simple the probability of the event B occurring regardless of whether or not the event A occurs.
For an example of conditional probabilities, consider a club with four male and five female charter members that elects two women and three men to membership. See also, Goldberg, S., 1986, “Probability: An Introduction” by Samuel Goldberg,” pages 74-75. From the total of 14 members, one person is selected at random, and suppose it is known that the person selected is a charter member. Now consider the question of what is the probability the person selected is male? In other words, given that we already know the person selected is a charter member, what is the probability the person selected at random is male? In terms of the conditional probability, B is the event “the person selected is a charter member,” and A is the event “the person selected is male.” According to the formula for conditional probability:
P(B)= 9/14,and
P(A∩B)= 7/14
Thus, the probability of the person selected at random is male given that the person selected is a charter member is:
Bayes' theorem relates the conditional probability of the event A given the event B to the probability of the event B given the event A. In other words, Bayes' theorem relates the conditional probabilities P(A|B) and (B|A) in a single mathematical expression as follows:
P(A) is a prior probability of the event A. It is called the “prior” because it does not take into account the occurrence of the event B. P(B|A) is the conditional probability of observing the event B given the observation of the event A. P(A|B) is the conditional probability of observing the event A given the observation of the event B. It is called the “posterior” because it depends from, or is observed after, the occurrence of the event B. P(B) is a prior probability of the event B, and can serve as a normalizing constant.
For an example application of Bayes' theorem consider two urns containing colored balls as specified in Table I:
Suppose one of the urns is selected at random and a blue ball removed Bayes' theorem can be used to determine the probability the ball came from urn 1. Let B denote the event “ball selected is blue.” To account for the occurrence of B there are two hypotheses: A1 is the event urn 1 is selected, and A2 is the event urn 2 is selected.
Because the urn is selected at random,
P(A1)=P(A2)=½
Based on the entries in Table I, conditional probabilities also give:
P(B|A1)= 2/9,and
P(B|A2)= 3/6
The probability of the event “ball selected is blue,” regardless of which urn is selected, is
Thus, according to Bayes' theorem, the probability the blue ball came from urn I is given by:
In this subsection, template parameters used to obtain dimensions of image fields and white spaces of a document template are described with reference to just three example document templates. The three examples described below are not intended to be exhaustive of the nearly limitless possible dimensions and arrangements of template elements. Instead, the examples described in this subsection are intended to merely provide a basic understanding of how the dimensions of elements of a template can be characterized, and are intended to introduce the reader to the terminology and notation used to represent template parameters and dimensions of document templates. Note that template parameters are not used to change the dimensions of the text fields or the overall dimensions of the templates. Template parameters are formally determined using probabilistic methods and systems described below in the subsequent subsection.
In preparing a document layout, document designers typically select a style sheet in order to determine the document's overall appearance. The style sheet may include (1) a typeface, character size, and colors for headings, text, and background; (2) format for how front matter, such as preface, figure list, and title page should appear; (3) format for how sections can be arranged in terms of space and number of columns, line spacing, margin widths on all sides, and spacing between headings just to name a few; and (4) any boilerplate content included on certain pages, such as copyright statements. The style sheet typically applies to the entire document. As necessary, specific elements of the style sheet may be overridden for particular sections of the document.
Document templates represent the arrangement elements for displaying text and images for each page of the document.
The template parameters and dimensions of an image and white space associated with the template 300 can be characterized by vectors as illustrated in
Because both the width wf and the height hf of the image are scaled by the same parameter Θf as described above, the first vector elements of 1 and 1 are wf and hf, respectively. The other dimensions varied in the template 300 are the widths of the white spaces 316 and 318, which are varied in the y-direction, and the margins which are varied in the x- and y-directions. For 1 the two vector elements corresponding to the parameters Θfp and Θp are “0”, the two vector elements corresponding to the margins mw1 and mw2 are “1”, and the two vector elements corresponding to the margins mh1 and mh2 are “0”. For 1 the two vector elements corresponding to the parameters Θfp and Θp are “1”, the two vector elements corresponding to the margins mw1 and mw2 are “0” and the two vector elements corresponding to the margins mh1 and mh2 are “1”.
The vector elements of 1 and 1 are arranged to correspond to the parameters of the vector in order to satisfy the following condition in the x-direction:
1
−W
1≈0
and the following condition in the y-direction:
1
−H
1≈0
where
1=Θfwf+mw1+mw2 is the scaled width of the image displayed in the image field 302;
W1=W is a variable corresponding to the space available to the image displayed in the image field 302 in the x-direction;
1=Θfhf+Θfp+Θp+mh1+mh2 is the sum of the scaled height of the image displayed in the image field 302 and the parameters associated with scaling the white spaces 316 and 318; and
H1=H−Hp1−Hp2 is a variable corresponding to the space available for the image displayed in the image field 302 and the widths of the white spaces 316 and 318 in the y-direction.
Probabilistic methods based on Bayes' theorem described below can be used to determine the template parameters so that the conditions 1−W1≈0 and 1−H1≈0 are satisfied.
The template parameters and dimensions of images and white spaces associated with the template 400 can be characterized by vectors as illustrated in
On the other hand, changes to the template 400 in the y-direction are characterized by two vectors 1 and 2 each vector accounting for changes in the height of two different images displayed in the image fields 402 and 404 and the white spaces 412 and 414. As shown in
As described above with reference to
1
−W
1≈0
and the following conditions in the y-direction:
1
−H
1≈0
2
−H
2≈0
where
1=Θf1wf1+Θf2wf2+Θff+mw1+mw2 is the scaled width of the images displayed in the image fields 402 and 404 and the width of the white space 410;
W1=W is a variable corresponding to the space available for the images displayed in the image fields 402 and 404 and the white space 410 in the x-direction;
1=Θf1hf1+Θfp+Θp+mh1+mh2 is the sum of the scaled height of the image displayed in the image field 402 and the parameters associated with scaling the white spaces 412 and 414;
2=Θf2hf2+Θfp+Θp+mh1+mh2 is the sum of the scaled height of the image displayed in the image field 404 and the parameters associated with scaling the white spaces 412 and 414,
H1=H−Hp1−Hp2 is a first variable corresponding to the space available for the image displayed in the image field 402 and the widths of the white spaces 412 and 414 in the y-direction; and
H2=H1 is a second constant corresponding to the space available for the image displayed in the image field 404 and the widths of the white spaces 412 and 414 in the y-direction.
Probabilistic methods based on Bayes' theorem described below can be used to determine the template parameters so that the conditions 1−W1≈0, 1−H1≈0, and 2−H2≈0 are satisfied.
The template parameters and dimensions of images and white spaces associated with the template 500 can be characterized by vectors as illustrated in
On the other hand, changes to the template 500 in the y-direction are also characterized by two vectors 1 and 2. As shown in
As described above with reference to
1
−W
1≈0
2
−W
2≈0
and satisfy the following conditions in the y-direction:
1
−H
1≈0
2
−H
2≈00
where
1=Θf1wf1+Θfp1+mw1+mw2 is the scaled width of the images displayed in the image fields 502 and the width of the white space 512;
W1=W−Wp1 is a first variable corresponding to the space available for displaying an image into the image field 502 and the width of the white space 512 in the x-direction;
2=Θf2wf2+Θfp2+mw1+mw2 is the sealed width of the image displayed in the image field 501 and the width of the white space 514;
W2=W−Wp2 is a second variable corresponding to the space available for displaying an image into the image field 504 and width of the white space 514 in the x-direction;
1=Θf1hf1+Θfp3+Θfp4+mh1+mh2 is the sum of the scaled height of the image displayed in the image field 402 and the parameters associated with scaling the white spaces 412 and 414;
H1=H−Hp2−Hp3 is a first constant corresponding to the space available to the height of the image displayed in image field 502 and the widths of the white spaces 516 and 518 in the y-direction;
2=Θf2hf2+Θfp3+Θfp4+mh1+mh2 is the sum of the scaled height of the image displayed in the image field 404 and the parameters associated with scaling the white spaces 412 and 414; and
H2=H−Hp1−Hp3 is a second constant corresponding to the space available to the height of the image displayed in image field 504 and the widths of the white spaces 516 and 518 in the y-direction.
Probabilistic methods based on Bayes' theorem described below can be used to determine the template parameters so that the conditions 1−W1≈0, 2−W2≈0, 1−H1≈0, and 2−H2≈0 are satisfied.
Note that the templates 300, 400, and 500 are examples representing how the number of constants associated with the space available in the x-direction Wi and corresponding vectors i, and the number of constants associated with the space available in the y-direction Hj and corresponding vectors j, can be determined by the number of image fields and how the image fields are arranged within the template. For example, for the template 300, shown in
On the other hand, as shown in
In summary, a template is defined for a given number of images. In particular, for a template configured with m rows and n columns of image fields, there are W1, W2, . . . Wm constants and corresponding vectors 1, 2, . . . m associated with the m rows, and there are H1, H2, . . . Hn constants and corresponding vectors 1, 2, . . . n associated with the n columns.
Examples determine an allocation of text, figures, and references for each page of the document. An allocation corresponds to the number of lines of text nL and the number of figures nF assigned to a page. Each page allocation is characterized by a random variable Aj, where j is a non-negative integer used to identify a page of the document. For the first page of the document j equals “0.” A random variable is a function whose domain is a sample space and whose range is a set of real numbers. For example, referring to
S
0={[1T;1F],[1T,2T;1F],[1T,2T,3T;1F,2F],[1T,2T,3T,4T;1F,2F]}
where each element in S0 is a bracket listing text blocks and figures that can be allocated to the first page of the document. The random variable A0 assigns a real value to each element in S0. Allocations for pages 2 through P+1 are denoted by A1 through Ap, respectively, and are similarly defined with an allocation for a subsequent page dependent upon the allocation for the previous page. Method and system examples described below determine optimal allocations A0*, A1*, . . . , Ap* for each page. For example, the optimal allocation A0* for page 1 can be the sample space element [1T, 2T, 3T; 1F, 2F].
Returning to
Once an optimal template is determined for each page of the document, an optimal set of template parameters j associated with dimensioning and spacing template elements is determined as described below, and each page of the document is rendered. For example, returning to
Note that with the exception of the first page, allocations for subsequent pages depend on the allocation for the previous page. For example, consider once again the example allocation of text blocks and figures for the first page, [1T, 2T, 3T; 1F, 2F]. The allocation for the second page cannot also include text blocks 1T, 2T, 3T and
The relationships between allocations, templates, and parameters can be represented by a Bayesian network.
Note that the Bayesian network defines a conditional independency structure. In other words, any node is conditionally independent of its non-descendent given its parent. For nodes like T0, . . . , TP the probabilities associated with these nodes P(T0), . . . , P(T0) are not conditioned on any other nodes.
A joint probability distribution that characterizes the conditional probabilities of a Bayesian network is a product of the probabilities of the parent nodes and the conditional probabilities. Thus, the joint probability distribution associated with the Bayesian network 700 is given by:
As shown in
One example is predicated on maximizing P({Tj}, {j}, {Aj}) with the assumption that the larger the probability P({Tj}, {j}, {Aj}), the close the document layout is to having the following desired document properties:
(1) each page of the document should look as good as possible to achieve overall optimal layout quality;
(2) text blocks that reference figures and the corresponding figures should appear on the same page; and
(3) the total number of pages is minimized.
In order to determine the sets {Tj}, {j}, and {Aj} for a document that gives the maximum probability P({Tj}, {j}, {Aj}), a maximum joint probability distribution is defined as follows:
Equations (1), (2), and (3) are used to determined optimal allocations, templates, and template parameters using the method of “belief propagation” from Bayesian methods. For the sake of simplicity, a description of determining the set {Aj} of optimal allocations using belief propagation is described first, followed by a description of determining an optimal template for each optimal allocation, and finally determining optimal template parameters for each template. However, in practice, optimal allocations, templates, and template parameters can also be determined simultaneously using belief propagation.
The set of allocations {Aj} that maximize equation (1) can be obtained by first determining the φ's. Each φ is a function of random variables, and is the maximum of a sequence of real numbers, one for each template Tj, as described in equation (2). Hence for each Aj and Aj-1 there is a maximizing template tj*. For the first page, φ(A0) is the maximum of the range of real values associated with the allocation A0. For subsequent pages, φ(Aj, Aj-1) is the maximum of the range of real values associated with the allocations Aj and Aj-1.
Once the φ's have been determined, a set of recursive equations denoted by τ are used to determined the optimal allocations A0*, A1*, . . . , Ap*. First, each τ is computed recursively as follows:
Next, after each of the τj's have been recursively obtained, optimal allocations A0*, A1*, . . . , Ap* can be obtained by solving the τj's in a reverse recursive manner as follows:
Thus, optimal allocations A0*, A1*, . . . , Ap* for maximizing the probability P*({Tj}, {j}, {Aj}) have been determined.
After the set of optimal allocations have been determined, for each optimal allocation, equations (2) and (3) can be used to determine an optimal Tj and j. For each Aj there is a set of Tj's. Once a φ(Aj, Aj-1) is determined, the corresponding Tj maximizes equation (2) and the corresponding template parameters j maximize equation (3). In equation (3), P(Aj|Aj-1, j) is the product of layout quality, reference quality, and page qualities probabilities given by:
P(Aj|Aj-1,j)=PQ(Aj|Aj-1,j)PR(Aj|Aj-1)PP(Aj|Aj-1)
The conditional probability PQ(Aj|Aj-1, j) associated with layout quality is determined by a document designer. The reference quality probability can be defined as follows:
P
R(Aj|Aj-1)∝ exp {−γ|RA
where γ is a reference constant assigned by the document designer, and |RA
P
P(Aj|Aj-1)∝ exp {−δ}
where δ is a page constant assigned by the document designer and corresponds to a page number penalty that is used to control the overall number of pages in the final document.
Next a closed form equation for determining the parameter vector , for each template is now described. This closed form description can be obtained by considering the relationship between dimensions of elements of a template with m rows of image fields and n columns of image fields and the corresponding parameter vector in terms of Bayes' Theorem from probability theory as follows:
P(|)∝P(|)P() Equation (1)
where
=[W1, W2, . . . Wm]T,
=[H1, H2, . . . Hn]T,
=[1, 2, . . . m]T,
=[1, 2, . . . n]T,
the exponent T represents the transpose from matrix theory.
Vector notation is used to succinctly represent template constants Wi and corresponding vectors i associated with the m rows and template constants Hj and corresponding vectors j associated with the n columns of the template.
Equation (1) is in the form of Bayes' Theorem but with the normalizing probability P() excluded from the denominator of the right-hand side of equation (1) (e.g., see the definition of Bayes' Theorem provided in the subsection titled An Overview of Bayes' Theorem and Related Concepts from Probability Theory). As demonstrated below, the normalizing probability P() does not contribute to determining the template parameters that maximize the posterior probability P(|), and for this reason P() can be excluded from the denominator of the right-hand side of equation (1).
In equation (1), the term P() is the prior probability associated with the parameter vector and does not take into account the occurrence of an event composed of . In certain examples, the prior probability can be characterized by a normal, or Gaussian, probability distribution given by:
where
Θ1 is a vector composed of independent mean values for the parameters set by a document designer;
Λ1 is a diagonal matrix of variances for the independent parameters set by the document designer;
Λ2=CTΔTΔC is a non-diagonal covariance matrix for dependent parameters; and
2=Λ−1CTΔTΔ is a vector composed of dependent mean values for the parameters.
The matrix C and the vector characterize the linear relationships between the parameters of the parameter vector given by C= and is a covariance precision matrix. For example, consider the template 300 described above with reference to
0.2Θf+3.1Θp≈−1.4,and
1.8Θf−0.7Θp+1.1Θp≈−3.1
Thus, in matrix notation, these two equations can be represented as follows:
Returning to equation (1), the term P(|) is the conditional probability of an event composed of , and , given the occurrence of the parameters of the parameter vector
are normal probability distributions. The variables αi−1 and βj−1 variances and Wi and Hj represent mean values for the distributions N(Wi|i, αi−1) and N(Hj|j, βj−1), respectively. Normal distributions can be used to characterize, at least approximately, the probability distribution of a variable that tends to cluster around the mean. In other words, variables close to the mean are more likely to occur than are variables farther from the mean. The normal distributions N(Wi|i, αi−1) and N(Hj|j, βj−1) characterize the probability distributions of the variables Wi if and Hj about the mean values i and j respectively.
For the sake of discussion, consider just the distribution N(Wi|i, αi−1).
The posterior probability P(|) can be maximized when the exponents of the normal distributions of equation (2) satisfy the following conditions:
i
−W
i≈0 and j−Hj≈0
for all i and j. As described above, for a template, Wi and Hj are constants and the elements of i and j are constants. These conditions are satisfied by determining a parameter vector MAP that maximizes the posterior probability P(|). The parameter vector MAP can be determined by rewriting the posterior probability P(|) as a multi-variate normal distribution with a well characterized mean and variance as follows:
The parameter vector MAP is the mean of the normal distribution characterization of the posterior probability P(|), and maximizes P(|) when equals MAP. Solving P(|) for gives the following closed form expression:
The parameter vector MAP can also be rewritten in matrix from as follows:
MAP
=A
−1
where
is a matrix and A−1 is the inverse of A, and
is a vector.
In summary, given a single page template and images to be placed in the image fields of the template, the parameters used to scale the images and white spaces of the template can be determined mined from the closed form equation for MAP.
Once the parameters of the parameter vector MAP are determined using the closed form equation for MAP, the template is rendered by multiplying un-scaled dimensions of the images and widths of the white spaces by corresponding parameters of the parameter vector MAP.
The elements of the parameter vector MAP may also be subject to boundary conditions on the image fields and white space dimensions arising from the minimum width constraints for the margins. In other examples, in order to determine MAP subject to boundary conditions, the vectors 1, and 2 the variances α1−1, α2−1, β1−1, and β2−1 the constants W1, W2, H1, and H2, are inserted into the linear equation AMAP= and the matrix equation solved numerically for the parameter vector MAP subject to the boundary conditions on the parameters of MAP. The matrix equation AMAP= can be solved using any numerical method in the art for solving matrix equations subject to boundary conditions on the vector MAP, such as the conjugate gradient method.
Additionally, in one example,
In one example, the constraint is provided, post-process. For example, an initial layout is provided and one or more adjustments to the layout are provided in a semi-automated manner. In another example, the constraint may be provided pre-process. In yet another example, a plurality of constraints may be provided and they may be provided pre-process, post-process or a combination thereof.
Examples of the type of constraint that may be given include, but are not limited to, changing the template, resizing one or more images, manipulate whitespace, define a number of pages within which the layout should fit, and the like.
The document is then rendered with text and figures allocated to each page within appropriate template fields of the template selected for each page while abiding by the constraint. If the constraint occurs post process, then one example provides the framework to accomplish these tasks by automatically reflowing the layout after receiving the constraint. The optimization may be global or fast and localized.
With reference to 904, a style sheet corresponding to the document's overall appearance is input. The style sheet may include (1) a typeface, character size, and colors for headings, text, and background; (2) format for how front matter, such as preface, figure list, and title page should appear; (3) format for how sections can be arranged in terms of space and number of columns, line spacing, margin widths on all sides, and spacing between headings just to name a few; and (4) any boilerplate content included on certain pages, such as copyright statements. The style sheet typically applies to the entire document. As necessary, specific elements of the style sheet may be overridden for particular sections of the document.
With reference now to 925 of
In one example, the constraint 925 of the document may include, but is not limited to, constraining a number of document pages Ap, constraining one or more template selections Tj, constraining one or more scalable template fields j and the like. Additionally, in one example, there may be a plurality of constraints 925. Further, the constraints may or may not overlap in size, scope, and control.
In one example, constraining one or more template selections Tj may include limiting the template selection to a single template, a small group of templates, or removing one or more templates from the selectable group. In one example, to apply a template constraint, one or more of the non-selected highest ranked templates may be offered as options for the different template. For example, one or more of the possible template options would be provided to the user via a GUI, or the like, and the user would select the new template(s) by clicking on one or more of the possible template options. After selecting the specific template or number of templates, the document generating process would be repeated with the new template constraint. In one example, if the constraint is a specific constraint, the single page would be re-generated while the rest of the document would remain unchanged. If the constraint were a local constraint, the single page plus one or more pages before or after the single page would be re-generated while the remainder of the document would remain unchanged. If the constraint were global, then the entire document would be regenerated with the template constraint being applied for the appropriate page.
In one example, constraining one or more scalable template fields j of the document may include, constraining an image, a margin, an amount of white space, a text size, and a text font. Further, the constraint may make the scalable template fields j larger or smaller. Again, it is also possible that a first constraint would make a first image larger while a second constraint may make a second image smaller, or vice-versa. In other words, the technology is well suited to providing a number of scalable field constraints on a single page, on a local number of pages, or globally over the entire document.
In one example, the application of each constraint may be global to the document, a single page constraint of the document, a plurality of objects constraint, a single object constraint, a localized portion of the document, the localized portion comprising more than a single page and less than the entire document, and the like.
In one example, the global solution may restrict the choice of templates for the affected page. However, the algorithm that optimizes template parameters is still performed. In so doing, allocation to the affected page is allowed to change and allowed to propagate to all pages.
With respect to the page or local solution, one example will fix allocation for the affected page. For example, instead of performing a global reflowing, one example will optimize template parameters given the new template and the old allocation on the specific page being constrained. In other words, the template may have been changed, or an image resized, but the allocation on the constrained page will remain equal to the allocation on the previous version of the page.
Local neighborhood solution refers to fixing allocation over a small neighborhood. For example, 1 page before and 1 page after the constrained page. In one example, global optimization will be run on the neighborhood. In one example, during the running of the local neighborhood optimization, specialized first/last page behaviors will be turned off.
For example, the constraint may be to change the template for a given page. As shown in
In block 905, a subroutine called “determine document layout” is called. The subroutine uses an automated method for determining an optimized document layout described below with reference to
In block 906, a document with an optimized layout is output. The document layout includes an optimized allocation of text and images per page, optimized templates for each page, and optimized scaling of images and other design elements including whitespaces. In block 907, blocks 901-906 can be repeated for a different document.
In block 1003, optimal allocations A0*, A1*, . . . , Ap*, that contribute to P*({Tj}, {j}, {Aj}) are determined using reverse recursion, as described above. However, it is noted that at 1003, a check for any input constraints 925 with respect to the number of pages is performed. If no constraint is provided, then optimal allocation using reverse recursion occurs. However, in one example, if the number of pages is constrained to a specific page number, then the forward process can be stopped at the designated page number and then the reverse recursion occurs. In other words, P* would become a specific P* and the reverse recursion occurs.
In one example, if the number of pages is constrained to a range of page numbers, then the optimal allocations A0*, A1*, . . . , Ap*, that contribute to P*({Tj}, {j}, {Aj}) can be determined over the allowable range of page numbers then the forward process can be stopped at the designated page number and then the reverse recursion occurs.
Block 1004 is a for-loop that repeats blocks 1005-1007 for each page of the document.
At 1005, in one example, a template is determined using equation (2). However, at 1005 a check for any input constraints 925 with respect to the template type is performed. If no constraint is provided, then a template is determined using equation (2). In one example, if the choice of templates is constrained to a specific template, then 1005 would receive the template selection from input constraint 925. However, in one example, if the number of templates is constrained to a range of templates, or certain pages have template constraints thereon, then the template may be determined using equation (2) over the allowable templates on a per page basis.
In other words, the restriction of the parameters can be easily enforced by equating the upper and lower bounds of equation (2) to the desired parameter values. In one example, the same or different input constraint 925 may be provided at each page of the repeating for-loop of 1005-1007.
In block 1006, a subroutine “determine parameters” is called. This subroutine produces a set of optimized template parameters for each template determined in block 1005. However, at 1006 a check for any input constraints 925 with respect to the optimized template parameters for each template type is performed. If no constraint is provided, then optimized template parameters may be determined using the discussion provided with respect to
In other words, the restriction of the parameters can be easily enforced by equating the upper and lower bounds of equation (3) to the desired parameter values. Thus, in one example, at each page of the repeating for-loop of 1005-1007, input constraint 925 may provide information that would affect optimized template parameters for one or more objects or white space on the given template page.
In block 1007, a page is rendered with the optimized template and corresponding template parameters. The template page can be rendered by exhibiting the page on monitor, television set, or any other suitable display, or the template page can be rendered by printing the page on a sheet of paper. In block 1008, when the document includes another page, blocks 1005-1007 are repeated, otherwise the subroutine returns to “determine document layout” in
Elements of the parameter vector MAP can be determined by solving the matrix equation AMAP= for MAP using the conjugate gradient method or any other matrix equation solvers in the art where the elements of the vector MAP are subject to boundary conditions, such as minimum constraints placed on the margins. In block 1105, once the parameter vector is determined, resealed dimensions of the block objects and widths of the white spaces can be obtained by multiplying dimensions of the template elements by the corresponding parameters of the parameter vector MAP. In block 1106, the subroutine returns to the subroutine “determine document layout” of
in one example, the constraints may be provided prior to an initial document layout. For example, any or all of the number of pages, or range of pages for the document; the pool of templates from which one or more pages of the document may be selected; and one or more image scaling constraints may be provided.
In another example, the constraints may be provided after a document has been generated. For example, after viewing the generated document, a user may adjust an image, for example, by selecting a corner of an image on a GUI and then dragging the corner to rescale the image. In another example, one or more pages of a generated document may be selected by the user and a template change may be invoked. In the following example, a single page of the document being changed is described for clarity; however, the change may also be performed on a number of document pages or for the entire document. Moreover, if the template changes are provided to more than one page in the document, each page that is changed in the document may be changed to the same new template, changed to one of a selected plurality of templates, changed to a different template, or any combination thereof.
In one example, invoking a template change may include the user designating the page of the document to be changed. Such as via a GUI, or the like. In one example, when a change to the template is invoked, one or more of the non-selected highest ranked templates may be offered as options for the template change. For example, one or more of the possible template options would be provided to the user via a GUI, or the like, and the user would select the new template(s) by clicking on one or more of the possible template options. After selecting the specific template or number of templates, the document generating process would be repeated with the new template constraint. In one example, if the constraint is a specific constraint, the single page would be re-generated while the rest of the document would remain unchanged. Similarly, if the constraint were a local constraint, the single page plus one or more pages before or after the single page would be re-generated while the remainder of the document would remain unchanged. If the constraint were global, then the entire document would be regenerated with the template constraint being applied for the appropriate page.
In one example, after viewing the generated document, the user may select a different page number. For example, if the generated document had 7 pages, the user may invoke a constraint on the document to reduce or increase the total page count. For example, the user may constrain the document to 5 pages, 8 pages, 6 pages or less, etc. After receiving the page number constraint the document would be regenerated with the page number constraint being applied.
Although the post document generation constraints are described in single case examples, e.g., a template change, a page number change, and an image resizing change, more than one of the constraints may be selected. For example, a user may constrain the page count of a generated document and also resize an image within the document. The document would then be re-generated while adhering to both the page constraint and the image resize constraint. In other words, the constraints that may be applied are not limited to a single type of constraint, but may include any number of constraints.
Additionally, the constraints may include pre-document generation constraints as well as post document generation constraints. For example, prior to the first generation of the document a constraint such as image size may have been provided. After the document was generated and provided, a page number constraint may be introduced. The re-generated document would then include both the image size constraint as well as the page number constraint.
In one example, the constraints may be changed or removed between document revisions. Utilizing the above example, the re-generated document included an original image size constraint and a later added page number constraint. Upon review of the re-generated document, the user may change the image size constraint. The document would then be re-generated with the page number constraint but without the image size constraint. This process could continue for n number of iterations that may include n number of constraints, changes to constraints, removal of constraints, and the like.
With reference to
In this example, computer system 1200 includes an address/data bus 1201 for conveying digital information between the various components, a central processor unit (CPU) 1202 for processing the digital information and instructions, a volatile main memory 1203 comprised of volatile random access memory (RAM) for storing the digital information and instructions, and a non-volatile read only memory (ROM) 1204 for storing information and instructions of a more permanent nature. In addition, computer system 1200 may also include a data storage device 1205 (e.g., a magnetic, optical, floppy, or tape drive or the like) for storing vast amounts of data. It should be noted that the software program for creating an editable template from a document image can be stored either in volatile memory 1203, data storage device 1205, or in an external storage device (not shown).
Devices which can be coupled to computer system 1200 include a display device 1206 for displaying information to a computer user, an alpha-numeric input device 1207 (e.g., a keyboard), and a cursor control device 1208 (e.g., mouse, trackball, light pen, etc.) for inputting data, selections, updates, etc. Computer system 1200 can also include a mechanism for emitting an audible signal (not shown).
Returning still to
Furthermore, computer system 1200 can include an input/output (I/O) signal unit (e.g., interface) 1209 for interfacing with a peripheral device 1210 (e.g., a computer network, modem, mass storage device, etc.). Accordingly, computer system 1200 may be coupled in a network, such as a client/server environment, whereby a number of clients (e.g., personal computers, workstations, portable computers, minicomputers, terminals, etc.) are used to run processes for performing desired tasks. In particular, computer system 1200 can be coupled in a system for creating an editable template from a document.
A number of embodiments of the present invention are thus described. While the present invention has been described in particular embodiments, it should be appreciated that the present invention should not be construed as limited by such embodiments, but rather construed according to the following claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US10/28147 | 3/22/2010 | WO | 00 | 9/13/2012 |