1. Field of the Invention
This invention relates to systems and methods for filling in digital form documents, and more particularly to systems and methods for interactive, user-driven detection, creation and completion of fillable form fields in digital documents.
2. Description of the Related Art
Filling in digital form documents with fixed form fields that do not embed Form Definition Format (FDFs) typically requires users to print the documents, fill them out manually, and scan them back into digital form. Alternatively, users could import the document into image editing software, such as Adobe Acrobat® (Adobe Systems Incorporated, San Jose, Calif.) which uses the Portable Document Format (PDF), and carefully overlay text boxes, checkmarks and other characters or symbols over the appropriate locations on the document pages.
Even digital documents where all form fields can be edited pose problems. Users editing a document with form fields using word processing software must be careful to select the “insert” key when completing the form fields, or otherwise risk destroying the format and content of the form document. As a result, even filling in a form in an editable document can be difficult.
Finally, even form-fillable PDF documents, such as that illustrated in
Automatically detecting form-field locations and types is also error prone. Acrobat's® own “Automatic Form Recognition” feature still requires several steps to accurately create and fill in a form. Furthermore, the tool and user interface was designed for form publishers to add FDF into their existing documents, not as a way for end-users to create their own form fields and then complete a form.
Systems and methods described herein provide interactive, user-driven detection, creation and completion of fillable form fields in digital documents in a single, fluid process. A document with form fields that require completion by a user is received, after which form fields are detected at the direction of the user. Once the user selects a possible form field, the system creates the appropriate fillable form field based on size, type, location, related text and other parameters of the form field and surrounding document. Additional levels of interaction include predictive text, pattern development and automatic completion of previously completed fields.
In one aspect of the invention, a system for detecting and creating fillable form fields in a digital document comprises an input unit which receives input from a user on the location of at least one form field in a digital document; an identification unit which identifies the properties of the at least one form field; a classification unit which classifies the at least one form field in the digital document; and a generation unit which generates a fillable form field at the location of the at least one form field.
In a further aspect, the properties of the at least one form field include the location, size and shape.
In another aspect, the properties of the at least one form field are determined using a boundary search initiated from the location input by the user.
In a yet further aspect, the at least one form field may be classified as a text box, a multi-character text box, a check box or a radio button.
In still another aspect, the classification unit classifies the at least one form field based on text adjacent to the at least one form field.
In a further aspect, the classification unit further classifies the text box based on the content of text to be entered into the fillable form field.
In another aspect, the generation unit provides options for data to be entered into a text box based on the content of the text to be entered.
In still another aspect, the generation unit generates additional fillable form fields in additional locations in the digital document based on the identification and determination of a previous form field.
In a further aspect, the digital document is an image file.
In a further aspect, the fillable form field is created using HTML.
In a still further aspect, the system is a web-based application accessible using an Internet browser.
In another aspect, the user selects the digital document for detecting and completing of the form fields by inputting a uniform resource locator (URL) address corresponding to the location of the digital document.
In a further aspect, the identification unit identifies a first form field on a first page of a multi-page digital document and subsequently identifies identical form fields on additional pages of a multi-page digital document, and wherein the generation unit populates the identical form fields with the data entered by the user in the first form field on the first page.
In a yet further aspect, the identical form fields are highlighted.
In a still further aspect, the information on the fillable form fields generated for a particular digital document are stored for future use with similar digital documents.
In another aspect of the invention, a method for detecting and creating fillable form fields in a digital document comprises receiving an input from a user on the location of at least one form field in a digital document; identifying the properties of the at least one form field; classifying the at least one form field in the digital document; and generating a fillable form field at the location of the at least one form field.
In a further aspect, the method further comprises inputting data into the at least one fillable form field.
In another aspect, the properties of the at least one form field include the location, size and shape.
In a yet further aspect, the at least one form field is classified as a text box, a multi-character text box, a check box or a radio button.
In still further aspect, the at least one form field is classified based on text adjacent to the at least one form field.
In yet another aspect of the invention, a computer program product for detecting and creating fillable form fields in a digital document is embodied on a computer readable medium and when executed by a computer, performs the method comprising receiving an input from a user on the location of at least one form field in a digital document; identifying the location of the at least one form field; determining the characteristics of the at least one form field in the digital document; and generating a fillable form field at the location of the at least one form field.
Additional aspects related to the invention will be set forth in part in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention. Aspects of the invention may be realized and attained by means of the elements and combinations of various elements and aspects particularly pointed out in the following detailed description and the appended claims.
It is to be understood that both the foregoing and the following descriptions are exemplary and explanatory only and are not intended to limit the claimed invention or application thereof in any manner whatsoever.
The accompanying drawings, which are incorporated in and constitute a part of this specification, exemplify the embodiments of the present invention and, together with the description, serve to explain and illustrate principles of the invention. Specifically:
In the following detailed description, reference will be made to the accompanying drawings. The aforementioned accompanying drawings show by way of illustration, and not by way of limitation, specific embodiments and implementations consistent with principles of the present invention.
The systems and methods disclosed herein provide, in one embodiment, an application for viewing a digital document, where each page of a digital document is shown as an image, over which users can seamlessly type in text, check checkmarks, select radio-buttons, and enter other characters and symbols into form fields even though the form field is not predefined in the document image. The application may be web-based, wherein a user can simply upload a digital document to a server on a network which runs the form-filling application. The user may also operate the application within an Internet browser application and simply enter a website address of a web-based document, which will then be scanned into the system for identifying and creating fillable form fields.
In one embodiment, as illustrated in
An example of a digital document with form fields is shown in
The system also applies previous user interactions to detect other form-fields. For example, when a checkbox 124 is identified, the pattern is searched in the rest of the document; users can simply hit “TAB” to move to the next form fields for improved efficiency.
The system also allows seamless editing, where users can select the first single-character box 122 of a multiple single-character form field and keep typing. The characters will appear in the next box automatically. If the user clicks on a box that was already filled, the cursor appears at that position, allowing users to add, backspace or delete characters as in a normal text field. Text alignment in table cells is also automatically set based on the layout of header cells.
The system is also able to recognize multiple single-boxes and groups of radio buttons based on proximity, and also textual content nearby, even if they look like checkboxes (e.g. [ ] Yes [ ] No).
In another embodiment, the system suggests useful form-completions for fields, for example date/time pickers and place/state/country drop-down menus. The system can also restrict the type of content (e.g. alpha or numeric) to be input (e.g. only digits if followed by % or preceded by $).
The system also stores previous interactions on a given document to benefit others who might need to fill similar documents. For example, information on the fillable form fields generated on one particular document are stored for future use in a similar document. By storing interactions, the system becomes better at automatically detecting form fields.
In one embodiment, the system converts any document or web page into an image file, and then uses HTML to create the form fillable fields in the appropriate sections, as will be further discussed below.
An input document (PDF, Word, PowerPoint, image file) is rendered into page images using available tools such as Ghostscript (www.ghostscript.com) or XPDF (www.foolabs.com/xpdf) (converting, for example, PDF to JPEG or PDF to PNG). A PowerPoint slide could be exported as an image as well, using OpenOffice™ (www.openoffice.org) or the Microsoft® Office Suite (Microsoft Corporation, Redmond, Wash.). Images are shown to the user. When the user clicks a point (x, y) on the image, the system determines the corresponding form-field type and its extent. The user can immediately start typing text in a text-based fillable form field, or the system automatically adds the appropriate mark (e.g. radio button selected or unselected, checkbox checked or unchecked, option circled or not circled).
From a page image and a user-selected location, the system determines the properties of the form field, such as the location, extent, and type, for example 1) a closed box 2) a box opened on the top, or 3) a line underneath, 4) a circle.
A difficulty with general form recognition is coverage of the many different types of forms. However, all that is needed here is to perform recognition of limited types of objects. The system relies on several image processing steps, including optical character recognition (OCR), line and line-crossing finding, and colored region finding. For OCR, there are a number of commercial systems, e.g., ABBYY (www.abbyy.com), Microsoft® Office Document Imaging (http://office.microsoft.com/en-us/help/about-microsoft-office-document-imaging-HP001077103.aspx), and OCRopus™ (code.google.com/p/ocropus/). Line finding can be performed using edge detection followed by a Hough transform, as described in R. Duda and P. Hart, “Use of the Hough transformation to detect lines and curves in pictures,” Comm. ACM, Vol. 15, No. 1, pp. 11-15 (1972). A simpler approach that can be used since forms generally contain horizontal and/or vertical lines, and not other orientations (assuming there is minimal skew), is to follow the “black” pixels horizontally or vertically across a page, allowing for slight “jogs.” In colored region finding, by limiting colored region finding to regions with the same pixel values (or average pixel values in a small window), the system can identify the extent of colored regions. In one embodiment, a preprocessing step can also include skew detection; any of the deskewing algorithms (e.g., that disclosed in Yang Cao, Shuhua Wangb and Heng Li, “Skew detection and correction in document images based on straight-line fitting,” Pattern Recognition Letters, Vol. 24, No. 12, pp. 1871-1879 (2003)) can be used to deskew a scanned page prior to use of the system.
In one embodiment, if the system is not correctly identifying the desired region, the user can invoke a fall-back, or default mode where a rectangular region is swept using the mouse. The region is shown in the viewer and the user can type inside the identified rectangular region. The corners of the region can also be adjusted similarly to those found in traditional graphical tools.
Some forms may be colored or have shading to distinguish form fields. For example, the lines or columns defining the boxes may be colored or shaded, as illustrated by the shaded lines 126 in
In the embodiment illustrated in
In
In
In a form field with multiple single-character fields 144, as illustrated in
In one embodiment, multiple single-box fields may be detected after the user clicks on any box. However, if nothing has been entered in any of the boxes, the cursor is automatically positioned at the first box 146.
In one embodiment, the system tries to find more fillable form fields below and above the currently detected line. If text is found on the left of the next line below the current line, the system considers the next line as a different form field, presumably because the text represents a different form category, as illustrated with the “Name” 148 and “Email Address” 150 text in
Common field names such as “Name” 148 or “Email Address” 150 can benefit from the auto-completion features already stored by an Internet browser's auto-complete list. In
The system can detect text fields even when the box 158 is open on the bottom, as illustrated in
When detecting a form field known as a “radio button” 160, as shown in
In
In one embodiment, the system can automatically restrict the type of characters that can be input into a fillable form field based on text found before or after the field. As illustrated in
Also, text justification in a table cell is automatically set to the same justification present in the header (left/center/right). In tables 176, as shown in
In another embodiment, common field formats and data patterns are recognized, as shown in
In a further embodiment, the system is also able to recognize identical form fields across multiple pages of a multi-page document. The system may highlight the identical fields with a specific color or shading pattern, or the system may fill in data from the first completed field in the subsequent identical fields so that the user does not have to enter the same data on multiple pages. This situation may occur with data such as dates or Social Security Numbers which often appear on multiple pages of a document. If the system enters the data in subsequent identical fields for the user, the system may still alert the user to the pre-populated data through a message or by highlighting the identical fields with specific colors or shading patterns.
In another embodiment, the system uses auto-complete features such as drop-down menus and widgets in order to suggest entries in the fillable form fields to the user. In
Date pickers can also be added over multiple single-box fields if, for example, 6 or 8 boxes are detected and/or the nearby text reads “date.” The 6 or 8 boxes would then be determined to correspond to a date field in the month/day/year format—MM/DD/YY or MM/DD/YYYY. Similar treatment happens for other types such as 2 box fields near “state” text.
Text fields 194 may occur inside a box 196, as illustrated in
In one embodiment, to identify and classify the extent of the form field, the system searches for the boundaries of the form field starting with a user-selected point 198, as illustrated in
In one embodiment, the method of identifying a form field, or element, starts from a raster image of the form page in question, and the position and content of the text of the page. The first step of identifying the extent of the form element and classifying can be performed as follows:
1) The user selects a point within the desired form field.
2) If the user selection is within a text box where text already exists, the system interprets the form field to be an “option selection” form field. The pre-existing text is selected or circled and processing stops. In cases where the same text is re-selected, the selection/circling would toggle between a selected and unselected state.
3) Using the region grow methodology, the color of the document background at the user-selected point is used as the seed from which to grow the region which is to become the fillable form field. Alternatively, the background color of the document (or region) may be already determined, in which case the closest background point is used. This would free the user from position errors on forms with small checkboxes.
4) From the user-selected point, the boundaries of the field are found by searching in each direction for an edge or boundary, such as using the region grow methodology to find a color significantly different than the initial point.
5) The search is performed subject to a maximum reasonable extent, wherein the maximum reasonable extent is determined based on the size of the page and/or size of the text on the page. For example, the extent of the vertical search in
6) In form fields which are text boxes, the baseline 206 of the form must also be analyzed to determine the internal and external boundaries of the form field, as illustrated in
7) In top and bottom heuristics, if the sides of the field are bounded by text boxes or lines with limited extent (as in
8) In one embodiment, the field type (text entry, character box, checkbox) can be determined from the size, shape, and boundary nature of the detected element, as determined by the identification unit. The characteristics of the presumed form field may include: nature of each boundary (i.e. text box boundary, line boundary, lip boundary, nothing (limit)); connectedness of the boundary; width, height, and aspect of the region; and the presence of text (see step 2 above). An example of a set of rules based on these attributes includes: a) If width<W and height<H and form field is fully bounded, then the form field is a checkbox; b) If width<W and height<H and the form field is bounded only on sides, then the form field is a parentheses-style checkbox; c) If height>=MinTextHeight and aspect>MinTextAspect, then the form field is a text box; and d) If height>=MinTextHeight and width<MaxCharboxWidth and has a lip, then the form field is a character box.
9) In one embodiment, the semantic attributes of the field (date, name, etc. . . . ) may be determined by finding the closest text regions. “Closeness” in this context may include both Euclidean distance and graphical distance. For instance, if an interactively-determined form field region is in the same connected component as a text box, it would have distance=0. In addition, horizontal distance may be counted less strongly than vertical distance in assigning text to a field. Also, the predominant direction of the language in use can influence the “closeness.” In Western, left-to-right languages, text to the left of the detected field can be considered to have more influence over the semantic attributes of the detected form field region than text to the right of the detected field.
10) For repeated elements, like the character boxes 222 illustrated in
The embodiments and implementations described above are presented in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other implementations may be utilized and that structural changes and/or substitutions of various elements may be made without departing from the scope and spirit of present invention. The following detailed description is, therefore, not to be construed in a limited sense. Additionally, the various embodiments of the invention as described may be implemented in the form of software running on a general purpose computer, in the form of a specialized hardware, or combination of software and hardware.