The invention relates to the reproduction of documents into a requested form. The forms can include print, audio, Braille or an electronic file. It also relates to the distribution of such documents over electronic networks, and remote reproduction. The documents can be either large or small in size.
Print Form Documents
Currently many documents are transmitted in paper, usually via post. One particularly common form of document is invoices. It is expensive for companies to print and post invoices. When they are received, they must be opened, be paid, sorted, and often information from the invoice must be data entered into a computer. This is expensive for customers. Often customers can not read the invoice they are sent because they are blind, the type is too small, the reader has a disability, or it is written in a language they cannot read. This problem extends to several other kinds of document, including bank statements, credit card statements, legal documents and letters.
Commercial computer networks, such as the Internet, have been used as a means of facilitating ordering of books and other reading material by consumers. This is typically achieved by presenting a web site-based user interface to consumers to allow them to order reading material such as books. One example of this is the website Amazon.com. However, the reading material that can be purchased by users of these systems are the same as the offering made by a traditional book store. That is, each item of reading material is usually offered in only one format. Further, users must wait whilst the reading material they ordered is retrieved from a warehouse and shipped to them.
Electronic Form Documents
The distribution of electronic documents is generally known, and is described, for example, in International Publication No. WO 00/72235 A1 (Silverbrook Research Pty Ltd, 30 Nov. 2000). Silverbrook describes text being formatted in the Extendable Mark-up Language (XML) using the Extensible Stylesheet Language (XSL).
Audio Form Documents
Digital talking Books (DTBs) are one type of audio form documents. DTBs known to the extent that there are technical standards that apply. One such standard is ANSI/NISO Z39.86-2002 “Specifications for the Digital Talking Book”, published in 2002 by the US National Information Standards Organisation, Bethesda, Md. 20814 (ISBN: 1-880124-52-1). The Z39.86 Standard deals with many aspects of DTBs, including the DTB package file, content format for text, audio file formats, image file formats, synchronisation of media files, navigation control files, portable bookmarks and highlights, resource file, packaging files for distribution and presentation files.
The Z39-86 Standard owes much to the work done by the DAISY Consortium. The DAISY 2.0 specification is based on HTML, and version 2.01, published in February 2001 (www.daisy.org/publication/specifications/daisy
Braille Form Documents
Braille characters are made up of up to six raised impressions in two columns of three impressions. Braille characters are approximately 28 point and are always the same size and the horizontal space between characters is constant. Letters are mapped to the Braille codes and this form of Braille is called Grade 1 Braille. Grade 2 Braille has contractions applied to words to make the size of Braille documents smaller and quicker to read. In English there are different contraction rules in the US, UK, Australia, and there is now a new standard, Universal English Braille Code, which is a fourth set of rules. Many of the rules are the same. In say German, the mapping of letters to Braille codes and the contractions may be so different that a German Braille reader who can speak both English and German may not be able to read English Braille. Images in documents need to be described in words, generally using additional information to be added. In addition, some graphical information can be provided by Brailled images. A map of Australia can be Brailled, so that the outline of Australia can be shown as a series of raised dots on paper, so that a blind person can feel it.
A needs exists, however, for the reproduction, and electronic distribution of a wide variety of documents in a chosen one of a number of available forms.
The invention generally provides computer programs, methods and computer apparatus/systems for reproducing a requested source document in a requested one of available forms. Additionally, requested documents can be provided in requested formats, and be navigable.
For each one of a plurality of documents: at least one access pathway is applied to a marked-up form of the document, the access pathways define discrete parts of the document. A fragment of the marked-up document is generated for each said access pathway for each available form.
A requested one or more parts of a source document can be generated in a requested form from the respective stored fragments.
Preferably, the access pathways are defined in a configuration file. A document is assigned to a respective class, and there is a configuration file for each class. The source documents are marked-up according to a schema, and there is a separate schema for each class. The configuration file for each class may contain certain variations for each form.
The schema describes the document fully. The configuration file indicates which pieces of the full document are significant.
Advantageously, an index list is created for each request maker, the index list including a set of documents available to each request maker, and lists the access pathways for each fragment of each document. One fragment comprises the entire source document.
Definitions
Source documents are the subject of a mark-up process according to an appropriate one of a number of schemas. Each such marked-up document is the subject of a build process, in which a document is analysed (according to a schema/set of rules) to determine pieces important as access paths. The access paths are defined for each document. So, for any one document the access paths are then used to create the set of fragments for that document. The fragments enable navigation of the document. The fragments are then each rendered into each one of the forms in which the document is to be available to the customer or customers entitled to see the document (for example, the person to whom the bank statement is addressed). Thus, for any one document, a set of fragments exists for each of the chosen forms that are available.
The source document can be translated as a preliminary step to the build process, to be available to customer or customers entitled to see the document (e.g. the person to whom a bank statement is addressed) in other languages. Document formatting choices can also be provided.
A customer request includes the identity of a document to be reproduced, the required form of the document, and optionally desired formatting information. An output file is produced, and is then subject to a reproduction process that utilises the access paths. The resultant forms supported in the embodiment described are a Braille physical document, a printed document, audio (eg. spoken word or music) or a physical storage medium (eg. CDROM or magnetic disk).
Generally, it is desirable to use customer's existing computer systems, since it allows interfacing with existing financial records and systems (invoices are one form of document that can be requested), and, in the main, is the least troublesome for the customer. Customers who are visually impaired may prefer to use their existing computers and software, rather than install new software and learn how to use it. For example, presenting invoices in a DAISY format may be more convenient for someone used to a particular DAISY reader than requiring the customer to acquire and learn new software.
Some document providers, such as banks, may not easily be able to generate invoices, statements and the like in XML form. In such a situation, a bank would require specific additional software to create and format such documents then forward them to a central repository where the documents can be organised for the user and from which a user can obtain requested documents.
B. Build Process
Building index lists and access pathways
Turning specifically to
For Braille-form mark-up, character strings that need to be treated differently in Braille should be separately tagged and identified. An example is a foreign word that will be spelled out in Braille 1. Information like phone numbers and web addresses that may be treated differently in the different versions of Braille are likely to be tagged, so that these character strings can be rendered into a standard form more easily (eg. phone numbers with area codes can be written in several different formats and the actual number may or may not have spaces in it). Images and diagrams need to be annotated for the visually impaired, as will be described below.
If a document is to be offered in a different language to the original, then the marked-up form of document is actively processed by a translation system 72 in response to a customer request. The translation will depend on the document type and importance. Certain documents like invoices, bank statements and credit card statements consist of a template into which the content of the document is inserted. The content of the documents often contain largely numeric information (which does not need language translation), part or product names (which do not need language translation) or single words or phrases that can often be machine translated. If the documents contain only numerical, part or product names in the content, then simply translating the template will translate the document. If the template constructed so that the information in the template is called from a database, and if the calls to the database include the language, then these documents can be automatically translated at the request of the user. Other information in the invoice can be machine translated or if the information is say standard advertising information, then it can be manually translated and temporarily added to the template.
Other documents can be machine translated. More valuable documents (such as legal document or contracts) may be translated manually. The most valuable documents can be manually translated and manually verified by an independent translator. For manual translation of documents a work flow process will be instituted for tracking the manual translation of documents.
The marked-up documents 70, 70′ are then stored in an XML repository 73.
An index and access pathway builder 74 uses an XML configuration document 75, in turn based on an XML schema 76 providing validation rules, to configure an application that will build an XML document specific to each customer containing a list of all the documents available for a particular customer: the Index list 77. The index list 77 provides various ways for the customer to access those documents (ie. the access paths) determined by an XML schema 78. An XML index list 77 allows searching of, and navigation to any fragments defined in the configuration file 75 which generates and defines the granularity of any fragments. Index documents thus generated are stored in an index store 79.
Consider the following example XML code for a ‘bank statement’ class of document 70:
As a separate example, consider an example marked-up XML file for a ‘book’ class of document 70:
Configuration files configure the software applications to provide the necessary functionality. An example XML configuration file 75 for the client index and access pathway builder 74 for a ‘bank statement’ class file is:
The configuration file 75 for the ‘book’ class of document is:
The XML schema 76 providing validation rules for the configuration files is shown in
Access pathways can be applied to any schema, and there is an ability to apply different access paths to the same element (eg. transactions and transaction item). Additionally, it is possible to use only a containing element (ie. a leaf node or one that does not contain lower level elements becomes the container).
Consider the following index list 77 for a particular customer:
This index document holds two ‘bank statement’ records and one ‘telephone account’ record. Each access path consists of a block of one or more elements contained by a single element; these containing elements are the identifiers in the “ITMPath” elements of the index list.
Reference is made to
Building Fragments
A fragment builder 80 has knowledge of the fragments for a particular document, and utilises known application programs to convert each fragment into each of the requested supported forms. The fragments can also include formatting options available to customers (as described below).
One objective is to provide disabled people with the ability to deal with their documents in an efficient manner in their chosen form. It may not apply for all customer document reports. This is described as a ‘navigation’ ability, in that a document can be navigated by its fragments.
For each class of document, analysis and mapping must be carried out to clearly identify the significant blocks of data requiring presentation to the user through navigable means. Consider the bank statement document described above. The following significant blocks of information are needed:
Indeed, these fragments are evident in the ‘bank statement’ XML index list 77 given above.
Relationships and Schema Relating to Fragment Production
The following example is an audio fragment, but it applies equally to any fragment. Firstly, it is important that the processing systems be able to clearly identify the elements of the schema that contain actual text that needs to be “spoken”. A schema may contain hundreds or even thousands of elements, some mandatory, others optional or dependant on higher level elements in the element “tree” a lesser number of the elements will encapsulate actual text. For this example, assume a schema holds 100 elements, 20 of those elements can contain text, the remaining 80 provide the context in which those text elements are used—the ancestry of the text. Thus it is important in using the chosen schema for the system to be able to identify which elements contain text and which elements provide the context of the text.
This classification of elements is further complicated by the fact that some elements can contain both text and lower level elements which also contain text, called a mixed model element.
An example of a mixed model is emphasis within a paragraph
It is obvious that the second model is more complex as we cannot simply speak the ‘upara’ element and the ‘emphasis’ element as there would then be two sound blocks, which in all practicality does not work.
The approach is to ignore the mixed element tags (emphasis) and speak all the text contained in the upara element, including that enclosed in the emphasis element, but not the actual tag itself (<emphasis type=‘Italic’>). This entails the need to clearly identify:
Although it is unlikely that the headings would be spoken differently (although it would be possible to use a different voice for each or tone or even volume for the hard of hearing), it is currently unlikely that this would happen.
Component Identification
Analysis of the chosen schema must be performed to clearly identify the elements that encapsulate complete blocks of text.
Definitions:
Finer granularity enables more precise navigation and searching.
Complete blocks of text may contain in-line or nested tags, typically these would relate to emphasis or such like, but in reality, all text contained within the root element of the document could be read in a single stream (ie. the complete book). Actual tags within the text block (but not their text content) need to be ignored in the reading process and this applies during recursion of the nesting process.
Where in-line tags occur, or structural tags are treated as in-line tags (such as in treating a complete chapter as a single block of text), it is ensured that removal or ignoring of the inline tags preserves white space and does not cause words to be joined.
All elements that are not those encapsulating complete blocks of text are either:
The three element types described above are used in the following manner
Although ancestry is less important in voice generation that in say the production of printed matter, it still has some significance and the same basic rules apply. Ancestry is important as a heading tag may be used in both the book title and the chapter title, same element—different ancestry (context). The context of the element is used in creation of the navigation component for the DAISY book. The complete ancestry of an element is typically not of interest, rather just whether element X is anywhere in the ancestry. Element X would normally be unique to a single path and sufficient to identify the context.
The fragment builder 80 thus generates—using standard software applications—output files 81 of the appropriate type for each form the source document can take: for example, .pdf for print, MP3 for audio, Braille ASCII for Braille and any convenient file type (eg. MS Reader™) for E-book. These are stored in the fragment store 82.
C. Reproduction
Reproduction is under the control of the management and synchronisation system 84. Both complete rendered documents in the chosen form and rendered document fragments of the chosen form for each navigable component defined by the pathway builder 74 can be reproduced. The chosen reproduction form is achieved by an appropriate mapping process. In one embodiment the following set of applications can be used:
Voice generation system—generates DAISY, MP3 and CD audio forms.
The process is as follows
Braille production is dependant on two principal driving factors. The first is the selected contraction table which is usually based on the language (US English Braille. UK English Braille, German Braille, etc). The second is the selection of the target Braille code which maps the characters of the language to the dot based Braille code. Although typically English words would have English contractions and English codes (also German->German->German) English words could be written with German contractions and German codes so that a German Braille reader who could speak English could read the English words without having to learn English Braille codes.
Braille contractions are driven by large translation tables (one for each language supported). These tables contain the word and the Braille contracted word in the target language. There are rules as to where contractions may be applied, for example some words may not have ending contractions applied if immediately followed by punctuation, etc. In this situation the word will be entered several times in the table, with the punctuation mark appended to the word in the additional entries. In the following hypothetical example, the characters “ing” are replaced in the word “running” but not in the word “running.” XML and table fragments illustrate this.
In reality all words in the <para> will be tagged with either true or false, but in this example for clarity we have tagged only “running” and “boy”. Words that are not tagged do not appear in the translation table, and will be written to an exception file for either addition to the table and reprocessing or they may be handled as Braille 1. The final step is processing to Braille output.
E-book Generation Process
Any convenient text conversion software application can be used (eg. Acrobat Reader™).
The document management and synchronisation system 84 manages and tracks the documents, fragments, XML documents and indexes. The management and synchronisation system 84 interacts with three output interfaces: a physical production interface 86, a web interface 88 and a download interface 90.
Physical Production
The physical production centre 86 uses the pre-built output documents and document fragments to produce physical media to delivered by suitable means to a customer 100. The physical production centre 86 produces the chosen form of either a Braille document 94, a printed document 96, or a storage medium such as a CDROM 98.
The web interface 88 employs web pages to call server functionality to deliver electronic files to the client in the following forms:
The web interface 88 is accessed by the customer 102 by any convenient browser application 104.
The download interface 90 is a simple web-service or other transfer mechanism to move documents to a customer PC for access purposes. This interface 90 is active when a customer chooses to synchronise documents over the internet. The download interface 90 thus communicates with local PC systems 106, under the control of the customer 108.
Turning now to
A download interface 120, 122 is provided for the simple PC system solution and the full-function PC system solution, respectively. A simple PC system solution has an index application 124, whereas a full-function PC system has a management application 126. In both cases the user's files are copied to the reproduction computer, including index files 128, output documents and fragments 130 and XML documents 132, in a common store 62.
The index application 124 has the ability to read and/or search the customer's index list, and search documents using the XML documents store 132 to present complete documents through a reader application.
The management application 126 has the ability to handle various forms of input other than a keyboard or helper application.
Four forms of output are provided. A Braille application 128, 130 generates a Braille document using any convenient commercial system, to be delivered to the user 108 in paper form by host or electronically for local printing or for use on a reader/keyboard device.
A voice application 132, 134 are generated as described above. Voice fragments are navigable using standard DAISY functionality giving limited levels of navigation through these classes of documents. One way to improve the navigability is to concatenate the index and access the pathways to create longer access pathways.
Having done this, the information can be mapped into a DAISY form. This approach delivers navigability in a third party product.
An E-book application 136, 138 can be achieved through the use of XSL(T) transformations.
Finally, a print application 140, 142 generates a PDF output file.
For these simple PC systems, a simple keyboard 150 can be interfaced with the index application 124. For the full-function PC system, a Braille input device 152, voice input device 154 and keyboard 156 can interface with an input conversion application 158, in-turn inputting to the management application 126.
Print Formatting
Referring now to
D. Searching
Searching can be performed on the index 77 or on the whole document. The index is used for navigation to allow rapid retrieval of a document or fragment, and in addition, the index can be searched for content. Not all information need be in the index, and so the document can also be searched for context. In searching for a telephone number on a phone bill, the search could be restricted to the phone number in the transaction listing sections (ie. access pathway) finding a specific number called, because the information is provided in XML as well as in any user-requested format. In the case of presentment in any form, the functionality is available as the XML used to create the presented document is provided as a basis for searching in context, the choice of customer system will define how the result is presented. In the case of a simple storage solution (left-hand side of item 60 in
This will only be able to present a complete document as the result of the search (ie. a phone bill, not a line on the bill). The full function system (right-hand side of item 60 in
E. Other Embodiments
Special Braille Mark-Up
Images can be represented in print and to a lesser extent in Braille. For example, a square can be represented as four lines intersecting lines of closely spaced Braille impressions forming a square. A pie chart can be represented as a circle of Braille impressions which are intersected by radii at appropriate points. A bar chart can similarly be represented as can a graph.
A program that can create regular images in print can also be used to create Braille representations at appropriate sizes for the reader.
With images represented in Braille, there are usually descriptions in Braille. These descriptions are usually manually created, as are the Braille images. These manual descriptions or annotations of the diagram can be used directly in Audio Books as well as Braille documents.
A standard text template be formed for regular images such as geometric shapes, pie or bar charts, graphs and other similar images, and variables can be automatically inserted in the mark-up process so that the particulars of that image can be correctly explained to the Braille reader.
A customer can create a Braille image representation and annotation simply by selecting the image type and inserting the variables to define the image. If an embossed image is required, the mark-up will generate the embossed image with the appropriate labels and insert the text of the variables in the annotation template text in a suitable format so that the Braille reader can quickly find out what the image refers to. This also can be applied to non English languages.
For example, a person wanting to create a Braille representation of a simple bar chart shown in
The variables to be filled in are:
Variable 12=Size of bar 3
The template may not include all of the visual information, such as the shading and horizontal lines shown in
The same variables can be used to generate the Braille and also the typeset image of the diagram of
Storage and Retrieval of Braille Images and Image Annotations
Sighted people can search for images from image categories and from descriptions of the images, and can locate possible images and then view the images to select the correct image. Using this technique, in addition to the original image being stored, an annotation of the image and a Braille representation of the image can be stored. In this way, someone who has created a Braille representation of an image of the map of Australia and annotated it can store the original image, the Braille representation of the image and the annotation, and make it available for other people to locate and use without having to redo this work.
Response Capability
The facility for customers to provide responses to documents is provided. For example, one form of document that is reproduced may be a questionnaire, and responses to the questions can be made by the customer in any desired form (supported by the customer computer), and stored on the document server for subsequent attention.
Invoice Classification
A person with normal vision may get the following invoice information sent to him:
A customer may be permitted to classify invoices into categories so that a phone bill from a Telco will entered correctly into the accounting system. There are two ways to do this: build a table or file using a mapping process that is translated from the XML to some input format for the customer's accounting system, or allow the user to enter his own classification code so that all bills from the Telco will go into chart of accounts entry 23, for example. If the Telco sends accounts for Internet and phones, the customer may be permitted to look at the bill and classify it, or to classify the Telco account number on the invoice.
These arrangements (ie. response capability and invoice classification) utilise the repository 73 on the document server side.
Number | Date | Country | Kind |
---|---|---|---|
2004903307 | Jun 2004 | AU | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/AU05/00832 | 6/10/2005 | WO | 12/13/2006 |