Method and apparatus for font subsetting

Information

  • Patent Application
  • 20080028304
  • Publication Number
    20080028304
  • Date Filed
    July 25, 2006
    18 years ago
  • Date Published
    January 31, 2008
    16 years ago
Abstract
A method and apparatus are provided for embedding a font subset in an electronic document. The method in one form includes analyzing a document having characters of a font set where characters may have different forms depending on the location of the character in a word or one or more ligatures represent a combination of characters. A font subset is created corresponding to only the character forms present in the document and the font subset is associated with the document. Advantageously, the embedded font subset only contains the font characters which are used in the document and not all characters which may be present in a complete font set for all font sets referenced in the document.
Description

BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a flow chart of one method in accordance with the present invention; and



FIG. 2 is a schematic depicting implementation in accordance with one aspect of the present invention.





DETAILED DESCRIPTION

Referring now to the figures and, in particular, FIG. 1, method 10 comprises analyzing text documents composed of characters of one or more font sets (step 20). The document is examined to determine which font sets are used to render the characters in the text document (step 20).


Next, the characters in the documents are analyzed to determine which characters and character combinations are present and to determine which, if any, glyph variants of the font set are used within the document, and/or whether one or more combination of characters is represented by a single ligature in the font character set used to render the text in the document (step 30). For example, the document may use a font character set in which characters are represented by different glyphs, where a different glyph is used depending on whether the character is the initial, middle or final character in a word, as in Arabic and Indic character sets. Further, the font character set may include ligatures, which represent a combination of characters, such as “fi” which is represented by ligature “fi and “fl” which is represented by ligature “fl.” Therefore, if the document includes one or more ligatures, it will be determined that such a ligature is to be included in a font subset to be associated with the document (step 30).


A font subset is created which contains all of the character forms present in the document based on the analysis of the document, which includes all glyphs and ligatures present in the document (step 40). The font subset does not contain extraneous or unused glyphs or ligatures which may be present in a complete font set but are not used within the document analyzed. For example, should the font set include a character having a glyph form corresponding to when the character is present at the initial position of a word, and the document does not contain a word in which the character is in the initial position of a word, then the font subset created will not include that glyph form.


At step 50, the font subset is associated with the document as an embedded font set.


Referring now to FIG. 2 along with FIG. 1, the present font subsetting method can be implemented in the form of a computer program initially stored on an appropriate computer medium 100 in the form of computer instruction 110 which can be executed by a computer processor (not shown). Document 120 contains text 130 which is composed of characters of a font set. The document also includes metadata 140 which corresponds to document formatting and other information regarding the document. After the document 120 has been analyzed (step 20), which includes text 130 and metadata 140, and all character forms present in the text 130 have been determined (step 30), embedded fonts 150 are associated with the document 120 (step 50). The embedded fonts 150 contain only the character forms present in the text 130 which includes only the glyphs and ligatures of the text, but no additional character forms which may be present in the font set used to render the text of document 120.


It will now be apparent to one of ordinary skill in the art that the present method provides features and advantages not found in prior font embedding methods. For example, the embedded fonts associated with text documents only include those characters present in the text of the document and not all characters which are present in the font set used in the document text. As a result, the embedded font subset will have a reduced size as compared with prior embedded fonts created using the prior art method of font subsetting, as the prior embedded font subsets include all character forms or glyphs for all characters of a document, regardless of whether a particular glyph form is actually used in the text of the document. Consequently, a document with embedded fonts, in accordance with the present invention, will have a reduced size, requiring less storage space for the electronic document and requiring a reduced data transmission bandwidth when being sent as an electronic document.


Although the invention has been described above in relation to preferred embodiments thereof, it will be understood by those skilled in the art that variations and modifications can be effected in these preferred embodiments without departing from the scope and spirit of the invention.

Claims
  • 1. A method for font subsetting, said method comprising: analyzing a document comprising characters of a font set wherein the characters are associated with glyphs having different forms depending on a location of a character in a word or the font set where one or more ligatures represent a combination of characters;creating a font subset containing shapes of the glyphs corresponding to only character forms present in the document determined in the analyzing the document; andassociating the font subset with the document.
  • 2. The method of claim 1, wherein said associating the font subset comprises storing the font subset within the document.
  • 3. The method of claim 1, wherein one or more of said characters have different positional glyph variants depending on the characters thereto and whether the character is an isolated, an initial, a medial or a final character of a word.
  • 4. The method of claim 1, wherein a combination of characters in the font set is represented by a single ligature in the font subset.
  • 5. A method for font subsetting, said method comprising: analyzing a document comprising characters which have glyph forms corresponding to: 1) a location in which a character is located within a word or2) a combination of characters;creating a font character subset comprising only the glyph forms present in the document as determined from analyzing the document; andembedding the font character subset in the document.
  • 6. The method of claim 5, wherein one or more of said characters have different glyph forms depending on the characters adjacent thereto and whether the character is an isolated, an initial, a medial or a final character of a word.
  • 7. The method of claim 6, wherein a combination of characters is represented by a single ligature in the font character subset.
  • 8. A computer-readable medium containing program instruction for font subsetting, said program instructions comprising: analyzing a document comprising characters of a font set wherein said characters comprise different forms depending a location of a character in a word or one or more ligatures which represent a combination of characters;creating a font subset corresponding to only character forms present in the document determined in the analyzing of the document; andassociating the font subset with the document.
  • 9. The computer-readable medium of claim 8, wherein said associating the subset of the font set with the document comprises embedding the font subset in the document or transmitting the font subset along with the document.
  • 10. The computer-readable medium of claim 9, wherein the font subset is transmitted as a separate data stream in a media presentation.
  • 11. A computer system for font subsetting, said system comprising: memory for storing a document comprising characters having glyph forms corresponding to:a) a location in which a character is present within a word orb) a combination of characters;a processor for: i) analyzing the document to determine which glyphs are present therein;ii) creating a subset of a font set corresponding to only character forms present in the document determined in the analyzing of the document; andiii) associating the subset of the font set with the document.
  • 12. The computer system of claim 11, wherein said processor for associating the subset of the font set comprises a processor for embedding the font subset in the document.