The specific features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings where:
In the following description of various embodiments of the present invention, reference is made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific embodiments in which the invention may be practiced. It is understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the present invention.
1.0 General Definitions:
The definitions provided below are intended to be used in understanding the description of the “Character-Level Font Linker” provided herein. Further, as described following these definitions,
1.1 Character: The smallest component of written language that has a semantic value. A “character” generally refers to the abstract meaning and/or shape, rather than a specific shape. In the context of the Character-Level Font Linker, characters are defined in terms of their Unicode code-point.
1.2 Glyph: The term “glyph” is a synonym for glyph image. In rendering, displaying and/or printing a particular Unicode character, one or more glyphs are selected from a font (or fonts) to depict that particular character.
1.3 Font: A “font” is a set of glyphs for rendering particular characters. The glyphs associated with a particular font generally have stylistic commonalities in order to achieve a consistent appearance when rendering, displaying and/or printing a set of characters comprising a text string. Examples of well known fonts include “Times New Roman” and “Arial.”
1.4 Script: A “script” is a unique set of characters that generally supports all or part of the characters used by a particular language. Typically, many fonts will support (at least in part) one or more scripts. Examples of scripts include Latin, Cyrillic, Hebrew, Greek, Latin Extended-B, etc., to name only a few.
While scripts support characters used by a particular language, scripts are not generally mapped in a one-to-one relationship with particular languages. For example, the Japanese language generally uses several scripts, including Japanese Hiragana, while the Latin script is used for supporting many languages, including, for example, English, Spanish, French, etc., each of which may use particular characters unique to those particular languages.
Further, fonts generally include header information that indicates whether the font provide a nominal support for a particular script. However, an indication of script support by a particular font is no guarantee that the particular font will actually support all of the characters of a particular script with glyphs for every character intended to be included in that script.
For example,
A particular example of this problem is Unicode code-point 0180 (element 200 for
Script ID (“SID”): A “SID” is used to provide a Unicode identification of a script which identifies the script (Latin, Cyrillic, Hebrew, etc.) needed to render each run of a text string. Generally, these SIDs are used to determine whether a particular script is supported
Run: A “run” is a run of contiguous characters extracted from a text string that uses the same font and/or formatting.
2.0 Exemplary Operating Environment:
At a minimum, to enable a computing device to implement the “Character-Level Font Linker” (as described in further detail below), the computing device 100 must have some minimum computational capability and either a wired or wireless communications interface 130 for receiving and/or sending data to/from the computing device, or a removable and/or non-removable data storage for retrieving that data.
In general,
In fact, the invention is operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held, laptop or mobile computer or communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer in combination with various hardware modules. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
For example, with reference to
The communications interface 130 is generally used for connecting the computing device 100 to other devices via any conventional interface or bus structures, such as, for example, a parallel port, a game port, a universal serial bus (USB), an IEEE 1394 interface, a Bluetooth™ wireless interface, an IEEE 802.11 wireless interface, etc. Such interfaces 130 are generally used to store or transfer information or program modules to or from the computing device 100.
The input devices 140 generally include devices such as a keyboard and pointing device, commonly referred to as a mouse, trackball, or touch pad. Such input devices may also include other devices such as a joystick, game pad, satellite dish, scanner, radio receiver, and a television or broadcast video receiver, or the like. Conventional output devices 150 include elements such as a computer monitors or other display devices, audio output devices, etc. Other input 140 and output 150 devices may include speech or audio input devices, such as a microphone or a microphone array, loudspeakers or other sound output device, etc.
The data storage 160 of computing device 100 typically includes a variety of computer readable storage media. Computer readable storage media can be any available media that can be accessed by computing device 100 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules, or other data.
Computer storage media includes, but is not limited to, RAM, ROM, PROM, EPROM, EEPROM, flash memory, or other memory technology; CD-ROM, digital versatile disks (DVD), or other optical disk storage; magnetic cassettes, magnetic tape, magnetic disk storage, hard disk drives, or other magnetic storage devices. Computer storage media also includes any other medium or communications media which can be used to store, transfer, or execute the desired information or program modules, and which can be accessed by the computing device 100. Communication media typically embodies computer readable instructions, data structures, program modules or other data provided via any conventional information delivery media or system.
The computing device 100 may also operate in a networked environment using logical connections to one or more remote computers, including, for example, a personal computer, a server, a router, a network PC, a peer device, or other common network node, and typically includes many or all of the elements described above relative to the computing device 100.
The exemplary operating environments having now been discussed, the remaining part of this description will be devoted to a discussion of the program modules and processes embodying the “Character-Level Font Linker.”
3.0 Introduction:
A “Character-Level Font Linker,” as described herein provides character-level linking of fonts via Unicode code-point to font mapping. In contrast to conventional dynamic font linking schemes which generally identify whether a font provides nominal support for a particular script (Latin, Cyrillic, Hebrew, Greek and Coptic, Japanese Hiragana, Latin Extended-B, Spacing Modifier Letters, IPA Extensions, Latin-1 Supplement, etc.), the Character-Level Font Linker operates based on a predefined lookup table, or the like, which identifies glyph-level support for particular characters on a Unicode code-point basis for each of a set of available fonts. In other words, the lookup table provided by the Character-Level Font Linker includes a Unicode code-point to font map that allows an immediate determination as to 1) whether a particular font supports a particular character with a corresponding glyph, or 2) given a particular character, which particular font(s) supports it with corresponding glyph.
3.1 System Overview:
As noted above, the Character-Level Font Linker described herein provides a system and method for ensuring that characters in a text string will be rendered with as few “white boxes” as possible by ensuring that fonts assigned to character runs segmented from a text string provide glyphs for each character in each run. In addressing such problems, the Character-Level Font Linker operates either by itself, or in combination with conventional font identification or font assignment systems.
For example, in the case where the Character-Level Font Linker operates in combination with existing font assignment systems, the conventional font selection system will select a default font for rendering one or more runs of text. Then, given this default font, the Character-Level Font Linker will begin an examination of whatever default font is selected for rendering a particular text string to determine whether that selected font includes actual glyphs to support each character of the current text run. If the run is supported with actual glyphs, the Character-Level Font Linker does not change the font assigned to those characters. However, in the case where the Character-Level Font Linker determines that the assigned font can not support one ore more characters of any runs with glyphs, then the Character-Level Font Linker operates as described herein to assign a new font or fonts to those characters prior to rendering, displaying, or printing those characters.
As noted above, the Character-Level Font Linker operates either by itself, or in combination with conventional font identification or font-linking systems. However, for purposes of explanation, the remaining detailed description will address the standalone case for font selection, as the operation of the combination case should be clear to those skilled in the art in view of the detailed description provided herein.
In general, the Character-Level Font Linker begins operation by parsing a text string to be rendered, displayed and/or printed (hereinafter referred to as simply “rendering” or “rendered”) to identify runs of characters that have glyph-level support for all characters in the run with respect to a particular font. Glyph support for particular characters is determined by comparing the Unicode code-point of each character to corresponding entries for the various fonts represented in the lookup table.
In the case where there is a default font (a user specified or preferred font), the Character-Level Font Linker tests that font with respect to the Unicode code-point of the first character of a run (which begins with the first character of the text string) to determine whether that font supports that first character with a glyph. If so, then the Character-Level Font Linker tests the next character, and so on, until a character is found in the text string that is not supported by the current font. Once an unsupported character is identified, the Character-Level Font Linker queries the lookup table to identify a new font that will support that character with a glyph. The newly identified font is then assigned to the current character, which is also used as the beginning of a new run of characters.
In the case where there is no default font, the Character-Level Font Linker simply compares the Unicode code-point of the first character to the lookup table to identify an initial font that includes glyph support for that character. The Character-Level Font Linker then proceeds as summarized above with respect to the subsequent characters in the text string.
In view of the preceding paragraphs, it should be clear that character runs are delimited by examining the characters in the text string relative to the lookup table to find contiguous sets of one or more characters supported by particular fonts that provide a glyph for each character in the run. However, this basic font selection method is further modified in various additional embodiments.
For example, in one embodiment, the lookup includes a default or user assigned font selection priority. This priority is useful since for many Unicode code-points there will be multiple fonts that support a particular glyph. In this case, font selection is achieved by selecting higher priority fonts first when identifying those fonts that support a particular character with an actual glyph.
In various related embodiments, consideration is given to overall uniformity or consistency of the text string to be rendered. For example, while it may be possible to associate many unique fonts to a text string for rendering all of the characters in that text string, the use of a large number of fonts will tend to reduce the overall uniformity of the rendered text. As a result, in various embodiments, the Character-Level Font Linker will automatically reduce the total number of fonts used by selecting the fewest number of fonts possible for rendering the overall text string. To accomplish this embodiment, the Character-Level Font Linker will first identify all of the fonts included in the lookup table that will support each character of the text string, and will then perform a set minimization operation to find the font, or smallest set of fonts, by heuristic rules, such as being uniform in term of font family or style, that will provide glyph support for the characters of the overall text string.
In a related embodiment, the Character-Level Font Linker is limited by a default font (user selected or preferred font), such that all characters supported by that font (according to the lookup table) will be rendered using that font. All of the remaining characters will then be rendered by other fonts by consulting the lookup table, again with the limitation that the total number of fonts used to render the remaining characters is minimized to ensure the greatest overall uniformity of the rendered text.
Once all of the runs have been identified and assigned supporting characters from corresponding fonts, the text string is rendered by using conventional techniques for displaying and/or printing the glyphs corresponding to the characters in the text string by using the fonts assigned to each run of characters.
3.2 System Architectural Overview:
The processes summarized above are illustrated by the general system diagram of
In general, as illustrated by
As noted above, the lookup table 315 indicates, for every locally available font included in the table, which Unicode code-points are actually supported by each of those fonts with actual glyphs. Therefore, given the code-point for every character of the text data 305, the data parsing module is able to construct the text runs 330 that are supported by single fonts by consulting the lookup table 315.
In one embodiment, if the data parsing module 310 is unable to find a local font that provides a glyph for a particular character of the text data 305, the data parsing module calls a font/glyph retrieval module 320 which connects to a remote font store 325 maintained by one or more remote servers. The font/glyph retrieval module 320 provides the code-point of the needed glyph to the remote font store 325, which then returns either an entire font, or an individual glyph that will support the character that is not supported by a local font store 340 as indicated by the lookup table 315. The returned font or individual glyph is then added to the local font store, and a mapping update module 345 updates the lookup table 315 with the character/script support information of the new font or glyph.
In either case, once all of the text runs 330 have been assigned fonts by the data parsing module, those runs are provided to a text rendering module 335 which calls the local font store 340 to render the text data 305 using conventional font rendering techniques.
As noted above, in one embodiment, the local font store 340 can be updated, either by adding or deleting fonts. Such updates can occur automatically because of the actions of some local or remote application, or can occur via manual user action via a user input module 350. In either case, in one embodiment, additions to the local font store 340 trigger the mapping update module 345 to evaluate the newly added fonts to add the character/script support information to the lookup table 315. Similarly, deletions from the local font store 340 trigger the mapping update module 345 to remove the corresponding character/script support information from the lookup table 315.
In another embodiment, the user can trigger updates to the lookup table 315 via the user input module 350 at any time the user desires. In a related embodiment, the user is provided with the capability to manually access and modify the lookup table 315 via the user input module 350. One example of a user modification to the lookup table includes the capability to manually specify the use of one code-point as a substitute for another code-point, either globally, or with respect to one or more particular fonts. The result of such a modification is that the Character-Level Font Linker will automatically cause a user specified glyph to be rendered whenever a particular character is included in the text data 305.
4.0 Operation Overview:
The above-described program modules are employed for implementing the Character-Level Font Linker described herein. As summarized above, this Character-Level Font Linker provides a system and method for ensuring that characters in a text string will be rendered with as few “white boxes” as possible by ensuring that fonts assigned to character runs segmented from a text string provide glyphs for each character in each run. The following sections provide a detailed discussion of the operation of the Character-Level Font Linker, and of exemplary methods for implementing the program modules described in Section 2.
4.1 Operational Details of the Character-Level Font Linker:
The following paragraphs detail specific operational embodiments of the Character-Level Font Linker described herein. In particular, the following paragraphs describe an overview of the lookup table with optional remote font/glyph retrieval; text string parsing; text rendering; and operational flow of the Character-Level Font Linker.
4.2 Unicode Code-Point to Font Mapping Table:
As noted above, the “Unicode Code-Point to Font Mapping Table,” also referred to herein as the “lookup table” provides, for every font included in the table, an indication of which Unicode code-points are actually supported by each font with actual glyphs. In general, the lookup table serves at least two primary purposes: 1) it covers as many Unicode code-points as possible, given a particular set of available fonts; and 2) the use of the lookup table allows the Character-Level Font Linker to use as fonts as possible when rendering a particular text string.
In one embodiment, construction of the lookup table is performed offline (remotely) based on an automatic evaluation of each of a set of default fonts expected to be available to the user. In general, construction of the lookup table involves examining every code-point of each font for each of the scripts nominally supported by that font to determine whether there is an actual glyph for each corresponding code point. Further, in the unlikely case that a particular font fails to indicate support for a particular script (or any script at all) it is possible to examine every possible code-point for the font to determine what characters are actually supported with glyphs. Since construction is performed offline in one embodiment, the fact that there are approximately one-million code-points in the Unicode international standard isn't a significant concern since such computations can be performed once for each font, with the results then being provided to many end users in the form of the lookup table.
As noted above, in various embodiments, the lookup table can also be constructed, updated, or edited locally by individual users. In this case, the lookup table contains the same type of data (actual glyph support for each corresponding code-point for one or more locally available fonts) as the lookup table constructed offline. As discussed above, in one embodiment, the lookup table is user editable via a user interface. Similarly, in various related embodiments, the lookup table is updated whenever one or more fonts are added or deleted from the user's computer system. Such updates are performed either automatically, or upon user request, by automatically evaluating one or more locally available fonts to determine which Unicode code-points are actually supported by each local font with actual glyphs.
Further, also as noted above, in one embodiment, when the Character-Level Font Linker optionally downloads a font or glyph to support a particular character, corresponding updates to the lookup table are performed to indicate local support for that character for use in rendering subsequent text data.
4.3 Text String Parsing:
As discussed above, parsing of the text data or text string involves segmenting that data into a number of “text runs” or “character runs” that are each supported by an individual font. In general, this parsing involves a character level comparison of the text data (as a function of the Unicode code-points associated with each character) to the glyph support information included in the lookup table.
In particular, the Character-Level Font Linker begins this parsing by first identifying a font that supports the first character for the text. If the first character has no font support (according to the lookup table), then the Character-Level Font Linker will examine each succeeding character until a character has font support. The font selected for the current run is referred to as the current font. The Character-Level Font Linker will then terminate the current run at the first subsequent character that is not supported by the current font or that is supported by the default font if the current font is not the default font (See
As noted above, the lookup table is consulted to identify a font that supports each particular character (based on the code-point of each character). However, in the case that the lookup table is constructed remotely and provided to a local user, it is possible that the user will not have a particular font that is included in the lookup table. Consequently, in one embodiment, the Character-Level Font Linker will first evaluate the lookup table to identify a font that supports a particular character. The Character-Level Font Linker will then scan the local system (or a list of local fonts) to see if the identified font is actually available. If the identified font is not available, then the Character-Level Font Linker will either 1) reevaluate the lookup table to identify another font followed by another check of the locally available fonts until a match between a supporting font and a locally available font is made, or 2) fetch that font (or part of that font, e.g. one glyph) from a remote store.
Further, as discussed above, in one embodiment, assignment of fonts to particular runs, and thus the particular segmentation of runs from the text data, is performed to minimize the number of fonts used to render the text. Consequently, in this embodiment, runs are not actually delimited until a determination is made as to the smallest set of fonts that can be used, as described above.
4.4 Text Rendering:
As noted above, the Character-Level Font Linker parses a text input into a number of text or character runs, with each run including an assigned font that includes glyph support for each character in each run. Consequently, once this information is available, the Character-Level Font Linker simply renders the text using the assigned font for each run. Rendering of text using assigned fonts (and formatting) is well known to those skilled in the art and will not be described in detail herein.
4.5 Operational Flow of the Character-Level Font Linker:
The processes described above with respect to
The Character-Level font linker keeps track of a current font and current character during processing. In general, as illustrated by
If there is no default font 405, the Character-Level Font Linker queries 425 the lookup table 315 to identify a supporting font for the first character of the text data 305, sets the identified supporting font as the current font, and begins 420 a text run with that character.
The next character is then set as the current character 430. Then, to process each new current character, there are three basic scenarios:
The above described processes (boxes 425 through 480 of
In addition to the embodiments illustrated in
The foregoing description of the Character-Level Font Linker has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. Further, it should be noted that any or all of the aforementioned alternate embodiments may be used in any combination desired to form additional hybrid embodiments of the Character-Level Font Linker. It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto.