This application claims priority to EP 16 290 146.6, filed Jul. 29, 2016, which is hereby incorporated by reference.
The present description relates generally to the field of handwriting recognition systems and methods using computing device interfaces. The present description relates more specifically to systems and methods for beautifying digital ink rendered from superimposed handwriting input to such computing device interfaces.
Computing devices continue to become more ubiquitous to daily life. They take the form of computer desktops, laptop computers, tablet computers, hybrid computers (2-in-1s), e-book readers, mobile phones, smartphones, wearable computers, global positioning system (GPS) units, enterprise digital assistants (EDAs), personal digital assistants (PDAs), game consoles, and the like. Further, computing devices are being incorporated into vehicles and equipment, such as cars, trucks, farm equipment, manufacturing equipment, building environment control (e.g., lighting, HVAC), and home and commercial appliances.
Computing devices generally consist of at least one processing element, such as a central processing unit (CPU), some form of memory, and input and output devices. The variety of computing devices and their subsequent uses necessitate a variety of interfaces and input devices. One such input device is a touch sensitive surface such as a touch screen or touch pad wherein user input is received through contact between the user's finger or an instrument such as a pen or stylus and the touch sensitive surface. Another input device is an input interface that senses gestures made by a user above the input interface. A further input device is a position detection system which detects the relative position of either touch or non-touch interactions with a non-touch physical or virtual surface. Any of these methods of input can be used generally for drawing or inputting text. The user's handwriting is interpreted using a handwriting recognition system or method.
One application of handwriting recognition in non-portable computing devices, such as in-vehicle control and entertainment systems, and in portable computing devices, such as smartphones, phablets and tablets, is for the input of text into various applications run by the computing devices in a manner similar to that traditionally done with keyboard, either physical or virtual, and voice. However, unlike keyboard input which is governed by strict layout rules, such as text entry on visible or invisible lines in accordance with the position of a displayed cursor, handwriting can be input virtually anywhere on the input interface of the device. Further, unlike voice input for text input, which generally involves speech-to-text conversion, handwriting for text input typically involves conversion of so-called ‘digital ink’, being the rendered display of the users handwriting or ‘raw ink’, into typed or ‘typeset ink’ with associated typesetted structure.
The substantially unconstrained input of handwriting has some effect on the users' ability to interact with the input once it has been rendered as digital ink. This is because, unlike typeset ink text which has characters relatively located in a uniform and known manner, the characters of the digital ink text are not uniform or in relatively known positions, generally due to the irregularity inherent in handwriting. Accordingly, conventional systems and methods typically only allow further interaction to be made on the typeset ink rendered from the recognized text. The conversion to the typeset ink also has problems with respect to interpretation of the relatively free positioning of the digital ink.
These problems of conversion and interaction are exacerbated in certain use environments and cases involving relatively small input interfaces, such as wearables and smartphones, and/or lack of visibility of the interface during input, such as in automotive environments and use cases involving entry during motion, such as performed on smartphones by so-called “petextrians”, e.g., input on screen during walking. For example, in automotive use cases, the driver may not be able to look at the input interface to see what they are typing or writing during driving. Typically, voice recognition systems have been used for such use cases. However, the inherent noisy environment of driving can make such systems inaccurate and ineffective. The same can be said for public use environments which are inherently noisy or require quiet.
For such uses cases handwriting recognition systems that allow input of superimposed handwritten characters in script and cursive or linked form have been developed by the present Applicant and Assignee. These systems are described in United States Patent Publication Nos. 2015/0286886 and 2015/0356360 filed in the name of the present Applicant and Assignee, the entire contents of which is incorporated by reference herein. These systems provide accurate recognition of the input of text and non-text content and commands using superimposed handwriting. That is, at least some of the handwriting is overlaid on previous handwriting on the input interface. This allows a relatively small interface footprint and the ability of users to handwrite without regard to the position of that handwriting, such that ‘no-look’ input is supported.
It is desired to allow users to view the input, so they know that it was recognized accurately, so they can later read the input for further input using non-superimposed handwriting, for example, or so they can interact with their handwriting to edit or add to the input (e.g., add or delete characters or words). Some available applications provide alignment of digital ink through normalization and the like. However, these do not take into account certain geometrical features of the handwriting and therefore the aligning is not properly performed to allow faithful reproduction of the original digital ink or full interaction with it. Further, they do not use recognition to provide these operations, and as such do not provide interaction capabilities within the input signal. Accordingly, later interaction editing can only be performed at the word level, not at the character level, for example.
The examples of the present disclosure that are described herein below provide methods, systems and a computer program product for use in beautifying superimposed handwriting on computing devices.
In some implementations, the present disclosure provides a system for beautifying superimposed handwriting on computing devices, each computing device comprising a processor and at least one non-transitory computer readable medium for processing handwriting input under control of the processor. The at least one non-transitory computer readable medium may be configured to receive handwritten input as a plurality of at least partially superimposed input strokes, determine first geometrical information of a plurality of characters defined by the handwriting input, determine a structuring transformation of the handwritten input in accordance with the first geometrical information to align the plurality of characters in non-superimposed order, and cause display of, on a display interface of the computing device, digital ink in accordance with the structuring transformation.
The first geometrical information may be determined from widths of at least some of the characters.
The structuring transformation may be determined to include at least inter-character spacing between at least some of the characters, the inter-character spacing may be defined by the determined widths.
The structuring transformation may be determined to include inter-word spacing between words defined by at least some of the characters, the inter-word spacing may be defined by the determined widths.
The at least one non-transitory computer readable medium may be configured to determine second geometrical information of the characters, wherein the structuring transformation may be determined in accordance with the first and second geometrical information.
The second geometrical information may be determined from typography parameters of at least some of the characters.
The at least one non-transitory computer readable medium may be configured to determine third geometrical information of an alignment structure of the display interface, wherein the structuring transformation is determined in accordance with the first, second and third geometrical information so that the digital ink is displayed in accordance with the alignment structure.
The alignment structure may be a line pattern, and the third geometrical information may be determined from typography parameters of the line pattern.
In some implementations, the present disclosure provides a method for beautifying superimposed handwriting on computing devices, each computing device comprising a processor and at least one non-transitory computer readable medium for processing handwriting input under control of the processor. The method may comprise receiving handwritten input as a plurality of at least partially superimposed input strokes, determining first geometrical information of a plurality of characters defined by the handwriting input, determining a structuring transformation of the handwritten input in accordance with the first geometrical information to align the plurality of characters in non-superimposed order, and displaying, on a display interface of the computing device, digital ink in accordance with the structuring transformation.
The first geometrical information may be determined from widths of at least some of the characters.
The structuring transformation may be determined to include at least inter-character spacing between at least some of the characters, the inter-character spacing may be defined by the determined widths.
The structuring transformation may be determined to include inter-word spacing between words defined by at least some of the characters, the inter-word spacing may be defined by the determined widths.
The method may further comprise determining second geometrical information of the characters, wherein the structuring transformation may be determined in accordance with the first and second geometrical information.
The second geometrical information may be determined from typography parameters of at least some of the characters.
The method may further comprise determining third geometrical information of an alignment structure of the display interface, wherein the structuring transformation may be determined in accordance with the first, second and third geometrical information so that the digital ink is displayed in accordance with the alignment structure.
The alignment structure may be a line pattern, and the third geometrical information may be determined from typography parameters of the line pattern.
In some implementations, the present disclosure provides a non-transitory computer readable medium having a computer readable program code embodied therein. The computer readable program code may be adapted to be executed to implement a method for beautifying superimposed handwriting on a computing device, the computing device comprising a processor and at least one system non-transitory computer readable medium for processing handwriting input under control of the processor. The method may comprise receiving handwritten input as a plurality of at least partially superimposed input strokes, determining first geometrical information of a plurality of characters defined by the handwriting input, determining a structuring transformation of the handwritten input in accordance with the first geometrical information to align the plurality of characters in non-superimposed order, and displaying, on a display interface of the computing device, digital ink in accordance with the structuring transformation.
The first geometrical information may be determined from widths of at least some of the characters.
The structuring transformation may be determined to include at least inter-character spacing between at least some of the characters, the inter-character spacing may be defined by the determined widths.
The structuring transformation may be determined to include inter-word spacing between words defined by at least some of the characters, the inter-word spacing may be defined by the determined widths.
The non-transitory computer readable medium may further comprise determining second geometrical information of the characters, wherein the structuring transformation may be determined in accordance with the first and second geometrical information.
The second geometrical information may be determined from typography parameters of at least some of the characters.
The non-transitory computer readable medium may further comprise determining third geometrical information of an alignment structure of the display interface, wherein the structuring transformation may be determined in accordance with the first, second and third geometrical information so that the digital ink is displayed in accordance with the alignment structure.
The alignment structure may be a line pattern, and the third geometrical information may be determined from typography parameters of the line pattern.
The present system and method will be more fully understood from the following detailed description of the examples thereof, taken together with the drawings. In the drawings like reference numerals depict like elements. In the drawings:
In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant teachings. However, it should be apparent to those skilled in the art that the present teachings may be practiced without such details. In other instances, well known methods, procedures, components, and/or circuitry have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present teachings.
Reference to and discussion of directional features such as up, down, above, below, lowest, highest, horizontal, vertical, etc., are made with respect to the Cartesian coordinate system as applied to the input interface on which the input to be recognized is made. Further, the use of the term ‘text’ in the present description is understood as encompassing all alphanumeric characters, and strings thereof, in any written language and common place non-alphanumeric characters, e.g., symbols, used in written text. Furthermore, the term ‘non-text’ in the present description is understood as encompassing freeform handwritten content and rendered text and image data, as well as non-alphanumeric characters, and strings thereof, and alphanumeric characters, and strings thereof, which are used in non-text contexts.
The various technologies described herein generally relate to capture, processing and management of handwritten content on portable and non-portable computing devices in a manner which allows conversion of that content into publishable documents. The systems and methods described herein may utilize recognition of users' natural writing or drawing styles input to a computing device via an input interface, such as a touch sensitive screen, connected to, or of, the computing device or via an input device, such as a digital pen or mouse, connected to the computing device. Whilst the various examples are described with respect to recognition of handwriting input using so-called online recognition techniques, it is understood that application is possible to other forms of input for recognition, such as offline recognition in which images rather than digital ink are recognized.
The terms hand-drawing and handwriting are used interchangeably herein to define the creation of digital content by users through use of their hands either directly onto a digital or digitally connected medium or via an input tool, such as a hand-held stylus. The term “hand” is used herein to provide concise description of the input techniques, however the use of other parts of a users' body for similar input is included in this definition, such as foot, mouth and eye.
The illustrated example of the computing device 100 has at least one display 102 for outputting data from the computing device such as images, text, and video. The display 102 may use LCD, plasma, LED, iOLED, CRT, or any other appropriate technology that is or is not touch sensitive as known to those of ordinary skill in the art. At least some of the display 102 is co-located with at least one input interface 104. The input interface 104 may employ technology such as resistive, surface acoustic wave, capacitive, infrared grid, infrared acrylic projection, optical imaging, dispersive signal technology, acoustic pulse recognition, or any other appropriate technology as known to those of ordinary skill in the art to receive user input. The input interface 104 may be bounded by a permanent or video-generated border that clearly identifies its boundaries. Instead of, or additional to, an on-board display, the computing device 100 may have a projected display capability.
The computing device 100 may include one or more additional I/O devices (or peripherals) that are communicatively coupled via a local interface. The additional I/O devices may include input devices such as a keyboard, mouse, scanner, microphone, touchpads, bar code readers, laser readers, radio-frequency device readers, or any other appropriate technology known to those of ordinary skill in the art. Further, the I/O devices may include output devices such as a printer, bar code printers, or any other appropriate technology known to those of ordinary skill in the art. Furthermore, the I/O devices may include communications devices that communicate both inputs and outputs such as a modulator/demodulator (modem; for accessing another device, system, or network), a radio frequency (RF) or other transceiver, a telephonic interface, a bridge, a router, or any other appropriate technology known to those of ordinary skill in the art. The local interface may have additional elements to enable communications, such as controllers, buffers (caches), drivers, repeaters, and receivers, which are omitted for simplicity but known to those of skill in the art. Further, the local interface may include address, control, and/or data connections to enable appropriate communications among the other computer components.
The computing device 100 also includes a processor 106, which is a hardware device for executing software, particularly software stored in memory 108. The processor can be any custom made or commercially available general purpose processor, a central processing unit (CPU), commercially available microprocessors including a semiconductor based microprocessor (in the form of a microchip or chipset), microcontroller, digital signal processor (DSP), application specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, state machine, or any combination thereof designed for executing software instructions known to those of ordinary skill in the art.
The memory 108 can include any one or a combination of volatile memory elements (e.g., random access memory (RAM, such as DRAM, SRAM, or SDRAM)) and nonvolatile memory elements (e.g., ROM, EPROM, flash PROM, EEPROM, hard drive, magnetic or optical tape, memory registers, CD-ROM, WORM, DVD, redundant array of inexpensive disks (RAID), another direct access storage device (DASD), or any other magnetic, resistive or phase-change nonvolatile memory). Moreover, the memory 108 may incorporate electronic, magnetic, optical, and/or other types of storage media. The memory 108 can have a distributed architecture where various components are situated remote from one another but can also be accessed by the processor 106. Further, the memory 108 may be remote from the device, such as at a server or cloud-based system, which is remotely accessible by the computing device 100. The memory 108 is coupled to the processor 106, so the processor 106 can read information from and write information to the memory 108. In the alternative, the memory 108 may be integral to the processor 106. In another example, the processor 106 and the memory 108 may both reside in a single ASIC or other integrated circuit.
The software in the memory 108 includes an operating system 110 and an ink management system 112 in the form of a non-transitory computer readable medium having a computer readable program code embodied therein. The software optionally further includes a handwriting recognition (HWR) system 114 which may each include one or more separate computer programs. Each of these has an ordered listing of executable instructions for implementing logical functions. The operating system 110 controls the execution of the ink management system 112 (and the HWR system 114). The operating system 110 may be any proprietary operating system or a commercially or freely available operating system, such as WEBOS, WINDOWS®, MAC and IPHONE OS®, LINUX, and ANDROID. It is understood that other operating systems may also be utilized. Alternatively, the ink management system 112 of the present system and method may be provided without use of an operating system.
The ink management system 112 includes one or more processing elements related to detection, management and treatment of user input (discussed in detail later). The software may also include one or more other applications related to handwriting recognition, different functions, or both. Some examples of other applications include a text editor, telephone dialer, contacts directory, instant messaging facility, computer-aided design (CAD) program, email program, word processing program, web browser, and camera. The ink management system 112, and the other applications, include program(s) provided with the computing device 100 upon manufacture and may further include programs uploaded or downloaded into the computing device 100 after manufacture.
The HWR system 114, with support and compliance capabilities, may be a source program, executable program (object code), script, application, or any other entity having a set of instructions to be performed. When a source program, the program needs to be translated via a compiler, assembler, interpreter, or the like, which may or may not be included within the memory, so as to operate properly in connection with the operating system. Furthermore, the HWR system with support and compliance capabilities can be written as (a) an object oriented programming language, which has classes of data and methods; (b) a procedure programming language, which has routines, subroutines, and/or functions, for example but not limited to C, C++, Pascal, Basic, Fortran, Cobol, Perl, Java, Objective C, Swift, and Ada; or (c) functional programing languages for example but no limited to Hope, Rex, Common Lisp, Scheme, Clojure, Racket, Erlang, OCaml, Haskell, Prolog, and F#.
Alternatively, the HWR system 114 may be a method or system for communication with a handwriting or mixed input recognition system remote from the device, such as server or cloud-based system, but is remotely accessible by the computing device 100 through communications links using the afore-mentioned communications I/O devices of the computing device 100. Further, the ink management system 112 and the HWR system 114 may operate together or be combined as a single application. Further still, the ink management system 112 and/or the HWR system 114 may be integrated within the operating system 110.
Strokes entered on or via the input interface 104 are processed by the processor 106 as digital ink. A user may enter a stroke with a finger or some instrument such as a pen or stylus suitable for use with the input interface. The user may also enter a stroke by making a gesture above the input interface 104 if technology that senses or images motion in the vicinity of the input interface 104 is being used, or with a peripheral device of the computing device 100, such as a mouse or joystick, or with a projected interface, e.g., image processing of a passive plane surface to determine the stroke and gesture signals.
A stroke is characterized by at least the stroke initiation location, the stroke termination location, and the path connecting the stroke initiation and termination locations. Further information such as timing, pressure, angle at a number of sample points along the path may also be captured to provide deeper detail of the strokes. Because different users may naturally write the same object, e.g., a letter, a shape, a symbol, with slight variations, the HWR system accommodates a variety of ways in which each object may be entered whilst being recognized as the correct or intended object.
The recognition stage 118 may include different processing elements or experts.
The segmentation expert 122 defines the different ways to segment the input strokes into individual element hypotheses, e.g., alphanumeric characters and mathematical operators, text characters, individual shapes, or sub expression, in order to form expressions, e.g., words, mathematical equations, or groups of shapes. For example, the segmentation expert 122 may form the element hypotheses by grouping consecutive strokes of the original input to obtain a segmentation graph where each node corresponds to at least one element hypothesis and where adjacency constraints between elements are handled by the node connections. Alternatively, the segmentation expert 122 may employ separate experts for different input types, such as text, drawings, equations, and music notation. Nodes of the graph are considered adjacent if the corresponding hypotheses have no common stroke but whose strokes are consecutive in the original input.
The segmentation expert 122 is not limited to handprint writing input where each individual character is separated from its neighbor characters with a pen-up (such as shown in
Alternatively, with respect to cursive input, the HWR system 114 may further be configured to operate as described in the afore-incorporated by reference United States Patent Publication No. 2015/0356360. This alternative configuration is shown in
The recognition expert 124 provides classification of the features extracted by a classifier 128 and outputs a list of element candidates with probabilities or recognition scores for each node of the segmentation graph. Many types of classifiers exist that could be used to address this recognition task, e.g., Support Vector Machines, Hidden Markov Models, or Neural Networks such as Multilayer Perceptrons, Deep, Convolutional or Recurrent Neural Networks. The choice depends on the complexity, accuracy, and speed desired for the task.
The language expert 126 generates linguistic meaning for the different paths in the segmentation graph using language models (e.g., grammar or semantics). The expert 126 checks the candidates suggested by the other experts according to linguistic information 130. The linguistic information 130 can include a lexicon, regular expressions, etc. and is the storage for all static data used by the language expert 126 to execute a language model. A language model can rely on statistical information on a given language. The linguistic information 130 is computed off-line, with or without adaption according to the results of recognition and user interactions, and provided to the linguistic expert 126.
The language expert 126 aims at finding the best recognition path. In one example, the language expert 126 does this by exploring a language model such as final state automaton (FSA) representing the content of linguistic information 130. In addition to the lexicon constraint, the language expert 126 may use a language model with statistical information modeling for how frequent a given sequence of elements appears in the specified language or is used by a specific user to evaluate the linguistic likelihood of the interpretation of a given path of the segmentation graph.
The present system and method may make use of the HWR system 114 in order to recognize handwritten input to the device 100. As discussed earlier, unlike keyboard input which is governed by strict layout rules, such as text entry on visible or invisible lines in accordance with the position of a displayed cursor, the present system and method allows handwriting to be input virtually anywhere as superimposed handwriting on the input interface 104 of the computing device 100. This superimposed or ‘free’ input may be rendered as digital ink in the input position, or in an offset position if the handwriting input is offset from the rendering position, at least temporarily so that the user understands that the input has been received.
The HWR system 114 is able to recognize this relatively freely positioned and superimposed handwriting as described above. The present system and method displays the original input (i.e., the raw ink) in its original format (i.e., in digital ink) as if it would have been input on a piece of paper or larger input interface, for example. Redisplay of the digital ink, such as ‘untangling’ of superimposed characters, is made with or without typeset ink conversion. In this way, the present system and method allows users to interact with the digital ink itself and provide meaningful guidance and results of that interaction through beautifying of the digital ink.
Beautification may be achieved by properly aligning the digital ink in accordance, for example, with an alignment structure of the input interface 104 or the input interface 104 itself by taking into account certain geometrical features of the handwriting, and of the display is desired (described in detail later). These and other aspects of the present system and method are now described.
The superimposed handwritten inputs 500, 600 and 700 of this example are sequential inputs of content, such as words. Each of the superimposed inputs may be at least temporarily displayed on the input interface 104, or more generally on the display 102, rendered as (first) digital ink having the form as depicted in
As can be seen, the legibility of the inputs 500, 600 and 700 is very low, since the individual characters are not readily ascertainable.
Further,
As described earlier, the ink management system 112 causes the HWR system 114 to recognize the handwritten strokes of the inputs 500, 600 and 700 (or 800), and through the recognition processing described earlier, the HWR system 114 recognizes the characters of
In the present system and method, this recognized output is rendered on the display 102, for example, by the ink management system 112 through application of a structuring transformation on the strokes of the input to beautify the digital ink through proper alignment (and scaling, if desired). Alternatively, or additionally, the structuring transformation may be carried out on the non-recognized input by the ink management system 112 (as described later). In either case, the structuring transformation is performed by determining information regarding the relative geometry of the handwritten input.
In the example in which structure transforming is performed on the recognized input, the ink management system 112 receives the recognition results from the output 128 of the HWR system 114, for example, via the memory 108 of the digital device 100. This recognition processed ink includes two object classes. The first object class is the segmented ink as segmented by the segmentation expert 122 with the most probable associated text (character or symbol) candidates for the ink segments which constitutes the second object class. The ink management system 112 may process these first and second object classes together with a third object class, being the raw ink itself, in order to perform the structure transformation. For example, the ink management system 112 may use the third object class to display the transformation of the structured digital ink to the user.
The characters belonging to the character or word candidates of the first object class are used by the ink management system 112 to determine the groupings of the strokes of the second object class. For example, as described earlier, the first three handwritten strokes of the input 500 are recognized as belonging to the text character “H” 502, the fourth handwritten stroke of the input 500 is recognized as belonging to the text character “o” 504, etc. The ink management system 112 may assign the handwritten strokes to each character by causing identification data for each of the strokes to be added to the data for each stroke, e.g., by including an identification tag in the metadata of the stokes.
Alternatively, or additionally, the ink management system 112 may rely on such assignments through the HWR system 114, where the HWR system 114 is configured to provide stroke indexation to form ink objects including the recognition candidates for the stroke groups, as described in U.S. patent application Ser. No. 15/083,195 titled “System and Method for Digital Ink Interactivity” filed claiming a priority date of 7 Jan. 2016 in the name of the present Applicant and Assignee, the entire contents of which is incorporated by reference herein.
In any case, once the membership of the strokes to the different recognized characters is known, certain geometrical features of the characters can be ascertained by the ink management system 112. These geometrical features include the width of each character, the height of each character and the geometrical area of each character, where the width is defined in the x-axis and the height is defined in the y-axis as shown in
For example, the three stroke character 502 illustrated in
It is noted that grouping strokes into candidate characters may involve the transposed grouping of later strokes with earlier strokes that are not in sequential (time) input order. That is, in the example multi-stoke characters 502, 702 and 708, the multiple strokes forming these characters are input in sequential (time) order. However, for other characters having overlaid or disconnected strokes, some delayed strokes may be entered after the initial strokes of the characters are entered, particularly with respect to cursive or linked handwriting input. For example, for the text characters “t” and “i” the bar “-” and the “{dot over ( )}”, respectively, are entered later, for accented characters, like é, the diacritic is added later. The recognition of such delayed stroke characters may be handled by the HWR system 114, and/or the ink management system 112 as described in the afore-incorporated by reference United States Patent Publication No. 2015/0356360.
The determined geometric features of the recognized characters forming the complete recognized input, e.g., the question phrase “How are you?”, are compared, or related to a common reference, by the ink management system 112 in order to determine parameters for the structural transformation. For example, the determined widths of the characters may be compared in order to determine the average width. From the average width the ink management system 112 may construct each of the handwritten strokes into the recognized input.
For example, as illustrated in
The ink management system 112 considers the determined widths as (first) geometrical information from which the average width dw is determined. In order to construct the recognized input, the ink management system 112 applies a scaling factor or percentage (e.g., less than 100%) to the average width dw to determine an inter-character distance of dc. For example, a scaling factor of about 0.1 to about 0.8 may be used as a pre-determined (and settable; via a user interface (UI) or the like of the ink management system 112) constant or may be dynamically set based on the average width determined.
It is understood that the scaling factor may also be set to be greater than one (i.e., greater than 100%) or to be equal to one (i.e., 100%), based on the size of the handwritten characters. Additionally, different scaling factors may be used for different parts of the handwritten text. For example, the inter-letter spacing between capitalized or upper-case letters may be desired to be larger than that between non-capitalized or lower-case letters. This may be achieved by the ink management system 112 taking the character recognition results of the raw ink into account.
In any case, the inter-character distance is used by the ink management system 112 to provide a substantially uniform separation between at least some of the recognized characters as ordered sequentially in the recognized input. For example, as shown in
As can be seen, this provides a legible spacing of the characters of the constructed input 1000 which is reasonably similar to a character spacing as would be used naturally by users handwriting the recognized input in a non-superimposed fashion. Alternatively, or additionally, the ink management system 112 may apply a pre-determined (and settable) inter-character spacing parameter. However, as such a parameter is unlinked to the sizing of the actual input handwritten characters, it is possible the spacing will not appear natural, i.e., too large or too small. This may lead to users being forced to adjust the spacing manually, if allowed, and in any case impacts on the user experience.
Utilizing the scaled average width as the inter-character spacing (first alignment parameter) in the structurally transformed input provides a character spacing based on the geometrical features of the handwritten input itself Inter-word spacing is typically larger than inter-character spacing. Accordingly, in order to construct the recognized input, the ink management system 112 uses the average width dw as the inter-word distance. This may be achieved, for example, by setting the scaling factor to one (i.e., 100%) or by omitting the scaling factor at word boundaries (determined as described later).
In this way, the inter-word distance (second alignment parameter) is used by the ink management system 112 to provide a substantially uniform separation between the recognized words as ordered sequentially in the recognized input. For example, as shown in
As can be seen, this provides a legible spacing of the words of the reconstructed input 1000 which is reasonably similar to a word spacing as would be used naturally by users handwriting the recognized input in a non-superimposed fashion. Alternatively, or additionally, the ink management system 112 may apply a pre-determined (and settable) inter-word spacing parameter. However, as such a parameter is unlinked to the sizing of the actual input handwritten characters and words, it is possible the spacing will not appear natural, i.e., too large or too small. This may lead to users being forced to adjust the spacing manually, if allowed, and in any case impacts on the user experience.
In order to determine in which positions of the input to apply the inter-character and inter-word spacings, the positions of the words need to be known to the ink management system 112. With respect to the recognized input, such information is provided by the HWR system 114 to the ink management system 112 as part of the recognition results of the output 120, in particular as part of the first object class. The detection of word input in superimposed handwritten input may be performed by the HWR system 114 as described in the afore-incorporated by reference United States Patent Publication No. 2015/0286886, for example.
On the other hand, in examples of the present system and method in which the structure transformation is performed on the raw input rather than recognized input, as described above, the ink management system 112 is configured to determine characters and words of the input itself in order to make the above-described geometrical parameter determinations. This may be achieved by configuring the ink management system 112 with at least partial aspects of the HWR system 114. In particular, components of the ink management system 112 with functions similar to the segmentation, recognition and language experts of the HWR system 114 may be provided in order to determine the likeliest groupings of the strokes into characters and words without performing recognition of the input itself. That is, the structural transformation of the superimposed input may be performed by the present system and method to merely re-arrange the raw ink into structured ink for display to the user. This constructed input may then be recognized by a handwriting recognition process, like the HWR system 114 for example, at a later stage if desired.
Having determined the inter-character and inter-word parameters, the ink management system 112 may cause the constructed input 1000 to be displayed on the display 102 of the digital device 100. This display may be caused in response to receipt of a user request, for example, through the UI or the like of the ink management system 112 or the device 100 itself, or may be performed when input is detected.
For example,
An example alignment pattern is described in U.S. patent application Ser. No. 14/886,195 titled “System and Method of Digital Note Taking” filed claiming a priority date of 25 Aug. 2015 in the name of the present Applicant and Assignee, the entire content of which is incorporated by reference herein. However, other examples of the present system and method may not provide such a line pattern, particularly when further interaction and non-superimposed input to the input interface directly is not envisaged and the mere untangled display of the superimposed input is desired.
As can be seen from
In one example, beautification of the input is provided by configuring the ink management system 112 to use information on further geometrical features of the input based on recognized or determined features of the characters represented by the input and geometrical features of the display, for desired. In the example in which structure transforming is performed on the recognized input, the (second) geometrical information on the recognized characters is provided by the HWR system 114 together with the recognition results. That is, in the recognition process, the recognition engine 118 determines geometrical information related to typography of at least each of the recognized characters of the ink input.
Typography of text and punctuation characters includes several elements related to elements of text, including ascenders and descenders of letters. With respect to handwritten characters the HWR system 114 determines lower and upper extrema of each character. A lower extremum is the lowest point at which the stroke or strokes of a recognized character pass. An upper extremum is the highest point at which the stroke or strokes of a recognized character pass. From these extremum, writing lines at the levels of character, word (e.g., a series of characters), sentence or phrase (e.g., a series of characters and/or words), and text line (e.g., a series of characters, words and/or sentences) are determined. The writing lines include top-, lower-, base- and mid- (or median-) lines. In the alternative example in which the structure transformation is performed on the raw input, the ink management system 112 is configured to determine these extremum itself.
With respect to individual characters the typography information may be applied based on the type of character recognized by the HWR system 114, or otherwise determined by the ink management system 112. That is, for capitalized letters, letters having ascenders and certain punctuation marks, the top-line is defined as the horizontal line passing through a point related to the upper extremum across the horizontal extent of the character. For example, as shown in
For letters having descenders and certain punctuation marks, the lower-line is defined as the horizontal line passing through a point related to the lower extremum across the horizontal extent of the character. For example, as shown in
The base-line is defined as the horizontal line passing through a point related to at least the lower-line across the horizontal extent of the character, and the mid-line is defined as the horizontal line passing through a point related to the top- and base-lines across the horizontal extent of the character. The determination of the points through which these lines pass is particular to the recognition process and language model(s) used by the HWR system 114, or the ink management system 112.
For letters having neither descenders nor ascenders and in lower case form, the mid-line is defined as the top-line and the base-line is defined as the lower-line. For example, as shown in
For capitalized letters, letters having ascenders and certain punctuation marks, the base-line is defined as the lower-line and the mid-line is defined as the horizontal line passing through a point which is the median distance between the top- and base-lines. For example, as shown in
For letters having descenders and certain punctuation marks, the mid-line is defined as the top-line and the base-line is defined as the horizontal line passing through a point which is the median distance, or which is defined by some other metric (such as certain geometrical features of the stoke(s) itself), between the mid- and lower-lines. For example, as shown in
Of course, certain characters, depending on language and handwriting styles, may include both ascenders and descenders, or diacritics (such as accents), such that certain combinations or weightings of the above-described relationships are used to define the typography of the characters. Also, certain punctuation marks and other symbols may not range over more than one of the typography lines (such as full stops and hyphens), such that specific rules are applied for such characters.
The structuring transformation employed by the ink management system 112 may utilize this (second) geometrical information of the input in a number of ways to further beautify the rendering of the digital ink representing the input. In one example, the mid-line of each character is used to align the characters with one another (described below with respect to
In order to display the reconstructed input 1200 the ink management system 112 may cause the reconstructed input 1200 to be displayed as rendered digital ink as shown in
This offset position may be pre-determined or settable (either by the ink management system 112 or by users, e.g., through the UI or the like) based on display or input parameters such as the line pattern 410 height (e.g., the vertical distance between the lines termed the line pattern unit (LPU)) with or without scaling of the digital ink, the size of the handwriting (raw ink) input (e.g., the LPU may be adjusted based on the raw ink height), typography conventions, or a combination thereof.
For example, (third) geometrical information of the line pattern 410 may be determined by the ink management system 112. The line pattern geometrical information includes at least top-, base- and mid-lines of the line pattern 410 as shown in
Accordingly, in the example of
As can be seen from
In order to display the reconstructed input 1500 the ink management system 112 may cause the reconstructed input 1500 to be displayed as rendered digital ink as shown in
As can be seen from
In the afore-described examples, the alignment of the constructed and reconstructed inputs with respect to internal elements of the input and/or external elements to the input is based on one or more of the first, second and third geometrical information applied at the character level of the input. However, the level, including character, word, sentence, text line levels, at which these types of geometrical information are applied may be different to each other or for all, and may be pre-determined or settable via the UI or the like of the ink management system 112, for example.
The present system and method provides users with the ability to view their superimposed handwritten input in digital ink in a manner which allows handwriting recognition feedback and further input using superimposed or non-superimposed handwriting. This is achieved whilst retaining faithful reproduction of the original digital ink. Further, full interaction with the digital ink to edit or add to their handwriting at the level of characters, words, etc., is provided through processing the re-arranged digital ink as ink objects which uses at least aspects of handwriting recognition to provide the re-arrangement transformation or operation.
While the foregoing has described what is considered to be the best mode and/or other examples, it is understood that various modifications may be made therein and that the subject matter disclosed herein may be implemented in various forms and examples, and that they may be applied in numerous other applications, combinations, and environments, only some of which have been described herein. Those of ordinary skill in that art will recognize that the disclosed aspects may be altered or amended without departing from the true spirit and scope of the subject matter. Therefore, the subject matter is not limited to the specific details, exhibits, and illustrated examples in this description. It is intended to protect any and all modifications and variations that fall within the true scope of the advantageous concepts disclosed herein.
Number | Date | Country | Kind |
---|---|---|---|
16290146.6 | Jul 2016 | EP | regional |