This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2015-060058, filed on Mar. 23, 2015; the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to an image processor, an image processing method and a non-transitory recording medium.
In an image processor that processes handwritten characters, it is desirable to arrange the handwritten characters for easy viewing.
According to one embodiment, an image processor includes an acquisitor and a processor. The acquisitor acquires an input image including a first character string. The processor implements a first operation of generating a first generated image from a first extracted image based on an arranged state of the first character string. The first extracted image is extracted from the input image. The first extracted image is relating to the first character string. The first extracted image extends in a first direction. The first generated image extends in a second direction different from the first direction.
According to another embodiment, an image processing method includes acquiring an input image including a first character string. The method includes generating a first generated image from a first extracted image based on an arranged state of the first character string. The first extracted image is extracted from the input image. The first extracted image is relating to the first character string. The first extracted image extends in a first direction. The first generated image extends in a second direction different from the first direction.
According to another embodiment, a non-transitory recording medium has an image processing program being recorded in the recording medium. The program causes a computer to execute acquiring an input image including a first character string. The program causes the computer to execute generating a first generated image from a first extracted image based on an arranged state of the first character string. The first extracted image is extracted from the input image. The first extracted image is relating to the first character string. The first extracted image extends in a first direction. The first generated image extends in a second direction different from the first direction.
Various embodiments of the invention will be described hereinafter with reference to the accompanying drawings.
The drawings are schematic or conceptual; and the relationships between the thicknesses and widths of portions, the proportions of sizes between portions, etc., are not necessarily the same as the actual values thereof. The dimensions and/or the proportions may be illustrated differently between the drawings, even in the case where the same portion is illustrated.
In the drawings and the specification of the application, components similar to those described in regard to a drawing thereinabove are marked with like reference numerals, and a detailed description is omitted as appropriate.
25
The image processor 110 of the embodiment includes an acquisitor 10 and a processor 20. The acquisitor 10 includes, for example, input/output terminals. The acquisitor 10 includes an input/output interface that communicates with the outside via a wired or wireless method. The processor 20 includes, for example, a calculating device including a CPU (Central Processing Unit), memory, etc. A portion of each block or each entire block of the processor 20 may include an integrated circuit such as LSI (Large Scale Integration), etc., or an IC (Integrated Circuit) chipset. Each block may include an individual circuit; or a circuit in which some or all of the blocks are integrated may be used. The blocks may be provided as one body; or some blocks may be provided separately. Also, for each block, a portion of the block may be provided separately. The integration is not limited to LSI; and a dedicated circuit or a general-purpose processor may be used.
A setter 21, a calculator 22, an extractor 23, and a corrector 24 are provided in the processor 20. For example, these components are realized as an image processing program. In other words, the image processor 110 also may be realized by using a general-purpose computer device as the basic hardware. The functions of each component included in the image processor 110 may be realized by causing a processor mounted in the computer device recited above to execute the image processing program. In such a case, the image processor 110 may be realized by preinstalling the image processing program recited above in the computer device; or the image processor 110 may be realized by storing the image processing program recited above in a storage medium such as CD-ROM, etc., or distributing the image processing program via a network and appropriately installing the image processing program in the computer device. The processor 20 also may be realized by appropriately utilizing a storage medium such as memory, a hard disk, CD-R, CD-RW, DVD-RAM, DVD-R, etc., connected externally or built into the computer device recited above.
For example, the image processor 110 according to the embodiment is applied to application software for arranging an image in which a handwritten character string written on a whiteboard, a blackboard, a notebook, etc., is imaged so that the image is easy to view. For example, such application software is used for a handwritten character string in which the tilt, the character spacing, and the character size fluctuate easily between the characters. An image in which a handwritten character string is imaged by a camera is modified for easy viewing by modifying the size, arrangement, etc., of the handwritten characters included in the handwritten character string.
The acquisitor 10 acquires the input image 30. The input image 30 is, for example, an image formed by imaging multiple handwritten characters written on a whiteboard, a blackboard, etc., by a lecturer performing a lecture, a chairperson chairing a meeting, etc. The acquisitor 10 may acquire the input image 30 from an imaging device such as a digital still camera, etc. The acquisitor 10 may acquire the input image 30 from a storage medium such as a HDD (Hard Disk Drive), etc.
The input image 30 includes a first extracted image 31 relating to a first character string 31a. The first character string 31a includes multiple handwritten characters c. In the embodiment, handwritten characters such as kanji, hiragana, and katakana of Japanese, etc., are used as the multiple characters c. Handwritten numerals, various symbols, figures, etc., also are used as the multiple characters c.
The processor 20 extracts the first extracted image 31 from the input image 30. The first extracted image 31 extends in a first direction D1 relating to the first character string 31a. The processor 20 implements a first operation of generating, from the first extracted image 31 based on the state of the first character string 31a, a first generated image 32 extending in a second direction D2 that is different from the first direction D1. For example, the first character string 31a is tilted with respect to the row direction (the horizontal direction of the input image 30). The tilt with respect to the row direction of the second direction D2 is smaller than the tilt with respect to the row direction of the first direction D1. The first extracted image 31 is shown in
The input image 30 further includes a second extracted image 33 relating to a second character string 33a. The second character string 33a includes multiple handwritten characters c.
The processor 20 extracts the second extracted image 33 from the input image 30. The second extracted image 33 extends in a third direction D3 relating to the second character string 33a. As shown in
The processor 20 generates, from the second extracted image 33 based on the state of the second character string 33a, a second generated image 34 extending in a fourth direction D4 that is different from the third direction D3. For example, the second character string 33a is tilted with respect to the row direction. The tilt of the fourth direction D4 with respect to the row direction is smaller than the tilt of the third direction D3 with respect to the row direction. The fourth direction D4 may be the same as the second direction D2. The second extracted image 33 is shown in
The tilt of the first character string 31a with respect to the row direction is larger than the tilt of the second character string 33a with respect to the row direction. Therefore, the modification amount of the tilt of the first character string 31a is larger than the modification amount of the tilt of the second character string 33a. Thereby, the tilts of the first extracted image 31 and the second extracted image 33 are modified. The first generated image 32 and the second generated image 34 after the tilt modification are arranged to be substantially horizontal in the input image 30. Similarly, the tilt can be modified for other images relating to character strings as well.
Here, there is a reference example in which character recognition is performed for multiple handwritten characters written in a predetermined designated frame as one character string; and the tilt of the character string, etc., are modified after the character recognition. In such a reference example, there are cases where a whiteboard, a blackboard, etc., on which handwritten character strings having different causes of fluctuation are multiply mixed cannot be accommodated.
In the embodiment, the tilt of the handwritten character string can be modified for each handwritten character string. Therefore, even for the image in which the handwritten character strings are multiply mixed, the handwritten characters can be arranged for easier viewing as handwritten characters.
For example, in a meeting, it is often that a participant handwrites a meeting memo on a whiteboard, etc. It is considered that the handwritten meeting memo is associated with the content of the meeting and remains in the memory of the participant. Therefore, by keeping the image of the meeting memo as a handwritten meeting memo that has an arrangement that is easier to view, the participant that views the image can intuitively recall the content of the meeting. It is difficult to obtain such an effect when the handwritten meeting memo is converted into digital character data by the character recognition of such a reference example. According to the embodiment, the handwritten characters can be modified to an arrangement that is easier to view as handwritten characters. Therefore, the ease of viewing can be improved while keeping the merits of being handwritten.
A specific image processing method using the image processor 110 will now be described.
The setter 21 implements setting processing. In the setting processing as shown in
Most simply, it may be considered to use one connected component included in the character c as one character candidate region r. As shown in
The calculator 22 implements the calculation processing. In the calculation processing, one of the multiple character candidate regions r is set as a reference region; and evaluation values relating to the ease of forming links between the reference region and each of the multiple character candidate regions r other than the reference region are calculated.
As shown in
Similarly, an evaluation value v13 between the reference region r1 and the character candidate region r3 is calculated. An evaluation value v14 between the reference region r1 and the character candidate region r4 is calculated. An evaluation value v15 between the reference region r1 and the character candidate region r5 is calculated. Thus, it is sufficient to calculate an evaluation value for each combinable pair of character candidate regions in the input image 30.
The extractor 23 implements extraction processing. In the extraction processing as shown in
Similarly, the extractor 23 extracts the second extracted image 33 including the multiple character candidate regions r from the input image 30 based on the evaluation values recited above. The second extracted image 33 is extracted as the set of the multiple character candidate regions r connected by a line 36. A fourth character candidate region r14 and a fifth character candidate region r15 are included in the set connected by the line 36. The fourth character candidate region r14 includes a fourth character c4 positioned at one end of the second extracted image 33. The fifth character candidate region r15 includes a fifth character c5 positioned at the other end of the second extracted image 33. Thus, the extractor 23 extracts the set of the character candidate regions r for each of the multiple images included in the input image 30.
The corrector 24 implements the modification processing. As shown in
Here, it is possible to adjust how much the modification is performed by, for example, using the settings of a user, etc. For example, the setting may be in a range of 0 to 100%, where 0% is the state in which the first tilt of the line segment L1 is not modified, and 100% is the state in which the first tilt of the line segment L1 is modified to be zero (horizontal). In such a case, it is favorable for the image processor 110 to include a display unit. The display unit displays a setting screen to receive the settings set by the user.
Thus, the first generated image 32 of
In the embodiment, the tilt of the handwritten character string can be modified for each handwritten character string. Therefore, even for the image in which the handwritten character strings are multiply mixed, the handwritten characters can be arranged for easier viewing as handwritten characters.
In the example of
By modifying the second tilt of the line segment L2, the positional relationship between the first character candidate region r11 and the second character candidate region r12 is changed. In other words, the positional relationship between the first character c1 and the second character c2 is changed. Specifically, from the perspective of easy viewing, it is favorable for the second tilt of the line segment L2 to be small. By reducing the second tilt of the line segment L2, two mutually-adjacent handwritten characters can be arranged in an easily-viewable state in which the tilt is suppressed.
It is possible to adjust how much the modification is performed by the settings of the user, etc. For example, the setting may be in a range of 0 to 100%, where 0% is the state in which the second tilt of the line segment L2 is not modified, and 100% is the state in which the second tilt of the line segment L2 is modified to a direction parallel to the first direction D1.
Thereby, the two characters of the first extracted image 31 are arranged for easy viewing as handwritten characters. A similar tilt modification is possible for two characters of the second extracted image 33 as well. A similar tilt modification is possible for two characters of images other than the first extracted image 31 and the second extracted image 33 as well.
The corrector 24 may modify the first tilt of the line segment L1 of the input image 30 when the first tilt of the line segment L1 of the first rectangular region rr1 is larger than a first reference tilt. In other words, it is determined whether or not the first tilt of the line segment L1 is larger than the first reference tilt; and the first tilt of the line segment L1 is modified based on the determination result. For example, when the first tilt of the line segment L1 is larger than the first reference tilt, the first tilt of the line segment L1 is modified to be the first reference tilt. On the other hand, when the first tilt of the line segment L1 is smaller than the first reference tilt, the first tilt of the line segment L1 is not modified.
The average value of the first tilt of the first extracted image 31 and a third tilt of the second extracted image 33 may be used as the first reference tilt. As shown in
As shown in
The average value of the tilts of the images relating to all of the character strings included in the input image 30 may be used as the first reference tilt. Zero (horizontal) which is the state of no tilt may be used as the first reference tilt.
In the description recited above, the modification of the first tilt of the line segment L1 may be large when the difference between the first reference tilt and the first tilt of the line segment L1 is large; and the modification of the first tilt of the line segment L1 may be small when the difference between the first reference tilt and the first tilt of the line segment L1 is small. The first tilt of the line segment L1 may be modified so that the difference between the first reference tilt and the first tilt of the line segment L1 is zero. Thus, the modification may be performed so that the overall tilt of the first extracted image 31 approaches the first reference tilt.
Here, in
As shown in
In the binary processing of step S1, the background pixels and the writing pixels are separated from each other. The writing pixels are the pixels corresponding to the handwritten character portions. Specifically, for example, a method such as discriminant analysis or the like is used. In discriminant analysis, the input image 30 is converted to grayscale; a histogram of the pixel values is calculated for each local region; and the boundary that separates the background pixels and the writing pixels is determined adaptively. Sufficient separation performance can be ensured by this method in the case where the contrast between the background pixels and the writing pixels is sufficient in the image in which the whiteboard, the blackboard, or the like is imaged.
In the line thinning of step S2, the pixels of the core line (the core line pixels) which are at the center of the stroke and the pixels at the periphery of the core line are separated for the writing pixels separated by the binary processing; and only the core line pixels are extracted. Specifically, a 3×3 image filter is applied; and in the case where adjacent writing pixels exist, only the core line pixels are selected by passing through the image filter.
In the connected component extraction processing of step S3, attributes such as an isolated point, an intersection, a normal point, etc., are attributed based on the adjacent relationships of the core line pixels. Based on the attributes, the set of the writing pixels adjacent to the core line pixels is extracted as one connected component.
In the character candidate region determination processing of step S4, the character candidate region r is determined based on the result of the connected components. Most simply, the method of using one connected component as the character candidate region r may be used. The result of the character candidate regions r being set by the setter 21 is shown in
The extraction results of the first extracted image 31 and the second extracted image 33 are shown in
As shown in
In the graph construction processing of step S11, the character candidate regions r that are set in the setter 21 are used as basic units (nodes). The graph is constructed by connecting spatially-proximal character candidate regions with arcs.
In the linking cost calculation processing of step S12, the linking cost for determining whether or not the character candidate regions are easy to link to each other is defined for the arcs. For the linking cost, for example, the similarity of the size, the distance between the character candidate regions, etc., may be used as the reference.
In the graph evaluation of step S13, it is determined which combination of character candidate regions r has a small linking cost when subdividing a portion of the graph for the graphs that are constructed.
In the character candidate region group determination processing of step S14, it is determined that the combination of character candidate regions r determined by the graph evaluation recited above is a group of the same character string (the set of the character candidate regions).
In
Thus, according to the embodiment, the overall tilt or the partial tilt of the handwritten character string can be modified for each handwritten character string. Therefore, the handwritten characters can be arranged for easier viewing as handwritten characters even for the image in which the handwritten character strings are multiply mixed.
In the embodiment, the spacing between two mutually-adjacent characters is modified for each handwritten character string. In other words, processing of modifying the spacing between two characters c is implemented (a third operation) when the spacing between the two characters c is larger than a reference spacing.
The acquisitor 10 acquires the input image 60 as shown in
As shown in
The calculator 22 sets one of the multiple character candidate regions r as a reference region and calculates the evaluation values relating to the ease of forming links between the reference region and each of the multiple character candidate regions r other than the reference region.
As shown in
The corrector 24 modifies the spacing between two mutually-adjacent character candidate regions (characters) for the first character string 61 and the second character string 62. For example, the spacing between the two character candidate regions is modified to become small when the spacing between the two character candidate regions is larger than the reference spacing. The spacing between the two character candidate regions may be modified to become large when the spacing between the two character candidate regions is smaller than the reference spacing.
Specifically, for example, the average value of the spacing between the mutually-adjacent character candidate regions is determined for the first to fourth character strings 61 to 64; and the average value is used as the reference spacing. The maximum value of the spacing between the mutually-adjacent character candidate regions may be determined for the first to fourth character strings 61 to 64; and the maximum value may be used as the reference spacing. That is, the character spacing of the first character string 61 is modified to approach the reference spacing when the character spacing is larger than the reference spacing. The character spacing of the first character string 61 may be modified to approach the reference spacing when the character spacing is smaller than the reference spacing. For example, all of the character spacing of the first character string 61 may be modified to become the reference spacing. Similarly, the character spacing of the second character string 62 may be modified. The degree of the modification may be appropriately adjusted by the settings of the user. Thereby, multiple character spacing 65 of
Thus, according to the embodiment, the character spacing of the handwritten character string can be modified for each handwritten character string. Therefore, the handwritten characters can be arranged for easier viewing as handwritten characters even for the image in which the handwritten character strings are multiply mixed.
In the embodiment, the character size of the characters is modified for each handwritten character string. In other words, processing of modifying the size of the character c (a fourth operation) is implemented when the size of the character c is larger than a reference size.
The acquisitor 10 acquires an input image 80 as shown in
The setter 21 sets the multiple character candidate regions r for the input image 80. Each of the multiple character candidate regions r includes at least a portion of one character c included in the input image 80.
The calculator 22 sets one of the multiple character candidate regions r as a reference region and calculates the evaluation values relating to the ease of forming links between the reference region and each of the multiple character candidate regions r other than the reference region.
The extractor 23 extracts, from the input image 80 based on the evaluation values calculated by the calculator 22, the first character string 81 including a set 91 of the character candidate regions r. Similarly, the extractor 23 extracts the second character string 82 including a set 92. The extractor 23 extracts the third character string 83 including a set 93. The extractor 23 extracts the fourth character string 84 including a set 94. The extractor 23 extracts the fifth character string 85 including a set 95. The extractor 23 extracts the sixth character string 86 including a set 96.
The corrector 24 modifies the size of the character candidate region r (the character c) included in each of the first to sixth character strings 81 to 86. For example, the size of the character candidate region r is modified to become small when the size of the character candidate region r is larger than the reference size. The size of the character candidate region r may be modified to become large when the size of the character candidate region r is smaller than the reference size.
Specifically, for example, the average value of the sizes of the character candidate regions r of the first to sixth character strings 81 to 86 is determined; and the average value is used as the reference size. The maximum value of the sizes of the character candidate regions r of the first to sixth character strings 81 to 86 may be determined; and the maximum value may be used as the reference size. That is, the character size of each of the first to sixth character strings 81 to 86 is modified to approach the reference size when the character size is larger than the reference size. The character size of each of the first to sixth character strings 81 to 86 may be modified to approach the reference size when the character size is smaller than the reference size. For example, all of the character sizes included in the first to sixth character strings 81 to 86 may be modified to be the reference size. The degree of the modification may be appropriately adjusted by using the settings of the user. Thereby, the character sizes of the first to sixth character strings 81 to 86 of
Thus, according to the embodiment, the character sizes of the characters included in the handwritten character string can be modified for each handwritten character string. Therefore, the handwritten characters can be arranged for easier viewing as handwritten characters even for the image in which the handwritten character strings are multiply mixed.
An image processor 111 of the embodiment includes the acquisitor 10 and the processor 20. The processor 20 includes the setter 21, the calculator 22, the extractor 23, and the corrector 24 and further includes a determiner 25.
The determiner 25 implements determination processing. The determination processing determines whether or not the first character string further includes a noncharacter (also called a noncharacter symbol).
The extractor 23 excludes, from the multiple character candidate regions r, the noncharacter regions determined to be noncharacters.
The acquisitor 10 acquires an input image 100. As shown in
The embodiment focuses on a feature of the noncharacters 102a to 102d including the underlines, the enclosing lines, etc., being different from that of a character, where the feature is the aspect ratio, size, or the like of the configuration of the character candidate region r, the density of the writing pixels for the character candidate region r, etc. For example, there is a method for constructing an identifier using identities such as the configuration of the character candidate region r, the density of the writing pixels, or the like as the feature. A linear SVM (Support Vector Machine) or the like may be considered as a specific example of the identifier.
As shown in
As shown in
Thus, according to the embodiment, it can be determined whether or not each of the multiple character strings includes a noncharacter; and the noncharacters can be excluded from each of the multiple character strings. Therefore, the detection precision of the characters can be increased.
An image processor 112 of the embodiment includes the acquisitor 10 and the processor 20. The processor 20 includes the setter 21, the calculator 22, the extractor 23, the corrector 24, and the determiner 25 and further includes a structuring unit 26.
The structuring unit 26 implements structuring processing. The structuring processing utilizes the determination result of the determiner 25. In the structuring processing, the first extracted image is extracted by recognizing the noncharacters inside the first character string and removing the noncharacters from the first character string.
The acquisitor 10 acquires the input image 120. As shown in
Similarly to the fourth embodiment, the embodiment focuses on a feature of the noncharacter symbols 122a to 122d including the underlines, the partitioning lines, etc., being different from that of a character, where the feature is the aspect ratio, size, or the like of the configuration of the character candidate region r, the density of the writing pixels for the character candidate region r, etc. For example, a method for constructing the identifier may be used in which identities such as the configuration of the character candidate region r, the density of the writing pixels, or the like is used as the feature. A linear SVM or the like may be considered as a specific example of the identifier.
As shown in
As shown in
As shown in
As shown in
Thus, according to the embodiment, by utilizing the noncharacter symbols such as the partitioning lines, etc., the multiple handwritten character strings are integrated into one character string group; and the spatial arrangement such as the beginning of the lines, the row spacing, etc., of the multiple handwritten character strings can be modified by the unit of character string group. Thereby, it is possible to arrange the handwritten characters for easy viewing with the merits of being handwritten remaining as-is. By utilizing the partitioning lines, etc., mistaken integration of the multiple handwritten character strings can be suppressed.
The first to fifth embodiments are described as being used for kanji, hiragana, and katakana of Japanese. The embodiment may be used for the English alphabet.
As shown in
As shown in
Thus, the embodiment is applicable not only to kanji, hiragana, and katakana of Japanese but also the English alphabet and all sorts of languages other than Japanese and English.
As shown in
Similarly, as shown in
As shown in
As shown in
An image processor 200 of the embodiment is realizable by various devices such as a desktop or laptop general-purpose computer, a portable general-purpose computer, other portable information devices, an information device that includes an imaging device, a smartphone, other information processors, etc.
As shown in
It is possible to execute the instructions of the processing methods of the embodiment described above based on a program which is software. It is also possible to obtain effects similar to the effects of the image processor of the embodiment described above by the general-purpose computer system pre-storing the program and reading the program. The instructions described in the embodiment described above are recorded, as a program that can cause the execution by a computer, in a magnetic disk (a flexible disk, a hard disk, etc.), an optical disk (CD-ROM, CD-R, CD-RW, DVD-ROM, DVD±R, DVD±RW, etc.), semiconductor memory, or similar recording media. The storage format of the recording medium may have any form as long as the recording medium is readable by a computer or embedded system. The computer can realize an operation similar to that of the image processor of the embodiment described above based on the program by reading the program from the recording medium and executing the instructions recited in the program using the CPU. Of course, the computer may perform the acquiring or reading via a network when acquiring or reading the program.
Database management software or the OS (operating system) operating on the computer, MW (middleware) operating on a network, etc., may execute a portion of the processing for realizing the embodiment based on the instructions of the program installed in the computer or the embedded system from the recording medium.
The recording medium of the embodiment is not limited to a recording medium that is independent of the computer or the embedded system; and the recording medium of the embodiment also includes a recording medium that stores or temporarily stores a downloaded program transmitted by a LAN, the Internet, etc. The recording medium is not limited to one type; and the recording medium of the embodiment also includes the case where the processing of the embodiment is executed from multiple recording media. The configuration of the recording medium may be any configuration.
The computer or the embedded system of the embodiment executes the processing of the embodiment based on the program stored in the recording medium and may have any configuration such as a device made of one of a personal computer, a microcomputer, or the like, a system in which multiple devices are connected by a network, etc.
The computer of the embodiment is not limited to a personal computer, also includes a processor included in an information processing device, a microcomputer, etc., and generally refers to devices and apparatuses that can realize the functions of the embodiment by using a program.
According to the embodiments, an image processor, an image processing method, and an image processing program can be provided in which handwritten characters can be arranged for easy viewing.
Hereinabove, embodiments of the invention are described with reference to specific examples. However, the invention is not limited to these specific examples. For example, one skilled in the art may similarly practice the invention by appropriately selecting specific configurations of components such as the acquisitor, the processor, etc., from known art; and such practice is within the scope of the invention to the extent that similar effects can be obtained.
Further, any two or more components of the specific examples may be combined within the extent of technical feasibility and are included in the scope of the invention to the extent that the purport of the invention is included.
Moreover, all image processors, image processing methods and non-transitory recording mediums practicable by an appropriate design modification by one skilled in the art based on the image processors, image processing methods and non-transitory recording mediums described above as embodiments of the invention also are within the scope of the invention to the extent that the spirit of the invention is included.
Various other variations and modifications can be conceived by those skilled in the art within the spirit of the invention, and it is understood that such variations and modifications are also encompassed within the scope of the invention.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2015-060058 | Mar 2015 | JP | national |