This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2012-121306, filed May 28, 2012, the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to a document search apparatus, a document search method, and a program.
Document search apparatuses are known which searches a database for a handwritten text that is similar to or matches a handwritten query input or specified by a user.
In general, according to one embodiment, a document search apparatus includes an acquirer, determiner, searcher, and display. The acquirer acquires data on a handwriting including coordinate data. The determiner determines a shape of the handwriting based on the coordinate data to determine a type of a query. The searcher searches the document according to a search method corresponding to the type of the query. The display that displays the document by a display method corresponding to the type of the query.
An embodiment will be described below with reference to the drawings.
The present embodiment relates to a search system that is used if a search target and a query are handwritten data. That is, the present system is mainly intended for such a language-independent search system as uses a “handwritten document” as a search target and as uses, for a search, a query that uses handwritten characters or the like (hereinafter referred to as a “handwritten query”). The handwritten query is not limited to characters but includes a user's drawing such as a mark or a line.
However, the search target may be a text document. In this case, a handwritten query is converted into a text query for a search. Furthermore, the search target may be a handwritten document, and the query for use for a search may be a text query. In this case, the text query is converted into a handwritten query for a search. In any of these examples of systems, according to the embodiment described below, the type of the handwritten query is determined, a search is carried out by a search method corresponding to the type of the handwritten query, and a search result is displayed by the corresponding appropriate display method.
According to the present embodiment, the handwritten query is classified into, for example, a “string”, a “one-stroke mark”, an “underline”, and an “enclosure”. A search target in a handwritten text which is similar to or matches a handwritten query varies depending on the type of the handwritten query. For example, if the type of the handwritten query is a string, the string itself is a search target. For a one-stroke mark, the search is not limited to the one-stroke mark itself. For the one-stroke mark, the search target is strings preceding and succeeding the one-stroke mark. If the type of the handwritten query is an underline or an enclosure, an underlined string or a string enclosed by an enclosure is to be searched for.
Possible types of queries are not limited to those described above. Those skilled in the art can modify the embodiment by expanding the range of handwritten queries or conversely reducing the types of handwritten queries based on the present disclosure.
As shown in
The acquirer 1 acquires handwriting data including coordinate data.
The handwriting data acquired by the acquirer 1 includes time-series coordinate data for each stroke and is expressed, for example, as follows.
Stroke 1: (x(1, 1), y(1, 1)), (x(1, 2), y(1, 2), . . . , (x(1, N(1)), y(1, N(1)))
Stroke 2: (x(2, 1), y(2, 1)), (x(2, 2), y(2, 2), . . . , (x(2, N(2)), y(2, N(2)))
Where, “N(i)” denotes the number of points for sampling of a stroke i.
Such handwriting data as described above is also provided for handwritten documents that are stored in the handwritten document DB 4.
The query determiner 2 determines the shape of handwriting using handwriting coordinate data acquired by the acquirer 1 to determine the type of the query.
A method for inputting a query will be described with reference to
(i) The direct handwriting is a method in which the user inputs a handwriting corresponding to a query in a handwriting manner using an input device (stylus pen or the like). In
(ii) The direct handwriting selection is a method of directly selecting a handwriting to be used as a query from a displayed handwritten text instead of inputting the query itself in a handwriting manner. For example, the user operates an input device so as to draw a diagonal line 34 to directly select a handwriting 35 enclosed by a rectangle and specified by the diagonal line 34 (in this example, “idea”). Alternatively, the user directly selects a handwriting 36 using the input device or by finger tapping.
(iii) The indirect handwriting selection is a method of indirectly selecting a handwriting to be used as a query from a displayed handwritten. For example, the user operates the input device so as to draw an underline 37 to indirectly select a handwriting 38 (in this case, “idea”) located above and adjacent to the underline. Alternatively, the user operates the input device so as to draw an enclosure 39 to indirectly select a handwriting 40 (in this case, “idea”) inside the enclosure 39.
In the (iii) indirect handwriting selection, one example has been illustrated wherein the user operates the input device so as to draw the underline 37 to input a query. However, the query determiner 2 may determine the direction of handwriting of the underline 37 and vary the processing of the underline 37 depending on the result of the determination. The direction of the handwriting can be determined based on the magnitude of the above-described coordinate values in the time series of handwriting data. For example, if the underline 37 is drawn from the left to right of the sheet of
A specific process for determining the type of a query will be described with reference to a flowchart in
In step S1, the shape of an input handwriting is determined. Using the coordinate data included in the stroke data of a query input acquired by the acquirer 1, the query determiner 2 determines the shape of the handwriting to determine the type of the query to be one of a string, a one-stroke mark, an underline, and an enclosure.
In step S11 in
A process of determining whether or not the handwriting is contained inside the closed loop will be described with reference to
(1) A straight line f[i] (x, y)=0 which joins two points P[i] and P[i+1] together is calculated based on f(x, y)=(Y[i+1]−Y[i]*(x−X[i])−(X[i+1]−X[i])*(y−Y[i])=0. However, for i=N, a straight line f[N](x, y)=0 which joins two points P[N] and P[0] is used.
(2) The apparatus determines to which side Q(X, Y) is located with respect to the traveling direction of the straight line. Thus, f[i](X, Y) is calculated. If the resultant value is positive, Q(X, Y) is located to the right of the traveling direction. If the resultant value is negative, Q(X, Y) is located to the left of the traveling direction.
(3) The processing in (1) and (2) is repeated for all the values of i. If Q(X, Y) takes the same sign in all the straight lines f[i](X, Y), Q is determined to be inside the closed loop.
If in step S15, the query is determined to be a closed loop, the type of the query is determined to be an enclosure.
If in step S12 or step S15, the query fails to be determined to be a closed loop, the apparatus determines whether the one-stroke handwriting is a horizontal line. For example, a well-known linear regression problem is solved to fit a straight line to fit the polygonal line. When a regression error determined by this process is equal to or smaller than a threshold, the handwriting is determined to be a straight line. If the query can be determined to be a straight line, when the absolute value of inclination of the straight line is equal to or smaller than a given value, the straight line is determined to be horizontal. If in step S13, the query is determined to be a horizontal line, the apparatus determines in step S16 whether or not the handwriting is located near and above the horizontal line.
A process of determining whether or not the handwriting is located near and above the horizontal line will be described with reference to
X[1]<X
X<X[2]
Y>(Y[1]+Y[2])/2
Y<(Y[1]+Y[2])/2+C
In these expressions, “C” demotes a preset threshold.
If in step S16, the handwriting is determined to be located near and above the horizontal line, the type of the query is finally determined to be an “underline”.
If in step S13 or step S16, the straight line is not determined to be horizontal, the apparatus determines in step S14 whether the one-stroke handwriting is a mark. In this case, the similarity between the one-stroke handwriting and predetermined marks (example “◯”, “Δ”, “⋆”, and “□”) is calculated. If the similarity between the one-stroke handwriting and each mark is equal to or smaller than a given value, the handwriting is determined to be a “one-stroke mark”. A specific process for calculating the similarity to a predetermined mark may be carried out using a method described in, for example, Japanese Patent No. 3537949.
If query is not determined to be a mark in step S14 in
In step S2 in
As described above, the determined type of the query and the handwriting data are passed from the query determiner 2 to the searcher 3 and the search result display 5.
Then, a timing for initiation of a search process will be described. The search process according to the present embodiment can carry out a search for each of the (i) direct handwriting, the (ii) direct handwriting selection, and the (iii) indirect handwriting selection. For the (i) direct handwriting, after a handwriting is input, a search process is carried out when a search button is selected using a pen. This is similar to a series of operations for a text search in which the search button is clicked via a mouse after a text is input. If the search input area is presented, the input handwriting may be set to be a handwriting to be searched for during a pen-up event or if no input has been provided for a predetermined time.
For the (ii) direct handwriting selection, the following is carried out when a query handwriting is determined to be selected.
(1) During a pen-up event, a search process is carried out (mostly in a search mode), or
(2) during a pen-up event, a context menu or a dialog is displayed to ask the user whether to carry out a search.
(iii) For the indirect handwriting selection, an operation similar to that in (ii) is performed when a query handwriting is determined to be selected.
The searcher 3 searches the handwritten document DB 4 for a text in accordance with a search method corresponding to the type of the query determined by the query determiner 2, to obtain a search result. Specifically, the search is carried out in accordance with the search method corresponding to the type of the query as follows.
(a) String: the handwritten document DB 4 is searched for a handwriting that is similar to or matches the string query.
(b) One-stroke mark: the handwritten document DB 4 is searched for a handwriting that is similar to or matches the one-stroke query. However, if the one-stroke mark is selected from the head of a line, then for example, the search target may be limited to the head of a line. Similarly, the priority of a mark present at the head of a line may be increased.
(c) Underline: the handwritten document DB 4 is searched for a horizontal line near and above which the handwriting is located.
(d) Enclosure: the handwritten document DB 4 is search for a closed curve inside which a handwriting is contained.
Here, a specific example of a process of searching the handwritten document DB 4 for a handwriting that is similar to or matches a string query will be described. The searcher 3 searches for a stroke sequence (a series of strokes) which is similar to a stroke sequence of a query, in a plurality of stroke sequences. The searcher 3 may perform the search by means of matching of a characteristic vector. An example of a more specific structure of stroke data (handwriting data) will be described with reference to
The “stroke” is a stroke of a character input in a handwriting manner and refers to a trajectory obtained after a pen or the like comes into contact with an input surface and before the pen leaves the input surface. Normally, points on the trajectory are sampled at predetermined timings (for example, at regular intervals). Thus, the stroke is expressed by a sequence of sampled points.
In an example in
The structure of a point may depend on the input device. In an example in
The coordinates belong to a coordinate system for the document plane, and may be represented by positive values that increase from the upper left corner to the lower right corner; the value at the upper left corner is the origin.
Furthermore, if the input device cannot acquire the writing pressure or if the input device can acquire the writing pressure, which is not used during the subsequent processing, the writing pressure in
In examples in
For example, DP matching (DP: Dynamic Programming) may be used as a specific example of characteristic vector matching used when a stroke sequence similar to a stroke sequence indicative of a query handwriting is searched for. The user's specified number of strokes in a stroke sequence is not necessarily the same as the user's desired number of strokes in the stroke sequence. This is because for example, some users write two strokes of the same character in one stroke or different users may write a string with the same meaning in different numbers of strokes. Normally, DP matching for strokes is a technique deals with the correspondence between one stroke and one stroke and optimally associates the two strokes with each other with expansion and contraction of each stroke accepted. Here, DP matching with the correspondence between one stroke and N strokes taken into account is used to enable matching that is robust against a variation in the number of strokes (see, for example, “Masuda, Uchida, Sakoe, Experimental optimization of DP Matching in On-line Character Recognition, the Joint Conference of Electrical and Electronics Engineers in Kyushu, H. 17. http://human.ait.kyushu-u.ac.jp/-uchida/Papers/masuda-shibu2005.pdf”
For example, a matching target stroke sequence is associated with a stroke sequence that is the user's specified query by setting each of all the strokes included in the matching target stroke sequence to be a start point, and the similarity between the stroke sequences is calculated. The similarity based on each start point is calculated, and the strokes are sorted in order of decreasing similarity. Since each of all the strokes is set to be a start point, overlapping results are obtained. Thereafter, peak detection is carried out to integrate the ranges of overlapping strokes.
The correspondence between multiple strokes and multiple strokes may be taken with use of the DP matching. For facilitating the understanding of this technique, a case of the application thereof to character recognition is illustrated in
On the other hand,
Various matching methods other than those described above are possible.
The search result display 5 displays the search result obtained by the searcher 3 by a display method corresponding to the type of the query. The following two patterns are available for displaying the search result depending on the type of the query.
(1) The retrieved handwriting and a handwriting located proximate to the retrieved handwriting are displayed.
(2) One page of handwriting including the retrieved handwriting is displayed.
However, for both patterns, a highlighting method may be varied depending on the type of the query as follows.
(a) String: A handwriting located proximate to the retrieved handwriting is also displayed. However, the retrieved handwriting itself is highlighted by changing the color thereof.
(b) One-stroke mark: A handwriting located proximate to the retrieved handwriting is also displayed. However, the retrieved handwriting itself and a handwriting located proximate to and above the retrieved handwriting are highlighted by changing the colors thereof.
(c) Enclosure: A handwriting located proximate to the retrieved handwriting is also displayed. However, retrieved handwriting itself and a handwriting contained inside the retrieved handwriting are highlighted by changing the colors thereof.
The above-described embodiment can determine the type of the handwriting query, carry out a search by a search method corresponding to the type of the handwriting query, and display the search result by the corresponding appropriate display method. Thus, the appropriate search can be carried out by the search method corresponding to the type of the query.
Variations of the present embodiment will be described.
The searcher 3 according to the present embodiment may search for a group of handwritten documents accumulated in the handwritten document DB 4 inside the document search apparatus. Alternatively, if the document search apparatus can be connected to a network such as an intranet and/or the Internet, the searcher 3 may search for a group of handwritten documents accessible to the document search apparatus via the network. Alternatively, the searcher may search for a group of handwritten documents accumulated in a removable memory connected to the document search apparatus. Alternatively, the above-described methods may be optionally combined together.
The document search apparatus according to the present embodiment can be configured as a standalone apparatus or as a plurality of nodes that can communicate with one another via the network.
Furthermore, the document search apparatus according to the present embodiment can be implemented by various devices such as a general-purpose desktop or laptop computer, a portable general-purpose computer, any other portable information equipment, information equipment with a touch panel, a smart phone, and any other information processing apparatus.
Furthermore, for example, a part of the configuration in
For example,
In the illustrated case, the client 301 is connected to the network 300 via radio communication, and the client 302 is connected to the network 300 via wired communication.
The clients 301 and 302 are normally user apparatuses. The server 303 may be, for example, provided on a LAN such as a corporate LAN or operated by an Internet service provider or the like. Alternatively, the server 303 may be a user apparatus, and a certain user may provide functions to other users.
Various methods are possible for distributing the components in
Furthermore, the instructions shown in the process procedure in the above-described embodiment can be carried out based on a program that is software. When a general-purpose computer system pre-stores this program, reading in the program enables effects similar to those of the document search apparatus according to the above-described embodiment to be exerted. The instructions described above in the embodiment are recorded in a magnetic disk (a flexible disk, a hard disk, or the like), an optical disk (CD-ROM, CD-R, CD-RW, DVD-ROM, DVD±R, DVD±RW, or the like), a semiconductor memory, or a similar recording medium. The recording medium may have any storage form provided that a computer or a built-in system can read the instructions from the recording medium. The computer can perform operations similar to the operations of the document search apparatus described above in the embodiment by reading in the program from the recording medium and allowing a CPU to carry out the instructions described in the program based on the program. Of course, upon acquiring or reading in the program, the computer may acquire or read in the program through the network.
Furthermore, an OS (Operating System) operating on the computer or an MW (Middle Ware) such as a database management software or a network may carry out a part of the processing for implementing the present embodiment, based on the instructions obtained from the recording medium and installed in the computer or built-in system.
Moreover, examples of the recording medium according to the present embodiment are not limited to those which are independent of the computer or built-in system but include those in which a program transmitted and downloaded via a LAN, the Internet, or the like is stored or temporarily stored.
In addition, the number of recording media is not limited to one, but the present embodiment allows the processing according to the present embodiment to be carried out via a plurality of media. The recording medium may be configured in any manner.
The computer or built-in system according to the present embodiment carries out the processes according to the present embodiment based on the program stored in the recording medium. The computer or built-in system according to the present embodiment may have any configuration such as an apparatus comprising a single personal computer, a single microcomputer, or the like, or a system comprising a plurality of apparatuses connected together via the network.
Furthermore, the computer according to the present embodiment is not limited to a personal computer but may be an arithmetic processing apparatus, a microcomputer, or the like included in information processing equipment. The computer according to the present embodiment collectively refers to equipment and apparatuses which can execute the functions according to the present embodiment in accordance with the program.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2012-121306 | May 2012 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
6130962 | Sakurai | Oct 2000 | A |
7929770 | Arai | Apr 2011 | B2 |
20020071607 | Kawamura | Jun 2002 | A1 |
20050055628 | Chen et al. | Mar 2005 | A1 |
20050154707 | Napper | Jul 2005 | A1 |
20060112142 | Sako et al. | May 2006 | A1 |
20110153633 | Napper et al. | Jun 2011 | A1 |
20120066578 | Robin | Mar 2012 | A1 |
Number | Date | Country |
---|---|---|
1625741 | Jun 2005 | CN |
1637741 | Jul 2005 | CN |
63-228874 | Sep 1988 | JP |
H11-053402 | Feb 1999 | JP |
2006-031492 | Feb 2006 | JP |
2007-317022 | Dec 2007 | JP |
Entry |
---|
First Office Action mailed by Japan Patent Office on May 13, 2014 in the corresponding Japanese Patent Application 2012-121306—6 pages. |
Background Art Information, Toshiba, Aug. 20, 2012. |
Cheng Cheng, et al., “A Discriminative Model for On-line Handwritten Japanese Text Retrieval,” 2011 International Conference on Document Analysis and Recognition, IEEE, pp. 1285-1288. |
Pasitthideth Luangvilay, “An On-line Handwritten Text Search Method based on Directional Feature Matching,” 2011 International Conference on Document Analysis and Recognition, IEEE, pp. 683-686. |
First Office Action dated Nov. 10, 2015 issued by Japan Patent Office in the corresponding Japanes Patent Application No. 2015-014601—4 pages. |
Office Action dated Dec. 28, 2015 in corresponding Chinese Application No. 201210362464.8, 13 pgs. |
Number | Date | Country | |
---|---|---|---|
20130318120 A1 | Nov 2013 | US |