A technique of the present disclosure relates to a technique of setting a property for a document file.
A method of scanning documents such as paper business forms and converting them to files to store and manage the documents in a storage server on a network has been conventionally widely used. Storing and managing documents in a storage server on a network as document files requires setting of properties such as a file name. For example, as a method of setting the file name, there is a method in which a character recognition process is performed on a scanned image of a document to extract character information and a character string to be used as the file name is selected from the obtained character information.
Moreover, character region information of the character string used by a user to set the file name is stored and accumulated for each type of document. Then, in the case where features of a document form match those of a document stored in the past, the character string used in the past document is automatically proposed as the character string to be used in the file name. In this case, if the character string used for the setting of the file name in the past is incorrect, an incorrect character string is proposed.
In this regard, Japanese Patent Laid-Open No. 2020-46819 discloses a technique that uses a degree of certainty as an index indicating certainty of the character string subjected to the character recognition process and that prompts the user to check or correct the character string subjected to the character recognition process even if the degree of certainty thereof is high.
The aforementioned conventional technique attempts to guarantee the certainty of the character string by making the user check or correct the character string. However, the user may make a mistake in the check or the correction. In such a case, an incorrect character string is set as the file name or the like due to a mistake in the correction or the like made by the user.
A technique of the present disclosure has been developed in view of the aforementioned problems and an object thereof is to allow a user to more easily set characters strings with fewer mistakes as character stings used in properties of document files.
The image processing apparatus according to the present disclosure is an image processing apparatus for setting a property of a document file by using a result of a character recognition process performed on a scanned image of a document and includes: at least one memory that stores a program; and at least one processor that executes the program to perform: obtaining a character string by performing the character recognition process on a scanned image relating to a document file to be currently generated; and automatically setting the obtained character string as a character string to be used in a property of the document file to be currently generated in a case where the obtained character string is a character string that is obtained in the character recognition process performed on a scanned image relating to a document file generated in the past and that is approved by a user a certain number of times or more.
Further features of the present disclosure will become apparent from the following description of exemplary embodiments with reference to the attached drawings.
Hereinafter, with reference to the attached drawings, the present disclosure is explained in detail in accordance with preferred embodiments. Configurations shown in the following embodiments are merely exemplary and the present disclosure is not limited to the configurations shown schematically.
The MFP 110 is an example of an information processing apparatus with a scan function. The MFP 110 is a multifunction peripheral with multiple functions such as a print function and a BOX storage function in addition to the scan function. The server apparatuses 120 and 130 are examples of an information processing apparatus that provides a cloud service. The server apparatus 120 of the embodiment provides a cloud service of performing image analysis on a scanned image received from the MFP 110 and transferring a request from the MFP 110 to the server apparatus 130 that provides another service. In the following description, the cloud service provided by the server apparatus 120 is referred to as “MFP cooperative service”. The server apparatus 130 provides a cloud service of storing a file received via the Internet and providing a stored file in response to a request from a web browser of a client PC (not illustrated) or the like (hereinafter, referred to as “storage service”). In the embodiment, the server apparatus 120 that provides the MFP cooperative service is referred to as “MFP cooperative server” and the server apparatus 130 that provides the storage service is referred to as “storage server”.
The configuration of the image processing system illustrated in
Function modules of the MFP 110 are roughly divided into two modules of a native function module 410 and an additional function module 420. The native function module 410 is an application generally included in the MFP 110 while the additional function module 420 is an application additionally installed into the MFP 110. The additional function module 420 is an application based on Java (registered trademark) and can easily implement addition of functions to the MFP 110. Note that other additional applications that are not illustrated may be installed in the MFP 110.
The native function module 410 includes a scan execution unit 411 and a scanned image management unit 412. Moreover, the additional function module 420 includes a display control unit 421, a scan control unit 422, a cooperative service request unit 423, and an image processing unit (not illustrated).
The display control unit 421 displays a user interface (UI) screen for receiving various user operations on the touch panel of the operation unit 220. The various user operations include, for example, input of login authentication information for access to the MFP cooperative server 120, scan setting, scan start instruction, file name setting, file storage instruction, and the like.
The scan control unit 422 instructs the scan execution unit 411 to execute a scan process while sending information on scan setting thereto, in response to a user operation (for example, pressing of a “scan start” button 701 to be described later) performed on the UI screen. The scan execution unit 411 causes the scanner unit 222 to execute a document read operation via the scanner I/F 217 according to the scan process execution instruction from the scan control unit 422 and generate scanned image data. The scanned image management unit 412 stores the generated scanned image data in the HDD 214. In this case, the scan control unit 422 is notified of information on a scanned image identifier that uniquely indicates the stored scanned image data. The scanned image identifier is a number, a symbol, an alphabet, or the like for uniquely identifying an image scanned in the MFP 110. For example, the scan control unit 422 obtains a scanned image data to be converted to a file from the scanned image management unit 412 by using the aforementioned scanned image identifier. Then, the scan control unit 422 instructs the cooperative service request unit 423 to send a request for processes necessary for file conversion to the MFP cooperative server 120.
The cooperative service request unit 423 sends requests for various processes to the MFP cooperative server 120 and receives responses to these requests. The various processes include, for example, login authentication, analysis of the scanned images, transmission of the scanned image data, and the like. A communication protocol such as representational state transfer (REST), SOAP, or the like is used for communication with the MFP cooperative server 120.
The image processing unit (not illustrated) performs a predetermined image process on the scanned image data and generates an image used in the UI screen displayed by the display control unit 421. Details of the predetermined image process are described later.
Note that an apparatus other than the MFP 110 (not-illustrated client PC or the like) may include the aforementioned additional function module 420. Specifically, the system configuration may be such that the request for analysis of the scanned image obtained in the MFP 110, the setting of a file name based on an analysis result, and the like are performed in the client PC.
First, a software configuration of the MFP cooperative server 120 is described. The MFP cooperative server 120 includes a request control unit 431, an image processing unit 432, a cloud storage access unit 433, a file generation unit 434, and a display control unit 435. The request control unit 431 stands by in a state where it can receive a request from an external apparatus, and instructs the image processing unit 432, the cloud storage access unit 433, and the file generation unit 434 to execute predetermined processes depending on the contents of the received request.
The image processing unit 432 performs image modification processes such as rotation and tilt correction in addition to analysis processes such as a character region detection process, a character recognition process (OCR process), a similar document determination process, and an automatic transmission determination process on the scanned image data sent from the MFP 110.
A character region detected from the scanned image is referred to as “character region” in the following description. Moreover, description is given of an example in which a business form such as a quotation or a sales invoice is a target document, and a scanned image of this document is referred to as “business form image”. The cloud storage access unit 433 sends a request for a process to the storage server 130. In a cloud service, various interfaces for storing a file in a storage server and obtaining a stored file by using a protocol such as REST or SOAP are made open to the public. The cloud storage access unit 433 sends a request to the storage server 130 by using the interfaces made open to the public. The file generation unit 434 receives an instruction from the request control unit 431 and generates a document file including a scanned image. In this case, the file generation unit 434 sets a file name based on a result of a character recognition process performed on a character string included in the scanned image. The display control unit 435 receives a request from a web browser operating on a PC or a mobile terminal (both are not illustrated) connected via the Internet and sends back screen configuration information (HyperText Markup Language (HTML), Cascading Style Sheets (CSS), or the like) necessary for screen display. The user can check registered user information and change the scan setting through a screen displayed on the web browser.
Next, a software configuration of the storage server 130 is described. The storage server 130 includes a request control unit (not illustrated), a file management unit (not illustrated), and a display control unit (not illustrated). The request control unit (not illustrated) stands by in a state where it can receive a request from an external apparatus, and instructs the file management unit (not illustrated) to store a received file or read a stored file in response to a request from the MFP cooperative server 120 in the embodiment. Then, the request control unit (not illustrated) sends back a response depending on the request, to the MFP cooperative server 120. The display control unit (not illustrated) receives a request from a web browser operating on a PC or a mobile terminal (both are not illustrated) connected via the Internet and sends back screen configuration information (HTML, CSS, or the like) necessary for screen display. The user can check and obtain a stored file through a screen displayed on the web browser.
Communications among the apparatuses are described below in a chronological order based on the sequence diagram of
In S501, the MFP 110 displays an UI screen for inputting login authentication information used to access the MFP cooperative server 120 (hereinafter, referred to as “login screen”), on the operation unit 220.
In S502, in the case where the user inputs a user ID and a password registered in advance respectively into the ID input field 612 and the password input field 613 on the login screen 610 and presses the “login” button 611, a login authentication request is transmitted to the MFP cooperative server 120.
In S503, the MFP cooperative server 120 having received the login authentication request performs an authentication process by using the user ID and the password included in the request. In the case where the user is confirmed to be a qualified user as a result of the authentication processing, the MFP cooperative server 120 sends back an access token to the MFP 110. Hereinafter, sending this access token together with various requests from the MFP 110 to the MFP cooperative server 120 enables determination of the logging-in user. In the embodiment, login to the storage server 130 is completed simultaneously with the completion of login to the MFP cooperative server 120. To this end, the user ID for using the MFP cooperative service is linked in advance with an user ID for using the storage service via a web browser or the like in a PC (not illustrated) on the Internet. This allows login authentication to the storage server 130 to be completed simultaneously with success of the login authentication to the MFP cooperative server 120 and an operation of login to the storage server 130 can be omitted. Moreover, the MFP cooperative server 120 can handle requests relating to storage services from the user logging into itself. The method of login authentication may be a general, publicly known method (authorization using Basic authentication, Digest authentication, OAuth, or the like).
In S504, in the case where the login is completed, a UI screen for scan setting (hereinafter, referred to as “scan setting screen”) is displayed on the operation unit 220 in the MFP 110.
In S505, in the case where the logging-in user having completed the scan setting sets a document to be scanned on the glass platen of the MFP 110 and presses the “scan start” button 701, the scan is executed. The scanned image data obtained by computerizing a paper document is thus generated.
In S506, after completion of the scanning, the MFP 110 transmits the scanned image data that is data including an image obtained in this scanning to the MFP cooperative server 120, together with an analysis request that is a request for analyzing the scanned image data.
In S507, in the MFP cooperative server 120 having received the analysis request of the scanned image, the request control unit 431 instructs the image processing unit 432 to execute an analysis process (S508 to S511). In this case, the request control unit 431 sends back a request ID (“processId”) by which the received analysis request can be uniquely determined, to the MFP 110.
In S508, the image processing unit 432 having received the instruction to execute the analysis process executes the analysis process on the scanned image relating to the analysis request. In this analysis process, first, the image processing unit 432 performs a process of detecting character regions present in the scanned image (hereinafter, referred to as “target image”) being a target of the analysis process (character region detection process). For example, a known method such as a method of extracting rectangular regions assumed to be characters from an image binarized by using a certain threshold may be applied to the character region detection.
In S509, the image processing unit 432 performs a process of determining whether image data of a paper document being the target of image analysis is similar to an image of a previously computerized paper document by checking the image data against training data. Examples of the paper document being the target of image analysis include business forms such as a quotation. Moreover, the process of determining whether the image data of the business form being the target of image analysis is similar to the image of the previously computerized business form by checking the image data against the training data is refereed to as “similar business form determination” in the following description as appropriate. In the similar business form determination, there is used arrangement information that is information indicating a position of each of character regions (hereinafter, referred to as “character region” as appropriate) that are regions present in the target image and that are regions including character strings to be targets of a later-described character recognition process, in the target image. Specifically, the image processing unit 432 compares the arrangement information of the target business form image to be currently generated and the arrangement information of a document file images generated in the past images accumulated in the training data and determines whether the arrangements of the character regions are the same or similar. This is based on the fact that, if the arrangements of the character regions are in a same or similar relationship, it is possible to assume that these documents are the business forms of the same type created by using the same document form (business form). The arrangement information of the past business form images used in the similar business form determination is accumulated as the training data in a “training process” to be described later. Although whether the business forms are similar or not is determined based only on the matching degree of the arrangement of the character regions in the embodiment, for example, whether the business forms are similar or not may be determined by determining the type of business form (quotation, sales invoice, or the like) from the result of an OCR process to be described later and also taking information on the obtained type into consideration.
In S510, the image processing unit 432 executes the character recognition process on each of the character regions found in the character region detection process. The character recognition process of the image processing unit 432 in S510 is described later by using
In S511, the image processing unit 432 executes a process of determining whether a document file of the target image in the current operation can be set as a target of automatic transmission (hereinafter, referred to as “automatic transmission determination process”). The automatic transmission determination process is described later by using
In S512, during the aforementioned analysis process, the MFP 110 regularly (for example, every several hundreds to several milliseconds) checks an analysis process status with the MFP cooperative server 120 by using the aforementioned request ID (S512 to S513). This check is repeatedly performed until the MFP 110 obtains a response including information indicating completion of the analysis process from the MFP cooperative server 120. The request control unit 431 of the MFP cooperative server 120 checks a progress status of the analysis process corresponding to the request ID upon receiving the request for checking of the analysis process status. If the analysis process is not completed, the request control unit 431 sends back an in-process response indicating that that the analysis process is in a status of being currently executed. Meanwhile, if the analysis process is completed, the request control unit 431 sends back a completion response. A character string indicating the current process status is inputted in “status” of these responses. Specifically, in the case where the process is being executed in the MFP cooperative server 120 (in-process response), “processing” is inputted in the “status” and, in the case where the process is completed (completion response), “completed” is inputted in the “status”. Note that character strings indicating other statues such as “failed” used in the case where the process fails may be inputted in the “status”. Moreover, the completion response includes information on URL or the like indicating a storage location of the image analysis results. In the case where the MFP 110 obtains the completion response of the aforementioned analysis process, the MFP 110 transmits a request for obtaining the analysis process results to the MFP cooperative server 120 by using the URL included in the completion response and indicating the storage location of the image analysis result. In the case where the request control unit 431 of the MFP cooperative server 120 receives the request, the request control unit 431 sends back information including the analysis process results (result information) to the MFP 110. In this case, the result information includes a result of the character region determination, a result of the similar business form determination, a result of the character recognition process, and a result of the automatic transmission determination process for the target image.
In the case where the result of the automatic transmission determination process is “automatic transmission is not possible”, processes of S515 to S518 to be described later are executed to prompt the user to perform check, correction, or the like. Meanwhile, in the case where the result of the automatic transmission determination process is “automatic transmission is possible”, the processing proceeds to S520 without execution of the processes of S515 to S518 to be described later and the document file of the target image is automatically generated and automatically transmitted to the storage server 130 (S521). In the automatic generation of the document file, the character strings obtained in the character recognition process are used in the file name of this document file.
In the case where the result of the automatic transmission determination process is “automatic transmission is not possible”, in S515, the MFP 110 displays a file name setting screen 900 (see
In the case where the MFP 110 detects pressing of the aforementioned “OK” button 902, in S516, the MFP 110 displays a transmission confirmation screen 1000 (see
In S517, in the case where the request control unit 431 of the MFP cooperative server 120 receives the file name setting information from the MFP 110, the request control unit 431 instructs the image processing unit 432 to store the received file name setting information.
In S518, the image processing unit 432 receives the aforementioned storage instruction from the request control unit 431 and executes a process for storing the file name setting information. Note that details of the process for storing the file name setting information are described later.
In S518a, the image processing unit 432 notifies information including status information indicating completion of the file name setting information storage process to the request control unit 431. In S518b, in the case where the request control unit 431 obtains the information including the status information indicating completion of the storage of the file name setting information from the image processing unit 432, the request control unit 431 transmits a response including the status information indicating this completion to the MFP 110.
In S519, in the case where the MFP 110 obtains the response from the request control unit 431 transmitted in S518, the MFP 110 transmits a request for generating a file (file generation request) to the MFP cooperative server 120 based on information included in the obtained response. In the case where the request control unit 431 of the MFP cooperative server 120 receives the file generation request, the request control unit 431 starts a file generation process and sends back a response including information indicating normal reception of the file generation request, to the MFP 110. The MFP 110 terminates the processing upon receiving the response from the request control unit 431 and returns to the scan setting screen display of S504.
In S520, the MFP cooperative server 120 obtains information on a file format from scan setting registered advance and converts the target image to a file (generates a file) according to the obtained file format. In this case, for an automatically transmitted document file, a character string that is used in a past file conversion process (that is, approved by the user) and that satisfies the accumulated number of times of approval is automatically set as the file name. Meanwhile, for a document file transmitted according to an explicit instruction made by the user, a file name that is created by the user by appropriately correcting the character string presented on the file name setting screen and performing final check is set.
In S521, the request control unit 431 transmits the generated file to the storage server 130. Then, in S522, in the case where the storage server 130 receives the file transmitted from the MFP cooperative server 120, the storage server 130 sends back a response indicating a status of reception completion of this file, to the MFP cooperative server 120.
The rough flow of the processing in the entire image processing system is as described above. Although the contents of the sequence diagram in
The character recognition process of S510 is described by using
In S1101, the image processing unit 432 executes the character recognition process on the character region obtained in the character region analysis process of S508.
Subsequent S1102 and S1103 are processes performed for each of characters recognized in the character region being the target of the character recognition process.
In S1102, the image processing unit 432 determines whether the degree of certainty is lower than a threshold set in advance, the degree of certainty being the index indicating the certainty of the character obtained as a result of the character recognition process in S1101. In the character recognition process, the image processing unit 432 uses dictionary data in which feature amounts (patterns) of various characters are registered in advance and finds a character whose feature amount matches that of the input character from the dictionary data to recognize a certain character. Accordingly, the higher the feature amount match ratio is, the more accurate the character recognition result is (the more likely the determined character is to be correct). The more blurred or squished the character is, the lower the match ratio tends to be. Accordingly, the feature amount match ratio in the character recognition result is used as the degree of certainty and, if a value of the degree of certainty is low, the image processing unit 432 determines that the input character is a character that requires creation of a character replacement candidate. Meanwhile, if the value of the degree of certainty is high, the image processing unit 432 determines that the input character is a character that does not require the creation of a character replacement candidate. Note that this degree of certainty depends on a used OCR engine. Accordingly, a value of degree of certainty (for example, a predetermined threshold of 95% or more) to be a reference that is obtained through experiments with the used OCR engine is set in advance and the image processing unit 432 determines whether the degree of certainty is high by performing a process of comparing the threshold and the degree of certainty in the character recognition result obtained in S1101. For example, a threshold of 95% or more is set in advance in the image processing unit 432. Then, the image processing unit 432 determines whether the degree of certainty is high by performing the process of comparing the threshold (95% or more) and the degree of certainty (for example, “score” 1.0, in
In S1103, the image processing unit 432 creates a replacement candidate character for a character whose degree of certainty in the character recognition result is determined to be low. For example, in the case where the recognized character is the number “1” and the degree of certainty in the character recognition result is low, other characters with similar shapes such as the lower case alphabet “l” and the upper case alphabet “I” are set as the replacement candidate characters. In this case, characters with character shapes similar to that of the characters recognized in the character recognition process are assumed to be learned by using a publicly known machine learning function.
Processes of S1104 and beyond are processes performed for each character region set as the target of the character recognition process.
In S1104, the image processing unit 432 uses the replacement candidate character created in S1103 to create a correction candidate character string of the character string obtained from the character region. For example, in the case where the initially obtained character sting is “Sales invoice” including the number “1”, the image processing unit 432 creates a correction candidate character string such as “Sales Invoice” in which the number “1” is changed to the upper case alphabet “I”. Moreover, the image processing unit 432 may create a correction candidate character string such as “Sales invoice” in which the number portion is changed to the lower case alphabet “l” in some cases.
In S1105, the image processing unit 432 calculates a score for each correction candidate character string created in S1104. In this case, the image processing unit 432 uses a word dictionary stored in advance and gives a high score to the correction candidate character string in the case where the arrangement of characters matches that of a word in the dictionary. For example, regarding the character string “Sales invoice”, “Sales” matches a word in the word dictionary but “Invoice” does not. In this case, since only one word out of two words matches the word in the word dictionary, the score is 100×1/2=50. Regarding the “Sales Invoice”, “Sales” and “Invoice” both match the words in the word dictionary and the score is thus 100×2/2=100.
Although the example in which the image processing unit 432 calculates the score for the character string of “Sales Invoice” is described as an example of the image processing unit 432 calculating the score for each correction candidate character string, the example of the image processing unit 432 calculating the score for each correction candidate character string is not limited this example. For example, in the case where only the character string “Shimomaruko” in the character string “Shimomaruko Co., Ltd.” is subjected to the character recognition process and the character string “Co., Ltd.” is not recognized at all, “Shimomaruko” matches the word in the word dictionary and the score is thus 100×1/1=100. Then, in S1106 to be described later, the character string “Shimomaruko” is determined as a character string recognition result. However, even in such a case, a negative determination result is likely to be given in determination of S1203 or S1204 to be described later. Moreover, the user visually checks the character string recognition result. Accordingly, the character string “Shimomaruko” is unlikely to be set as the target of automatic transmission.
In S1106, the image processing unit 432 determines the correction candidate character string with the highest score calculated in S1105 as the final character string recognition result. The character string determined as described above is referred to as “determined character string” hereinafter.
The contents of the character recognition process are as described above.
The automatic transmission determination process of S511 that is executed subsequent to the character recognition process of S510 is described by using
In S1201, the image processing unit 432 obtains information on the determined character string of interest (hereinafter, referred to as interest determined character string) among the determined character strings obtained in the character recognition process. For example, the image processing unit 432 refers to later-described “text” in
In S1202, the image processing unit 432 refers to the aforementioned past data and obtains information on the approved character string most similar to the interest determined character string. For example, if the character string “INV” is present in the stored past data as the approved character string, the image processing unit 432 obtains the character string “INV”.
In S1203, the image processing unit 432 performs matching determination of determining whether the interest determined character string obtained in S1201 matches the approved character string obtained in S1202. “Match” described herein is a concept that does not include partial matching. If the interest determined character string matches the approved character string in the result of the determination, the processing proceeds to S1204. Meanwhile, if the interest determined character string does not match the approved character string, the processing proceeds to S1206.
In S1204, the image processing unit 432 refers to the aforementioned past data and determines whether the accumulated approval number of the approved character string matching the interest determined character string is equal to or higher than a predetermined number (equal to or higher than a certain number). In this case, the predetermined number may be set as appropriate depending on a degree at which the user desires to suppress erroneous transmission of a file. For example, in the case where the user places greater priority on the usability than the suppression of erroneous transmission in the automatic transmission of a file, the user may set the predetermined number to a smaller number (for example, 3 or the like). Meanwhile, in the case where the user desires to suppress erroneous transmission as much as possible, the user may set the predetermined number to a greater number (for example, 10 or the like). If the accumulated approval number of the approved character string is equal to or greater than the predetermined number in the result of the determination, the processing proceeds to S1205. Meanwhile, if the accumulated approval number is smaller than the predetermined number, the processing proceeds to S1206.
In S1205, the image processing unit 432 sets temporary information indicating that the automatic transmission is possible, for the interest determined character string. For example, a flag that indicates whether the automatic transmission is possible in a binary value (for example, “1” is given in the case where “automatic transmission is possible” and “0” is given in the case where the “automatic transmission is not possible”) is conceivable as the temporary information.
In S1206, the image processing unit 432 sets, for example, the aforementioned flag as the temporary information indicating that the automatic transmission is not possible, for the interest determined character string.
In S1207, the image processing unit 432 checks whether there is an unprocessed determined character string. If an unprocessed determined character string is left, the processing returns to the process of S1201 and the image processing unit 432 determines the next determined character string of interest and repeats the same processes. If the processing for all determined character strings is completed, the processing proceeds to S1208.
In S1208, the image processing unit 432 determines whether the automatic transmission is possible for the target image of analysis process, based on the aforementioned temporary information given to all determined character strings included the target image. For example, if the temporary information of “automatic transmission is possible” is set for all determined character strings, the image processing unit 432 determines that the automatic transmission of the target image is possible. Alternatively, if a proportion of the determined character strings to which the temporary information of “automatic transmission is possible” is given is equal to or greater than a certain proportion, the image processing unit 432 may determine that the automatic transmission is possible. The image processing unit 432 terminates the processing after this determination.
The contents of the automatic transmission determination process are as described above.
Next, description is given of the process of storing the file name setting information (S518) executed in the case where the automatic transmission is determined not to be possible.
In S1301, the image processing unit 432 obtains the file name setting information from the request control unit 431. As described in S516, the file name setting information includes the pieces of information on the character regions in the target image, the type of business form, and the like, in addition to the determined character strings used in the file name. The pieces of information on the character regions and the type of business form are pieces of information corresponding to “rect” and “formId” in the data structure illustrated in
In S1302, the image processing unit 432 determines whether the user has corrected the determined character string, based on information that indicates whether correction for the determined character string is present or absent and that is included in the file name setting information obtained in S1301. The user is assumed to operate the UI screen (for example, software keyboard displayed on the operation unit 220 to correct an erroneously recognized determined character string to a proper character string. This correction also includes, for example, changing the character region used in the file name generated in this operation to a character region different from that used in the initial file name and adding or deleting a separator.
Moreover, the image processing unit 432 performs the aforementioned determination by comparing the value of “value” in “metadataArray” included in the result information with information that is inputted into the file name input field 901 after the file name setting is performed on the scanned image in this operation. If there is a difference in the result of the comparison, the image processing unit 432 determines that the initial file name has been corrected and proceeds to the S1303. If the value matches the information, the image processing unit 432 determines that the initial file name is unchanged and proceeds to S1304. For example, assume a case where, in a character string that should be recognized as “Invoice” in the character recognition process, the upper case alphabet “I” is erroneously recognized as the lower case alphabet “l” and “Invoice” is displayed in the file name input field 901 and is corrected by the user. In this case, since the file name setting information includes information indicating that correction regarding “I” in “Invoice” being the determined character string has been performed, the image processing unit 432 determines that the determined character string has been corrected based on this information. If the user has corrected the character string, the processing proceeds to S1303. If the user has not corrected the character string, the processing proceeds to S1304.
In S1303, the image processing unit 432 stores the character string before the correction (that is the character string erroneously recognized in the character recognition process) and the determined character string after the correction in link with the information on the character region of these character strings. For example, assume a case where the character string before the correction is “invoice” including the lower case alphabet “l” and the corrected determined character string is “Invoice” correctly including the upper case alphabet “I”. In this case, the image processing unit 432 links the “Invoice” before the correction and the “Invoice” after the correction with each other and store these character strings in the storage unit together with the information on the character region of these character strings as “past data”.
In S1304, the image processing unit 432 stores the determined character string in link with the information on the character region thereof. Moreover, the image processing unit 432 checks whether this determined character string is registered as the approved character string. If the determined character string is not registered, the image processing unit 432 counts the accumulated approval number as “1”. If the determined character string is registered, the image processing unit 432 increments (+1) the current count value.
The contents of the process of storing the file name setting information are as described above.
As described above, according to the embodiment, in the case where a result of the OCR process on the file conversion target image has a record of being used in the file name or the like a certain number of times or more in the past, this result is automatically set as the candidate of the file name.
Specifically, in the case where the result satisfies the accumulated approval number and is determined to be automatically transmitted, the determined character string at this moment is used in the file name.
Meanwhile, in the case where the result does not satisfy the accumulated approval number and is determined not to be automatically transmitted, the temporary file name using the determined character string at this moment is presented to the user.
The result of the similar business form determination may be used to determine whether the automatic transmission is possible or not. For example, if the determined character string matches the approved character string and the type of business form relating to the determined character string is the same as the type of business form relating to the approved character string, the certainty of the determined character string is likely to be higher. Accordingly, the following configuration may be employed. In the case where the determined character string approved by the user is stored, the information indicating the type of business form in which this determined character string is written is stored together with the determined character string. Then, in the comparison between the determined character string and the approved character string, the type of business form of the determined character string is compared with the type of business form of the approved character string and the determination of whether the automatic transmission is possible or not based on the accumulated approval number is performed only in the case where the types of business form match each other.
In the first embodiment, a character string that has not been approved in the past or a character string that has been approved but whose number of times of approval is less than the predetermined number is not set as the target of automatic transmission to suppress automatic transmission of a file for which an incorrect file name is set. In this case, even if the degrees of certainty for all characters forming the character string in the result of the character recognition process are values close to “1” (for example, “0.9” or more) and this character string is not corrected by the user, this character string is not set as the target of automatic transmission if the character string is not approved the predetermined number of times or more. Accordingly, the first embodiment has a problem of making the user perform work of approval even though the degree of certainty of the character string is high. The second embodiment solves this problem.
Details of the automatic transmission determination process in the second embodiment are described by using
Since S1401 and S1402 are the same as S1201 and S1202 in the flow of
In S1403, as in S1203, the image processing unit 432 performs the matching determination of whether the interest determined character string matches the approved character string. If the interest determined character string matches the approved character string, the processing proceeds to S1404. Meanwhile, if the interest determined character string does not match the approved character string, the processing proceeds to S1406.
In S1404, as in S1204, the image processing unit 432 determines whether the accumulated approval number of the approved character string matching the interest determined character string is equal to or higher than the predetermined number. If the accumulated approval number is equal to or higher than the predetermined number, the processing proceeds to S1406. Meanwhile, if the accumulated approval number is less than the predetermined number, the processing proceeds to S1405.
In S1405, the image processing unit 432 determines whether the degrees of certainty for all characters forming the interest determined character string are equal to or higher than a threshold set in advance. The degrees of certainty calculated in the already executed flow of
In S1406, as in S1205, the image processing unit 432 sets the temporary information relating to the interest determined character string and indicating that the automatic transmission is possible. Moreover, the image processing unit 432 may additionally set information indicating that the degrees of certainty are equal to or higher than the threshold in the case where the degrees of certainty are equal to or higher than the threshold. An example of the information indicating that the degrees of certainty are equal to or higher than the threshold includes “∞” but is not limited to this.
In the case where the information indicating that the degrees of certainty are equal to or higher than the threshold is additionally set, this information may be set instead of the information relating to the accumulated approval number or may be set together with the information relating to the accumulated approval number.
S1407 corresponds to S1206 in the flow of
S1408 and S1409 also correspond to S1207 and S1208 in the flow of
The contents of the automatic transmission determination process relating to the second embodiment are as described above.
As described above, according to the second embodiment, even if the number of times of approval of the determined character string is less than the predetermined number, the temporary information indicating that the automatic transmission is possible is set as long as the degrees of certainty are equal to or higher than the threshold. This increases the number of document files to be targets of automatic transmission from that in the first embodiment and the work of user approval can be eliminated.
In the first and second embodiments, the automatic transmission of a document file is determined to be possible in the case where the accumulated approval number of a certain character string is equal to or higher than the predetermined number. Meanwhile, users include users who frequently correct the character strings displayed in the file name input field 901 of
Note that, since a system configuration and a processing procedure in the third embodiment are the same as those described in the first embodiment, description thereof is omitted and only the different points are described.
Details of a file name setting information storage process performed by the image processing unit 432 of the MFP cooperative server 120 in the embodiment are described by using
In S1501, the image processing unit 432 obtains the file name setting information from the request control unit 431. The file name setting information obtained in the embodiment includes not only the information on the character strings used in the file name and the character regions of the character strings but also information on a user ID identifying the target user and information on presence or absence of correction on the determined character strings made by the target user.
In S1502, the image processing unit 432 obtains the “past data” for the target user. Note that the steps of S1503 and beyond are executed for each determined character string.
In S1503, the image processing unit 432 determines whether the target user has corrected the determined character string. Note that, since the details of the determination in this case are the same as those in S1302 in the flow of
If the target user has corrected the determined character string, in S1504, the image processing unit 432 updates the accumulated number of times the target user has corrected the determined character string (accumulated correction number) by obtaining information indicating that the target user has corrected the determined character string. Moreover, the image processing unit 432 checks whether correction history of this determined character string is registered. If the correction history is not registered, the image processing unit 432 counts the accumulated approval number as “1”. If the correction history is registered, the image processing unit 432 increments (+1) the current accumulated correction number. In S1505, the image processing unit 432 calculates the frequency at which the determined character string is corrected by the target user (hereinafter, referred to as correction frequency) based on the accumulated correction number for the determined character string. In S1506, the image processing unit 432 calculates the accumulated approval number necessary for the automatic transmission performed with the determined character string corrected by the target user used in the file name, based on the calculated correction frequency. S1507 corresponds to S1303 in the flow of
If the target user has not corrected the determined character string, in S1508, the image processing unit 432 calculates the correction frequency as in S1505. Then, in S1509, the image processing unit 432 calculates the accumulated approval number as a threshold as in S1506.
In S1510, the image processing unit 432 stores the information on the accumulated approval number and the like for each target user. Specifically, the image processing unit 432 stores the accumulated approval number newly calculated and set for each determined character string and the information identifying the target user (user ID) in link with each other as the file name setting information.
The flow of the file name setting information storage process according to the embodiment is as described above.
Processes of S1504 and beyond in the case where the target user corrects the determined character string is described by using a specific example.
For example, assume a case where a user A has used a character string “Invoice” 19 times and has performed correction 9 times out of the 19 times in the file name setting information storage process executed in the past. Then, assume that, in the 20th file name setting information process, the user A corrects the character string “Invoice”. The accumulated correction number for the character string “Invoice” for the user A is then incremented and is updated to 10 times (S1504). The correction frequency of the user A in this case can be obtained from, for example, the following formula (1) (S1505).
“Accumulated correction number/(accumulated correction number+number of times correction is not performed)=correction frequency” formula (1).
At this moment, the correction frequency of the user A is (10 times)/(10 times)+(10 times)=0.5 from the aforementioned formula (1).
Then, the accumulated approval number necessary for the automatic transmission performed with the determined character string “Invoice” used in the file name is calculated based on the correction frequency calculated by using the aforementioned formula (1) (S1506). For example, the following formula (2) may be used for the calculation in this case.
“(Correction frequency×10+(accumulated approval number set in advance)=new accumulated approval number” formula (2).
Note that “×10” in the aforementioned formula (2) is a constant for converting the correction frequency expressed in a decimal to an integer. In the case where the correction frequency continues to a second decimal place or beyond, the correction frequency may be rounded down to the nearest tenth.
Assume that the accumulated approval number set in advance is 5 times. Then, the new accumulated approval number in the case where the correction frequency is 0.5 is (0.5)×10+(5 times)=10 times. If the approval number calculated based on the aforementioned formula (2) is not an integer, the approval number may be rounded to the nearest integer. As apparent from the aforementioned formula (2), the “new accumulated approval number” never falls below the “accumulated approval number set in advance”. In other words, the “accumulated approval number set in advance” functions as a minimum value of the approval number necessary for the automatic transmission performed with the determined character string used in the file name. Although the case where the accumulated approval number is changed by being increased is described herein, the present disclosure is not limited to this configuration. For example, the accumulated approval number may be changed such that a value greater than the minimum value is set as an initial value of the “accumulated approval number set in advance” and the accumulated approval number is reduced for a character string approved by a user whose rarely corrects character strings.
According to the embodiment described above, the accumulated approval number can be set as an appropriate threshold depending on the correction frequency of the user.
Embodiment(s) of the present disclosure can also be realized by a computer of a system or apparatus that reads out and executes computer executable instructions (e.g., one or more programs) recorded on a storage medium (which may also be referred to more fully as a ‘non-transitory computer-readable storage medium’) to perform the functions of one or more of the above-described embodiment(s) and/or that includes one or more circuits (e.g., application specific integrated circuit (ASIC)) for performing the functions of one or more of the above-described embodiment(s), and by a method performed by the computer of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s) and/or controlling the one or more circuits to perform the functions of one or more of the above-described embodiment(s). The computer may comprise one or more processors (e.g., central processing unit (CPU), micro processing unit (MPU)) and may include a network of separate computers or separate processors to read out and execute the computer executable instructions. The computer executable instructions may be provided to the computer, for example, from a network or the storage medium. The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)™), a flash memory device, a memory card, and the like.
According to the technique of the present disclosure, the user can more easily set characters strings with fewer mistakes as character stings used in properties of document files.
While the present disclosure has been described with reference to exemplary embodiments, it is to be understood that the disclosure is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2020-209197, filed Dec. 17, 2020 which are hereby incorporated by reference wherein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2020-209197 | Dec 2020 | JP | national |