The present application is based on, and claims priority from JP Application Serial Number 2023-022623, filed Feb. 16, 2023, the disclosure of which is hereby incorporated by reference herein in its entirety.
The present disclosure relates to a scanning system and an information processing program.
There is known a scanning system that reads a document of a plurality of pages to generate image data of the plurality of pages, performs optical character reading (OCR) processing to recognize characters in the document, extracts a table of contents, and generates an electronic document to which bookmark information is added (for example, see JP-A-2021-197616). A user who views the generated electronic document can search for a desired location from a main text by referring to the table of contents in the bookmark information.
JP-A-2021-197616 is an example of the related art.
The document of the plurality of pages may include a cover, a preface, and the like to which page numbers of the main text are not attached. In this case, a difference occurs between page numbers sequentially assigned to image data by the scanning system and page numbers assigned to the main text of the document. Such a difference in the page numbers is inconvenient when searching for a desired location from the main text.
A scanning system according to the present disclosure includes:
In addition, in a non-transitory computer-readable storage medium storing an information processing program according to the present disclosure, the program causes a computer to execute: a recognition function of recognizing a character included in first image data of a plurality of pages read from a document; a difference elimination function of detecting, based on a recognition result obtained by the recognition function, a difference between first page information obtained from the recognition result and second page information sequentially assigned to image data of the pages, and generating table-of-contents information including page information in which the difference is eliminated; and
In addition, a method for generating second image data according to the present disclosure includes:
Hereinafter, an embodiment of the present disclosure will be described. Of course, the following embodiment merely shows the present disclosure, and all of the features described in the embodiment are not necessarily essential to the solutions disclosed herein.
First, an overview of a technique included in the present disclosure will be described with reference to examples shown in
As shown in
The image data DA2 of the plurality of pages output from the output unit 62 includes the table-of-contents information T1. The table-of-contents information T1 includes page information in which the difference (Ns) between the first page information PA1 obtained from the recognition result R1 of the image data DA1 of the plurality of pages read from the document OR1 and the second page information PA2 sequentially assigned to image data is eliminated. Therefore, according to the above aspect, it is possible to provide a scanning system that outputs image data having a table of contents in which a difference occurring in page information is eliminated from image data of a plurality of pages read from a document.
Here, the scanning system may be a single device such as a copier (including a multifunction peripheral) or a plurality of devices such as an image reading device and a host device.
The table-of-contents information may be information (for example, bookmark information) added to main body data using image data of a plurality of pages read from a document as main body data, or information in which page information after correction is embedded in a position of a table of contents in the main body data.
The page information of the table-of-contents information may be the first page information or the second page information.
In the present application, “first”, “second”, and the like are terms used to identify each component included in a plurality of components having similarities, and do not mean an order.
An output of the image data of the plurality of pages obtained by the output unit may be an output outside of the output unit, and may be an output to an external device coupled to an image forming device, an e-mail destination, an output to a storage unit in the image forming device, a print of the image data, a display of the image data, or the like.
The above-described additional features are also applied to the following aspects.
As shown in
In the above case, the first page information PA1 is displayed on the table-of-contents information T1 including the link L1 on which the image data of the second page is displayed. Therefore, in the above aspect, it is possible to obtain the image data having a linked table of contents in which the difference occurring in the page information is eliminated.
The first page information may be displayed in the table-of-contents information, and both the first page information and the second page information may be displayed in the table-of-contents information. The additional features are also applied to the following aspects.
As shown in
In the above case, the first page information PA1 starting from the page in which the heading C1 at a start location searched from the recognition result R1 of characters is present is displayed as the bookmark B2a, and the image data of a second page is displayed according to the link L1 of the bookmark B2a. Therefore, in the above aspect, it is possible to provide a preferable example of obtaining the image data having a linked table of contents in which the difference occurring in the page information is eliminated.
As shown in
In the above case, the first page information PA1 included in the table of contents T2 included in the recognition result R1 of the characters is displayed as the bookmark B2a, and the image data of the second page is displayed according to the link L1 of the bookmark B2a. Therefore, in the above aspect, it is also possible to provide a preferable example of obtaining the image data having a linked table of contents in which the difference occurring in the page information is eliminated.
As shown in
The second page information may be displayed in the table-of-contents information, and both the first page information and the second page information may be displayed in the table-of-contents information. The additional features are also applied to the following aspects.
As shown in
In the above case, not only the table of contents T2 but also the difference occurring in the page information included in the image data DA1 of the plurality of pages read from the document OR1 is eliminated. Therefore, in the above aspect, it is possible to provide a preferable example of obtaining image data in which the difference occurring in the page information is eliminated.
As shown in
In the above case, not only the table of contents T2 but also the difference occurring in the page information attached to the main text TXT is eliminated. Therefore, in the above aspect, it is also possible to provide a preferable example of obtaining image data in which the difference occurring in the page information is eliminated.
The difference elimination unit 61 may identify, based on the recognition result R1, a position of page information included in the image data DA1 of the plurality of pages. As shown in
In the above case, when the page information included in the image data DA1 of the plurality of pages indicates the page of another document, the second page information PA2 is not added, and when the page information included in the image data DA1 of the plurality of pages indicates the page of the image data DA1 of the plurality of pages, the second page information PA2 is added. Therefore, in the above aspect, it is possible to provide a preferable example of obtaining image data in which the difference occurring in the page information is eliminated.
The output unit 62 may output the image data DA2 of the plurality of pages having the table-of-contents information T1 as a PDF file. In the aspect, it is possible to provide a preferable example of obtaining the image data having a table of contents in which the difference occurring in the page information is eliminated.
In an information processing program PRO according to an aspect of the present technique, as shown in
According to the above aspect, it is possible to provide an information processing program for acquiring image data of a plurality of pages read from a document and outputting image data having a table of contents in which a difference occurring in the page information regarding the acquired image data is eliminated.
Further, the present technique can be applied to an information processing device included in the above-described scanning system, a complex system including the above-described scanning system, a scanning method, a method for producing scanning data, an information processing method included in the above-described scanning method, a computer-readable medium recording the above-described information processing program, a device for performing information processing, and the like. The above-described information processing device may include a plurality of distributed parts.
The copier 1 shown in
The control unit 10 includes a CPU 11 as a processor, a ROM 12 as a semiconductor memory, a RAM 13 as a semiconductor memory, a storage unit 14, an I/F 15, and the like, and controls the operation panel 20, the document reading unit 30, the file output unit 50, and the like. Here, the CPU is an abbreviation for central processing unit, the ROM is an abbreviation for read only memory, the RAM is an abbreviation for random access memory, and the I/F is an abbreviation for interface. At least one of the storage unit 14 and the ROM 12 stores the information processing program PRO that causes a computer to function as the copier 1. The CPU 11 executes the information processing program PRO while using the RAM 13 as a work area to perform various kinds of processing such as control processing of the operation panel 20, control processing of the document reading unit 30, and control processing of the file output unit 50. The storage unit 14 may be a semiconductor memory called a flash memory, a magnetic recording medium called a hard disk, or the like. When the external device 100 is coupled, the I/F 15 transmits and receives data to and from the external device 100 according to a predetermined communication protocol. The external device 100 may be a personal computer including a tablet terminal, a mobile phone such as a smartphone, or a storage device such as a memory card.
The processor implementing the control unit 10 is not limited to one CPU, and may be a plurality of CPUs, a hardware circuit such as an ASIC, a combination of the CPU and the hardware circuit, or the like. The ASIC is an abbreviation of application specific integrated circuit.
The operation panel 20 includes a display unit 21 that displays a screen, an input unit 22 that receives an operation on a display screen, and the like. The operation panel 20 may include a dedicated CPU. A display panel such as a liquid crystal panel, or the like can be used as the display unit 21. As the input unit 22, a touch panel attached to a surface of the screen of the display unit 21, a hard key such as a keyboard, a pointing device, or the like can be used.
The document reading unit 30 includes a document conveying unit 31 that conveys the document OR1, a reading unit 32 of the document OR1, an image processing unit 33 that performs set image processing on the image data DA1, and the like. The document reading unit 30 may include a dedicated CPU. The document reading unit 30 reads the document OR1 and generates the image data DA1 of the plurality of pages read from the document OR1. The document conveying unit 31 includes, for example, a feeding tray, a feeding roller pair, a document separating unit, a multi feed detection unit, a conveying roller pair, a discharging roller pair, and a discharging tray. The document conveying unit 31 that continuously feeds a plurality of documents OR1 to the reading unit 32 is called an ADF or an automatic feeding device. Here, the ADF is an abbreviation for auto document feeder. The reading unit 32 sequentially reads a plurality of documents OR1, generates the image data DA1 of the plurality of pages corresponding to the plurality of documents OR1, and stores the image data DA1 in the memory 40. The reading unit 32 may be, for example, an image sensor of a contact image sensor type which is abbreviated as a CIS type or a charge coupled devices type which is abbreviated as a CCD type, a CMOS image sensor, a solid-state image sensor such as a line sensor or an area sensor including a CCD, and a digital camera. Here, the CMOS is an abbreviation for complementary metal-oxide semiconductor. The image processing unit 33 performs image processing of adjusting a color and the like according to an image setting such as a color on the image data DA1 of the plurality of pages stored in the memory 40. As the memory 40, a RAM, a nonvolatile semiconductor memory such as a flash memory, and the like can be used.
The file output unit 50 includes an image loading unit 51, the character recognition unit 52, a file generation unit 53, a print engine 54, and the like. The file output unit 50 may include a dedicated CPU. The image loading unit 51 transfers the image data DA1 from the memory 40 to the character recognition unit 52. The character recognition unit 52 sequentially performs OCR processing on the image data DA1 of the plurality of pages, recognizes characters included in the image data DA1, and generates the recognition result R1. Here, the OCR is an abbreviation for optical character reading. The file generation unit 53 generates, based on the recognition result R1 obtained by the character recognition unit 52, the table-of-contents information T1 (see
The information processing program PRO causes the copier 1 to implement the recognition function FU1, the difference elimination function FU2, the output function FU3, and the like. The recognition function FU1 recognizes the characters included in the image data DA1 of the plurality of pages from the document OR1. A recognition program for causing the copier 1 to implement the recognition function FU1 may be executed by the CPU 11 of the control unit 10, may be executed by the CPU of the file output unit 50, or may be executed by both the CPU 11 of the control unit 10 and the CPU of the file output unit 50. The file output unit 50 that executes the recognition program functions as the character recognition unit 52. The difference elimination function FU2 detects, based on the recognition result R1 obtained by the recognition function FU1, a difference between the first page information PA1 obtained from the recognition result R1 and the second page information PA2 sequentially assigned to the image data DA1. The difference elimination function FU2 generates the table-of-contents information T1 including the page information in which the difference is eliminated. A difference elimination program for causing the copier 1 to implement the difference elimination function FU2 may be executed by the CPU 11 of the control unit 10, may be executed by the CPU of the file output unit 50, or may be executed by both the CPU 11 of the control unit 10 and the CPU of the file output unit 50. The control unit 10 and the file output unit 50 that execute the difference elimination program function as the difference elimination unit 61. The output function FU3 outputs the image data DA2 of the plurality of pages having the table-of-contents information T1. An output program for causing the copier 1 to implement the output function FU3 may be executed by the CPU 11 of the control unit 10, may be executed by the CPU of the file output unit 50, or may be executed by both the CPU 11 of the control unit 10 and the CPU of the file output unit 50. The control unit 10 and the file output unit 50 that execute the output program function as the output unit 62.
The storage unit 14 storing the information processing program PRO can be said to be a computer-readable medium recording the information processing program PRO. When the information processing program PRO is recorded in an external recording medium, the recording medium can be said to be a computer-readable medium recording the information processing program PRO.
As shown in
In the specific example, in order to cope with the above-described inconvenience, the image data DA2 of the plurality of pages having the table-of-contents information T1 including the page information in which the difference is eliminated is generated based on recognition result R1.
In the document OR1 shown in
The difference elimination unit 61 first generates bookmark information B1 by associating each heading C0 with the second page information PA2 starting from the first page. The bookmark information B1 includes the second page information PA2 starting from a first page in a part to be the bookmark B2a and in the link L1. Since the second page information PA2 that is different from the first page information PA1 included in the displayed image data DA2 is a part to be the bookmark B2a, the user may search for a desired location of the image data DA2 according to the first page information PA1 in accordance with the document OR1. For example, when the user searches for a main text starting from the heading “AAA” from the image data DA2 of the plurality of pages, when the user mistakenly searches for the page information in accordance with the document OR1 from the image data DA2, the user finds a fifth page, which is different from the third page that is supposed to be originally searched.
Then, the difference elimination unit 61 replaces the second page information PA2 of the part to be the bookmark B2a in the bookmark information B1 with the first page information PA1. The obtained bookmark information B2 is the table-of-contents information T1 in which the first page information PA1 in accordance with the document OR1 is displayed on the bookmark B2a, and is the table-of-contents information Tl including the link L1 on which the image data DA2 corresponding to the second page information PA2 starting from the first page is displayed. The bookmark information B2 can also be referred to as a linked table of contents.
The output unit 62 uses the image data DA1 of the plurality of pages read from the document OR1 as a main body DA2a, adds the bookmark information B2 to the main body DA2a, and generates a file (the image data DA2 of the plurality of pages). For example, when the user searches for a main text starting from the heading “AAA” from the image data DA2 of the plurality of pages, when the user searches for the first page shown in the bookmark B2a from the image data DA2, the user can find the correct third page starting from the first page.
As shown in
In the document OR1 shown in
The difference elimination unit 61 generates the bookmark information B1 based on the recognized table of contents T2. The bookmark information B1 includes the first page information PA1 in accordance with the document OR1 in a part to be the bookmark B2a and in the link L1. Since the first page information PA1, which is different from the second page information PA2 starting from the first page, is in the link L1, the user may search for a desired location according to the link L1 from the image data DA2 of the plurality of pages by an operation of the displayed bookmark B2a. For example, when the user searches for a main text starting from the heading “AAA” from the image data DA2 of the plurality of pages, the user finds a first page that is different from the third page, which is supposed to be originally searched, by the operation of the heading “AAA” included in the bookmark B2a.
Then, the difference elimination unit 61 replaces the first page information PA1 of the link L1 with the second page information PA2. The obtained bookmark information B2 is the table-of-contents information Tl in which the first page information PA1 in accordance with the document OR1 is displayed on the bookmark B2a, and is the table-of-contents information Tl including the link L1 on which the image data DA2 corresponding to the second page information PA2 starting from the first page is displayed.
The output unit 62 uses the image data DA1 of the plurality of pages read from the document OR as a main body DA2a, adds the bookmark information B2 to the main body DA2a, and generates a file (the image data DA2 of the plurality of pages). For example, when the user searches for a main text starting from the heading “AAA” from the image data DA2 of the plurality of pages, the user can find the correct third page starting from the first page by the operation of the heading “AAA” included in the bookmark B2a.
Hereinafter, with reference to
It is assumed that the user sets a plurality of documents OR1 on the document conveying unit 31 serving as an ADF before instructing file generation.
When the file generation processing is started, the document reading unit 30 reads the document ORI to generate the image data DA1 of a plurality of pages (S102). The document reading unit 30 stores the image data DA1 of the plurality of pages read from the document OR1 in the memory 40 (S104).
Next, the image loading unit 51 transfers the image data DA1 from the memory 40 to the character recognition unit 52, and the character recognition unit 52 performs OCR processing on the image data DA1 of the plurality of pages in a reading order and performs character recognition processing to recognize characters included in the image data DA1 (S106).
Next, the file generation unit 53 searches for the heading C0 of each chapter from the recognition result R1, and acquires a page number NP2 of the image data DA1 in which each heading C0 is present (S108). The page number NP2 is a numerical value indicating the second page information PA2 starting from a first page. The file generation unit 53 can acquire the page number NP2 of the image data DA1 in which each heading C0 is present, by searching from the recognition result R1 for a conspicuous part such as a large font in the main text, and a combination of numbers and other characters. In the example shown in
Next, the file generation unit 53 calculates the skip page number Ns by subtracting 1 from the page number NP2 where the first heading Cl is present (S110). In the example shown in
Next, the file generation unit 53 branches the processing according to whether the table of contents T2 including the page number NP1 in accordance with the document OR1 is included in the recognition result R1 (S112). When there is a page in which the “table of contents” is included in the recognition result R1 and the “table of contents” is a conspicuous part in which a font thereof is larger than that of other parts, the file generation unit 53 can determine that the table of contents T2 is included in the recognition result R1 in a page including the “table of contents”. When the table of contents T2 is included in the recognition result R1 as shown in
When the table of contents T2 is not included in the recognition result R1, the file generation unit 53 generates the bookmark information B1 (see
As described above, the difference elimination unit 61 detects a difference between the first page information PA1 and the second page information PA2 as the skip page number Ns, and generates the table-of-contents information T1 including the page information in which the difference is eliminated. Addition of the first page information PA1 to the bookmark information is not limited to correcting the second page information PA2 to the first page information PA1, and the first page information PA1 may be written together with the second page information PA2. For example, when the page number NP2=3 corresponds to the page number NP1=1, the difference elimination unit 61 may replace a display location “3” in the second page information PA2 with “3(1)”, “1(3)”, or the like.
Next, the file generation unit 53 generates an electronic file by using the image data DA1 of the plurality of pages read from the document OR1 as the main body DA2aand adding the bookmark information B2 to the main body DA2a(S118). The obtained file is the image data DA2 of the plurality of pages having the bookmark information B2 as the table-of-contents information T1. The file generation unit 53 can generate, for example, a PDF file including the main body DA2a and the bookmark information B2. Thereafter, the control unit 10 outputs the file according to a setting (S120), and ends file generation processing. An output destination of the file may be the external device 100, an e-mail destination, the storage unit 14 included in the copier 1, the print engine 54 for printing on the main body DA2a, the display unit 21 for display, or the like. For example, when the file is displayed on the external device 100 or the display unit 21, the user can find a correct page by searching for the page of the first page information PA1 indicated by the bookmark B2a from the image data DA2.
As shown in
As described above, the difference elimination unit 61 detects a difference between the first page information PA1 and the second page information PA2 as the skip page number Ns, and generates the table-of-contents information T1 including the page information in which the difference is eliminated.
Next, as in S118, the file generation unit 53 generates an electronic file, for example, a PDF file, by using the image data DA1 of the plurality of pages read from the document OR1 as the main body DA2a and adding the bookmark information B2 to the main body DA2a (S156). The obtained file is the image data DA2 of the plurality of pages having the bookmark information B2 as the table-of-contents information T1. Thereafter, as in S120, the control unit 10 outputs the file according to a setting (S158), and ends file generation processing. For example, when the file is displayed on the external device 100 or the display unit 21, the user can find a correct page starting from the first page according to the link L1 by the operation of the heading included in the bookmark B2a.
As described above, the table-of-contents information T1 included in the image data DA2 of the plurality of pages to be output includes page information in which the difference between the first page information PA1 and the second page information PA2 is eliminated. Therefore, the present scanning system SY1 can output image data having a table of contents in which a difference occurring in page information is eliminated from the image data DA1 of the plurality of pages read from the document OR1.
Even when the table of contents T2 is included in the image data DA1 of the plurality of pages as shown in
S152 to S158 in the file generation processing shown in
In the document OR1 shown in
Since the first page information PA1 indicated by the table of contents T2 included in the image data DA1 of the plurality of pages is different from the second page information PA2 starting from the first page, the user may search for a desired location in the image data DA2 according to the first page information PA1 in accordance with the document OR1. For example, when the user searches for a main text starting from the heading “AAA” included in the table of contents T2 from the image data DA2 of the plurality of pages, when the user mistakenly searches for the page information in accordance with the document OR1 from the image data DA2, the user finds the second page, which is different from the fifth page that is supposed to be originally searched.
In addition to the table of contents T2, since the first page information PA1 included in the image data DA1 of the plurality of pages is different from the second page information PA2 starting from the first page, the user may search for a desired location in the image data DA2 according to the first page information PA1 in accordance with the document OR1.
Then, the difference elimination unit 61 identifies, based on the recognition result R1, a position of the first page information PA1 included in the image data DA1 of the plurality of pages, and adds the second page information PA2 to the image data DA1 of the plurality of pages at the position of the first page information PA1. In other words, the difference elimination unit 61 rewrites a description of a page number in an image included in the image data DA1 of the plurality of pages to an actual page number of the image data DA1. A value obtained by subtracting the page number corresponding to the first page information PA1 from the page number corresponding to the second page information PA2 corresponds to the skip page number Ns in the first specific example. The table-of-contents information T1 obtained based on the table of contents T2 is information in which the second page information PA2 starting from the first page is displayed. The output unit 62 generates a file including the image data DA2 of the plurality of pages in which the page information is replaced. For example, when the user searches for a main text starting from the heading “AAA” included in the table-of-contents information T1 from the image data DA2 of the plurality of pages, when the user searches for the fifth page shown in the table-of-contents information Tl from the image data DA2, the user can find the correct fifth page starting from the first page.
Hereinafter, with reference to
An upper part of
The file generation processing shown in
When the file generation processing is started, as in S102 shown in
Next, as in S106 shown in
Next, the file generation unit 53 identifies, based on the recognition result R1, the page of the table of contents T2 in the image data DAL of the plurality of pages, and identifies all positions of the first page information PA1 presenting in the page of the table of contents T2 (S208). When there is a page in which the “table of contents” is included in the recognition result R1 and the “table of contents” is a conspicuous part in which a font thereof is larger than that of other parts, the file generation unit 53 can determine that the table of contents T2 is included in the recognition result R1 in a page including the “table of contents”. For example, in the example shown in
Next, the file generation unit 53 identifies, based on the recognition result R1, the position of the main text TXT and the position of the first page information PA1 attached to the main text TXT in a page after the page of the table of contents T2 among the image data DAL of the plurality of pages (S210). The file generation unit 53 may determine, for example, information at a position assumed to be the page information, such as a lower center position or an upper right position in the image data DAI, as the first page information PA, and may store the position of the first page information PA1. In addition, for example, the file generation unit 53 can determine that information at a position not assumed to be the page information in the image data DA1 is the main text TXT, and may store the position of the main text TXT.
Next, the file generation unit 53 adds the second page information PA2 starting from the first page to the position of the first page information PA1 in accordance with the document ORI on the page of the table of contents T2 (S212). Accordingly, as shown in
As described above, the difference elimination unit 61 generates the table-of-contents information T1 in which the second page information PA2 is displayed as the page information in which the difference is eliminated. Addition of the second page information PA2 to the page of the table of contents T2 is not limited to correcting the first page information PA1 to the second page information PA2, and the second page information PA2 may be written together with the first page information PA1. For example, when the page number “2” as the second page information PA2 corresponds to the page number “5” as the first page information PA1, the difference elimination unit 61 may replace a display location “2” of the first page information PA1 with “2(5)”, “5(2)”, or the like.
Next, the file generation unit 53 adds the second page information PA2 to the image data DA1 of the plurality of pages at the position of the first page information PA1 attached to the main text TXT without adding the second page information PA2 to the position of the main text TXT (S214). Accordingly, as shown in
Of course, addition of the second page information PA2 to the position of the first page information PA1 is not limited to correcting the first page information PA1 to the second page information PA2, and the second page information PA2 may be written together with the first page information PA1. For example, when the page number “2” as the second page information PA2 corresponds to the page number “5” as the first page information PA1, the difference elimination unit 61 may replace a display location “2” of the first page information PA1 with “2(5)”, “5(2)”, or the like.
Next, the file generation unit 53 generates an electronic file including the image data DA2 of the plurality of pages having the obtained table-of-contents information T1 (S216). Thereafter, the control unit 10 outputs the file according to a setting (S218), and ends file generation processing. For example, when the file is displayed on the external device 100 or the display unit 21, the user can find a correct page when the user searches for the page of the second page information PA2 indicated by the page of the table of contents T2 from the image data DA2.
As described above, the table-of-contents information T1 included in the image data DA2 of the plurality of pages to be output includes page information in which the difference between the first page information PA1 and the second page information PA2 is eliminated. Therefore, the scanning system SY1 in the second specific example can also output image data having a table of contents in which a difference occurring in page information is eliminated from the image data DAL of the plurality of pages read from the document OR1.
In the example shown in
The file generation unit 53 in the third specific example includes a page information determination unit that determines whether the page information included in the main text TXT indicates a page of the main text TXT itself or a page of a document other than the document OR1. The page information determination unit can use, for example, a page information determination model generated by machine learning for determining whether the page information included in the main text TXT indicates the page of the main text TXT itself or the page of the document other than the document OR1.
An upper part of
When the file generation processing is started, the processings of S202 to S214 shown in
Next, the file generation unit 53 determines, based on the recognition result R1, whether each piece of page information included in the main text TXT indicates a page of the main text TXT or a page of another document (S254). Therefore, in S208 to S210 shown in
Next, the file generation unit 53 adds the second page information PA2 starting from the first page to the image data DA1 at the position of the page information PA12 indicating the page of the main text TXT (S256). Therefore, in S212 to S214 shown in
Further, the difference elimination unit 61 adds the second page information PA2 to the image data DA1 of the plurality of pages at the position where the page information included in the image data DA1 of the plurality of pages indicates the page of the image data DA1 of the plurality of pages.
Of course, addition of the second page information PA2 to the position of the first page information PA1 is not limited to correcting the first page information PA1 to the second page information PA2, and the second page information PA2 may be written together with the first page information PA1. For example, when the page number “2” as the second page information PA2 corresponds to the page number “5” as the first page information PA1, the difference elimination unit 61 may replace a display location “2” of the first page information PA1 with “2(5)”, “5(2)”, or the like.
Next, the file generation unit 53 generates an electronic file including the image data DA2 of the plurality of pages with the second page information PA2 added to a position indicating the page of the image data DA1 (S216). Thereafter, the control unit 10 outputs the file according to a setting (S218), and ends file generation processing. For example, when the file is displayed on the external device 100 or the display unit 21, the user can find a correct page when the user searches for the page of the page information PA12 indicating the page of the main text TXT from the main text TXT.
As described above, when the page information included in the image data DA1 indicates the page of another document, the second page information PA2 is not added, and when the page information included in the image data DA1 indicates the page of the image data DA1, the second page information PA2 is added. Therefore, the scanning system SY1 in the third specific example can output image data in which a difference occurring in page information is more appropriately eliminated from the image data DA1 of the plurality of pages read from the document OR1.
Various modifications of the present disclosure are considered.
For example, the copier 1 may be a multifunction peripheral having a facsimile communication function or the like. In the scanning system SY1, a scanner dedicated device, a digital camera, a smartphone, a personal computer, or the like may be used instead of the document reading unit in the copier 1. Alternatively, a scanner dedicated device, a digital camera, a smartphone, a personal computer, or the like may include all of the document reading unit, the recognition unit, the difference elimination unit, and the output unit and may be replaced with the copier 1 itself.
A part of the above-described processing may be performed by the external device 100. In this case, a combination of the copier 1 and the external device 100 is an example of the scanning system SY1.
The above-described processing can be appropriately changed, such as changing the order. For example, in the file generation processing shown in
As described above, according to the present disclosure, it is possible to provide a technique capable of outputting image data having a table of contents in which a difference occurring in page information is eliminated from image data of a plurality of pages read from a document. Of course, basic operations and effects described above can also be obtained by the technique including only constituent features according to the independent claims.
In addition, a configuration in which the configurations disclosed in the above-described examples are mutually replaced or a combination thereof is changed, a configuration in which the configurations disclosed in the known technique and the above-described examples are mutually replaced or a combination thereof is changed, and the like may be implemented. Further, the user may select which aspect of the various aspects described above is used to eliminate a difference occurring in page information, and the selected aspect may be used to eliminate the difference occurring in the page information. In addition, depending on the document, it may be determined that there is no difference from the beginning as a result of recognizing first page information. In such a case, second image data may be generated and output as it is. The present disclosure also includes such configurations.
Number | Date | Country | Kind |
---|---|---|---|
2023-022623 | Feb 2023 | JP | national |