This application is based on Japanese Patent Applications No. 2010-238861 filed on Oct. 25, 2010 and No. 2010-268194 filed on Dec. 1, 2010 the contents of which are incorporated herein by reference.
1. Technical Field
The present invention relates to a data processing device and a data processing method for retrieving image data such as scanned data for storage in a specified format.
2. Description of Related Arts
The multiple functions of recent scanners have increased user's freedom of specifying the saving format for scanned data generated by these scanners. For example, Japanese Patent Application Publication No. 2006-146486 discloses an MVP capable of dividing a document image into small objects such as texts, illustrations, photographs, and tables by means of vector-scanning, and storing them in an user's specified file format for each object type.
In this context, various efforts have been made for effective use of scanned data, most of recent scanners are provided with a function to retrieve scanned data in JPEG format for storage in a specified file format such as Microsoft Word (registered trademark) and Microsoft Excel (registered trademark). These scanners are capable of receiving user's designation of a file format for storage, but their data storage operation will only result in random relocation of the scanned data within the designated format file. For example, if Excel is designated as a storage file format, the entire scanned data will be stored into a single spreadsheet (See
In order to achieve at least one of the objects mentioned above, the data processing device for incorporating one or more scanned data files generated by an image scanning device for storage into a file in a specified file format equipped with a plurality of display areas, which reflect one aspect of the present invention, comprises: a reception unit for receiving user's designation of one of said display areas to which each of said scanned data files is allocated; and a data modification unit for modifying configuration data of said file so that each of said scanned data files is allocated to one of said display areas according to said user's designation received by said reception unit.
The data processing device which reflects another aspect of the invention comprises: an acquisition unit for acquiring image data; a determination unit for determining whether or not image layout of said acquired image data matches with image layout of a predetermined template by comparing said acquired image data with template data area of said template in order to determine whether or not said acquired image data fits into said template data area; and a control unit for generating vector data from said image data, and controlling said data processing device so that said generated vector data is output in image layout of said template with which said determination unit determines that image layout of said acquired image data matches.
The objects, features, and characteristics of the present invention other than those set forth above will become apparent from the description given herein below with reference to preferred embodiments illustrated in the accompanying drawings.
The embodiments of this invention will be described below with reference to the accompanying drawings.
The first embodiment of the present invention will be described below.
The network N is a LAN complying with a standard such as Ethernet (registered trademark), Token Ring, FDDI, etc., or a WAN with a plurality of LANs connected one another via a dedicated line. The PC 1 and the scanner 2 according to the present embodiment can also be connected directly instead of being connected via the network N. Type and number of equipments connected to the network N are not limited to the example shown in the figure.
The structures of the aforementioned equipments are described below in detail.
The control unit 11 is a CPU (Central Processing Unit) for controlling operations of each unit in accordance with control programs and executing various arithmetic processing. The storage unit 12 contains a ROM (Read Only Memory) for storing control programs of PC 1's basic functions and various parameters, a RAM (Random Access Memory) for temporarily storing various programs and data files to serve as a working area, a hard disk for storing an OS (i.e. Basic Software), various control program for the particular processing shown below, various parameters, etc.
The display unit 13 is a display device such as an LCD for displaying various information to user. The input unit 14 contains a keyboard, a mouse, etc., to be used for receiving various operational instructions from user. The input/output interface 15 is an interface for communication with other devices on the network N.
The PC 1 according to the present embodiment is equipped with various kinds of business software programs including a document preparation program such as Microsoft Word (registered trademark) and a spreadsheet program such as Microsoft Excel (registered trademark). In particular, the PC 1 supports Word and Excel files in the Office OPEN XML (OOXML) format. The PC 1 is provided with a scanner driver, which is a software product which can offer an user interface (UI) screen for setting operations of the scanner 2.
The control unit 21 is a CPU (Central Processing Unit) for controlling operations of each unit in accordance with control programs and executing various arithmetic processing. The storage unit 22 contains a ROM (Read Only Memory) for storing control programs of the basic functions of the scanner 2 and various parameters, a RAM (Random Access Memory) for temporarily storing programs and data files to serve as a working area, a hard disk for storing a program for controlling the particular processing of the scanner 2 and various parameters.
The operation unit 23 is an operation panel such as a LCD for displaying an UI screen for showing various information to user as well as receiving various operational instructions the user. The input/output interface 25 is an interface for communication with other devices on the network N.
The image scanner 24 has a function to irradiate either a document set on a certain scanning position on a platen, or a document transported to the same scanning position by an ADF (Auto Document Feeder) with a light source such as a fluorescent lamp, and to scan the reflected light from the document with light receiving elements such as CCD and CMOS image sensors to generate digital data of the document image. Such a series of operations is hereinafter referred to as “scanning operation”. The digital data created by the scanning operation is hereinafter referred here as “scanned data” or “scanned data file”.
In the image scanning system S with the aforementioned structure, the PC 1 can retrieve more than one scanned data file created by the scanner 2 into am Excel file in the OOXML format. More specifically, the PC 1 can receive user's designation of a spreadsheet (hereinafter also referred to as “sheet”) as a storage destination of each scanned data file by means of the UI screen offered by the scanner driver, and store each scanned data file into user's designated sheet.
The Excel file in the COXML format contains a group of binary data files such as text data files in the XML (Extensible Markup Language) and image data files. Such a group is generally called as a package, and individual data files within a package are called as part files. The PC 1 according to the present embodiment can create an Excel file with user's desired structure by modifying the part files within the package accordingly. In particular, the PC 1 is capable of incorporating image data such as scanned data into an Excel file by modifying or adding the part files shown in Table 1.
While a newly created Excel file is equipped with three spread sheets, the system can also incorporate scanned data into additional sheets starting with the fourth sheet by modifying or adding the part files shown in Table 2.
As can be seen from the above, the PC 1 according to the present embodiment is capable of incorporating each scanned data file for storage into user-designated sheet by modifying the relevant part files in the package of the Excel file.
Next, an overview of the PC 1's operations according to the present embodiment is shown below.
Firstly, the PC 1 executes incorporation setting processing (S101) for receiving user's instructions regarding the incorporation of the scanned data. Specific steps in the incorporation setting processing (S101) are shown below with reference to
In the present embodiment, the file format for incorporating the scanned data can be any file format as long as it is equipped with a plurality of display areas (i.e. pages, sheets, etc.) like Word and Excel. The following explanation assumes that an Excel file in the OOXML format is specified by user.
After receiving user's instructions regarding the number of scanned data files to be generated by the scanner 2, the sheet number of the incorporation destination for each scanned data file, etc. on the UT screen (S202), the PC 1 returns to the flowchart in
With reference to
As can be seen from the above, prior to the generation of the scanned data files by the scanner 2, the PC 1 receives user's instruction regarding the incorporation destination sheet. However, the PC 1 according to the present embodiment can also display a preview of the scanned data files on the UI screen after generating them by the scanner 2, in order to receive user's instruction regarding the incorporation destination via the UI screen.
Next, the PC 1 determines whether or not additional sheets starting with the fourth sheet (Sheet4, . . . ) is designated as the loading destinations of the scanned data files with reference to user's incorporation settings obtained in S101 (S104). The reason for executing this determination step is because the newly-created Excel file is equipped with 3 sheets alone, and the designation of additional sheets would entail modification of the relevant part files within the package for creating the additional sheets.
The PC 1 then moves onto the steps from S105 to S110 to be described later if additional sheets are designated as the loading destination (S104: Yes) while moving directly onto S111 if additional sheets are not designated (S104: No). Details of the steps S105 through S110 are shown below.
Firstly, the PC 1 adds to the part file “[Content_Type]. xml” in the package of the newly created Excel file in S103, the extension of the scanned data files to be incorporated into the Excel file. More specifically, supposing that the file format of the scanned data files created by the scanner 2 is “JPEG”, such a data string as shown in
The PC 1 then adds the information regarding the additional sheets to each of the part files “app.xml” in the folder “¥docProps”, the part file “workbook.xml” in the folder “¥xl”, and the part file “workbook.xml.rels” in the folder “¥xl¥_rels” (S106, S107 and, S108, respectively). Next, the PC 1 newly creates part files “sheet4.xml”, . . . for the additional sheets, and adds these part files to the folder “¥xl¥worksheets” (S109). Next, the PC 1 newly creates part files “sheet4.xml.rels”, . . . for the additional sheets, and adds these part files to the folder “¥xl¥worksheets¥_rels” (S110).
The newly created part files in S109 and S110 (i.e. “sheet4.xml”, . . . and “sheet4.xml.rls”, . . . ) will receive information on the additional sheets in the scanned data incorporating processing (S112) to be described later.
Next, the PC 1 newly creates a folder “¥xl¥media” for storing the scanned data files as well as the folders “¥xl¥drawings” and “¥xl¥drawings¥_rels” for storing various part files showing information on the scanned data files, and adds these folders to the package (S111).
Next, the PC 1 repeats the scanned data incorporation processing (S112) to be described later for each scanned data file created by the scanner 2. After finishing the scanned data incorporation processing in S112 for all the scanned data files, the PC 1 then rewrites the update time for the part file “core.xml” in the folder “¥docProps” (S113). More specifically, in S113, the relevant portion in the part file “¥docProps¥core.xml” is replaced with the data string shown in
The PC 1 then zips the package after the modification in the steps from S103 to S113, and stores it after changing its file extension to “xlsx” (S114). The storage destination of the zipped file can be either the storage unit 12 of the PC 1 or the storage unit 22 of the scanner 2, or even an external storage device connected to the network N. After that, the PC 1 finishes the file storage processing (End).
Specific steps in the scanned data incorporation processing in S112 are described below with reference to
(a) Adding the extension of the scanned data files. More specifically, such a data string as shown in
(b) Adding the part name (i.e. “Part Name”) and the content type (i.e. “ContentType”) of the scanned data. For example, such a data string as shown in
Next, the PC 1 specifies the sheet number of the scanned data's incorporation destination (S303) by referring to the incorporation settings obtained in S101. The sheet number specified in S303 is hereinafter referred to as “N” for the sake of convenience. The PC 1 then determines whether or not data incorporation into SheetN is the first time ever (S304), and branches the subsequent steps in accordance with the determination result.
If the data incorporation into SheetN is the first time (S304: Yes), the PC 1 creates the part file “drawingN.xml” which describes positional information of the scanned data to be incorporated into SheetN, and adds the part file to the folder “¥xl¥drawings” which has been created in S110. The positional information of the scanned data to be incorporated into SheetN is thus added to the part file “drawingN.xml” in the folder “¥xl¥drawings” (S305). More specifically, such a data string as shown in
Next, the PC 1 newly creates a part file “drawingN.xml.rels” which describes relationship of the part file “drawingN.xml”, and adds the part file to the folder “¥xl¥drawings¥rels” which has been created in S110. The PC 1 then adds the relationship with the target resource (i.e. scanned data file) to the part file “drawingN.xml.rels” in the folder “¥xl¥drawings¥rels” (S306).
More specifically, such data strings as shown in
On the other hand, if the incorporation into SheetN is not the first time (S304: No), the PC 1 adds positional information of the scanned data file to be incorporated into SheetN, to the part file “drawingN.xml” which has been added to the folder “¥xl¥drawings” (S307). The PC 1 then adds the relationship with the scanned data file to be incorporated into SheetN, to the part file “drawingN.xml.rels” which has been added to the folder “¥xl¥drawing¥_rels” (S308).
The PC 1 then adds the relationship between SheetN and the scanned data file to be incorporated into SheetN, to the part file “drawingN.xml.rels” in the folder “¥xl¥drawing¥_rels” (S309). For example, such data strings as shown in
Next, the PC 1 adds the ID information of the relationship which has been added to the part file “sheetN.xml.rels” in S309, to the part file “sheetN.xml” in the folder “¥xl¥worksheets” (S310).
After that, the PC 1 repeats the same processing for the remaining scanned data files, and returns to the flowchart of
As can be seen from the above, the PC 1 according to the present embodiment for incorporating one or more scanned data files generated by the scanner 2 for storage into an Excel file in the OOXML format, is capable of modifying the relevant part files of the Excel file so that the scanned data files will be allocated to user-designated display areas (i.e. spreadsheets). Therefore, the present embodiment can substantially reduce the burden on user who tries to make efficient use of the data files incorporated into an Excel file.
The Image scanning device according to the present embodiment can also be an MFP (multifunction Peripheral) equipped with printing and copying functions in addition to a scanning function while the present embodiment uses the scanner 2 as an example. Furthermore, the data processing device according to the present embodiment can also be a built-in device of an image forming device with a scanning function such as a MFP. This means that the present embodiment also cover the aspect of the present invention where a single image forming device performs all the steps of creating scanned data, incorporating the scanned data into a file in a specified format, and finally storing the file into an internal storage device such as a HDD or an external storage device such as a USB memory storage device, by itself.
The file format of the incorporation destination according to the present invention can be any file format equipped with a plurality of display areas such as spreadsheets and pages although the present embodiment uses an Excel file and a Word file in the OOXML format as examples.
The second embodiment of the present invention will be described below.
The image forming device 4 is connected to a communication line 6 so that it can communicate with the PC 5. The communication line 6 creates a network between the image forming device 4 and the PC 5. The communication line 6 can conform to any communication method as long it ensures connection between the PC 5 and the image forming device 4. For example, the communication line 6 can be a wired network using an Ethernet (registered trademark) cable, a coaxial cable, optical fiber etc., a wireless network based on various standards, or any combination of these wired and wireless communication methods. The communication line 6 can also be LAN (Local Area Network), Internet, or any other network in an arbitrary scale.
The CPU 51 cooperates with the programs stored in the ROM 53, and controls the operations of the PC 5 in accordance with the programs and data read into the RAM 52. The RAM 52 stores various data created as a result of the processing of the CPU 51 as well as temporary data generated in the course of the same processing. The ROM 53 stores the programs and data retrieved by the CPU 51.
The storage unit 54 stores the programs and data retrieved by the CPU 51. The storage unit 54 is a rewritable storage unit formed by a combination of a flash memory, a hard disk drive, and any other rewritable storage device.
The input interface 55 is an interface for receiving an input from an external input device 58. The external input device 58 is typically a keyboard and a mouse, and is used to receive user's manual input.
The output interface 56 is an interface for sending an output to an external output device 59. The external output device 59 is typically a display device such as a CRT or a LCD for displaying an output screen based on the processing result of the CPU 51.
The communication device 57 makes a connection between the PC 5 and an external communication network (e.g. communication line 6) to enable the PC 5 to communicate with an external equipment. The communication device 57 is typically a NIC (Network Interface Card), and is capable of making a connection in accordance with various types of communication methods.
The CPU 41 cooperates with the programs stored in the ROM 43, and controls the operations of the image forming device 4 in accordance with the programs and data read into the RAM 42. The RAM 42 stores the data created as a result of the processing of the CPU 41 as well as temporary data generated in the course of the same processing. The ROM 43 stores the programs and data retrieved by the CPU 41.
The storage unit 44 stores the programs and data retrieved by the CPU 41. The storage unit 44 is a rewritable storage unit formed by a combination of a flash memory, a hard disk drive, and any other rewritable storage device.
The input interface 45 is an interface for receiving an input from an input device such as an external input device 49. The external input device 49 is typically an operation panel with a touch screen which allows user to enter various instructions.
The image scanning unit 46 is equipped with an ADF unit, a platen glass, and an optical system such as CCD image sensors, realizing a function to scan a document image placed on the ADF or the platen glass by the optical system. The image data obtained by scanning the image document with the image scanning unit 46 (i.e. analogy image signals) is put into A/D conversion and various image processing, before being stored into the storage unit 44 in the form of digital image data (or an image data file) and being output to the image printing unit 47.
The image printing unit 47 executes image forming (i.e. print processing) based on the input image data. The printing method used by the image printing unit 47 can be the electronic photography method, the ink-jet method, the thermal transfer method, the offset method, etc. In the present embodiment, the image printing unit 47 performs image forming by means of the electronic photography method.
The communication device 48 makes a connection between the image forming device 1 and an external communication network (e.g. communication line 6) to enable the image forming device 4 to communicate with external equipment. The communication device 48 is typically a NIC (Network Interface Card), and is capable of making connection in accordance with various types of communication methods.
Next, the image layouts of a paper document to be scanned by the image scanning unit 46 will be described below with reference to
A paper document in the slide mode takes a form of a one-page slide image. The slide image herein refers to an image formed on a PowerPoint slide, and it also contains various images such as letters, lines, tables, figures, and photographs. A paper document in the note mode takes a form of a one-page slide image and a note image. The note image consists of a text image such as a memorandum concerning the slide image. A paper document in the distribution mode takes a form of slide images for a plurality of pages.
Next, the process flow for scanning a paper document to output the scanned paper document in an electronic file which can be edited by PowerPoint is shown below. When a paper document in the note mode is scanned by the image scanning unit 46 as shown in
When a paper document in the distribution mode is scanned by the image scanning unit 46 as shown in
Moreover, when a paper document in the slide mode is scanned by the image scanning unit 46, image data (or an image data file) consisting of a one-page slide image is acquired (not shown), and a file in the slide mode is output in accordance with the procedure illustrated in
Next, the file output processing will be described below with reference to
The following explanation assumes that a paper document in the note mode (See
The following explanation also assumes that a selection screen for allowing user to select an application to be used is displayed on the touch screen of the external input device 49 beforehand, and user selects “PowerPoint” as the application to be used via the external input device 49.
The file output processing (See
Firstly, the image data acquisition is performed (S401). In other words, the image forming device 4 scans the paper document by the image scanning unit 46 to acquire image data. The acquired image data (i.e. analog image signals) is put into A/D conversion. Then, various image processing is applied to the image data after the A/D conversion, and the digital image data after the image processing is stored into the storage unit 44.
After the execution of S401, the image data stored in the storage unit 44 is retrieved, and a matching check of the retrieved image data is carried out by using a predefined template (S402). The matching check is intended to determine whether or not the image layout of the retrieved image data matches with the template. Details of the predefined template are shown below with reference to
The template in the slide mode (See
The following is an explanation of the matching check in the case where a paper document in the note mode (See
Firstly, the acquired image data and the template data area are compared with each other, and it is determined whether or not the image data fits into the template data area. The acquired image data herein consists of a one-page slide image and a notebook image. In this case, it is determined whether or not the one-page slide image and the note image in the acquired image data fit into the template area of each template.
In this example, the image data and the template area (i.e. the main body area and the note area) in the note mode (See
Similarly, the image data and the template area (i.e. the main, body area) in the slide mode (See
Similarly, the image data and the template area (i.e. the main body area) in the distribution mode (See
The aforementioned matching check can also involve user's selection of a template in the case where the calculated scores are below a certain level. More specifically, selection information showing the image layout of each template for user's selection can be displayed on the touch screen of the external input device 49 for user's selection of the template if the calculated scores are below a certain level. The selection information can also be transmitted to the PC 2 via the communication device 48 to be displayed on the external output device 59.
With reference to
After the execution of the steps from S404 to S406, the file output processing is executed (S407). The file output processing is intended to output the vector data in the image layout of the template with which the image layout of the acquired image data matches. More specifically, the file output processing includes steps of appending the vector data to an OOXML file, which can be handled by Microsoft Word (registered trademark), Microsoft Excel (registered trademark), Microsoft PowerPoint (registered trademark), etc., generating a file in the image layout of the template with which the image layout of the acquired image data matches, and storing the generated file to the storage unit 44.
OOXML is a newly adopted file storage format (or a file format) in Microsoft Office (registered trademark) 2007. This means that OOXML is generally used as a file format for Word, Excel, Powerpoint, etc. Therefore, an OOXML file can be handled (i.e. edited) by applications such as Word, Excel, and Powerpoint.
The OOXML extensions for Word, Excel, and Powerpoint are “dccx”, “xlsx”, and “pptx”, respectively. User can view the content of an OOXML file “sample.pptx” by changing its extension “pptx” into “zip” and then decompressing the same file with the changed extension “zip” using a decompression software, for example. The sample OOXML file “sample.pptx” consists of a plurality of part files as shown in
If the type of the image data shown in
Moreover, the vector data resulted from the OCR processing of the note area needs to be added to the part file “ppt¥notesSlides¥notesSlide1.xml” (See
Moreover, the vector data resulted from the OCR processing of the main body area of the image data needs to be added to the part file “ppt¥notesSlides¥notesSlide1.xml” (See
The relationship regarding the structure of the main body area needs to be added to the part file “ppt¥Slides¥_rels¥Slide1.xml.rels” (See
As can be seen from the above, the addition of the vector data of the main body area and the note area to the file “sample.pptx” (i.e. the addition of the vector data to the relevant part files among a plurality of part files within “sample.pptx”) results in the generation of the file “sample.pptx” in the notebook mode.
The generated file “sample.pptx” in the note mode is stored into the storage unit 44. When user enters an instruction to edit the file “sample.pptx” in the notebook mode, the file “sample.pptx” in the note mode is retrieved from the storage unit 44 to be displayed on the touch screen of the external input device 49 or on the external output device 59 (See
As shown in the above, the present embodiment includes determining whether or not the image layout of the acquired image data matches with the image layout of the template by comparing the acquired data and the template area in the template in order to determine whether or not the acquired image data fits into the template area. Thus, the present embodiment ensures high accuracy in determining whether or not the image layout of the image data matches with the image layout of the template. The present embodiment also includes classifying the image data acquired by the image scanning unit 46 into either the main body area or the note area based on the presence or absence of a frame. Therefore, the present embodiment ensures high accuracy in classifying the image data even when the text portion contains a plurality of areas with different attributes (e.g. main body area and the note area). The present embodiment also includes generating the vector data from the image data, and outputting the generated vector data in the image layout of the template with which the image layout of the image data matches, thereby allowing user to edit the file using PowerPoint. The present embodiment hence improves convenience for user.
The present embodiment also includes calculating a score showing the matching degree of the image layout between the image data and the template, and determining that the image layout of the image data matches with the image layout of the template with the highest score, thereby ensuring high accuracy in the matching check of the image layout between the image data and the template.
The present embodiment also includes displaying the selection information for allowing user to select the image layout of the template on the touch screen of the external input device 49 or the external output display 59 if the calculate scores are below a certain level. Thus, the present embodiment allows user to select the image layout of his/her desired template if the calculated scores are below a certain level.
The present embodiment also includes adding the vector data to the file “sample.pptx” in the OOXML format to generate the file “sample.pptx” in the note mode. The present embodiment also includes displaying the file “sample.pptx” in the note mode on the touch screen of the external input device 49 or the external output device 59, thereby allowing user to view the file “sample.pptx” in the note mode and to edit the “sample.pptx” via the external input device 49 or the external input device 58 using PowerPoint.
The image forming device 4 according to the present embodiment can also receive the image data created on the PC 5 via the communication device 48, instead of acquiring it by scanning a paper document with the image scanning unit 46 as illustrate above. The application software according to the present embodiment is not limited to Powerpoint in spite of the explanations set forth above. This means that various other applications (e.g. Word) can also be used for PowerPoint. Furthermore, a plurality of templates in different modes (e.g. a N-up mode template) can also be used.
Moreover, the data processing device according to the present invention can also be implemented by a dedicated hardware circuit for executing the aforementioned steps, or a program executed by the CPU to perform the aforementioned steps. If the present invention is implemented by the latter means, the control program of the data processing device can take a form of a computer readable recording medium such as a floppy (registered trademark) disk or CD-ROM, or a downloadable program file supplied on-line via a network such as Internet. In the former case, the program recorded on the computer readable recording medium is normally transmitted to a memory unit such as a ROM or a hard disk. The control program can also take a form of an application software program or a built-in function of the data processing device.
Number | Date | Country | Kind |
---|---|---|---|
2010-238861 | Oct 2010 | JP | national |
2010-268194 | Dec 2010 | JP | national |