1. Field of the Invention
The present invention relates to a document management server that manages documents, a document management method, and a non-transitory storage medium storing a program.
2. Description of the Related Art
In a document management system constituted by a document management server and a plurality of clients, the server receives a request from a client and transmits information on a requested document to the client. There is also a system disclosed in Japanese Patent Laid-Open No. 2005-222237 in which a server generates image data (display data) of a requested document and transmits the image data to a client so that the image data is displayed, thereby enabling the content of the document to be displayed in a general-purpose browser of the client.
In such a document management system, there is a case where a document management server repeatedly receives requests for the same document from a plurality of clients. In order to deal with this, it is considered that a function configured to accumulate image data of this document as reusable data (hereinafter referred to as cache data) is provided in the server so as to enable quick responses to requests for the same document.
In a system disclosed in Japanese Patent Laid-Open No. 2000-076257, cache data of an information entity is accumulated so that the cache data may be searched for by using meta information, and it may also be detected whether or not the cache data of the information entity has already been accumulated by using a meta information table in which the meta information is stored.
However, as in Japanese Patent Laid-Open No. 2000-076257, a technique of referring to cache data by using a table requires an area for storing the table, thereby resulting in consumption of physical resources (hereinafter referred to as resources).
Also, every time new image data generated in response to a request from a client is stored as cache data, the table has to be updated, and thus this may have some effect on the speed of a process of responding to a client.
In addition, a procedure is performed in which it is checked whether or not applicable meta information exists within the meta information table, and when the meta information exists, cache data is searched for by using cache data storage location information written in the meta information so as to acquire the cache data. Hence, it takes some time to check the presence or absence of cache data.
A document management server includes: a generation unit configured to generate, by using a request received from a client, a character string for identifying a requested document; a determination unit configured to determine whether or not an image file of the document exists in a location represented by the character string by performing a check through a cache data storage area by using the character string; a storage unit configured to, in a case that the determination unit determines that no image file of the document exists in the location, acquire an entity file of the document, generate an image file of the document by using the acquired entity file, and store, as cache data, the generated image file in the location; an acquisition unit configured to, in a case that the determination unit determines that an image file of the document exists in the location, acquire the image file of the document existing in the location; and a transmission unit configured to transmit the acquired image file to the client.
Further features of the present invention will become apparent from the following description of exemplary embodiments with reference to the attached drawings.
A central processing unit (CPU) 201 is constituted by at least one processor, executes processes in flowcharts to be described by executing a computer program stored in a computer-readable storage medium, and controls an entire computer. A memory 202 is constituted by a random access memory (RAM) or a read only memory (ROM). A video interface 212 outputs an image to the display device 213. An input/output (I/O) interface 203 receives an input through operation of the keyboard 204 or the mouse 205. A storage device 208 is a nonvolatile storage device constituted by a hard disk drive (HDD) 209, a flash memory (silicon drive), or the like. A drive 206 is an optical drive for a compact disc-read only memory (CD-ROM), a digital versatile disk (DVD), or the like, and is used as a nonvolatile data source. An interconnection bus 207 is a bus via which communication between blocks is performed under the control of the CPU 201. A network controller (NC) 210 is connected to the network 103 via a certain network interface 211, and executes a control process for communication with another network device.
Control programs for causing the document management server 101 and the client PC 102 to execute the processes illustrated in the flowcharts to be described are stored in a storing unit, such as the memory 202 or storage device 208 of each of the apparatuses, and executed by the CPU 201 of each apparatus. The document management server 101 has a database for document management, and the database is also constructed on the storage device 208.
An application 301 displays, on a screen of the display device 213, a user interface for executing various functions, receives a request from a user via the keyboard 204 or the mouse 205, and executes processes for the functions. Reference numerals 302 and 303 denote components that constitute the application 301. The component 302 is an application user interface (UI) unit, constructs a user interface, receives various input operations performed by the user, and displays a processing result based on an input operation. The component 303 is a library management unit, manages a library used in the application 301, stores a document in the library, and performs various types of document manipulation, for example, browses, updates, changes attributes of, and searches for documents in the library. The library here is a unit of storage for performing document management, and stores document data, and document management data, such as document attributes. In this embodiment, the library stores a past version of a document and update histories of the document as well. The library management unit 303 passes a request for document manipulation received from the application UI unit 302 to the document management server 101, receives a processing result of the request from the document management server 101, and passes the processing result to the application UI unit 302. A reference numeral 305 denotes an interface for connecting the library management unit 303 to the document management server 101.
A cache data management unit 403 accumulates, as cache data, an image file generated by using entity data of a document, and manages the cache data, and thus the cache data may be repeatedly used in response to a request from the user. The structure of cache data management is decided upon by the cache data management unit 403. A cache data storage area is constructed on the storage device 208. In the first embodiment, cache data management is performed by a file system independent of the database 405.
An image file generation unit 404 extracts image data and non-image data from entity data of a document, and keeps the image data unchanged or performs image conversion, such as scaling, of the image data when necessary. As for non-image data, for example, text is converted into a font image and is rendered as image data, and vector-format data is rendered as image data in accordance with an output size. Then, these pieces of data are combined and thereby converted into an image file in a file format in which the content of the document may be displayed in a browser of the client PC 102. In this embodiment, the image file generation unit 404 is capable of calculating the total number of pages of display image data from an entity file of a document, and generates image files corresponding to pages.
In this embodiment, the document management server 101 provides a web service, and a network address for accessing the topmost layer 501 is assigned to the document management server 101. Identification codes (hereinafter referred to as site IDs) for uniquely identifying, in the topmost layer 501, sites in the second layer 502 are assigned to the sites. In addition, identification codes (hereinafter referred to as document IDs) for uniquely identifying, in each site in the second layer 502, documents arranged in a layer under the site are assigned to the respective documents. Thus, all the documents may be uniquely identified by combinations of the network address, the site IDs, and the document IDs.
For example, in the case where a document management system is operated in a company, an area accessible by users may be determined as follows. For example, in the second layer 502, sites (AAA, BBB, CCC) are assigned to departments in the company, and in the third layer 503, libraries (DDDD, EEEE) are assigned to sections in each department. Users belonging to each section are authorized to access the folders 504 and the documents 505 existing under each library. The same type of documents are often shared in each section. For this reason, attribute items predefined in the section are set, attribute values corresponding to the attribute items are respectively associated with documents, and the documents are registered, thereby facilitating a search for each document.
In step 701, the cache data management unit 403 generates a unique character string for identifying a document by using a URI of a document display request, such as the request illustrated in
In step 702, the cache data management unit 403 determines whether or not a folder hierarchy similar to a hierarchy identified by using the character string generated in step 701 exists in the cache data storage area on the storage device 208. When the folder hierarchy exists, the process flow proceeds to step 709, and when no folder hierarchy exists, the process flow proceeds to step 703. A specific example of the determination process in step 702 will be described later.
In step 703, the document management unit 402 executes an entity file acquisition process of acquiring, from the database 405, an entity file of the document (entity data of the document) requested from the client PC 102. As described in the description of
In step 704, the image file generation unit 404 calculates the total number of pages of display image data by using the entity file of the document acquired in step 703.
In step 705, the image file generation unit 404 generates image files corresponding to pages by using the entity file of the document acquired in step 703.
In step 706, the cache data management unit 403 generates, immediately under a top folder in the cache data storage area of the storage device 208, a folder hierarchy in which the structure of the folder hierarchy and folder names are similar to those based on the character string generated in step 701.
In step 707, the cache data management unit 403 generates, under a folder in a bottommost layer among folders generated in step 706, a folder whose folder name is a character string representing the total number of pages of the display image data of the document calculated in step 704.
In step 708, the cache data management unit 403 stores, as cache data, the image files corresponding to the pages generated in step 705 under the folder generated in step 707. At this time, the file names of the image files corresponding to the pages are the page numbers of the respective pages. A description of the folder hierarchy constructed in the cache data storage area of the storage device 208 through the processes in steps 706 to 708 will be provided below together with a description of
In step 709, the cache data management unit 403 acquires, from the cache data storage area of the storage device 208, image files of the document corresponding to pages requested from the client PC 102. In step 710, the document management server 101 transmits the image files acquired in step 709 to the client PC 102.
Here, a determination technique in step 702 for determining whether or not image files have been cached in the cache data storage area will be described with reference to
As described above, according to the first embodiment, since it may be determined, by using the folder hierarchy generated in the cache data storage area, whether or not cache data has already been stored, a table for determining the presence or absence of cache data as in Japanese Patent Laid-Open No. 2000-076257 does not have to be prepared. Furthermore, the presence or absence of cache data may be determined without a table and by using the folder hierarchy in the cache data storage area, thereby enabling a quick response.
As described in the description of
In the second embodiment, the risk of exceeding a restriction of the number of subfolders in a typical file system is reduced, and thus a document management system that defines a folder hierarchy of cache data capable of dealing with a large number of documents may be provided.
In such a document management system as described in the first embodiment, the physical capacity of the storage device 208 that contains the cache data storage area is generally limited. When many image files of documents are generated and stored in response to requests from the client PC 102, the amount of file data is expected to exceed an upper limit of the capacity of the storage device 208. In a third embodiment, a technique for solving the above problem and also obtaining an effect similar to that in the first embodiment will be described.
An access control unit 1302 makes a determination as to whether or not it has been authorized to use a resource of the document management server 101 for each of requests from authorized users and how the resource is able to be used (authorization determination). Then, the access control unit 1302 makes, by using a result of the determination, a determination as to whether or not the request is to be accepted and how a function is to be provided in response to the request. In this embodiment, the determination technique may be either a technique of verifying the request against an access control list stored in the document management server 101, or a technique of making a determination by using an external server.
In step 1401, the access control unit 1302 determines whether or not the user having made the request has display authorization to display the document. When the user has the display authorization, the process flow proceeds to step 701, and when the user does not have the display authorization, the process flow proceeds to step 1402. The processes in steps 701 to 710 in
In step 1402, the document management server 101 transmits, to the client PC 102, a report that the user does not have the display authorization to display the document.
In step 1403, the access control unit 1302 determines whether or not the user having made the request has authorization to store reusable image files as cache data, that is, storage authorization to store image files in the cache data storage area. Through this authorization determination process, when the user has the storage authorization, the process flow proceeds to step 706, and when the user does not have the storage authorization, the process flow proceeds to step 710.
In the third embodiment, when the client PC 102 transmits a document display request to the document management server 101 in accordance with an instruction from the user who does not have the storage authorization to store cache data, image files generated in step 705 are not cached, but transmitted to the client PC 102 in step 710, and discarded.
Authorization to store image files in the cache data storage area may be set for users individually by the administrator of the document management server 101. Alternatively, a process may be performed in which frequency of access or frequency of a document display request from a user is compared with a certain threshold, and storage authorization to store image files in the cache data storage area is automatically assigned to a user for which the frequency exceeds the threshold.
In the third embodiment, users who are able to use the cache data storage area are restricted, and thus a document management system that prevents the amount of file data from exceeding an upper limit of the capacity of the storage device 208 because many image files have been cached may be provided.
Embodiments of the present invention can also be realized by a computer of a system or apparatus that reads out and executes computer executable instructions recorded on a storage medium (e.g., non-transitory computer-readable storage medium) to perform the functions of one or more of the above-described embodiment(s) of the present invention, and by a method performed by the computer of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s). The computer may comprise one or more of a central processing unit (CPU), micro processing unit (MPU), or other circuitry, and may include a network of separate computers or separate computer processors. The computer executable instructions may be provided to the computer, for example, from a network or the storage medium. The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)™), a flash memory device, a memory card, and the like.
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2013-120719 filed Jun. 7, 2013, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2013-120719 | Jun 2013 | JP | national |