Apparatus, program product and method for structured document management

Information

  • Patent Application
  • 20070198559
  • Publication Number
    20070198559
  • Date Filed
    September 21, 2006
    18 years ago
  • Date Published
    August 23, 2007
    17 years ago
Abstract
The structured document management apparatus includes a document data accepting unit that accepts input of structured document data having a hierarchical logic structure; a structure guide data storage unit that stores structure guide data which is a summary of hierarchical structure information of the structured document data; a structure stream converting unit that syntax-analyzes the accepted structured document data, and converts the structure information in the structured document data into structure stream data as one-dimensional sequence data using the structure guide data; and a structure stream data storage unit that stores the converted structure stream data.
Description

BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a schematic diagram illustrating a system constructing example of a structured document management system according to a first embodiment of the present invention;



FIG. 2 is a module constructional diagram of a server and a client terminal;



FIG. 3 is a block diagram illustrating a schematic construction of the server and the client terminal;



FIG. 4 is an explanatory diagram illustrating one example of structured document data;



FIG. 5 is an explanatory diagram illustrating one example of structure guide data;



FIG. 6 is an explanatory diagram illustrating one example of structure stream data;



FIG. 7 is a flow chart illustrating a flow of an updating process for the structure guide data;



FIG. 8 is an explanatory diagram illustrating one example of query data;



FIG. 9 is a flow chart schematically illustrating a flow of a path pattern compile process;



FIG. 10 is a schematic diagram illustrating a primary structure graph with respect to query data Q1;



FIG. 11 is a schematic diagram illustrating a secondary structure graph based on the primary structure graph of FIG. 10;



FIG. 12 is an explanatory diagram illustrating one example of a path pattern processing table with respect to the query data Q1;



FIG. 13 is a flowchart illustrating a flow of a process for creating the path pattern processing table;



FIG. 14 is a flowchart is a flowchart illustrating a flow of a process for scanning a structure stream;



FIG. 15 is a flowchart illustrating a flow of a Token pushing process to Place;



FIG. 16 is a progress chart when the structure stream data shown in FIG. 6 is given to the path pattern processing table shown in FIG. 12;



FIG. 17 is an explanatory diagram illustrating one example of query data Q2 according to a second embodiment of the present invention;



FIG. 18 is a schematic diagram illustrating a primary structure graph with respect to the query data Q2;



FIG. 19 is a schematic diagram illustrating a secondary structure graph based on the primary structure graph of FIG. 18;



FIG. 20 is an explanatory diagram illustrating one example of a path pattern processing table for the query data Q2;



FIG. 21 is a progress chart when the structure stream data shown in FIG. 6 is given to the path pattern processing table shown in FIG. 20;



FIG. 22 is an explanatory diagram illustrating one example of the path pattern processing table for processing the query data Q1 and Q2 simultaneously according to a third embodiment of the present invention;



FIG. 23 is an explanatory diagram illustrating one example of structured document data accompanying advance structure information according to a fourth embodiment of the present invention; and



FIG. 24 is an explanatory diagram illustrating one example of the path pattern processing table where a skipping procedure is set.


Claims
  • 1. A structured document management apparatus comprising: a document data accepting unit that accepts input of structured document data having a hierarchical logic structure;a structure guide data storage unit that stores structure guide data which is a summary of hierarchical structure information of the structured document data;a structure stream converting unit that syntax-analyzes the structured document data, and converts the structure information in the structured document data into structure stream data as one-dimensional sequence data using the structure guide data; anda structure stream data storage unit that stores the structure stream data.
  • 2. The apparatus according to claim 1 further comprising: a query data accepting unit that accepts input of query data;a path pattern compile unit that creates a path pattern processing table which specifies a processing procedure specialized for the query data by syntax-analyzing the accepted query data and referring to the structured guide data stored in the structure guide data storage unit; anda structure stream scanning unit that acquires the structure stream data aggregate from the structure stream data storage unit, and gives the structure stream to the path pattern processing table so as to execute the processing procedure.
  • 3. The apparatus according to claim 2, wherein the path pattern compile unit synthesizes the path pattern processing tables relating to the respective query data so as to create a path pattern processing table of the plural query data when the plural query data are processed.
  • 4. The apparatus according to claim 2, wherein the path pattern compile unit incorporates a procedure for skipping some of the structure stream data into the path pattern processing table when structure information of the structured document data is defined.
  • 5. The structured document management apparatus according to claim 2, wherein the path pattern compile unit incorporates a procedure for skipping some of the structure stream data into the path pattern processing table when the structure information appears due to statistics information of the structured document data.
  • 6. The apparatus according to claim 1, wherein the structure guide data hold the following conditions (1) to (3): (1) all paths which appear in the structured document data aggregate stored in the system appear in the structure guide data;(2) all paths which appears in the structure guide data appear in the structured document data aggregate stored in the system; and(3) all paths which appear in the structure guide data are unique.
  • 7. A computer program product having a computer readable medium including programmed instructions for managing a structured document, wherein the instructions, when executed by a computer, cause the computer to perform: accepting input of structured document data having a hierarchical logic structure;syntax-analyzing the structured document data, and converting structure information in the structured document data into structure stream data as one-dimensional sequence data using structure guide data which is a summary of hierarchical structure information of the structured document data; andstoring the structure stream data in a structure stream data storage unit.
  • 8. The computer program product according to claim 7, wherein the instructions cause the computer to further perform: accepting input of query data;creating a path pattern processing table which specifies a processing procedure specialized for the query data by syntax-analyzing the accepted query data and referring to the structure guide data; andacquiring the structure stream data aggregate from the structure stream data storage unit, and giving the structure stream to the path pattern processing table so as to execute the processing procedure.
  • 9. The computer program product according to claim 8, wherein the path pattern processing tables relating to the respective query data are synthesized so as to create a path pattern processing table of the plural query data when the plural query data are processed.
  • 10. The computer program product according to claim 8, wherein a procedure for skipping some of the structure stream data is incorporated into the path pattern processing table when structure information of the structured document data is defined.
  • 11. The computer program product according to claim 8, wherein a procedure for skipping some of the structure stream data is incorporated into the path pattern processing table when the structure information appears due to statistics information of the structured document data.
  • 12. A method of managing structured document comprising: accepting input of structured document data having a hierarchical logic structure;syntax-analyzing the structured document data, and converting structure information in the structured document data into structure stream data as one-dimensional sequence data using structure guide data which is a summary of hierarchical structure information of the structured document data; andstoring the structure stream data in a structure stream data storage unit.
  • 13. The method according to claim 12 further comprising: accepting input of query data;creating a path pattern processing table which specifies a processing procedure specialized for the query data by syntax-analyzing the accepted query data and referring to the structure guide data; andacquiring the structure stream data aggregate from the structure stream data storage unit, and giving the structure stream to the path pattern processing table so as to execute the processing procedure.
  • 14. The method according to claim 13, wherein the path pattern processing tables relating to the respective query data are synthesized so as to create a path pattern processing table of the plural query data when the plural query data are processed.
  • 15. The method according to claim 13, wherein a procedure for skipping some of the structure stream data is incorporated into the path pattern processing table when structure information of the structured document data is defined.
  • 16. The method according to claim 13, wherein a procedure for skipping some of the structure stream data is incorporated into the path pattern processing table when the structure information appears due to statistics information of the structured document data.
Priority Claims (1)
Number Date Country Kind
2006-45807 Feb 2006 JP national