This is a continuation application of PCT application No.PCT/JP03/07019 filed on Jun. 3, 2003 in Japan.
1. Field of the Invention
This invention relates to a contents reuse management apparatus and a contents reuse support apparatus, and more particularly to a contents reuse management apparatus and a contents reuse support apparatus for judging the level of the reuse among contents stored in a database using a computer such as a scenario, text, a document, a template, a sentence example, a drawing example, an image, voice, etc. The contents reuse management apparatus according to the present invention judges the reusability of the contents from the surface information about contents, a keyword, etc., and obtains the presence/absence of a reuse relationship and the level of reuse from the contents similarity and the information associated with the contents. The contents reuse support apparatus according to the present invention provides a user with recommendation information indicating the importance level of contents based on the level of the reuse of contents, and allows the contents at a high importance level to be easily selected, thereby supporting easy reuse of contents.
2. Description of the Related Art
The similarity between contents has conventionally been judged by the number of appearance times of keywords, etc. But, it has not been checked whether the keywords incidentally appear in both documents or the keywords appear by reuse in both documents.
The similarity can also be judged by extracting the longest matching character string from two documents.
Further Japanese Patent Application Laid-Open No. 2002-118736 (pages 7 to 11) describes that a replica is checked by electronic watermark.
Conventionally, a binary judgment has been performed using “YES” and “NO” to check the reuse of contents including a document, an image, voice, etc. in the method including electronic watermark. However, this method requires a complicated process of specifically inserting electronic watermark.
Therefore, it is an object of the present invention to provide a contents reuse management apparatus which can promote and control the reuse of contents by determining the level of the reuse according to surface information including a text string, a byte string, etc. about the contents of a text document, an image document, etc. and the pattern information using a dictionary, without using the above-mentioned electronic watermark, and by grasping derivative relationship between contents.
Furthermore, it is an object of the present invention to provide a contents reuse support apparatus for supporting the reuse of contents by judging level of the reuse of contents, generating contents recommendation information based on the level of the reuse, and providing a user with the information, thereby allowing the user to easily selecting contents at a high importance level.
The present invention provides a contents reuse management apparatus for judging the presence/absence of reuse between contents. The apparatus includes a surface information generation unit to generate surface information including a character string, etc. appearing in the contents, and a reuse judging unit to judge the reusability using the surface information. And, the presence/absence of the use relationship between the contents is judged according to the matching level of the surface information between the contents.
The present invention provides also a contents reuse management apparatus for judging the presence/absence of reuse between contents. The apparatus includes reuse judging unit to generate a keyword contained in the contents and to judge the reusability based on the keyword. And, the presence/absence of the use relationship between the contents is judged according to the matching level of the keyword between the contents.
Furthermore, the present invention provides a contents reuse management apparatus for judging the presence/absence of reuse between contents. The apparatus includes a surface information generation unit to generate surface information including a character string appearing in contents, at least one of reuse judging unit to judge the presence/absence of a reuse relationship between contents according to the surface information and reuse judging unit to judge reusability based on a keyword, a meta-data holding unit to hold meta-data which is attribute information about contents, and meta-data use judging unit to support a judging result of the reuse judging unit using the meta-data. And, the reuse is further judged based on the reuse judgment result of the reuse judging unit and the meta-data.
The present invention provides a contents reuse management apparatus including first contents to be referred and being able to be reused, second contents to be judged which can be generated by reusing the contents to be referred, a surface information generation unit to generate surface information including a character string appearing in contents, etc., a reuse judging unit having a surface information base reuse judging engine to judge reusability according to the surface information, and a display unit to display information output by the reuse judging unit.
Thus, when there are two contents, surface information can be generated using these contents, and the reuse relationship can be checked by matching the surface information. Therefore, a reuse status can be detected without a complicated process including electronic watermark, or without preparing information including a keyword, meta-data, etc. in advance.
The contents reuse management apparatus according to the present invention includes first contents to be referred and being able to be reused, second contents to be judged which can be generated by reusing the first contents to be referred, a reuse judging unit having a keyword dictionary to hold a keyword, a character string, etc. and a dictionary base reuse judging engine to judge the reusability according to dictionary information about a keyword, a character string, etc., and a display unit to display information output by the reuse judging unit.
Thus, since it is not necessary to extract a pattern from contents, a reuse relationship can be quickly detected.
The contents reuse management apparatus according to the present invention includes first contents to be referred and being able to be reused, meta-data including a generating person, a backup source, etc. of the first contents to be referred, second contents to be judged which can be generated by reusing the first contents to be referred, meta-data relating to the second contents to be judged, a surface information generation unit to generate surface information including a character string, etc. appearing in the contents, a reuse judging unit having a surface information base reuse judging engine to judge the reusability using surface information, or reuse judging unit having a keyword dictionary to hold a keyword, a character string, etc. and a dictionary base reuse judging engine to judge the reusability according to dictionary information including a keyword, a character string, etc., a judgment support unit to support a judgment result of the reuse judging unit using meta-data, a meta-information dictionary to hold meta-data used by the judgment support unit, and a display unit to display information output by the reuse judging unit.
Since meta-data is used in addition to surface information about contents and keyword information, a correct reuse judgment can be made at a higher level by judging a strong possibility of reuse when the same user generates two documents.
Furthermore, the contents reuse management apparatus according to the present invention includes a reference contents database to store a plurality of first contents to be referred and being able to be reused, second contents to be judged which can be generated by reusing first contents stored in the reference contents database, a surface information generation unit to generate surface information including a character string, etc. appearing in the contents, a reuse judging unit having a surface information base reuse judging engine to judge the reusability using the surface information, or reuse judging unit having a keyword dictionary to hold a keyword, a character string, etc. and a dictionary base reuse judging engine to judge the reusability according to the dictionary information including the keyword, a character string, etc., and a display unit to display information output by the reuse judging unit.
Since a database stores a plurality of contents to make a judgment, all contents in a company, all contents in the departments of a company or a plurality of contents can be matched with. Thus, a judgment can be more quickly made as to which contents are reused for contents to be judged in the company than by matching with each contents.
The contents reuse management apparatus according to the present invention includes a reference contents database with meta-data to store a plurality of first contents to be referred and being able to be reused and first meta-data of the first contents, second contents to be judged which can be generated by reusing first contents stored in the reference contents database with meta-data, second meta-data relating to the second contents to be judged, a surface information generation unit to generate surface information including a character string, etc. appearing in contents, a reuse judging unit having a surface information base reuse judging engine to judge the reusability using surface information, or reuse judging unit having a keyword dictionary to hold a keyword, a character string, etc. and a dictionary base reuse judging engine to judge the reusability according to the dictionary information including the keyword, a character string, etc., a judgment support unit to support a judgment result of the reuse judging unit using third meta-data, a meta-information dictionary to hold the third meta-data used by the judgment support unit, and a display unit to display information output by the reuse judging unit.
Since meta-data of the respective contents is stored in addition to the plurality of contents in the database, a reuse relationship can be correctly judged using both of the plurality of contents and meta-data.
The contents reuse support apparatus according to the present invention includes a contents holding unit to hold contents, a contents management unit to manage management information indicating the level of reuse of the contents, and a contents recommendation unit to generate contents recommendation information for recommendation of contents according to the contents use information.
The contents reuse support apparatus according to the present invention includes a contents generation support unit to support a user editing contents according to recommendation information generated by a contents recommendation unit.
According to the above-mentioned contents reuse support apparatus of the present invention, contents having a high use rate can be easily selected, and thus the contents can be reused.
The first embodiment of the present invention is explained below by referring to
In
It is judged whether or not the contents 101 to be referred have been reused to generate other contents. It is also judged whether or not the contents 102 to be judged have been generated by reusing other contents.
The surface information base reuse judging engine 201 judges using the surface information about the contents 101 to be referred and the contents 102 to be judged whether or not the contents 102 to be judged have been generated by reusing the contents 101 to be referred, and is structured by the CPU.
The surface information generation unit 206 generates surface information including a character string (including a punctuation mark) appearing in the contents 101 to be referred and the contents 102 to be judged. In other words, the unit 206 generates a text string or a byte string of a text document and an image document.
The reuse judging unit 210 judges according to the surface information whether or not the contents 102 to be judged have been generated by reusing the contents 101 to be referred. In other words, the unit 210 obtains such judgment results as (1) totally reused, (2) partially reused, (3) possibly referred to, and (4) no possibility of reuse, etc.
The judgment (1) indicates the case in which the surface information about the contents 102 to be judged substantially matches the surface information about the contents 101 to be referred in the entire contents. The judgment (2) indicates the case in which the surface information about the contents 102 to be judged substantially matches the surface information about the contents 101 to be referred in, for example, the first half portion or the second half portion. The judgment (3) indicates the case in which the surface information matches in the certain number of pieces of data or in a certain length. The judgment (4) indicates the case in which none of the judgments (1) to (3) obtained. The substantially matching level in the judgment (1), the partially matching level in the judgment (2), the certain number or a threshold of a length in the judgment (3) are predetermined. When a plurality of surface information are matched, it is necessary in judging the reusability that the order of the portions of the matched surface information is the same.
The display unit 301 displays the judgment result of the reuse judging unit 210 as indicated by, for example, the judgments (1) to (4), and the user can judge the reuse status of the contents 102 to be judged to the contents 101 to be referred.
Described below is the operation shown in
Then, the surface information base reuse judging engine 201 operates to sequentially compare the surface information about the contents 101 to be referred with the surface information about the contents 102 to be judged, and sequentially discriminates the matching portion. When there is matching surface information and there are a plurality of matching portions, it is further judged whether or not the matching portions also match in order, and at which positions of the contents 102 to be judged the matching occurs.
Based on the judgment, the surface information base reuse judging engine 201 outputs the judgment results of the judgments (1) to (4), and displays the results on the display unit 301.
By watching the display, the user can recognize whether or not the contents 102 to be judged is obtained by reusing the contents 101 to be referred.
The storage device 15 comprises the contents database 420 comprising a contents 21 and meta-data 103 about the generation date, the generating person, etc. of the contents. A reference numeral 106 denotes a contents database. The storage device 20 comprises the surface information generation unit 206 for generating a character string of contents, judgment support unit 204 for judging reuse of contents using a meta-information dictionary, and reuse judging unit 23 for judging the presence/absence of reuse of contents.
The reuse judging unit 23 comprises the reuse judging unit A 210 which is surface information base reuse judging unit to judge reuse of contents according to surface information, and reuse judging unit B 220 which is a dictionary base reuse judging unit and makes a reuse judgment using a keyword dictionary.
The reuse judging unit 210 comprises a contents input unit 31 for inputting contents, a character string analysis unit 32 for analyzing a character string of contents, a contents holding unit 33 for holding an input contents, and a generated character string holding unit 37 for holding a generated character string.
The surface information base reuse judging engine 201 comprises a matching judging unit 61 for judging match between the character strings of the contents A and B and holding a matching character string with a matching character string length, the positions and number of appearances of the matching character strings in the contents A and B, a matching character string holding unit 42 for holding a matching character string, a matching character string number holding unit 43, a reuse judgment threshold holding unit 44 for holding a character for a reuse judgment and holding a matching character threshold for a judgment of matching, a threshold for appearance order matching of a character string, etc., and a reuse judging unit 45 for judging the level of contents reuse relationship by the number of matching character strings and the threshold, the number of matching appearance orders of matching character strings and the threshold, etc. A judgment result holding unit 70 holds the presence/absence of a contents reuse relationship and the reuse level, etc. for each content.
A character string of contents A is generated and held (S1 and S2), and a character string of contents B is generated and held (S3 and S4). A character string of the contents A is compared with a character string of the contents B (S5 and S6). When no matching is detected, the preceding matching character string is held with the length of its character string, the appearance position, the number of appearances, and an index (S7 and S8). It is determined whether or not all data is processed (S10). When YES, the process terminates. When NO, the process for generating the next character string is performed (S11), and the processes in and after S1 are repeated. When no matching character string is detected in S6, it is determined whether or not all character strings have been checked (S10). When YES, the process terminates. When NO, the process of generating the next character string is performed (S11), and the processes in and after S1 are repeated.
A value of L is set as a threshold of the length of a matching character string (S1). A character string exceeding L in length of a matching character string is obtained (S2). The ratio of the matching character string to the entire contents and the matching level of the appearance order of the matching character string are obtained (S3). The ratio of the total number of characters of a matching character string to the total number of characters of contents is obtained and is compared with the threshold (S4 and S5). When the rate of the number of characters of a matching character string exceeds K, it is judged that there is a reuse relationship between the contents A and the contents B. When the rate of the number of characters of a matching character string does not exceed K, the level of matching in appearance order of character strings is compared with the contents A and B (S6 and S7). The matching number or rate of the appearance order of a matching character string is obtained from the appearance position and the number of appearances of a matching character string, and when the value exceeds the threshold P, it is judged as “reuse relationship” (S9). When the rate of matching in the appearance order of a character string does not exceed the threshold P, it is judged as “no reuse relationship” (S8). Then, the judgment result is held (S10).
It is judged whether or not the length of a matching character string exceeds 25 characters (S1). When there is no matching within 25 characters, it is judged “no reuse” (S9). In the contents A (in contents A shown in
When the total length of matching character strings in the contents A is lower than 90% in S2, it is judged whether or not the total length of matching character strings exceeds 90% in the contents B (S4). When the rate of the total length of character strings exceeds 90%, it is judged that the reuse relationship between the contents A and the contents B is “partially reused” (S7). When the rate of the total length of character strings does not exceed 90% in S4, it is judged whether or not the matching character string is in the correct appearance order (matching judgment of appearance order of a character string) (S5). When the appearance order of a matching character string is correct (matching), it is judged that there is a “partial reuse relationship” between the contents A and B (S7). When the appearance order of a matching character string is not correct (not matching), it is judged that the reuse relationship between the contents A and B is “reference only” (S8).
The contents reuse management apparatus 2 judges whether or not the contents 102 to be judged are generated by reusing the contents 101 to be referred based on a dictionary database including a keyword, a character string stored in the keyword dictionary 203.
The dictionary base reuse judging engine 202 judges whether or not the contents 102 to be judged have been generated by reusing the contents 101 to be referred using the keyword information stored in the keyword dictionary 203, a character string, and the dictionary information including the thesaurus, etc., and the result is stored by the CPU.
The keyword dictionary 203 stores the keyword information, the character string information, the dictionary information including the thesaurus, etc., and the description positions of the keyword and the character string, including the page number are described.
The reuse judging unit 220 judges whether or not the contents 102 to be judged have been generated by reusing the contents 101 to be referred using the keyword information and the character string information, and judges the reuse level including the judgments (1) to (4) like the reuse judging unit 210 shown in
Described below is the operation shown in
Then, the dictionary base reuse judging engine 202 reads the contents 102 to be judged, and detects the presence of the keyword, character string, etc. stored in the keyword dictionary 203. Based on the detection status including the keyword and the matching appearance order of a character string, etc., the judgments including the above-mentioned judgments (1) to (4) are made, the judgment result is output to the display unit 301, and is displayed for the user.
When a special keyword described only in the contents 101 to be referred is detected in the contents 101 to be referred, and the keyword is detected by the contents 102 to be judged, then it can be judged that the contents 102 to be judged have reused the contents 101 to be referred containing the special keyword.
A keyword holding unit 58 holds a keyword of the generated contents, and a thesaurus for the keyword.
The reference numeral 202 denotes a dictionary base reuse judging engine. The reference numeral 60 denotes a keyword input unit. The matching judging unit 61 judges the matching keywords between the contents A and B. A matching keyword holding unit 62 holds the appearance position and the appearance order of the matching keyword between the contents A and B. The reuse judgment threshold holding unit 44 holds a threshold for judgment of the presence/absence of reuse and the use level. A reuse judging unit 65 judges the presence/absence of reuse of the contents A and B based on the number of matching keywords and the appearance order.
The judgment result holding unit 70 holds a matching keyword, the position of a keyword in the contents, and the appearance order. The judgment result holding unit 70 also holds the presence/absence of reuse, the judgment result including a reuse level, etc.
The reuse relationship is judged using a matching character string and a matching keyword between the contents A and B (S1). When the presence/absence of the reuse relationship is not certain in S1, or when it is judged that there is “no reuse relationship”, it is judged whether or not a special keyword is contained in the matching keyword (S2 and S3). When there is a special keyword in the matching keyword, it is judged as a “reuse relationship” (S4). When there is no special keyword, it is judged as “no reuse relationship” (S5). The judgment result is held (S6).
In the description above, the presence/absence of the reuse relationship is judged using a special keyword, but a space can be inserted to indicate specific information in the contents so that the reuse relationship can be judged by analyzing the appearance order of the space. For example, one space and the consecutive two spaces are inserted. One space represents 0, and two spaces represent 1. The insertion order of one space and two spaces is represented by 2-bit information having a specific meaning. The space between the contents A and B is analyzed. When the 2-bit information obtained by the spaces matches another, it can be judged as a “reuse relationship”. When the information does not match another piece of information, it can be judged as “no reuse relationship”.
The third embodiment of the present invention is explained by referring to
The contents reuse management apparatus 3 judges whether or not the contents 102 to be judged have been generated by reusing the contents 101 to be referred based on the meta-data including the generating person of the contents, the corrector of the contents, the generation date of the contents, etc., and the surface information or the keyword information.
The judgment support unit 204 provides the reuse judging unit 230 with the judgment support information for use in judging whether or not the contents 102 to be judged have been generated by reusing the contents 101 to be referred. For example, when the generating person of the contents 101 to be referred is A, and the generating person of the contents 102 to be judged is B, the relationship between the generating persons A and B, for example, the members of the same department or project, etc. is extracted from the meta-information dictionary 205 and provided.
The meta-information dictionary 205 stores in advance the relevant information about the meta-data of the contents 101 to be referred and the contents 102 to be judged, and includes the relevant information about each generating person, for example, the department or the project to which each generating person belongs, the friends of each generating person, etc. The reuse judging unit 230 judges whether or not the contents 102 to be judged have been generated by reusing the contents 101 to be referred, and is structured by the reuse judging unit 210 shown in
The operation shown in
The dictionary information including a keyword, a character string, etc. described in advance in the contents 101 to be referred is stored in the keyword dictionary. The reuse judging unit 230 reads the generation date of the meta-data 103 and 103′ and judges it as “no reuse” when the generation date of the contents 102 to be judged precedes the generation date of the referred contents 101, and displays the judgment (4) on the display unit 301.
However, as described above by referring to
The judgment support unit 204 notifies the reuse judging unit 230 of the relationship information including that the contents generated by the generating person A that the generating persons A and B belong to the same project can be very easily recognized by the generating person B, that the generating persons A and B have not belonged to the same department or project in the company and the generating person B cannot possibly recognize the contents generated by the generating person A, etc.
Thus, when it is not certain whether or not the above-mentioned judgments (1) and (2) hold, the reuse judging unit 230, the judgment can be clearly made that the judgment (3) can hold when there is a strong possibility of recognition, and the judgment (4) can hold when there is no possibility of recognition. That is, a definite judgment (1), (2), (3), or (4) can be made.
Also when the reuse judging unit 230 is structured by the reuse judging unit 210 shown in
In the explanation above, meta-information is used to make a judgment on a reuse relationship when the reuse relationship is not certain. However, when the presence/absence of reuse is judged using meta-information and there is the possibility of reuse according to the meta-information, a judgment can be made on the reuse relationship by the matching result of a keyword and a character string. The method in this case is explained below.
In
The reference numeral 33 denotes a contents holding unit. The reuse judging unit 230 inputs contents for judgment on a reuse relationship. A contents selection unit 34 selects the contents judged as “possibly reused” as a result of the primary judgment. The reuse judging unit A 210 judges the contents reuse according to the surface character information. The reuse judging unit B 220 judges the reuse of contents using a keyword. A secondary judgment result holding unit 82 holds the judgment result of the presence/absence of reuse. A meta-data use judging unit 83 compares the generation date between the contents judged as “reused” by the reuse judging unit A and the reuse judging unit B, and judges the contents on the reused side and the contents of the reusing side. A meta-data input unit 84′ inputs the contents generation date. A reference numeral 85 denotes a meta-data holding unit. A meta-data comparison unit 86 judges the generation date. A tertiary judgment result holding unit 87 holds a comparison result of the meta-data comparison unit 86.
The operation of the structure shown in
The contents selection unit 34 selects the contents judged as “possibly reused” from the result of the primary judgment made using the meta-information, and input. The reuse judging unit A 210 judges the reuse of contents by the surface information base reuse judging engine. The reuse judging unit B 220 judges the reuse of contents by a keyword. Based on the result of at least one of the reuse judging unit A and the reuse judging unit B, the secondary judgment result on the reuse of contents is obtained and held in the secondary judgment result holding unit 82. When it is judged that the results of both the reuse judging unit A and the reuse judging unit B indicate “reused”, and when it is judged that the judgment result is “reused” or one of them is “reused”, the secondary judgment result is judged as “reused”, or when one of them is judged as “reused”, it is judged that the secondary judgment result is “reused”. Thus, the judgment result is selected as necessary to use the respective judgment results. The secondary judgment result is held in the judgment result holding unit 70.
It is judged using meta-data whether the contents judged as “reused” in the secondary judgment result are reused contents or the reusing contents. The generation date of the contents judged as “reused” as a secondary judgment result is selected by the meta-data input unit 84′ from the meta-data 103, and input to the meta-data use judging unit 83. The meta-data comparison unit 86 compares the generation date of the contents (contents A and B) to be compared. It is judged that the contents having a preceding generation date are reused contents, and the contents having a succeeding generation date are reusing contents. The tertiary judgment result is held as associated with the contents in the tertiary judgment result holding unit 87.
In the explanation above, the contents are narrowed with a judgment of the possibility of reuse by the department of the contents generating person, but the meta-information for narrowing the contents can be any other meta-information. Otherwise, the category of the contents (a thesis of scientific technology, a patent specification, etc.) can be assigned meta-information as associated with a file name so that the contents belonging to the same category can be “possibly reused” and the contents belonging to a different category can be “impossibly reused”.
The fourth embodiment of the present invention is explained below by referring to
The contents reuse management apparatus 4 judges whether or not the contents 102 to be judged have been generated by reusing any of a plurality of contents to be referred stored in the contents to be referred group 104.
The contents to be referred groups 104 are a plurality of contents to be referred group on which it is judged whether or not the group has been reused to generate other contents, and can be structured by, for example, a server.
The operation shown in
In advance, a keyword, a character string, etc. stored in the contents to be referred group 104 stored in a database are stored in a keyword dictionary together with the contents to be referred.
The reuse judging unit 230 reads the contents 102 to be judged, detects the presence of the keyword, the character string, etc. of the first contents to be referred stored in the keyword dictionary, makes the above-mentioned judgments (1) to (4), then detects the presence of a keyword, a character string, etc. of the second contents to be referred, and makes the above-mentioned judgment (1) to (4). Thus, the comparison with the keywords and the character strings of all contents to be referred stored in the keyword dictionary, and the judgment result can be sequentially displayed on the display unit 301.
Thus, the “reused” judgment on the contents of a plurality of contents to be referred groups can be efficiently made.
The reuse judging unit A judges the presence/absence of reuse of contents in the above-mentioned judging method based on a matching character string. The reuse judging unit B judges the presence/absence of reuse of contents. Each result is held in the judgment result holding unit 70 for each content. According to the present embodiment, the presence/absence of a reuse relationship of contents to be judged to a plurality of contents to be referred can be efficiently judged. Furthermore, all or a part of contents judged by the reuse judging unit A or the reuse judging unit B can be judged as necessary as to the presence/absence of reuse by making a judgment by the other reuse judging unit.
The fifth embodiment of the present invention is explained below by referring to
The contents reuse management apparatus 5 judges whether or not the contents 102 to be judged have been generated by reusing any of the plurality of contents to be referred stored in the reference contents group 105 with meta-data.
The reference contents group 105 with meta-data are a plurality of contents to be referred for a judgment as to whether or not they are reused to generate other contents, stored in a database with the respective meta-data, and held in, for example, a server.
The operation shown in
The dictionary information including a keyword and a character string relating to a plurality of contents to be referred stored in advance in the reference contents group 105 with meta-data is stored in a keyword dictionary.
The reuse judging unit 230 reads the meta-data of the first contents to be referred stored in the reference contents group with meta-data and the meta-data 103′ of the contents 102 to be judged, judges the contents as not reused when the generation date of the contents 102 to be judged precedes the generation date of the first contents to be referred, and displays the judgment (4) on the display unit 301.
However, as the operation explained by referring to
As a result, as explained above by referring to
The above-mentioned process is sequentially performed on each referred-to content stored in the reference contents database with meta-data, and a judgment result can be displayed on the display unit 301.
Thus, a reuse judgment on a plurality of contents to be referred can be efficiently made using meta-data. In the explanation above, the reuse relationship is confirmed using meta-information after the judgment of reuse of contents using a character string or a keyword. However, the contents can be narrowed into those having a “possible reuse relationship” using meta-information in advance, and then a reuse judgment can be made using a keyword and a character string. In the following explanation, the method is used.
In
A meta-information input unit 601 inputs meta-information including the information about the department of the contents generating person. The judgment support unit 204 judges the presence/absence of the possibility of reuse of contents according to the meta-information. For example, the contents of the same department as the contents generating person have strong possibility of reuse. Therefore, the contents can be narrowed such that only the contents belonging to the same department as the generating person can be judged using a keyword or a matching character string. The primary judgment result holding unit 76 holds a judgment result about the possibility of the presence/absence of a reuse relationship obtained using meta-information.
The keyword input unit 60 inputs a keyword of contents when the keyword is generated for the contents. The matching character string input unit 68 inputs a matching character string, and inputs a matching character string when a matching character string has been generated for the contents to be judged. The keyword holding unit 58 holds a keyword of contents.
A reference numeral 220 denotes reuse judging unit B. A reference numeral 210 denotes reuse judging unit A. The secondary judgment result holding unit 82 holds judgment results of the reuse judging unit A and B. A contents selection unit 84 selects the contents judged as having a reuse relationship in the secondary judgment results.
A meta-data input unit 602 inputs a generation date of contents. The meta-data use judging unit 83 compares the generation dates of the contents judged as having a reuse relationship, and judges that the contents having a preceding generation date have been generated reused by other contents, and the contents having a succeeding generation date have been generated by reusing others. A reference numeral 87 denotes a tertiary judgment result holding unit. The judgment result holding unit 70 holds a reuse judgment result.
With the structure shown in
Contents i and j which are judged as having a “possible reuse relationship” as a primary judgment result are input (SI). The presence/absence of reuse is judged using a keyword and a matching character string (S2 and S3). When a keyword and a matching character string between contents to be judged have been generated, the keyword and the matching character string are used for the contents. When no keyword or matching character string have been generated for contents, a keyword and a matching character string are generated, and the presence/absence of reuse is judged in the above-mentioned method. The judgment result of “reused” or “no reuse” is held in the secondary judgment result holding unit (S4, S5, and S6). It is determined whether or not all contents have been judged (S7). When NO, it is determined whether or not the contents j are changed. When YES, the next contents j are selected (S9 and S10), and the next contents i is selected in S11. When the contents j are not changed, the next i is selected without changing the contents j (S11). The processes in and after S1 are repeated, and when all necessary contents have been determined in S7, the process terminates.
The detailed judging process of a reuse relationship is started by referring to meta-data (S1). The contents i and j having a secondary judgment result “reused” are selected (S2). The generation date of the contents i is defined as Di, and the generation date of the contents j is defined as Dj (S3). Di is collated with Dj for the order of the generation date (S4). When Di follows Dj, it is judged that the contents i are generated by reusing the contents j (S5). When Di precedes Dj, it is judged that the contents j are generated by reusing the contents i (S6). The detailed reuse relationship is held in the tertiary result area (S7). It is judged whether or not all necessary contents have been judged (S8). When not, it is determined whether or not the contents j are to be changed. When the contents j are to be changed, the next contents j are selected in S10. The next contents i are selected in S11, and the processed in and after S2 are repeated.
In the explanation above, the narrowed contents are judged by the department, but the contents can be narrowed using other meta-information (for example, the field of contents, etc.).
The sixth embodiment of the present invention is explained by referring to
The contents reuse management apparatus 6 judges whether or not the contents 102 to be judged have been generated by reusing the contents stored in the database management device 106.
The database management device 106 stores the contents stored in the contents management system including groupware, etc. in each department of a company together with the meta-data including directory information, a generating person, a generation date, etc., and is structured by, for example, a server.
The keyword dictionary 203 stores common dictionary information including a keyword, a character string, etc. and a thesaurus, etc. specific to each department in advance.
The operation shown in
For example, when there are contents A, B, and C, it is judged that the contents B are generated by reusing the contents A, and it is stored in the meta-information dictionary 205, and when it is judged that the contents C have been generated by using the contents B, it is judged that the contents C have been generated by using the contents A. Therefore, the value of contents A is highly evaluated, and the reuse and importance of the contents A can be recognized.
Thus, according to the present invention, the relationship among the contents groups distributed in a company can be arranged from the viewpoint of reuse. Additionally, according to the present invention, important contents can be extracted from the viewpoint of reuse, and the contents can be used as a sample. The administrator can recommend using the sample among the members in the department, thereby allowing each member to easily generate contents with quality higher than a predetermined level.
With the above-mentioned contents reuse management apparatus according to the present invention, contents generated by reusing other contents, or contents reused by other contents can be easily judged from among a number of contents.
Described below is the contents reuse management apparatus according to the present invention capable of easily reusing contents using the reuse result of contents as obtained above.
As described above, contents refer to, for example, a scenario, a template, a common document (having contents different from a scenario), and information processed by a computer including a text sample, graphic sample, etc. They can also include multimedia data including a moving picture, voice, etc. A scenario refers to a document formatted to a certain extent as, for example, a patent document. A template refers to, for example, an arrangement of only headers of document formats, and enables a document to be generated in a predetermined format based on the template. A document refers to common writing in any format. A text sample can be, for example, formatted salutation, a frequently cited specific sentence, etc. A text sample can be a frequently used portion.
A conventional contents management system registers generated contents in a directory or a library. When contents are reused, necessary contents can be fetched by retrieving a keyword and using a dictionary, and reuse can be realized by copying and pasting the original data.
According to the contents reuse support apparatus according to the present invention, reuse can be easily performed on various application contents, and using a number of reused contents, contents can be obtained at a low cost with constant quality. A user requesting reuse of contents can select high-quality contents by obtaining all or a part of evaluation of the contents to be copied, thereby easily generating high-quality contents.
The contents reuse support apparatus according to the present invention evaluates the contents in a database. Based on the given evaluation, a user selects contents and generates a draft of contents. Furthermore, by recording the process of generating the draft, the evaluation of the contents can be updated. Thus, by using the evaluated contents and managing the contents structured by the parts of the contents, the quality of the contents accumulated in the database can be enhanced.
In the storage device 25, a contents recommendation unit 500 generates recommendation information for a user such that the user can determine the importance of contents having a high use frequency, a high use level, etc. A draft generation support unit 600 supports changing and editing contents, etc. according to the recommendation information. A contents parts segmentation support unit 700 supports the process of a user retrieving a common portion based on a plurality of contents. A contents management support unit 800 supports the process of amending the evaluation of contents based on the use frequency of the contents or treating contents into new contents pats based on the evaluation of the contents.
In the storage device 26, the contents database 420 holds contents.
The contents database 420 is structured by a contents management unit 430, a contents holding unit 440, a correction point holding unit 445, a common point holding unit 470, a recommendation information holding unit 460, and a 472. The contents management unit 430 comprises a contents management information holding unit 431 for holding contents management information including the frequency of download, a use rate, and a pointer to the correction point holding unit to each content, a correction point management information holding unit 432 for holding the correction point management information for management of the difference between contents, a common point management information holding unit to hold common point management information for management of common points between contents. Furthermore, it comprises a management information holding unit 434 for holding other management information including the management information for the recommendation information and the management information for the contents boundary information. The contents holding unit 440 holds various contents including a document, a scenario, a template, a text sample, and a drawing sample. The correction point holding unit 445 holds a correction point between contents. The common point holding unit 470 holds a common point among a plurality of contents. The recommendation information holding unit 460 holds recommendation information.
The contents recommendation unit 500 generates contents recommendation information. In the contents recommendation unit 500, a recommendation information generation unit 501 generates the number of use of contents, a use level, a retrieval result of the contents reuse management apparatus, reference contents display information (described later), derivative contents display information (described later). A download information management unit 455 manages downloading contents parts held in the contents holding unit 440, counts the frequency of downloading, and generates a correction history, etc. The management information is transmitted to a contents management unit and held therein. The data of the correction history is held by the correction point holding unit 445. The contents reuse management apparatus 250 is the same as the contents reuse support apparatus according to the present invention.
The correction point management information holding unit 432 holds an index, contents A and contents B whose difference is obtained, a pointer to the correction point management information holding unit 432, etc. A common point management information holding unit 433 holds contents names (contents A and B) whose common point is obtained, and a pointer to a contents management unit 430, etc.
The contents holding unit 440 holds a contents name, contents data, and a pointer to the contents management information holding unit. The correction point holding unit 445 holds an index, correction point data, and a pointer to the correction point management information holding unit. The held correction point is assigned a contents parts name to generate contents parts. The common point holding unit 470 holds an index, common point data, and a pointer to the common point management information holding unit. A common point can be assigned a contents parts name to generate contents parts.
The recommendation information holding unit 460 holds contents recommendation information 521. The contents recommendation information holds the use frequency of contents (frequency of download), use level including total use, partial use, etc. (obtained by the contents reuse management apparatus 250 retrieving a contents database), user information, the retrieval result indicating the contents reuse relationship obtained by retrieving the contents reuse management apparatus according to the present invention, and the system of a contents reuse relationship, etc.
The contents boundary information holding unit 472 holds the information indicating the relationship before and after the use point when contents are used. For example, when a scenario is a-patent document the boundary information including the “unit to solving the problem”, “embodiments of the invention”, and “effect of the invention” indicating the boundary of the portions changed and not changed when only the embodiment of the original document is changed is held.
A reference contents display 252 is displayed on a display device. The reference contents display specifies a target document based on the contents reuse relationship 251 of the retrieval result, and systematically shows the use relationship of the document A used by the document and the document used by the document A, etc. In the case of the example shown in
A derivative contents display derivative contents display 253 obtains the contents reuse relationship derived from the specified target document based on the contents reuse relationship retrieval result 251, and systematically displays it. In the example shown in
The contents database further comprises the contents holding unit 440, the correction point holding unit 445, the common point holding unit 470, and the contents boundary information holding unit 472.
The contents recommendation unit 500 comprises the download information management unit 455, a recommendation information generation unit 551, and a reference contents display information generation unit 553. The reference numeral 250 denotes the contents reuse management apparatus according to the present invention. The reference numeral reuse judging unit 210 denotes reuse judging unit. A reference numeral 116 denotes another system using a database. The reference numeral 115 denotes another database.
The operation of the contents management apparatus shown in
In the recommendation unit 500, the recommendation information generation unit 551 generates recommendation information based on the contents management information (number of download times, reuse relationship, use rate, etc.) held in the contents management information holding unit 431, and holds the information in the recommendation information holding unit 460. The reference contents display information generation unit 553 generates reference contents display information based on the contents reuse relationship, and holds the information in the reference contents display information holding unit of the recommendation information holding unit 460. The derivative contents display information generation unit generates the derivative contents display information based on the contents reuse relationship held in the contents reuse relationship holding unit, and holds the information in the derivative contents display information holding unit.
Another system 116 can download and use the contents parts through the download information management unit 455. When contents are used and the contents are corrected, the download information management unit 455 generates a correction history, holds the data management information in the contents management information holding unit 431, and the corrected data is held in the correction point holding unit 445 using the difference as a correction point. The user of the contents reuse support apparatus of the present invention can access other databases 115 through the download information management unit 455 and can hold the data as the contents parts of the contents management database.
Based on the reuse relationship held in the reuse relationship holding unit, the information including a matching character string, a keyword, and, a reuse level is held in the contents management unit.
The method of obtaining a common point (that is, a common portion) of a plurality of contents by the contents reuse support apparatus according to the present invention is explained below by referring to
Described below is the operation of the contents boundary information generation unit 713 according to the present invention. The contents boundary information generation unit 713 obtains boundary information boundary information which is area information about the area before and after the common point in the respective contents based on the common point of a plurality of contents. That is, it is judged what area before and after the common point in the respective contents is. For example, when the contents A and B are the templates as shown in
A high number of download times indicates important contents, and when the used portions are distributed, it unit the used portions are of importance. In this case, more easily used parts can be generated by setting the portions as original contents parts. When contents are used by a specific user group, more easily used parts can be generated by treating them as contents parts appropriate for the group. For example, it can be realized by generating a new template, etc. by regenerating a header according to the contents boundary information. The contents boundary information can be reference information for use in generating parts by the contents parts management unit.
A scenario management support unit 960 manages a scenario as contents by the contents management support unit according to the present invention. A document management support unit 970 manages a document as contents by the contents management support unit according to the present invention. A template management support unit 980 manages a template as contents by the contents management support unit according to the present invention. A text/drawing sample management support unit 990 manages a text/drawing sample as contents by the contents management support unit according to the present invention.
The contents recommendation unit 500 communicate s with the scenario database 910, the document database 920, the template database 930, and the text/drawing sample database 940 respectively for a scenario, a document, a template, and a text/drawing sample, receives necessary information for generating contents recommendation information, generates recommendation information, and provides the information for the respective databases. The contents recommendation unit 500 generates contents recommendation information according to the information about the reuse relationship, the reuse level, and the user, etc. of the contents generated by the contents reuse management apparatus 250, and provides the information for each database.
A scenario administrator, a document administrator, a template administrator, and a text/drawing sample administrator use the contents recommendation unit 500, refer to the recommendation information, manage contents, and manage generation of contents parts respectively using the scenario management support unit 960, the document management support unit 970, the template management support unit 980, and the text/drawing sample management support unit 990.
The contents reuse management apparatus according to the present invention accesses each database of the contents reuse support apparatus according to the present invention, judges the reuse relationship of contents, and store a judgment result in each database. The contents reuse management apparatus 250 according to the present invention can access the database system 115 and judge the contents reuse relationship. The contents reuse support apparatus according to the present invention can also access another database system 115 to store the contents parts as the contents parts of the database of the contents reuse support apparatus of the present invention. Another system 116 can also access and use the contents database of the contents reuse support apparatus of the present invention.
The contents reuse management apparatus of the present invention can generate surface information based on a plurality of contents, and the reuse relationship can be checked only by comparing the surface information. Since not only the surface information about contents or keyword information, but also meta-data can be used in a reuse judgment, the details of the reuse relationship can be easily judged. Furthermore, since meta-information can be used in making a reuse judgment, all contents in a company can be narrowed among a number of contents in a database, or a similarity can be narrowed to all contents, etc. in a department of a company, thereby realizing a high-speed reuse judgment on a number of contents.
The contents reuse support apparatus according to the present invention can easily select frequently used contents according to the contents recommendation information. Therefore, important contents can be selected and reused to easily generate high quality contents. Thus, by using the contents generation support apparatus of the present invention, the contents of a database can be successfully enhanced.
Number | Date | Country | Kind |
---|---|---|---|
2002-296862 | Oct 2002 | JP | national |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP03/07019 | Jun 2003 | US |
Child | 11093090 | Mar 2005 | US |