The present invention relates to a software documentation preparing system capable of outputting software documentation in plural natural languages, and more specifically to a software documentation preparing system capable of preparing documentation on the software from a source file of computer software including comments in text processing, and converting a file in the text processing.
Before describing the present invention, the definitions of terms that can be frequently misunderstood are first described below. In the present invention, some types of “languages” are used in a computer system. Therefore, in the present specification, a “natural language” means a language normally used by people such as Japanese, English, Chinese, Korean, etc. A “programming language” generally means a language such as an assembly language, a C language, a Java (registered trademark) language for describing software that operates information equipment.
It is an important strategy for a software house to deliver software products to a number of nations and areas. Generally when software is used, a user reads any documentation to fully understand the software. In this case, the user can learn how to use the software the most efficiently by reading the documentation written in the first natural language (that is, the mother tongue) of the user. That is, the usability of software largely depends on the readability of the documentation relating to the software. Therefore, presenting the documentation relating to software, or generally the documents relating to software, in a language of each nation or area allows the value of the software to be enhanced in each target area.
On the other hand, rapid progress has been made in international software development. A programming language itself is substantially independent of a natural language, and belongs to a knowledge system common to worldwide software developers. Therefore, the internationalization is a natural course of software development. However, as the software has been complicated these days, it is very difficult to understand the software only by the source codes of the software. As a result, it is common to share or distribute among developers the documents (examples: XXX software documentation, YYY internal program specification, etc.) relating to the development of software written in a natural language together with source codes. Some documents can promote the understanding of users relating to the specification of the software. The problem is the natural language in which a developing document is written.
Generally, a document for development is frequently described in the first natural language in a target area, or in English as an internationally standard natural language in many cases. However, there are very few persons having the ability to fully understand the necessary language and successfully develop software. Therefore, it would be helpful to deeply understand the software to be developed by reading the software developing document in the mother tongue of the reader (that is, a software developer) in order to significantly reduce development time. Therefore, it is desired that a software product (or a software component product) is released not only along with a document in English as an internationally common natural language, but also with a document written in a local natural language. As a result, the value of the software product can be enhanced in each area of the world.
Therefore, it is necessary to prepare a document written in some natural languages on the software developing side. However, it takes a long time and much labor to prepare a document for software development. Especially when it is necessary to issue a document in a large number of natural languages, it is necessary to provide a translating step for each issue, and a step of confirming the consistency among the documents written in the respective natural languages, thereby causing the bottleneck in improving the productivity of a software product.
The above-mentioned problems can be easily understood by considering the step of preparing a software developing document as associated with the software itself. Generally, it is costly to separately prepare and maintain a source code and a software developing document (internal specification) of software because internal specification is a document closely related to a source code of software, and it is hard to maintain the consistency in contents between the source code and the document when changes are needed in the source code.
Therefore, it has been widely recognized that a mode of development, in which operable software is integrated with the document by annotating (with comments) the source code of software, is effective in improving the productivity of software products. For example, in the document “Literate Programming”, Knuth shows software in a programming language written along with the description in a natural language by incorporating the source code of software into text, and demonstrates the effectiveness of the mode of development. In the mode of development, a comment in the source code is automatically extracted and adjusted by the comment extracting and document adjusting software, and can be immediately available during software development or operation as a complete document.
It is appropriate to say that an automatic document preparing system is very effective from the viewpoint of the quality maintenance and cost reduction for software and a document. Furthermore, since the document can be immediately available, it is effective in improving the development efficiency. In addition, it has the advantage that the consistency between the software to be executed and the document can be easily guaranteed.
Javadoc of Sun Microsystems, Inc. is an appropriate example of the system. In a program source code of Java (registered trademark), a description is written as a comment in a component of software such as a class, a method, etc. so that a document can be output as an HTML document and a PDF document. When a description is written, a sign indicating the meaning of the description can be added to the description, thereby controlling the description such that the description can be displayed in an appropriate position in the document to be output.
Such an automatic preparation of a document has traditionally been implemented on a source file having a comment written in a single natural language, because it is a common practice to describe software in a programming language using an English character set and a comment added to the source file is also written in English in many cases, due to the history of the establishment of a computer and the background that the internationally standard and natural language is currently English.
Outstanding open implementation of preparing a document similar to Javadoc as a system of automatically preparing a document from a software source code can be a Doxygen, KDOC, DOC++, etc. However, the listed tools are to prepare software documentation written in a single natural language.
A well-known technique of a contents filter for an electronic document written in plural foreign languages is the patent document 1. The technique of a contents filter is to classify news articles of mainly current events into the respective fields of topics.
Conventionally, as described in the patent document 1 as a technique of preparing software documentation in plural natural languages, there is a technique of a contents filter for an electronic document written in plural foreign languages. However, the technique is to classify news articles of mainly current events into the respective fields of topics, but is not to explicitly and concretely indicate a method of classifying and extracting a statement written in plural natural languages. Therefore, it does not indicate a system of preparing software documentation in plural natural languages by applying the technique to a source file of software.
In addition, there is a method of using a text preprocessing system as a well-known technique. Concretely, a document preprocessing system is, for example, a preprocessor for a C language. An assumed method is, for example, embedding an instruction for a preprocessor in a source file on a preparing side, and performing a preprocess before inputting the instruction to an automatically document preparing system, thereby removing a comment written in the languages other than the natural language to be used in preparing a document, and preparing software documentation described in the target natural language. An example of an instruction for a C language preprocessor can be #ifdef, #endif, etc., and a purpose can be attained by fully utilizing the instructions. However, since the description is complicated, and it is originally used in describing a source code of software, there can be a disorder frequently occurring in the management of codes for identifying a language, and they are not appropriate for identifying a comment in plural natural languages.
At present, there is no effective technique of enhancing the productivity in preparing software documentation written in plural natural languages. Therefore, the present invention aims at providing a system for preparing software documentation in plural natural languages to prepare software documentation written in plural natural languages.
To attain the above-mentioned objectives, the system for preparing software documentation in plural natural languages according to the present invention as the first aspect includes: input means for inputting a source file including a source code statement written in a programming language and a comment assigned to the source code statement, in which source file, the comment describing one of functions in the source code is described in plural natural languages, each of the descriptions in the natural languages provided with a combined sign of a sign indicating the function and a sign indicating a type of natural language; storage means for interpreting the input source file, identifying the combined sign, associating the sign with a source code statement, and storing a comment on memory; extraction means for extracting only a comment provided with a sign corresponding to the type of the user-specified natural language to be output; and output means for outputting software documentation in the natural language to be output for the source code statement based on the extracted comment.
The system for preparing software documentation in plural natural languages according to the present invention as the second aspect includes: input means for inputting a source file including a source code statement written in a programming language and a comment assigned to the source code statement, in which source file, the comment describing one of functions in the source code is described in plural natural languages, each of the descriptions in the natural languages provided with a combined sign of a sign indicating the function, a sign indicating a type of natural language, and a sign indicating a nation or an area; storage means for interpreting the input source file, identifying the combined sign, associating the sign with a source code statement, and storing a comment on memory; extraction means for extracting only a comment provided with a sign corresponding to the type of the user-specified natural language to be output; and output means for outputting software documentation in the natural language to be output for the source code statement based on the extracted comment.
In this case, the software documentation preparing system further includes translation means for translating a statement in one natural language into a statement in another natural language. When there is no comment provided with a sign corresponding to the type of the user-specified natural language to be output specified by a user, the extraction means extracts a comment provided with a sign corresponding to the type of a predetermined natural language from the source file, and the output means allows the translation means to perform machine translation based on the comment to be output described in the natural language to a comment described in predetermined language, and outputs software documentation in the user-specified natural language to be output.
In this case, the system can also be configured such that a sign indicating the type of a primary natural language can be included in a source file to indicate the default of the type of the natural language of a comment to be translated when the machine translation is performed.
The system can also be configured such that, in the software documentation preparing system according to the present invention, a sign added to a comment includes a sign showing the necessity to update a comment, and the output means can output the information about a portion to be updated or a language to be updated in a source file based on a sign showing that the comment is necessary to be updated. According to another aspect of the present invention, each process element (means) is realized as a program. When the program is installed in the information processing device, it functions as the software documentation preparing system according to the present invention. In this case, there is a characteristic in the data structure for configuring a source file structure used in the system. In a source file including a source code statement written in a programming language and a comment assigned to the source code statement, a comment describing a function in a source code is described in plural natural languages, and a sign of a combination of a sign of a function and a sign of the type of a natural language is provided for a description of each natural language. In another source file including a source code statement written in a programming language and a comment assigned to the source code statement, a comment describing a function in a source code is described in plural natural languages following the sign indicating a function, and a sign of the type of the natural language used in the description is added to the comment.
According to the software documentation preparing system of the present invention with the above-mentioned configuration, by including a comment written in plural natural languages in a source file together with a source code, a software developer, an editor of each language, and a translator of each language can be prevented from performing a wrong editing process. Simultaneously, a portion necessary to be translated, such as a comment described in a foreign language can be displayed to a translator, thereby efficiently performing the editing process. As a result, the following problems can be successfully solved.
(Problem 1): Conventionally, a software developer prepares a source file provided with a comment for a source code written in a programming language, and prepares software documentation using a tool such as Javadoc etc. for preparing software documentation by inputting the source file. However, those tools are used for a single natural language. Therefore, it is necessary to translate a file in each natural language and confirm the consistency at a request for software documentation written in plural natural languages, and a comment written in plural natural languages has not been held in the source file for future processing. On the other hand, according to the software documentation preparing system of the present invention, a system capable of describing a comment of a source file in plural natural languages, and preparing software documentation written in plural natural languages can be realized.
(Problem 2): Although a natural language does not one-to-one correspond to a nation or an area, it is necessary to provide appropriate software documentation at a user request. However, no system for satisfying the request has been realized. The problem can also be solved by the software documentation preparing system according to the present invention.
(Problem 3): In a source file including a source code written in a programming language and a comment written in a natural language, the method of appropriately determining a portion to be translated has not been clearly described, and it is necessary to perform translation by a human translator, not by a machine translation. Therefore, in the process of manufacturing a software product, a long time and a high cost are required to prepare software documentation. The problem can also be solved by the software documentation preparing system of the present invention.
(Problem 4): Although in the case where machine translation can be applied to a comment in a source file, it is difficult to appropriately select a comment to be translated, and explicit selection means for reflecting an intention of a software developer is required. According to the software documentation preparing system of the present invention, the problem can also be solved.
(Problem 5): Although there are a describing method and a system for a processing method for a source file described by a comment in plural natural languages, it is very difficult to appropriately change and manage the contents of a comment written in each natural language based on the specification change of software and the implementation contents of a source code. For example, when a comment described in a natural language is changed, the changed comment does not match a comment described in another natural language, but it is difficult to manage the information as to which comment is to be amended as the latest information. Up to now, no system has been realized for appropriately changing and managing a source file for which a comment has been written in plural natural languages. According to the software documentation preparing system of the present invention, the problem can also be solved.
The mode for embodying the present invention is described below with reference to a concrete embodiment. In the description of the embodiment, the software documentation preparing system is operated by realizing a system element that functions as processing means by installing software (program) executed on a computer (or information equipment in a broad concept). As a file in this system, an electronic file stored in the storage device of a computer is assumed. The software documentation preparing system can also be configured as a stand-alone software documentation preparing apparatus most parts of which are configured by hardware and maintain the same functions.
In a source file input to the software documentation preparing system, not only a source code statement written in a programming language described in developing a program, but also a comment assigned and corresponding to the source code statement is input. The comments are described as necessary explanation in the necessary number of types of natural languages more than one. In this case, signs indicating the meanings of comments describing the functions of a source code and the types of natural languages are assigned as follows.
In a source file input to a normal software documentation preparing system, a comment is provided with a sign indicating the meaning of a comment. Normally, a comment is identified by a sign indicating the meaning by performing a syntax analysis on an input source file, and is stored in the storage means. However, in a source file to be targeted by the present invention, a comment written in each natural language is described in plural natural languages. Therefore, a sign obtained as a combination of a sign indicating the type of a natural language of each comment and a sign indicating the meaning of the comment is assigned, and these signs are described together. Therefore, when a comment is stored, it is identified by a sign as a combination of a sign indicating the type of a natural language and a sign indicating the meaning of the comment, and then stored.
In the software documentation preparing system, as described above, only a comment assigned a sign corresponding to the type of a user-specified natural language to be output is extracted from the comment identified by a combination sign and stored in the storage means. Thus, the software documentation corresponding to the source code statement and the executable software can be output in a specified natural language to be output.
In the software documentation preparing system, not only a sign as a combination of a sign indicating the type of a natural language, a sign indicating the meaning of a comment, but also a sign indicating a nation or an area is assigned to each comment written in each natural language in an input source file, and a comment including the sign is extracted to output software documentation.
Also in the software documentation preparing system, a comment of a specified natural language can be prepared by machine translation based on the comment described in another natural language to output software documentation, but a comment is prepared by including a sign indicating the type of a primary natural language in an input source file and performing machine translation of specified natural language comment based on the indicated primary natural language, thereby output software documentation.
Furthermore, a comment can also include a sign as a combination of a sign requiring update as assigned to a source code statement in an input source file, and the information about a portion requiring update by interpreting the comment including the sign can also be output.
It is also effective to provide a system that converts a source file that can be input to an existing software documentation preparing system. That is, the software documentation described in a target natural language is not prepared directly from the source file written in plural languages, but first a source file having a comment in the target natural language is output by extracting only a comment described in the target natural language and a source code. Then, the output source file is input to an existing documentation preparing system, thereby obtaining a software documentation finally described in a target natural language. This also provides an effective system.
In an example of a source code shown in the attached drawings, the description in the Java (registered trademark) language is illustrated, but the software documentation preparing system according to the present invention is not applied only to a software source code described in the Java (registered trademark) language. That is, the system can be applied not only in the Java (registered trademark) but also in any programming language.
The software documentation preparing system according to an embodiment of the present invention includes as system elements, as shown in
The input unit 11 inputs a source file including a source code statement written in a programming language and a comment assigned to the source code statement. That is, in a source file including a source code statement written in a programming language and a comment assigned to the source code statement, a comment describing a function in a source code is described in plural natural languages, and the data structure of the source file 101 input by the input unit 11 is provided with a sign of a combination of a sign indicating a function and a sign indicating the type of a natural language in the description of each natural language. The comment storing unit 12 stores a source file after performing a syntactic analysis the source file by associating an input comment with a source code statement. In this case, for example, a comment written in two or more types of natural languages is identified for each comment written in each natural language by a sign (for example, @u.ja etc.) as a combination of a sign indicating the type of the natural language and a sign indicating the meaning of the comment and stored. The comment extraction unit 13 extracts only the comment provided with a sign (for example, ja etc.) corresponding to the type of the user-specified natural language to be output from the comment storing unit 12. The output unit 14 outputs software documentation in a natural language to be output for the source code statement based on the extracted comment. The translation unit 15 performs machine translation on a statement in one natural language into a statement in another natural language as described later.
A source file input from a user by the input unit 11, or the source file 101 including a comment stored in the comment storing unit 12 is input to the documentation preparing system 102. A user specifies the type of a natural language of the software documentation to be output to the documentation preparing system 102. The documentation preparing system 102 allows the comment extraction unit 13 to extract only a comment provided with a sign corresponding to the specified type of a natural language to be output, thereby allowing the output unit 14 to output the software documentation corresponding to the source code statement and the executable software in the natural language to be output.
When a user specifies the first natural language to the documentation preparing system 102, documentation 103 written in the first natural language is output. When the user specifies the second natural language, documentation 104 written in the second natural language is output. When the user specifies the n-th natural language, documentation 105 written in the n-th natural language is output. The number of the types of the specified natural languages in this case can be only one or more than one simultaneously specified.
Thus, the system extracts a comment written in a target natural language from a comment written in a source code in plural types of natural languages, and automatically prepares a document in a target natural language from a source file.
Examples of description formats shown in
Back to
The sign “@u” is defined as a tag indicating the general description of a comment, and is a sign indicating the meaning of a comment. The following sign “.ja”, “.ko”, “.zh” indicates the type of a natural language to be used. These signs can be combined and the combination sign “@u.ja” is obtained by combining a “sign indicating the meaning of a comment” and a “sign indicating the type of natural language”, and the sign simultaneously represents the meaning of a comment and the type of natural language.
In the present invention, by using the above-mentioned combination sign, the documentation of plural types of natural languages can be efficiently edited and prepared. During editing a source code, a programmer who develops a program uses these signs, thereby presenting a comment written in plural different types of natural languages.
Described next as another example of using the signs is a document comment assigned to a definition 205 of a method “say”. A first half portion 203 is similar to the comment assigned to a class, and describes the outline of the method. A “@param” tag of the sign is assigned before the comment for description of the argument of the method in Javadoc. The sign is also a “sign indicating the meaning of a comment”. In this case, to correspond to plural types of natural languages, a sign indicating the type of a natural language is combined with a “@param” tag, and a resultant combination sign is used. That is, using a “@param.ja (Japanese)” tag, a “@param.ko (Korean)” tag, and a “@param.zh (Chinese)” tag as combination signs in a comment 204, combination sign corresponding to plural types of natural languages are generated. Using the tags, a documentation preparing system can identify each comment as a description of an argument of the method, and as a comment for each natural language.
An ISO 639 is regulated as an international standard of the name of a natural language, an ISO 639-1 is regulated for representation by two alphabetical characters, and an ISO 639-2 is regulated for representation by three alphabetical characters. In this embodiment, the ISO 639-1 is used, but any appropriate natural language name can be regulated for use in the software documentation preparing system according to the present invention without limit to the ISO standards. By adopting the above-mentioned notation, the software documentation preparing system according to the present invention can extract only the comment relating to the natural language to be prepared.
In a source file 300 shown in
With the above-mentioned example, a sign (nation code) indicating the type of a nation in addition to the sign (language code) indicating the type of natural language is further combined and used, thereby obtaining a sign for specifying a target natural language and nation. Therefore, a document can be written in more detail by a sign added to a comment corresponding to a source code.
It is an outstanding advantage that, by performing the above-mentioned processes, the documentation of a necessary and natural language can be prepared without translation by a person. The machine translation is in other words, electronic translation, computer translation, etc. that a translating process is performed in a machine translation process without translation by a person. When there is low reliability in correctness and validity of a translated statement obtained by the machine translation, it is desired that the translated statement is provided with a sign indicating that the statement is obtained by the machine translation. Using the sign, the machine translated statement can be checked later.
It is not always necessary to include in a system the system element (machine translating module) for performing the machine translating process. Not only processing by using internal data representation, but also a process of using a software service outside a system using an external file or clip board can be performed. In addition, the system can be configured by adding the function of machine translation by using the framework of an OLE (object linking and embedding).
An important point of developing a program using a source code including a comment written in plural types of natural languages using the software documentation preparing system according to the present invention is how to practically prepare a source file including a comment written in plural types of natural languages when a source file is prepared.
For example, in a common development model without using the software documentation preparing system according to the present invention, the software designed by a software designer is realized as in a form of a source file by software implementers. At that time, a natural language of a comment assigned to a source code is a first natural language regulated in a current project. The first comment is normally described in the first natural language.
However, using the software documentation preparing system according to the present invention, a primary natural language can be set for each source file, and translating operations in a source code can be performed completely separately. Therefore, it is not always necessary to use the same natural language in each project. As a result, a natural language that can be easily understood by a member of a project and in which an operation can be efficiently performed can be selected, thereby efficiently performing the entire operation.
In the above-mentioned translating operation, it is also important to select the optimum language as an original natural language from which the translating operation is performed. A determining method can be 1. specified by a user, 2. a natural language to be used in a working environment of a user is specified as an original natural language, etc.
However, there are many cases in which each source file, class, or method is developed by a different developer. In this case, it is desired to provide the information for a documentation preparing system by including a sign indicating the type of a primary natural language in the source file.
In the software documentation preparing system according to the present invention, it is important to maintain the consistency of the meaning among the comments written in each natural language in an input source file.
By performing the above-mentioned editing operation, it is determined as to whether or not it is necessary to at least perform again the translation on other natural languages other than English. By using the method, the consistency of the meaning can be easily maintained among the comments written in each natural language.
In the software documentation preparing system according to the present invention, since the essential point is to assign a sign indicating the necessity to update a comment, the scope of the application of the present invention is not limited to an example illustrated in the present embodiment.
The software documentation preparing system according to the present invention can be embodied as a software documentation preparing system, apparatus, or method described in a target natural language directly by the respective means described above, but it is not always necessary to use the embodiment of directly preparing the documentation. By extracting only a comment described in a target natural language from a comment described in plural types of natural languages, an embodiment of converting an electronic file to output a source file provided with a comment in a single natural language can also be an effective implementation of the software documentation preparing system according to the present invention.
By inputting the output source files (1203, 1204, . . . , 1205) to the respective existing software documentation preparing systems (1213, 1214, . . . , 1215), software documentation (1223, 1224, . . . , 1225) written in target natural languages can be prepared, thereby attaining the purpose. Using the existing software documentation preparing systems means that various representation of software documentation can be realized without developing a new system by adapting the present invention. Therefore, it is a very effective system.
A source file to be input in the present invention will include a number of signs obtained by combining a sign indicating the type of a natural language and a sign indicating the meaning of a comment. When this type of file is edited on a conventional text editor, the amount of work of describing the above-mentioned combination signs increases in addition to the operation of describing a comment in each natural language, thereby causing the problem of reduced efficiency in the editing operation. Therefore, the problem can be avoided by providing a mechanism of inserting a sign for an editing system for developing software such as an existing text editor.
Assuming that a source file whose comment is described in a number of natural languages is edited on a text editor, the positions of the comment and the source code are separated on the screen as the number of types of the natural languages available increases, and the editing operation becomes difficult. Additionally, there is a risk that an editor can erroneously delete or change a part of a comment in a natural language that cannot be understood by the editor, and the editor is not aware of the error. Therefore, the problem can be avoided by providing a mechanism of presenting only a necessary natural language on the editing screen for an editing system for software development such as an existing text editor.
Practically, when a source code having comments written in a number of natural languages is edited, a description is performed in the format used for an input source file according to the present invention, and the source file is used as to be edited. The text editor can identify a comment for each natural language using a combination sign of a sign indicating the type of natural language and a sign indicating the meaning of a comment, and only a comment described in a necessary natural language can be presented to an editor. As a result, a source code whose comment is described in a number of natural languages can be edited more efficiently, thereby enhancing the quality.
The software documentation preparing system according to the present invention is also useful in use of a source file of the source file structure shown in
According to the software documentation preparing system of the present invention, the comment and the source code written in each natural language are included in a single file, thereby being able to prepare the software documentation for each nation and area from a single source file. Holding the versions in all natural languages in a single source file means preventing the distribution of an information source, and has the effect of maintaining the consistency. Thus, the inconsistency among the natural language versions can be reduced, the time required to prepare documentation can be shortened, and the quality of the documentation can be enhanced.
In addition to preparing software documentation in each natural language, appropriate software documentation can be prepared depending on each nation or area, thereby providing each client with a higher service. Furthermore, by assigning the type of a natural language explicitly to a source code comment, a source code comment can be prepared by machine translation, which was impossible by any conventional technique. This leads to outstanding cost and time saving means when a software product is distributed to all over the world.
In selecting an appropriate statement to be translated, which is an important problem when machine translation is performed, an explicit use of a sign can reflect the intention of the software project, thereby largely contributing to the enhancement of the quality of software documentation to be prepared. A spelling check and a grammatical check have been performed on a single natural language, but by allowing a spelling check and a grammatical check to be performed on a file including plural natural languages in mixture, the quality of a comment can be improved. As for the inconsistency of a comment, which is the problem when a comment is described in plural natural languages, a portion to be corrected can also be determined efficiently by using a sign necessary to be updated. Furthermore, by using the system for converting a source file in which a comment is described in plural natural languages into a source file described in a single natural language, thereby effectively using an existing software documentation preparing system.
Number | Date | Country | Kind |
---|---|---|---|
2005-218993 | Jul 2005 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2006/314607 | 7/25/2006 | WO | 00 | 1/25/2008 |