1. Field of the Invention
The present invention relates to a technology for finding a page that outputs the query parameter input by a user.
2. Description of the Related Art
When cross site scripting (XSS) vulnerability is tested with respect to a Web application, query parameter input/output relationship analysis for detecting a page, on which a character string input to a query parameter is directly output, becomes important. The XSS means that a program for displaying a character string input by a user of a website directly on a screen sends a malicious script to user's browser. Damages due to XSS include cookie theft, which means that a browser executes a malicious script, thereby its cookie data is intercepted.
After such a page is found, the XSS vulnerability can be tested by inserting a script based on the position of an input value on the output page, and testing whether the inserted script is executed on a client. Accordingly, it is important for the XSS vulnerability test to find a page that outputs the value input as the query parameter. A technique for finding such a page is described, for example, in Japanese Patent Application Laid-Open No. 2004-164617.
The conventional technique, however, has a problem that the page that outputs the query parameter value is searched, targeting only a page immediately after an input of a query parameter value.
Therefore, in the conventional technique, if there is a page, on which the query parameter value is output, other than the response page immediately after the input, such a page cannot be found. In addition, a transition change accompanying a change in the query parameter value cannot be detected.
Furthermore, when a value input as the query parameter is the generally used character string “XXX”, it cannot be determined whether the character string “XXX” output on the page is the query parameter value or a value output irrelevantly to the query parameter value.
It is an object of the present invention to at least solve the problems in the conventional technology.
According to an aspect of the present invention, a method of finding an output page on which a query parameter value input by a user is output, includes detecting an output page, designating not only the page immediately after the query parameter value input by the user, but also a page output by a target website as a detection target region.
According to another aspect of the present invention, an apparatus that finds an output page on which a query parameter value input by a user is output, includes an output page detector that detects an output page, designating not only the page immediately after the query parameter value input by the user, but also a page output by a target website as a detection target region.
According to still another aspect of the present invention, a computer-readable recording medium stores therein a computer program that realizes the above method according to the present invention on a computer.
The above and other objects, features, advantages and technical and industrial significance of this invention will be better understood by reading the following detailed description of presently preferred embodiments of the invention, when considered in connection with the accompanying drawings.
Exemplary embodiments of the present invention will be explained below in detail with reference to the accompanying drawings.
The concept of a query parameter output page group finding apparatus according to one embodiment is explained first.
As shown in
The tracer value is a value hardly used in general, and easily traced as a value used for analyzing query parameter input and output relationship. For example, when it is assumed that “company A” is a query parameter input value, there is the high possibility that “company A” is used for other than the query parameter input value. Hence, the query parameter output page group finding apparatus according to the present embodiment adds “QZ” to “company A” to generate a tracer value “company AQZ”, and tests whether the same page as the page at the time of inputting “company A” is reproduced, when “company AQZ” is input.
When the same page as the page at the time of inputting “company A” is reproduced, a page that outputs a query parameter value is searched by using the tracer value, targeting not only the page immediately after the input, but also a region held as an already-known page by the query parameter output page group finding apparatus. On the other hand, when the same page as that at the time of inputting “company A”, such as a page that warns an input error, is not reproduced, another tracer value is generated to repeat trial and error until the page is reproduced.
In this manner, the query parameter output page group finding apparatus according to this embodiment can find a query parameter value output page, which cannot be found according to the conventional method, by searching a query parameter value output page by using the tracer value, targeting not only the page immediately after the input of the query parameter value, but also all pages held as the already-known page.
The query parameter output page group finding apparatus according to the present embodiment generates a tracer value based on the query parameter value, and uses the generated tracer value instead of the query parameter value, thereby preventing misdetection of a page, on which the same character string is used by chance irrelevantly to the query parameter value.
The configuration of the query parameter output page group finding apparatus according to this embodiment is explained next.
As shown in
The page reproduction/reproduction result verifying unit 110 reproduces a page output by a target website, and verifies whether the reproduced page matches an expected page. The page reproduction/reproduction result verifying unit 110 holds a record of reproduction trial methods and the verification method of the respective pages of the target website, to reproduce the page and verify the reproduction result.
The record of the reproduction trial methods is a list of generation methods of requests transmitted to the target website, and the record of the verification method is a list of response properties expected with respect to the list of the generation methods of requests transmitted to the target website. These records can be referred to or changed from outside.
Upon reception of an instruction to perform reproduction trial of an optional page, the page reproduction/reproduction result verifying unit 110 tries reproduction of the page according to the reproduction trial method, verifies whether the obtained response list is the expected one by the verification method, and notifies the result.
The page reproduction/reproduction result verifying unit 110 collects page information from the target website, classifies the collected pages into page classes, and builds a page transition model by modeling the transition of the pages. The page reproduction/reproduction result verifying unit 110 determines one reproduction request used at the time of performing reproduction trial of the respective page classes (classification unit of page group), and determines a prerequisite at the time of transmitting the request (which page class is to be subjected to reproduction trial immediately before the transmission).
When reproduction trial of a page classified in a certain page class is requested, the page reproduction/reproduction result verifying unit 110 sequentially performs reproduction trial of a page class group, which becomes the prerequisite, based on the page transition model, and lastly performs reproduction trial of a specified page class by transmitting a reproduction request. In that case, the page reproduction/reproduction result verifying unit 110 automatically verifies whether the pages obtained by respective requests are classified into the page classes expected at that time. When the obtained page is not classified into the expected page class, the page reproduction/reproduction result verifying unit 110 suspends the reproduction trial and notifies this matter.
That is, when the reproduction trial of “8: output” is requested, the page reproduction/reproduction result verifying unit 110 requests “9:” based on a set reproduction request, and confirms that the page obtained by the request is certainly classified in “9:”, or when the page is not classified in “9:”, notifies this matter.
Subsequently, the page reproduction/reproduction result verifying unit 110 requests “2: menu” based on the set reproduction request, and confirms the obtained page is certainly classified in “2: menu”, or when the page is not classified in “2: menu”, notifies this matter. Hereinafter, the page reproduction/reproduction result verifying unit 110 reproduces likewise until “8: output”. The details of the page reproduction/reproduction result verifying unit 110 are described in, for example, Japanese Patent Application No. 2004-237551.
The page group information storage unit 120 stores information required for reproduction of pages such as the page transition model and verification of the reproduction result. The page reproduction/reproduction result verifying unit 110 collects information from the target website, constructs the page transition model and the like, and stores the model in the page group information storage unit 120.
The output page group detector 130 detects the query parameter output page by using the page reproduction/reproduction result verifying unit 110, and includes a query parameter receiving unit 131, a traceable character string generator 132, a query parameter input reproducing unit 133, a query parameter output page reproducing unit 134, a found result output unit 135, and a controller 136.
The query parameter receiving unit 131 receives a query parameter group, which is an object of the input/output relationship analysis, the value thereof, a finding principle, and the like from a user.
As shown in
The traceable character string generator 132 generates the tracer value based on the query parameter value received by the query parameter receiving unit 131. Characteristics of the tracer value include “uniqueness” and “acceptance”. The “uniqueness” means that the tracer value is rarely used and when it is output from a Web application, the input of the tracer value is recognized. The “acceptance” means that the tracer value is accepted by the Web application in the same manner as an original query parameter value, that is, so that the same control as that when the original query parameter value is input by the Web application is performed.
The traceable character string generator 132 determines a character type forming the tracer value from the query parameter value after URL decoding, so as to satisfy the “acceptance”. Specifically, half-width lower case letters [0x61, 0x7A] are most likely to be accepted, and hence, the character type is determined in the following manner:
(1) When a half-width lower case letter is included in the query parameter value, the character type is determined as the half-width lower case letter.
(2) When a half-width upper case letter [0x41, 0x5A] is included in the query parameter value, the character type is determined as the half-width upper case letter.
(3) When a half-width numeric character [0x30, 0x39] is included in the query parameter value, the character type is determined as the half-width numeric character.
(4) When Japanese Hiragana script is included in the query parameter value, the character type is determined as hiragana.
(5) When a full-width Japanese Katakana script is included in the query parameter value, the character type is determined as full-width katakana.
(6) When half-width Japanese Katakana script is included in the query parameter value, the character type is determined as half-width Katakana.
(7) When other multibyte characters (characters that are not encoded to one byte in Unicode Transformation Format (UTF)-8, excluding non-letter symbols) are included in the query parameter value, the character type is determined as a character type of a language including the character (corresponding to Japanese Hiragana script).
(8) Otherwise, the character type is determined as half-width lower case letter.
The traceable character string generator 132 determines a character string peculiar to the language including the character (character row that is not used commonly), so as to satisfy the “uniqueness”. For example, the peculiar character string is determined in the following manner.
(1) In the case of half-width lower case letter; the character string is “qz”,
(2) In the case of half-width upper case letter, the character string is “QZ”,
(3) In the case of half-width numerical character, the character string is “7654”,
(4) In the case of the Hiragana script, the character string is “”,
(5) In the case of full-width Japanese Katakana script, the character string is “”, and
(5) In the case of half-width Japanese Katakana script, the character string is “” (half-width )
The traceable character string generator 132 also determines to use an unused character string from the shortest character string space, in which the size of a figure becomes the total number of target query parameters, so as to satisfy the “uniqueness”. For example, when the total number of target query parameters is 676 and the character type is the half-width lower case letter, since the shortest character string in which the size of a figure becomes 676 is two characters (26 in the case of lower case letter, and hence, the size of the figure in the character string space including two characters is 26×26=676), an unused character string is used from the character string space including two character formed of “aa” to “zz”. When the number of query parameters is 677, the shortest character string space becomes a character string space including three characters, and hence, an unused character string is used from the character string space formed of “aaa” to “baa”. In the case of the half-width lower case letter, therefore, the character string becomes “qzaa” or the like.
The traceable character string generator 132 determines a tracer value obtained by connecting a “front-half fixed character string” and a “latter-half fixed character string” specified by a user on the GUI shown in
Thus, since the traceable character string generator 132 generates a tracer value that has excellent “uniqueness” and “acceptance” based on the query parameter value accepted by the query parameter receiving unit 131, the query parameter value output page can be found accurately and efficiently.
Furthermore, the traceable character string generator 132 regenerates the tracer value based on an instruction from the query parameter input reproducing unit 133. Specifically, when it is assumed that an original value of the query parameter is an original value, the traceable character string generator 132 generates:
(1) original value+“default tracer value”;
(2) “default tracer value”+original value; and
(3) original value+“default tracer value”+original value. “+” means connection of character strings. The original value is connected when the tracer value is reproduced, because the Web application that has accepted the original value is likely to accept a character string in which the original value is added before and after the tracer value.
When the reproduced tracer value is not accepted, the traceable character string generator 132 requests the user to create a tracer value based on an instruction from the query parameter input reproducing unit 133.
The query parameter input reproducing unit 133 is a processor that reproduces an input of the query parameter by using the page reproduction/reproduction result verifying unit 110. That is, the query parameter input reproducing unit 133 generates a test request in which the query parameter value is changed to a tracer value from the original request generated by inputting the query parameter, and tries to reproduce a page classified in a page class in which the page obtained by the original request is classified, by using the page reproduction/reproduction result verifying unit 110. As a result, when the page classified in the same page class as the original request is reproduced, it is assumed that the tracer value is accepted by the target Web application.
For example, a test request http://example.com/?p1=qzaa&p2=CAPITAL, in which the value of “p1” is changed to “qzaa”, or the like is generated, when the target is “p1”, from http://example.com/?p1=small&p2=CAPITAL, and when the target is “p2”, http://example.com/?p1=small&p2=QZAB”, in which the value of “p2” is changed to “QZAB”, or the like is generated as the test request, respectively.
For example, when a query parameter “p” in a reproduction request http://example.com/?p=v in page class “4:” in
Furthermore, when the query parameter input reproducing unit 133 tries to reproduce the page classified in the page class, in which the page obtained by the original request is classified, and as a result, when the page classified in the same page class as the original request is not reproduced, that is, the obtained page is not classified in a presumed page class, it is assumed that the tracer value is not accepted by the Web application. This is because the Web application often outputs the result whether to accept the input parameter value immediately after the input.
When retrial is set, that is, it is set to “retry with a value obtained by connecting the original value before and after the tracer value at the time of reproduction failure” on the principle setting GUI shown in
When manual setting of a tracer value is specified when the tracer value regenerated by the traceable character string generator 132 is not accepted, that is, it is set to “display a dialog requesting appropriate input at the time of reproduction failure” on the principle setting GUI shown in
When the user specifies the tracer value, the query parameter input reproducing unit 133 retries reproduction by using the specified tracer value. On the other hand, when the user gives up finding the output page relating to the query parameter, the query parameter input reproducing unit 133 suspends reproduction.
The query parameter output page reproducing unit 134 detects a page that outputs the query parameter value, by using the page reproduction/reproduction result verifying unit 110. That is, when reproduction of the query parameter input by the query parameter input reproducing unit 133 is a success, the query parameter output page reproducing unit 134 uses the successful test request as a reproduction request to reproduce all the page classes, which are candidates to be found, by using the page reproduction/reproduction result verifying unit 110, monitors the output of the tracer value set in the test request, and detects a page that outputs the query parameter value.
For example, it is assumed that a test request http://example.com/?p=qzac, in which the value of “p” is changed to “qzac”, with respect to the query parameter “p” in http://example.com/?p=v, which is a reproduction request of page class “4:” shown in
The query parameter then output page reproducing unit 134 searches a page including “v”, which is the original value of “p”, from the whole page information stored in the page group information storage unit 120, to narrow down to which page classes the query parameter value can be output, and designates the page classes as a candidate page class group.
For example, it is assumed that “1:”, “7: confirm”, and “8: output” are the candidate page class group in
At the time of reproduction trial, when “4:” is passed, a request to be used at the time of performing reproduction trial of “4:” is replaced by the test request. The query parameter output page reproducing unit 134 monitors whether the tracer value “qzac” is output in “1:”, “7: confirm”, and “8: output” during reproduction, and when the tracer value is output, the page class is output as a found page.
In performing reproduction trial, when reproduction is a failure, it is difficult to guess the cause. Therefore, it is only output that reproduction is a failure. Furthermore, in a page class “1:”, which does not pass through “4:”, when “node other than reproduction route (node is a page class) is also designated as an object to be found” is not set on the principle setting GUI shown in
The found result output unit 135 is a processor that outputs an analysis result such as the query parameter value output page detected by the query parameter output page reproducing unit 134 and the like.
Analysis results for three query parameters, “action”, “address”, and “age”, which are included in a page shifted from node (page class) “3” to “4”, are shown in
For example, regarding “age”, the following information is output.
The original value is “30”, and the page class group, in which “30” is output, is node “6” and “7”, of all pages.
Thereafter, test requests by using “76540000000003” and other three values as a tracer value (traceable character string) were tried, but these did not shift to the original node “4”, but shifted to “3”. The traceable character string was not found in the shifted page.
Thereafter, a test request by using “117” as the traceable character string was tried, and as a result, the original node “4” was reproduced, and reproduction of nodes “6, 7” was tried. As a result, “6” was reproduced, however, the traceable character string “117” was not found therein. “7” was also reproduced, and the traceable character string “117” was found therein.
As shown in
The controller 136 is a processor that controls the entire query parameter output page group finding apparatus 100, and specifically, makes the query parameter output page group finding apparatus 100 function as one apparatus, by shifting the control between functional units and transferring data between the functional units and the storage unit.
A process procedure performed by the output page group detector 130 is explained next.
As the repeated processing, the traceable character string generator 132 generates a traceable character string (tracer value) based on the query parameter value (step S103), and the query parameter input reproducing unit 133 instructs the page reproduction/reproduction result verifying unit 110 to perform reproduction trial of the original page by a test request including the traceable character string (step S104).
The query parameter input reproducing unit 133 then determines whether the original page has been reproduced (step S105). As a result, when the original page has not been reproduced, the controller 136 determines whether an at-end condition is satisfied (step S106). When the at-end condition is not satisfied, control returns to step S103, to regenerate a traceable character string. When the at-end condition is satisfied, the query parameter input reproducing unit 133 records the query parameter as a query parameter failed in the finding processing (step S107), and performs processing with respect to a next query parameter. The at-end condition is a condition determined based on a finding principle set by the user on the finding principle setting GUI shown in
On the other hand, when the original page has been reproduced, the controller 136 sets a test request in the reproduction request (step S108), and repeats processing from step S109 to step S116 for each candidate page, on which the original query parameter value is output.
As the repeated processing, the query parameter output page reproducing unit 134 instructs the page reproduction/reproduction result verifying unit 110 to perform reproduction trial of the page by using the traceable character string (step S110), to determine whether the page has been reproduced (step S111).
As a result, when the page has been reproduced, the query parameter output page reproducing unit 134 searches the traceable character string from the page output (step S112), to determine whether the traceable character string has been found (step S113). When the traceable character string has been found, the query parameter output page reproducing unit 134 records the output of the query parameter as a found page (step S114).
On the other hand, when the page has not been found, the query parameter output page reproducing unit 134 records the page as a page failed in the finding processing (step S115).
When the repeated processing from step S109 to step S116 has finished for all candidate pages, the controller 136 returns the reproduction request to the original state, to perform processing with respect to the next query parameter.
Lastly, the found result output unit 135 outputs the found result (analysis result) to finish the processing (step S119).
Thus, the output page group detector 130 generates the traceable character string, and monitors the output of the traceable character string, while reproducing the page by using the page reproduction/reproduction result verifying unit 110, thereby finding the output page of the query parameter.
The relationship between the effects of the query parameter output page group finding apparatus 100 according to this embodiment and the main processing flow is explained next.
As shown in
For example, the query parameter output page group finding apparatus 100 selects “apple” as a query parameter (step S1), monitors the output of a character string “apple” while reproducing all pages (step S4), and outputs a page, on which “apple” is output, as a found page (step S5). As a result, the detection range can be enlarged.
Furthermore, the query parameter output page group finding apparatus 100 according to this embodiment selects a query parameter (step S1), generates a traceable character string by referring to an original value of the selected query parameter (step S2), confirms that the original page is reproduced by a new request using the generated traceable character string (step S3), reproduces all pages while monitoring the generated traceable character string (step S4), and outputs a detection result of the output page of the traceable character string (step S5). Accordingly, the query parameter output page group finding apparatus 100 can reduce erroneous detection.
For example, when the query parameter output page group finding apparatus 100 selects “apple” as a query parameter (step S1), generates “goggole” by referring to “apple” (step S2) confirms that the original page is reproduced by “goggole” (step S3), and when the original page is reproduced, monitors the output of a character string “goggole” while reproducing all pages (step S4), and outputs a page on which “goggole” is output as a found page (step S5). As a result, there is low possibility that “goggole” is output on an irrelevant page, and hence, erroneous detection can be reduced.
As described above, in this embodiment, the traceable character string generator 132 generates a tracer value based on the original query parameter value. The query parameter input reproducing unit 133 reproduces an input of a query parameter by using the tracer value, and when the query parameter input is reproduced, the query parameter output page reproducing unit 134 detects a page that outputs the query parameter value with respect to all pages stored in the page group information storage unit 120 by using the tracer value. Accordingly, the output page, on which the query parameter value input by the user is output, can be found highly accurately.
The query parameter input reproducing unit 133 and the query parameter output page reproducing unit 134 reproduce pages by using the page reproduction/reproduction result verifying unit 110, to verify the reproduction results.
In this embodiment, an example where pages are reproduced by using the page reproduction/reproduction result verifying unit 110 has been explained. However, the present invention is not limited thereto, and is also applicable to a case that reproduction trial of a page is performed while actually communicating with a target website.
In this embodiment, the query parameter output page group finding apparatus has been explained. However, by realizing the configuration of the query parameter output page group finding apparatus by software, a query parameter output page group finding program having the same function can be obtained. Therefore, a computer that executes the query parameter output page group finding program is explained.
The RAM 210 is a memory that stores programs and execution interim results of the programs, and the CPU 220 reads out programs from the RAM 210 and executes the programs.
The HDD 230 is a disk device that stores programs and data, and the LAN interface 240 connects the computer 200 to other computers via the LAN.
The input/output interface 250 connects input units such as a mouse and a keyboard and a display unit, and the DVD drive 260 reads data from and writes data in a DVD.
A query parameter output page group finding program 211 executed by the computer 200 is stored in the DVD, read out from the DVD by the DVD drive 260 and installed in the computer 200.
Alternatively, the query parameter output page group finding program 211 is stored in a database of another computer system connected to the computer 200 via the LAN interface 240, read from the DVD and installed in the computer 200.
The installed query parameter output page group finding program 211 is stored in the HDD 230, read by the RAM 210, and executed by the CPU 220 as a query parameter output page group finding process 221.
According to an embodiment, since more output pages can be detected, detectability of the output page can be improved.
Moreover, since erroneous detection of the output page can be prevented, the output page can be found highly accurately.
Furthermore, since the output page can be detected accurately, the output page can be found highly accurately.
Moreover, since the possibility of finding the output page can be improved, the output page can be found highly accurately.
Furthermore, since the tracer value is changed to the one easily accepted by the target website, the possibility of reproducing the page can be improved.
Moreover, since the user specifies the tracer value, the user himself can improve the possibility of reproducing the page.
Furthermore, since the user is involved in generation of the tracer value, the user himself can improve the possibility of reproducing the page.
Moreover, since the detection range of the output page is limited, the output page can be found efficiently.
Although the invention has been described with respect to a specific embodiment for a complete and clear disclosure, the appended claims are not to be thus limited but are to be construed as embodying all modifications and alternative constructions that may occur to one skilled in the art that fairly fall within the basic teaching herein set forth.
Number | Date | Country | Kind |
---|---|---|---|
2006-001879 | Jan 2006 | JP | national |