The present invention relates to an evaluation apparatus and an evaluation method.
Along with the arrival of the era of Internet of Things (IoT), various types of devices (IoT devices) are connected to the Internet and have a multiplicity of uses. In response to this, security measures of the IoT devices such as a traffic session anomaly detection system and an intrusion detection system (IDS) designed for the IoT devices are anticipated.
The aforementioned technology includes, for example, a technology using a probability density estimation device based on unsupervised learning such as a variational auto encoder (VAE). Thus with this technology, after a probability density of normal communication data is learned, communication having a low probability density is detected as abnormal. For this reason, Thus with the technology, it is sufficient when only the normal communication data is recognized, and the anomaly can be detected without learning all abnormal data. Therefore, the technology is effective to detect a threat to the IoT devices which are still in transition where all threat information is not yet fully identified.
Non-Patent Literature 1: Diederik P Kingma, Max Welling, “Auto-Encoding Variational Bayes”, [searched on Jun. 7, 2018], the Internet <URL: https://arxiv.org/abs/1312.6114>
At this time, since a VAE detects an anomaly based on probability, erroneous detection may occur. For example, the erroneous detection includes excess detection in which normal communication is erroneously determined as abnormal. Data that may be excessively detected includes communication for maintenance that occurs only several times a year and an abnormal amount of traffic data at the time of the Olympic Games. To build a practical anomaly detection system, a function is needed with which, when the occurrence of excess detection is noticed, data of the excess detection is fed back to improve detection precision.
Up to now, to feed back the excess detection data, a technique has been used with which a data set is created by mixing a data set used for initial learning with a data set in which the excess detection occurs, and a model of the VAE is learned again.
However, the technique in the related art has the following two problems. First, as a first problem, a problem occurs that the initial learning data set used for the initial learning needs to be saved even after the model is generated. Then, as a second problem, a problem occurs that, when the amount of excess detection data set is substantially lower than that of initial learning data set, the excess detection data cannot be precisely learned.
In general, the excess detection hardly occurs, and it is difficult in many cases to collect a large amount of excess detection data. For this reason, the second problem among the aforementioned problems is particularly serious. Therefore, the establishment of a technology is demanded with which the feedback is efficiently and precisely performed even when a small amount of excess detection data is used, and evaluation precision can be improved.
The present invention has been made in view of the aforementioned circumstances, and it is an object of the present invention to provide an evaluation apparatus that executes evaluation in a highly precise manner on the presence or absence of an anomaly of communication data and an evaluation method.
To solve the aforementioned problems and achieve the object, an evaluation apparatus according to the present invention is characterized by including an acceptance unit configured to accept an input of communication data of an evaluation target, and an evaluation unit configured to estimate a probability density of the communication data of the evaluation target by using a first model in which a feature of a probability density of normal initial learning data is learned and a second model in which a feature of a probability density of normal excess detection data detected as abnormal in a course of evaluation processing is learned, and evaluate presence or absence of an anomaly of the communication data of the evaluation target based on the estimated probability density.
According to the present invention, the evaluation on the presence or absence of the anomaly of the communication data is executed in a highly precise manner.
Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings. It is noted that the present invention is not intended to be limited by this embodiment. In addition, the same parts are assigned with the same reference signs in the description of the drawings.
An embodiment of the present invention will be described. An evaluation apparatus according to the embodiment generates an excess detection VAE model in which only excess detection data is learned in addition to a learning data VAE model in which normal learning data is learned. The excess detection data is normal communication data evaluated as abnormal in a course of evaluation processing, and only a small amount thereof is generated. In the evaluation apparatus according to the present embodiment, since an evaluation is performed based on a probability density obtained by concatenating the two generated VAE models with each other at a model level, the feedback of the excess detection data and the increase of the detection precision are realized.
It is noted that when an input of a certain data point xi is accepted, the VAE outputs an anomaly score (abnormality degree) corresponding to the data. When an estimated value of the probability density is set as p(xi), the anomaly score becomes an approximate value of −log p(xi). Therefore, it is indicated that as the anomaly score output by the VAE is higher, the abnormality degree of this communication data is higher.
In view of the above, a configuration of the evaluation apparatus according to the embodiment will be specifically described.
The communication unit 10 is a communication interface that transmits and receives various types of information to and from another apparatus connected via a network or the like. The communication unit 10 is realized by a network interface card (NIC) or the like, and communication is performed between another apparatus via a telecommunications line such as a local area network (LAN) or the Internet and the control unit 12 (described below). The communication unit 10 is connected, for example, to an external apparatus via the network or the like, and accepts an input of communication data of an evaluation target.
The storage unit 11 is realized by a random access memory (RAM), a semiconductor memory element such as a flash memory, or a storage device such as a hard disc or an optical disc, and stores a processing program for causing the evaluation apparatus 1 to operate, data used during the execution of the processing program, and the like. The storage unit 11 has a learning data VAE model 111 and an excess detection VAE model 112.
The learning data VAE model 111 is a learning data VAE model (first model) in which normal learning data is learned, and is a model in which a feature of the probability density of the normal initial learning data is learned. The excess detection VAE model 112 is an excess detection VAE model (second model) in which only the excess detection data is learned, and is a model in which a feature of the probability density of the normal excess detection data evaluated as abnormal in the course of the evaluation processing is learned. Each of the models has model parameter of the learned VAE.
The control unit 12 includes an internal memory that stores programs that defines various types of processing procedures and the like and necessary data, and various types of processing are executed by these. For example, the control unit 12 is an electronic circuit such as a central processing unit (CPU) or a micro processing unit (MPU). The control unit 12 includes an acceptance unit 120, a model generation unit 121 (generation unit), and an evaluation unit 123.
The model generation unit 121 includes a VAE 122 as a probability density estimation device. The model generation unit 121 learns input data, and generates a VAE model or updates VAE model parameter. The model generation unit 121 stores model parameter of the generated VAE model or the updated model parameter of the VAE model in the storage unit 11.
Then, as illustrated in
At this time, when a learning instruction of the excess detection data is received, the model generation unit 121 learns the input excess detection data, and generates the excess detection VAE model 112 or updates the parameter of the excess detection VAE model 112. Thus, the excess detection data is fed back to the evaluation apparatus 1.
Therefore, it is sufficient when the evaluation apparatus 1 saves only the number of pieces of the initial learning data Ds for the feedback learning of the excess detection data. In addition, since only the small amount of excess detection data is learned, the evaluation apparatus 1 can shorten a learning time as compared with a case where the large amount of initial learning data is learned. In addition, since only the excess detection data is learned, the evaluation apparatus 1 can execute the precise learning.
Then, the learning data VAE model 111 is a model in which the normal learning data is precisely learned in the initial stage (see (1a) in
The evaluation unit 123 estimates the probability density of the communication data of the evaluation target by using the learning data VAE model 111 and the excess detection VAE model 112, and evaluates the presence or absence of the anomaly of the communication data of the evaluation target based on the estimated probability density. The evaluation unit 123 evaluates the presence or absence of the anomaly of the communication data of the evaluation target based on the probability density obtained by concatenating the following two probability densities with each other. The first probability density is the probability density estimated by applying the learning data VAE model 111. The second probability density is the probability density estimated by applying the excess detection VAE model 112. The evaluation unit 123 detects that the communication data of the evaluation target is abnormal in a case where the concatenated probability density is lower than a predetermined value, and notifies an external response apparatus or the like of the occurrence of the anomaly of the communication data. The evaluation unit 123 includes a concatenation unit 124 and an anomaly existence evaluation unit 126.
The concatenation unit 124 has, for example, the following two VAEs. The first VAE is a first VAE 1251 to which model parameter of the learning data VAE model 111 is applied. The second VAE is a second VAE 1252 to which model parameter of the excess detection VAE model 112 is applied. The concatenation unit 124 concatenates the following two probability densities with each other. The first probability density is the probability density estimated by applying the learning data VAE model 111. The second probability density is the probability density estimated by applying the excess detection VAE model 112.
In a case where the excess detection VAE model 112 is generated or updated by the feedback of the excess detection data, the concatenation unit 124 concatenates the excess detection VAE model 112 with the learning data VAE model 111 at the model level. The concatenation at the model level indicates that scores corresponding to the outputs of the respective VAE models are concatenated with each other based on the following Formula (1). In other words, the concatenation unit 124 applies the following two anomaly scores to Formula (1) and calculates a concatenated anomaly score. The first anomaly score is the anomaly score estimated by the first VAE 1251 by applying the learning data VAE model 111. The second anomaly score is the anomaly score estimated by the second VAE 1252 by applying the excess detection VAE model 112.
In Formula (1), scoren denotes an anomaly score output by the first VAE 1251 applying the learning data VAE model 111 in which the initial learning data Ds is learned. scoreod denotes an anomaly score output by the second VAE 1252 applying the excess detection VAE model 112 in which the excess detection data De is learned. scoreconcat denotes a concatenated anomaly score. In addition, Nn denotes the number of pieces of learning data. Nod denotes the number of pieces of excess detection data.
The anomaly existence evaluation unit 126 evaluates the presence or absence of the anomaly of the communication data of the evaluation target based on the probability density concatenated by the concatenation unit 124. The anomaly existence evaluation unit 126 detects the presence or absence of the anomaly of the communication data of the evaluation target based on the concatenated anomaly score calculated by the concatenation unit 124. Specifically, in a case where the concatenated anomaly score is higher than a predetermined value, the anomaly existence evaluation unit 126 evaluates the communication data of the evaluation target as abnormal. On the other hand, in a case where the concatenated anomaly score is equal to or lower than the predetermined value, the anomaly existence evaluation unit 126 evaluates the communication data of the evaluation target as normal.
Next, learning processing performed in an initial stage by the evaluation apparatus 1 will be described.
As illustrated in
Next, evaluation processing of the evaluation apparatus 1 will be described.
As illustrated in
At this time, before the excess detection data is fed back, the storage unit 11 stores only the learning data VAE model 111. In this case, the evaluation unit 123 applies the learning data VAE model 111 to the first VAE and estimates the probability density of the evaluation data. In addition, in a case where the excess detection data is already fed back, the storage unit 11 stores both the learning data VAE model 111 and the excess detection VAE model 112. In this case, the evaluation unit 123 applies the learning data VAE model 111 to the first VAE 1251 and applies the excess detection VAE model 112 to the second VAE 1252, and estimates the probability density of the evaluation data in each of the VAEs.
Subsequently, the evaluation unit 123 calculates the probability density obtained by concatenating the two probability densities with each other (step S14). The first probability density is the probability density estimated by applying the learning data VAE model 111, and the second probability density is the probability density estimated by applying the excess detection VAE model 112. Specifically, in the evaluation unit 123, the following two anomaly scores are applied to Formula (1), and the concatenated anomaly score is calculated. The first anomaly score is the anomaly score estimated by the first VAE 1251 in which the concatenation unit 124 applies the learning data VAE model 111, and the second anomaly score is the anomaly score estimated by the second VAE 1252 by applying the excess detection VAE model 112.
Then, in the evaluation unit 123, the anomaly existence evaluation unit 126 evaluates the presence or absence of the anomaly of the communication data of the evaluation target based on the probability density calculated in step S14, and outputs the evaluation result (step S15). In a case where the concatenated anomaly score calculated by the concatenation unit 124 is higher than the predetermined value, the anomaly existence evaluation unit 126 evaluates the communication data of the evaluation target as abnormal.
Subsequently, the control unit 12 determines whether or not the excess detection data learning instruction is received (step S16). For example, an administrator analyzes a detection result output from the evaluation unit 123, and in a case where communication data that is detected as abnormal but is actually normal exists, classifies this communication data as the excess detection data. Then, when a predetermined number of excess detection data is collected, the administrator feeds back the collected excess detection data to the evaluation apparatus 1 to instruct to learn this excess detection data. Alternatively, in the external apparatus, the detection result output from the evaluation unit 123 is analyzed and classified as the excess detection data. Then, when a predetermined number of classified communication data is collected, the excess detection data of the learning target is fed back from the external apparatus, and also the learning instruction of the excess detection data is input.
In a case where the control unit 12 determines that the learning instruction of the excess detection data is received (step S16: Yes), the acceptance unit 120 accepts an input of the excess detection data of the learning target (step S17). Subsequently, the model generation unit 121 learns the input excess detection data and newly generates the excess detection VAE model 112 (step S18). Alternatively, the model generation unit 121 learns the fed-back excess detection data and updates the model parameter of the excess detection VAE model 112 (step S18).
In a case where it is determined that the excess detection data learning instruction is not received (step S16: No) or after the processing in step S18 is ended, the control unit 12 determines whether or not an end instruction of the evaluation processing is received (step S19). In a case where it is determined that the end instruction of the evaluation processing is not received (step S19: No), the control unit 12 returns to step S11 and accepts the next input of the evaluation data. In a case where it is determined that the end instruction of the evaluation processing is received (step S19: Yes), the control unit 12 ends the evaluation processing.
For example, the evaluation apparatus 1 according to the present embodiment can be applied to the anomaly detection of the IoT device.
In the evaluation apparatus 1, the model generation unit 121 receives an initial learning data set and an excess detection data set which are set as learning targets, and stores learned models obtaining by learning the received data sets in the storage unit 11.
At this time, the learned model to be applied to the concatenation unit 124 may also be the learning data VAE model 111 in which the initial learning data is learned, or may also be the excess detection VAE model 112 in which the excess detection data is learned. In addition, the plurality of learning data VAE models 111-1 and 111-2 in which the mutually different pieces of learning data are learned may also be applied to the concatenation unit 124 (see an arrow Y11). Of course, only the single learning data VAE model may also be applied to the concatenation unit 124.
Then, a plurality of excess detection VAE models 112-1 and 112-2 in which mutually different pieces of excess detection data are learned may also be applied to the concatenation unit 124 (see an arrow Y12). Of course, since the excess detection VAE model is not generated before the excess detection data is fed back, a configuration may also be adopted where the excess detection VAE model is not applied to the concatenation unit 124. In addition, as described above, only the single excess detection VAE model may also be applied to the concatenation unit 124.
In a case where a plurality of models are applied, the concatenation unit 124 concatenates the anomaly scores by the plurality of applied models with each other based on the following Formula (2).
Where scorek denotes a score output by the k-th model, and Nk denotes the number of pieces of data learned by the k-th model. In other words, when the anomaly existence evaluation unit 126 performs the evaluation with regard to the evaluation data, a value of Formula (2) is obtained as the concatenated anomaly score. In this manner, the concatenation unit 124 can also concatenate two or more models at the model level with one another.
As described above, at the time of the initial learning, the evaluation apparatus 1 inputs the initial learning data to the model generation unit 121 and obtains the learning data VAE model 111. Then, in the course of the evaluation processing, until some excess detections are discovered, the evaluation apparatus 1 inputs only the learning data VAE model 111 to the concatenation unit 124, and continues subsequently evaluating the traffic information obtained from the network.
Then, in a case where the excess detection is discovered, the evaluation apparatus 1 inputs the data set of the excess detection data to the model generation unit 121, and generates the excess detection VAE model 112 in which the excess detection data is learned. Thereafter, the evaluation apparatus 1 inputs the learning data VAE model 111 and the excess detection VAE model 112 to the concatenation unit 124, and continues sequentially evaluating the traffic information similarly obtained from the network.
The evaluation apparatus 1 sequentially repeats these processes of the excess detection discovery, the excess detection data learning, and the model concatenation, and continues improving the detection precision.
Next, the evaluation method in the related art will be described.
As illustrated in
Therefore, the VAE model in the related art indicates a low anomaly score to the communication data equivalent to the large amount of learning data at the time of the evaluation (see
In view of the above, results are illustrated by respectively performing the evaluations on the traffic session data between actual IoT devices by using the evaluation method in the related art and the evaluation method according to the present embodiment. The learning data corresponds to camera communication (369 pieces of data), and the excess detection data corresponds to SSH communication (10 pieces of data).
As the initial learning, an evaluation result in a case where the VAE model is generated by learning the camera communication will be described. That is, this is the result when the evaluation is performed by using the VAE model in which only the camera communication corresponding to the initial learning data is learned before the feedback of the excess detection data. In this case, an average score of the learning data was −25.2625. Since the excess detection data was not learned, an average score of the excess detection data was a high score at 268.530. Then, the time spent for the learning was 13.452 (sec).
Subsequently, the evaluation result after the feedback learning of the excess detection data is performed by using the evaluation method in the related art will be described. In this case, the average score of the learning data was −16.3808. The average score of the excess detection data still indicated a high score at 44.6441 although some improvement was attained as compared with the case before the feedback of the excess detection data, and the precision remained low. Then, the time spent for the relearning was 14.157 (sec), and was longer than the time spent for the initial learning.
In contrast to this, the evaluation result after the feedback learning of the excess detection data is performed by using the evaluation method according to the present embodiment will be described. In this case, the average score of the learning data was −25.2625. Then, the average score of the excess detection data was substantially improved at −24.0182 as compared with the evaluation method in the related art. Furthermore, the time spent for the relearning was substantially shortened at 3.937 (sec) as compared with the evaluation method in the related art.
In this manner, according to the present embodiment, the probability density of the evaluation data is estimated by using the learning data VAE model in which the normal learning data is learned and the excess detection VAE model in which the excess detection data is learned, and the presence or absence of the anomaly of the evaluation data is evaluated based on the estimated probability density. That is, according to the present embodiment, to be separated from the learning data VAE model in which the normal learning data is learned, the excess detection VAE model in which the feedback learning of only the excess detection data is performed is generated, and the evaluation is performed based on the probability density obtained when the probability densities estimated by the two generated VAE models are concatenated with each other.
According to the evaluation method in the related art, the excess detection data cannot be precisely learned, also the large amount of initial learning data needs to be saved for the feedback learning of the excess detection data, and the time equal to or longer than the time for the initial learning is needed to newly generate the VAE model again.
In contrast to this, in the evaluation apparatus 1 according to the present embodiment, only the number of pieces of the initial learning data Ds may be saved for the feedback learning of the excess detection data. Then, in the evaluation apparatus 1, as also illustrated in the evaluation experiment results as described above, only the small amount of excess detection data may be learned in the course of the evaluation processing, and the learning time can be substantially shortened as compared with the learning of the large amount of initial learning data. In addition, in the evaluation apparatus 1, as also illustrated in the evaluation experiment results as described above, even when the number of pieces of excess detection data and the number of pieces of learning data are biased, the excess detection data can be evaluated in a highly precise manner.
Therefore, according to the present embodiment, the small amount of excess detection data is efficiently fed back to reduce the generation of the excess detection data, and it is possible to execute the evaluation in a highly precise manner on the presence or absence of the anomaly of the communication data.
The respective components of the respective apparatuses illustrated in the drawings are like function conceptions, and are not necessarily required to be physically configured as illustrated in the drawings. That is, specific aspects of the distribution and integration of the respective apparatuses are not limited to the aspects illustrated in the drawings, and all or a part of the apparatuses can be configured by being functionally or physically distributed and integrated in any units in accordance with various types of loads, use situations, and the like. Furthermore, all or a part of the respective processing functions performed in the respective apparatuses may be realized by a CPU and programs analyzed and executed by the CPU, or may be realized as hardware based on wired logic.
In addition, among the respective processes described according to the present embodiment, all or a part of the processes that are described as being automatically performed can also be manually performed, or all or a part of the processes that are described as being manually performed can also be automatically performed by a known method. In addition to this, the information including the processing procedures, control procedures, specific names, and various types of data and parameter illustrated in the document and drawings described above can be optionally changed unless otherwise specified.
The memory 1010 includes a read only memory (ROM) 1011 and a RAM 1012. The ROM 1011 stores, for example, a boot program such as a basic input output system (BIOS). The hard disc drive interface 1030 is connected to a hard disc drive 1090. The disc drive interface 1040 is connected to a disc drive 1100. For example, a detachable storage medium such as a magnetic disc or an optical disc is inserted to the disc drive 1100. The serial port interface 1050 is connected, for example, to a mouse 1110 and a keyboard 1120. The video adapter 1060 is connected, for example, to a display 1130.
The hard disc drive 1090 stores, for example, an operating system (OS) 1091, an application program 1092, a program module 1093, and program data 1094. That is, programs that define the respective processes of the evaluation apparatus 1 are implemented as the program module 1093 in which codes that can be executed by a computer are described. The program module 1093 is stored, for example, in the hard disc drive 1090. For example, the program module 1093 configured to execute the processing similar to the functional configuration in the evaluation apparatus 1 is stored in the hard disc drive 1090. It is noted that the hard disc drive 1090 may also be substituted by a solid state drive (SSD).
In addition, setting data used in the processing according to the aforementioned embodiment is stored as the program data 1094 in the memory 1010 or the hard disc drive 1090, for example. Then, the CPU 1020 appropriately reads out the program module 1093 or the program data 1094 stored in the memory 1010 or the hard disc drive 1090 onto the RAM 1012 for execution.
It is noted that the program module 1093 and the program data 1094 are not only stored in the hard disc drive 1090, but may also be stored in a detachable storage medium and read out by the CPU 1020 via the disc drive 1100 or the like, for example. Alternatively, the program module 1093 and the program data 1094 may also be stored in another computer connected via a network (a LAN, a wide area network (WAN), or the like). Then, the program module 1093 and the program data 1094 may be read out from the other computer by the CPU 1020 via the network interface 1070.
The embodiment to which the invention made by the inventor of the present invention is applied has been described above, but the present invention is not intended to be limited by the description and drawings constituting a part of the disclosure of the present invention based on the present embodiment. That is, the other embodiments, examples, operation technologies, and the like implemented based on the present embodiment by those skilled in the art and the like are all included in the scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
2018-117456 | Jun 2018 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2019/024167 | 6/18/2019 | WO | 00 |