This application claims priority pursuant to 35 U.S.C. § 119 from Japanese Patent Application No. 2020-010366, filed on Jan. 24, 2020, the entire disclosure of which is incorporated herein by reference.
The present invention relates to an information processing system and a method for controlling the information processing system, and particularly, to a technology for performing inference by utilizing machine learning.
In recent years, in various fields such as a retail industry and a manufacturing industry, an information processing system for performing inference by utilizing machine learning has been introduced. In such an information processing system, it is required to continuously maintain inference accuracy throughout actual operation. U.S. Patent Application Publication No. 2019/0156247 discloses a technique of evaluating accuracy for inference results performed by each of a plurality of machine learning models and selecting a machine learning model on the basis of the evaluation.
In an information processing system utilizing machine learning, when degradation of inference accuracy is detected in a certain inference environment, improvement of the inference accuracy may be expected by performing retraining by using data acquired as an inference target (hereinafter, referred to as “inference data”) as the training target data (hereinafter, referred to as “training data”). In addition, improvement of the inference accuracy or processing efficiency may be expected by sharing a retrained machine learning model (hereinafter, referred to as “inference model”) also in other inference environments. However, for example, because a cause of the accuracy degradation is a trend change of the inference data unique to a certain inference environment, the inference accuracy in other inference environments may degrade adversely if the inference model obtained by retraining the inference data acquired in this inference environment as the training data is applied to other inference environments
The technique disclosed in U.S. Patent Application Publication No. 2019/0156247 is premised on that the inference data does not have a trend change unique to a particular inference environment as in the natural language processing or image recognition. For this reason, it is difficult to apply the technique to a use case in which the trend of the inference data is different for each inference environment, for example, in a case where future sales are forecasted by using an inference environment prepared for each store with sales data transmitted from a plurality of stores as the inference data, or the like.
Under the background described above, the present invention provides an information processing system and a method for controlling the information processing system, capable of securing the inference accuracy in each inference environment for a machine learning system that performs inference using a plurality of inference environments.
An aspect of the present invention to achieve the above objective is an information processing system comprising: a plurality of inference units that perform inference by inputting data to one or more inference models; an inference accuracy evaluation unit that evaluates inference accuracy of the inference units; a training unit that generates a new inference model by training data input to a first inference unit when degradation of the inference accuracy is detected in the first inference unit; a factor determination unit that determines a factor of the degradation of the inference accuracy; and a deployment determination unit that determines whether or not the new inference model is applied to a second inference unit other than the first inference unit on the basis of the determined factor.
Other problems and solutions thereof disclosed in the present application will become more apparent by reading description of the embodiments of the present invention and the accompanying drawings.
According to the present invention, it is possible to secure the inference accuracy under each inference environment in a machine learning system that performs inference using a plurality of inference environments.
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In the following description, like or similar reference numerals denote like or similar elements, and they will not be described for simplicity purposes. In addition, subscripts such as alphabets may be attached to a common reference symbol to distinguish configurations of the same type. In addition, in the following description, the letters “S” preceding the reference numerals denote processing steps. In the following description, machine learning may be referred to as “ML”. Furthermore, in the following description, the “machine learning model” is also referred to as “inference model”.
As illustrated in
The router 5a performs inference by allocating inference data to at least one of the inference models m1 and m2 applied to the inference environment 2a. In addition, the router 5b performs inference by allocating inference data to at least one of the inference models m1 and m2 applied to the inference environment 2b. Note that the routers 5a and 5b are not essential, and the machine learning models may also be fixedly allocated to the terminal devices 4a and 4b.
In the illustrated information processing system, for example, it is assumed that a trend of the inference data transmitted from the terminal device 4b changes (S1), and accordingly, inference accuracy degradation is detected in the inference performed by the inference model m2 (S2). In this case, for example, retraining is performed using the inference data as the training data in the training environment 3 (S3), and a new inference model m3 generated from the retraining is applied to the inference environment 2a (S4). In addition, when the same inference model m3 is also used in the inference environment 2b, the new inference model m3 is also applied to the inference environment 2b (S5).
Here, in this manner, if it is detected that the inference accuracy is degraded in a certain inference environment (S2), and the new inference model m3 retrained thereby is also applied to other inference environments, the following problems may occur. That is, in the aforementioned example, if the trend of the inference data changes only in the terminal device 4b that uses the inference environment 2a, and the trend of the inference data transmitted from the terminal devices 4c and 4b that use the inference environment 2b does not change, the inference accuracy in the inference environment 2b may degrade adversely in some cases by applying the new inference model m3 to the inference environment 2b. In addition, when the new inference model m3 is used, for example, in a so-called ensemble algorithm (ensemble model) that obtains the best result by using each inference result of a plurality of inference models, there is a possibility of degrading the inference accuracy, and calculation resources or time required for the inference may be consumed wastefully by applying the new inference model m3 to the inference environment 2b.
In this regard, as illustrated in
In such a mechanism, the trained new inference model m3 is applied only to the inference environment 2 by which improvement of the inference accuracy may be expected, and it is possible to prevent degradation of the inference accuracy caused by applying the new inference model m3 to the inference environment 2 by which improvement of the inference accuracy may not be expected. In addition, it is possible to prevent the inference processing from being performed unnecessarily and prevent computational resources or time required for the inference from being consumed wastefully.
The terminal device 4 transmits, for example, actual record values such as sales data as the inference data to the inference server 500. Note that the terminal device 4 transmits the inference data to the inference server 500, for example, along with an inference execution request (inference request). When the inference server 500 receives the inference data, the received inference data is input to the inference model allocated to this inference data (the terminal device 4 as a transmission source of the inference data) to perform an inference processing such as future sales prediction. When degradation of the inference accuracy is detected, the training server 600 generates a new inference model by training the inference data input to the inference model having the degraded inference accuracy. The management server 700 determines an application method (countermeasure method) for the new inference model depending on the factor of the inference accuracy degradation, and applies the new inference model to the inference environment 2 using the determined method.
The inference environment 2a includes an IT infrastructure 400 that provides an inference server 500, a management network 800, and a data network 810. The inference server 500 existing in each of the inference environments 2a and 2b is communicatably connected via the data network 810. The training environment 3 includes an IT infrastructure 400 that realizes the training server 600 and the management server 700, a management network 800, and a data network 810. The inference server 500, the training server 600, and the management server 700 are communicatably connected via the management network 800. In addition, the inference server 500 and the training server 600 are communicatably connected via the data network 810. The management network 800 is usually used for management of the inference server 500 or the training server 600. The data network 810 is usually used for communication performed between the inference server 500 and the training server 600 when a service is actually provided to the terminal device 4 (in actual performance). The terminal device 4 is communicatably connected to the inference server 500 via the wide area network 820 or the data network 810.
The communication network (management network 800, data network 810, and wide area network 820) consists of, for example, communication infrastructures such as WAN (Wide Area Network), LAN (Local Area Network), the Internet, leased lines, and public communication networks. The configuration of the communication network illustrated in
The information processing apparatus 10 includes, for example, a desktop personal computer, an office computer, a mainframe, a mobile communication terminal (such as a smart phone, a tablet, a wearable terminal, and a notebook personal computer), or the like. The information processing apparatus 10 may be realized by using virtual information processing resources provided on the basis of a virtualization technology, a process space separation technology, or the like, as in a virtual server provided by a cloud system. In addition, all or a part of the functions of the inference server 500, the training server 600, the management server 700, and the terminal device 4 may be realized, for example, by a service provided by a cloud system using an API (Application Programming Interface) or the like.
The processor 11 is configured of, for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), a GPU (Graphics Processing Unit), an AI (Artificial Intelligence) chip, an FPGA (Field Programmable Gate Array), an ASIC (Application Specific Integrated Circuit), or the like.
The main memory device 12 is a device that stores programs or data, and is configured of, for example, a ROM (Read Only Memory), a RAM (Random Access Memory), a non-volatile memory (NVRAM (Non Volatile RAM)), or the like. The auxiliary memory device 13 includes, for example, an SSD (Solid State Drive), a hard disk drive, an optical memory device (such as CD (Compact Disc) or DVD (Digital Versatile Disc)), a storage system, a read/write device for a recording medium such as an IC card, an SD card, or an optical recording medium, a memory area of a virtual server, or the like. Programs or data may be read into the auxiliary memory device 13 using the recording medium reader device or the communication device 16. The programs or data stored (memorized) in the auxiliary memory device 13 are read into the main memory device 12 from time to time.
The input device 14 is an interface that receives an input from the outside, and includes, for example, a keyboard, a mouse, a touch panel, a card reader, a voice input device, or the like. The output device 15 is an interface that outputs various types of information such as a processing progress and a processing result. The output device 15 includes, for example, a display device (such as a liquid crystal monitor, an LCD (Liquid Crystal Display), and a graphic card) that visualizes various types of information described above, a device that converts various types of information described above into speech (such as a voice output device (speaker)), or a device that converts various types of information described above into characters (such as printer). The output device 15 constitutes a user interface along with the input device 14. Note that, for example, the information processing apparatus 10 may be configured to input or output information to/from other devices (such as a smart phone, a tablet, a notebook computer, or various types portable information terminals) via the communication device 16.
The communication device 16 realizes communication with other devices. The communication device 16 is a wireless or wired communication interface that realizes communication with other devices via a communication network (including at least any one of the management network 800, the data network 810, and the wide area network 8200), and includes, for example, a NIC (Network Interface Card), a radio communication module, a USB (Universal Serial Bus) module, a serial communication module, or the like. Subsequently, the functions of each device will be described.
The memory unit 510 functions as a repository that stores and manages the inference model group 5110 and the inference model allocation table 5120. The memory unit 510 stores these data, for example, as a database table provided by a DBMS or a file provided by a file system.
The inference model group 5110 includes one or more inference models generated by a machine learning algorithm or training data. The inference model includes, for example, a model that predicts a future value of time series data using a regression equation, a model that classifies images using DNN (Deep Neural Network), or the like.
The inference model allocation table 5120 includes information on allocation of the inference data transmitted from the terminal device 4 to the inference model.
In the case of the illustrated inference model allocation table 5120, for example, the inference data transmitted from the terminal device 4 having a terminal device ID 5121 of “client001” indicates that the inference model ID 5122 is input to the inference model of “model001”. In addition, in the inference model API endpoint 5123, a URL including a description of a domain name indicating the inference environment in which the inference server 500 where the inference model is executed is installed is set. In
Different instances of the same inference model may be executed in a plurality of environments. For example, in the example of
Note that, although the inference models to which the inference data transmitted from the terminal device 4 are input are managed by using the inference model allocation table 5120 as described above in this embodiment, they may also be managed by other methods. For example, when name resolution of the inference model that processes inference data is performed using DNS (Domain Name System), network addresses allocated to different inference models (APIs) may be returned to each terminal device 4.
Returning to
Note that the method of transmitting the inference data from the terminal device 4 to the inference unit 520 is not necessarily limited. For example, the API provided by the inference unit 520 may be called from the terminal device 4. In addition, for example, the terminal device 4 may store inference data in a memory area of a storage accessible by both the terminal device 4 and the inference unit 520, and access information (such as a connection target or authentication information) to the inference data stored in the storage may be transmitted from the terminal device 4 to the inference unit 520. Then, the inference unit 520 may acquire the inference data from the storage using the access information when it receives an inference request from the terminal device 4.
The inference unit 520 and the inference model allocation table 5120 may be deployed only in any one of the inference servers 500 for each of the two inference environments 2a and 2b. In addition, the inference server 500 that stores the inference unit 520 and the inference model allocation table 5120 and the inference server 500 that stores the inference model group 5110 may be deployed in different information processing apparatuses. A relationship between the inference unit 520, the inference server 500, and the inference environment 2 is not necessarily limited. For example, the inference unit 520 may be realized by a plurality of inference servers 500. Furthermore, the inference unit 520 and the inference environment 2 may or may not correspond to each other on a one-to-one basis.
The memory unit 610 functions as a repository that stores and manages the training data group 6110. The memory unit 610 stores the training data group 6110, for example, as a database table provided by DBMS or a file provided by a file system. The training data group 6110 includes data serving as a generating source of the training data (hereinafter, referred to as “generator data”) and training data generated by the preprocessing unit 620 on the basis of the generator data. The generator data is, for example, inference data acquired from the terminal device 4.
The pre-processing unit 620 performs various pre-processings on the generator data to generate training data and evaluation data. The preprocessing includes, for example, a processing for complementing a missing value of the generator data, a processing for normalizing the generator data, a processing for extracting a feature amount, a processing for dividing the generator data into training data and evaluation data, and the like.
The training unit 630 performs machine learning on the basis of the training data to generate an inference model. The algorithm for generating the inference model is not necessarily limited. For example, the algorithm includes DNN (Deep Neural Network), various regression analyses, time series analyses, ensemble learning, and the like.
The evaluation unit 640 evaluates the performance of the inference model using the evaluation data. The type of the performance of the inference model or the method of evaluating the performance of the inference model is not necessarily limited. For example, the performance type of the inference model includes accuracy, fairness, and the like. For example, the method of evaluating the inference model includes a method of using a mean square error or a mean absolute error with respect to an actual value, or a coefficient of determination as an evaluation index.
In the following description, a program or data for realizing the processing for training of the inference model (the processings of each of the preprocessing unit 620, the training unit 630, and the evaluation unit 640) will be referred to as “ML code”. The ML code is updated, for example, when the effective feature amount changes. The ML code may be activated by, for example, a person (such as a developer of the inference model), or may be automatically executed by sequentially calling the ML code using predetermined software. In addition, for example, the predetermined software may execute the ML code under various conditions (algorithm selection or parameter setting) to automatically select the inference model having the highest evaluation.
The memory unit 710 functions as a repository that stores and manages a data trend management table 7110, an inference accuracy management table 7120, an ML code management table 7130, an inference model deployment management table 7140, an inference data/result group 7150, an ML code group 7160, and an inference model group 7170. The memory unit 710 stores such information (data), for example, as a database table provided by DBMS or a file provided by the file system. The memory unit 710 may further store programs or data for realizing the function of managing the ML code or the inference model. For example, the memory unit 710 may store a program that displays a trend of the inference data and a temporal change of the inference accuracy.
The data trend management table 7110 includes information indicating the result of grouping the trends of the inference data transmitted from the terminal device 4 to the inference server 500.
Returning to
Returning to
Returning to
Returning to
The data trend determination unit 720 determines the trend of the inference data transmitted from the terminal device 4 to the inference server 500. The data trend determination unit 720 stores a result of determination for the trend of the inference data in the data trend management table 7110.
The inference accuracy evaluation unit 730 evaluates accuracy of the inference result of the inference model and detects whether or not the inference accuracy degrades. The inference accuracy evaluation unit 730 manages the evaluation result in the inference accuracy management table 7120.
The ML code deployment unit 740 deploys the ML code included in the ML code group 7160 on the training server 600. The ML code deployment unit 740 manages a relationship between the ML code and the training server 600 on which the ML code is deployed in the ML code management table 7130.
The factor determination unit 750 determines a factor that causes degradation of the inference accuracy when the inference unit 520 detects the degradation of the inference accuracy.
The deployment determination unit 760 determines deployment of the inference model stored in the inference model group 7170 on the inference server 500 or allocation of the inference model to the terminal device 4. The deployment determination unit 760 manages a deployment status of the inference model on the inference server 500 in the inference model deployment management table 7140.
Subsequently, a processing performed by the information processing system 100 will be described.
When the inference data is received from the terminal device 4 along with the inference request (S1311), the inference unit 520 acquires the inference model ID and the inference model API endpoint 5123 corresponding to the terminal device ID of the terminal device 4 from the inference model allocation table 5120 (S1312). Note that, although it is assumed that the terminal device ID is contained, for example, in the inference data transmitted from the terminal device 4, the terminal device ID is not limited thereto, and may be specified on the basis of other methods.
Subsequently, the inference unit 520 transmits inference data to the acquired API endpoint and requests the API to perform inference (S1313). Note that the method of performing the inference request is not necessarily limited. If a plurality of inference model API endpoints 5123 are acquired in S1312, the inference unit 520 inputs the inference data to all the acquired endpoints.
Subsequently, the inference unit 520 acquires a result of the inference of the API by inputting the inference data to the inference model (S1314). Note that, as a method of returning the inference result from the inference model to the inference unit 520, the inference model may return the inference result to the inference unit 520 in a synchronous manner as a response to an API of the model called by the inference unit 520, or in a asynchronous manner separately from the API call.
Subsequently, the inference unit 520 returns the inference result to the terminal device 4 (S1315). Note that, if a plurality of inference model API endpoints 5123 are acquired in S1312, the inference unit 520 returns, for example, a plurality of inference results received from each of the plurality of inference models to the terminal device 4. In addition, the inference unit 520 may integrate a plurality of inference results and return the integrated one to the terminal device 4. Such a case may include, for example, a case where the inference unit 520 acquires a score indicating a likelihood of the inference for each result along with the inference result and return the inference result having the highest score among the acquired inference results to the terminal device 4, a case where, among a lot of the acquired inference results, the one having the same result as the other inference result is returned (majority decision), or the like. The inference result may be returned to the terminal device 4 in a synchronous manner as a response to the inference data received by the inference unit 520 from the terminal device 4, or may be returned in an asynchronous manner separately from the API call.
Subsequently, the inference unit 520 stores the inference data received from the terminal device 4 in S1311 and the inference result acquired from the inference model in S1315 in the inference data/result group 7150 of the management server 700 (S1316). The storage method described above includes a method, in which the management server 700 provides the API for storing the inference data or the inference result in the inference data/result group 7150, and the inference unit 520 calls the API, a method in which the inference data/result group 7150 is shared by the inference server 500 and the management server 700 via a file sharing protocol, or the like, and the inference unit 520 writes the inference data and the inference result as a file, and the like. However, the storage method is not limited thereto, and any other method may also be employed. Thus, the inference processing S1300 is terminated.
First, the data trend determination unit 720 determines a group having a similar trend for the inference data stored in the inference data/result group 7150 (for example, newly stored inference data) (S1411). As a determination method, for example, a group having a similar trend may be determined by clustering the inference data in a multidimensional space centered on a data item of the inference data. However, the determination method is not limited thereto, and any other method may also be employed for the determination.
Subsequently, the data trend determination unit 720 stores the determination result and the determination date/time in the data trend management table 7110 (S1412). Thus, the data trend determination processing S1400 is terminated.
First, the inference accuracy evaluation unit 730 evaluates the inference accuracy of the inference result of the inference data/result group 7150 (S1511). The method of evaluating the inference accuracy includes, for example, a method in which a person views and evaluates the inference result and obtains the result using a user interface, a method of comparing a predicted value obtained as the inference result with an actually measured value, or the like. However, any other method may also be employed for evaluation.
Subsequently, the inference accuracy evaluation unit 730 stores the inference accuracy evaluation result and the evaluation date/time in the inference accuracy management table 7120 (S1512).
Subsequently, the inference accuracy evaluation unit 730 determines whether or not the inference accuracy of the inference model that outputs the inference result of the evaluation target degrades (S1513). The determination method described above includes, for example, a method of comparing a predetermined threshold value with the inference accuracy and determining that the inference accuracy degrades if the inference accuracy is lower than the threshold value, or a method of determining that the inference accuracy degrades when the degradation amount from the previous inference accuracy is larger than a predetermined threshold value, or the like. However, any other method may also be employed for determination. If the inference accuracy evaluation unit 730 determines that the inference accuracy degrades (S1513: YES), the process advances to S1514. Otherwise, if the inference accuracy evaluation unit 730 does not determine that the inference accuracy degrades (S1513: NO), the inference accuracy evaluation process S1511 is terminated.
In S1514, the inference accuracy evaluation unit 730 calls the accuracy degradation countermeasure determination processing S1600 of the deployment determination unit 760. Details of the accuracy degradation countermeasure determination processing S1600 will be described below. After the accuracy degradation countermeasure determination processing S1600 is executed, the inference accuracy evaluation processing S1511 is terminated.
First, the factor determination unit 750 determines whether or not a change of the effective feature amount is a factor of degradation of the inference accuracy (S1611). The determination method described above is not necessarily limited. For example, there is a method disclosed in “A Unified Approach to Interpreting Model Predictions”, S. Lundberg et al., Neural Information Processing Systems (NIPS), 2017. If the factor determination unit 750 determines that a change of the effective feature amount is the factor of degradation of the inference accuracy (S1612: YES), the process advances to 51613. Otherwise, if the factor determination unit 750 does not determine that a change of the effective feature amount is the factor of degrading the inference accuracy (S1612: NO), the process advances to S1621.
In S1612, the deployment determination unit 760 notifies (by outputting an alert) a person such as a developer of the inference model that a change of the effective feature amount causes degradation of the inference accuracy of the inference model to prompt updating of the ML code. Note that, when the training server 600 can execute the ML code under various conditions (conditions corresponding to selection of the algorithm or selection of the parameter) and can execute the software for selecting the inference model having the highest evaluation, the deployment determination unit 760 may execute the software at this timing.
Subsequently, the deployment determination unit 760 executes the ML code deployment processing S1613 of the ML code deployment unit 7260 to deploy the ML code on the training server (S1613). The ML code deployment processing S1613 will be described below with reference to
Subsequently, the deployment determination unit 760 executes the ML code deployed by the ML code deployment processing S1613 to generate a new inference model depending on the change of the effective feature amount (S1614). The deployment determination unit 760 stores the new inference model in the inference model group 7170.
Subsequently, the deployment determination unit 760 deploys the new inference model generated in S1615 on the inference server 500 of the inference environment 2a and the inference server 500 of the inference environment 2b, and stores the inference server ID of the inference server, the inference model ID of the model, and the inference model API endpoint in the inference model deployment management table 7140 (S1615).
Subsequently, the deployment determination unit 760 updates the inference model allocation table 5120 and allocates the model generated in S1615 to all of the terminal devices 4 that use the inference model having the degraded inference accuracy (S1616). That is, the deployment determination unit 760 compares the inference model ID of the inference model having the degraded inference accuracy with the inference model ID of the inference model ID 5122 of the inference model allocation table 5120, and, for the records in which both the inference model IDs match, the inference model ID of the model generated in S1615 and the inference model API endpoint of the model generated in S1615 are stored in the inference model ID 5122 and the inference model API endpoint 5123, respectively. After this processing is executed, the accuracy degradation countermeasure determination processing S1600 is terminated, and the inference accuracy evaluation processing S1500 of
In S1621, the deployment determination unit 760 executes the ML code to generate a new inference model. The deployment determination unit 760 stores the generated new inference model in the inference model group 7170.
In S1622, the deployment determination unit 760 deploys the new inference model generated in S1621 on the inference server 500 of the inference environment 2 on which the inference model having the degraded inference accuracy is deployed, and stores the inference server ID of the inference server 500, the inference model ID of the inference model, and the inference model API endpoint in the inference model deployment management table 7140. In this case, the inference model ID and the inference model API endpoint may be overwritten, or a record may be added without overwriting. In the case of overwriting, for example, inference is performed using the inference model generated in S1621 instead of the inference model having the degraded inference accuracy. In addition, when a record is added, inference is performed by an ensemble algorithm that uses both the inference model having the degraded inference accuracy and the new inference model.
Subsequently, the deployment determination unit 760 refers to the inference model allocation table 5120 and specifies the terminal device 4 to which the inference model having the degraded inference accuracy is allocated (S1623). That is, the deployment determination unit 760 compares the inference model ID of the inference model having the degraded accuracy with the inference model ID of the inference model ID 5122 of the inference model allocation table 5120, and specifies the terminal device ID of the record in which both the inference model IDs match.
Subsequently, the deployment determination unit 760 refers to the data trend management table 7110 and specifies the terminal device 4 having a change of the trend of the transmitted inference data among the terminal devices 4 specified in S1623 (S1624). That is, the deployment determination unit 760 compares the terminal device ID specified in S1623 with the terminal device ID of the terminal device ID 7111 of the data trend management table 7110. In addition, for the records in which both the terminal device IDs match, the deployment determination unit 760 determines whether or not the data trend group ID of the data trend group ID 7112 changes during a predetermined period, and specifies the terminal device 4 of the terminal device ID of the record for which the data trend group ID changes as a terminal device 4 having a change of the trend of the inference data.
Subsequently, the deployment determination unit 760 refers to the data trend management table 7110, and specifies a terminal device 4 belonging to the same data trend group as that of the terminal device 4 having a trend change in the inference data (S1625). That is, the deployment determination unit 760 compares the terminal device ID of the terminal device 4 specified in S1624 with the terminal device ID of the terminal device ID 7111 of the data trend management table 7110, acquires a data trend group ID of a record in which both the terminal device IDs match, specifies another record having the same data trend group ID as this data trend group ID, and specifies a terminal device 4 of the terminal device ID of the specified record as a terminal device 4 belonging to the same data trend group having a trend change in the inference data.
Subsequently, the deployment determination unit 760 updates the inference model allocation table 5120 and allocates the new inference model generated in S1621 to the terminal device 4 specified in S1624 and S1625 (S1626). That is, the deployment determination unit 760 compares the terminal device ID of the terminal device 4 specified in S1624 and S1625 with the terminal device ID of the terminal device ID 5121 of the inference model allocation table 5120, and, for a record in which both the terminal device IDs match, the inference model ID of the new inference model generated in S1621 and the inference model API endpoint of the new inference model generated in S1621 are stored in the inference model ID 5122 and the inference model API endpoint 5123, respectively. Note that the inference model allocation table 5120 is updated, for example, by providing a processing step for determining whether or not the data trend changes in the middle of the data trend determination processing S1400 and performing the processing step if it is determined that the data trend changes. After this processing is executed, the accuracy degradation countermeasure determination processing S1600 is terminated, and the inference accuracy evaluation processing S1500 in
In S1721, the ML code deployment unit 740 monitors the ML code group 7160. Subsequently, the ML code deployment unit 740 determines whether or not the ML code is updated in the ML code group (whether or not the ML code is updated to the content corresponding to the change of the effective feature amount) (S1722). Note the ML code update includes adding a new ML code, deleting an existing ML code, changing an existing ML code, and the like. If the ML code deployment unit 740 determines that the ML code is updated (S1722: YES), the process advances to S1723. If the ML code deployment unit 740 determines that the ML code is not updated (S1722: NO), the process advances to S1721. In this case, in order to prevent the training server 600 from being overloaded by this processing, a predetermined processing performed by the training server 600 may be stopped for a predetermined time.
In S1723, the ML code deployment unit 740 deploys the updated ML code on the training server 600. Then, the ML code deployment processing S1614 is terminated, and the process advances to S1614 of
As described above in details, when the information processing system 100 according to this embodiment detects that the inference accuracy of the inference model degrades, a factor of the degradation of the inference accuracy of the inference model is determined. If the change of the effective feature amount is the factor of the degradation of the inference accuracy of the inference model, for example, a new inference model corresponding to the change of the effective feature amount is generated using the ML code updated by a developer of the inference model, or the like, and the generated new inference model is deployed on each inference server 500 of the inference environment 2. Otherwise, if the change of the effective feature amount is not the factor of the degradation of the accuracy of the inference model, the information processing system 100 generates a new inference model corresponding to the inference data having the degraded inference accuracy, and the new inference model is deployed on the inference server 500 having the same inference environment as that of the inference model having the degraded inference accuracy. In addition, the information processing system 100 allocates the new inference model to the terminal device 4 to which the inference model having the degraded inference accuracy is allocated and to the terminal device 4 belonging to the same data trend group as that of the terminal device 4. In this manner, the information processing system 100 according to this embodiment appropriately determines the application method of the new inference model depending on the factor of the accuracy degradation of the inference model. Therefore, it is possible to improve the inference accuracy in each of a plurality of inference environments without degrading the inference accuracy or wastefully increasing the load or time taken for inference.
While the embodiments of the present invention have been described hereinbefore, the present invention is not limited to the embodiments described above and encompasses various modifications. In addition, for example, the configurations have been described in details in the embodiments for easy understanding, and are not necessarily limited to a case comprising all the configurations described above. Furthermore, a part of the configurations of each embodiment may be added, deleted, or substituted with other configurations.
Each of the aforementioned configurations, functions, processing units, processing means, and the like may be realized by hardware by designing a part or all of them, for example, using an integrated circuit. In addition, they may be realized by a program code of the software that realizes each function of the embodiments. In this case, a storage medium recording the program code is provided to the information processing apparatus (computer), and a processor included in the information processing apparatus reads the program code stored in the storage medium. In this case, the program code itself read from the storage medium realizes the functions of the aforementioned embodiments, and the program code itself and the storage medium storing the program code constitute the present invention. The storage medium for supplying such a program code may include, for example, a hard disk, SSD (Solid State Drive), optical disk, magneto-optical disk, CD-R, flexible disk, CD-ROM, DVD-ROM, magnetic tape, a non-volatile memory card, ROM or the like.
In the embodiments described above, the control lines and information lines indicate what is considered necessary for the explanation, and not all the control lines and information lines on the product are necessarily illustrated. All configurations may be connected to each other. In addition, although various types of information are shown in a table format in the aforementioned description, such information may be managed in any format other than the table.
Number | Date | Country | Kind |
---|---|---|---|
2020-010366 | Jan 2020 | JP | national |