This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2018-081900, filed on Apr. 20, 2018, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to a computer-readable recording medium, a machine learning method, and a machine learning device.
Predicting mentally unwell conditions of employees in a few months later based on their attendance records data and taking some actions such as counseling in earlier stages to present them from taking a suspension of work (sick leave) has been performed. Generally, dedicated staff members perform a visual check to find an employee who falls on work conditions with feature patterns such as frequent business trips, long overtime, repeated sudden absences, absence without notice, and a combination of these patterns. It is difficult to clearly define these feature patterns, because such dedicated staff members may individually have their own standards. In recent years, machine learning using a decision tree, random forest, SVM (Support Vector Machine), or the like has been performed to learn feature patterns specific to mentally unwell conditions including sickness and to automatically provide a prediction, which has been decided by the dedicated staff members. Examples of related art are described in International Publication Pamphlet No. WO 2017/073373, Japanese Laid-open Patent Publication No. 2010-198189, Japanese Laid-open Patent Publication No. 2016-151979, and Japanese Laid-open Patent Publication No. 2005-332345.
With the above-described technique, however, in some cases, learning of a wrong feature pattern may proceed and result in degradation of the accuracy of learning. For example, there is a possibility that wrong learning proceeds with taking day off, not making attendance, on holidays such as Saturdays, Sundays, national holidays, and a summer vacation and a refresh vacation supplied by a company being recognized as a feature pattern common to many persons, thereby overlooking a true feature pattern that has a large effect on unwell conditions and preventing true learning from proceeding.
According to an aspect of an embodiment, a non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a process. The process includes receiving attendance record data for a plurality of employees, the attendance record data corresponding to a period of a calendar, the attendance record data including a plurality of records including a plurality of items, first generating exclusion data by excluding a record corresponding to an individual holiday that is differently set by the employees, and a record corresponding to a common holiday set commonly to the employees, from the attendance record data; second generating, based on the generated exclusion data, tensor data in which a tensor is created with calendar information and the items including different dimensions; and performing deep learning of a neural network and learning of a method of tensor decomposition with respect to a learning model in which the tensor data is subjected to the tensor decomposition as input tensor data to be inputted to the neural network.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
Preferred embodiments will be explained with reference to accompanying drawings. This embodiment is, however, not intended to limit the scope of the present invention in any way. Moreover, it is possible to combine the embodiments one another as appropriate within a scope without inconsistency.
To be specific, the learning device 100 generates a learning model using Deep Tensor (registered trademark) that performs deep learning (DL) on graph-structured data with attendance record data (label: with sick leave) of an unwell person who has taken sick leave (suspension of work) and attendance record data (label: without sick leave) of a person who has not taken sick leave (suspension of work) as supervised data. Then, the learning device 100 uses the learning model to which learning findings are applied to implement inference for an accurate event (label) with respect to a new graph-structured data.
For example, the learning device 100 acquires attendance record data constituted of a plurality of records for a plurality of employees, the records each corresponding to a period of a calendar and having a plurality of items. Then, the learning device 100 excludes, from the attendance record data, a record corresponding to individual holidays that are individually set by a plurality of employees and a record corresponding to a holiday on which attendance is not made among specified holidays including common holidays set commonly to the employees such as Saturdays, Sundays, and national holidays, and generates tensor data in which a tensor is created with calendar information and a plurality of items having different dimensions. After that, the learning device 100 performs deep learning of a neural network and learning of a method of tensor decomposition with respect to a learning model in which the tensor data is subjected to the tensor decomposition as input, so as to be input to the neural network. In this manner, the learning device 100 generates a plurality of pieces of learning data from the attendance record data and generates a learning model to provide classification into “take sick leave” and “not take sick leave” based on the tensor data of each piece of learning data.
After that, the learning device 100, similarly to the attendance record data of an employee who is to be determined, excludes a holiday on which attendance is not made among the specified holidays, then creates a tensor to generate tensor data, and inputs the tensor data to the learned learning model. The learning device 100 outputs an output value representing prediction results whether the target employee “takes sick leave” or “does not take sick leave”.
That is, the learning device 100 excludes a feature of a day on which attendance is not made, the day being a holiday, from learning data and prediction data. Holidays include common holidays set commonly to the employees such as Saturdays, Sundays, and national holidays, individual holidays that are set differently by the employees such as a summer vacation and a refresh vacation, and leave and the like supplied by the company. When attendance is made on a holiday, this attendance is not excluded from learning data and prediction data, but is left therein. In such a manner, the learning device 100 focuses a feature of the day on which attendance is made, thereby eliminating noise to improve the accuracy of learning.
The following explains Deep Tensor. Deep Tensor is a deep learning with a tensor (graphic information) as input. Deep Tensor extracts a partial graph structure that contributes to a determination, together with learning a neural network. This extraction processing is provided by learning parameters for tensor decomposition of input tensor data together with learning a neural network.
Next, the following explains a graph structure with reference to
Such processing to extract a partial graph structure is implemented by a mathematical operation referred to as tensor decomposition. Tensor decomposition is an operation to approximate the input n-th order tensor with a product with the n-th or lower order tensor. For example, the input n-th order tensor is approximated with a product of the one n-th order tensor (referred to as a core tensor), and n tensors the order of which is lower than n-th (when n>2, normally the second order tensor, or matrix, is used). This decomposition is not unique, but any desired partial graph structure in the graph structure represented by the input data is able to be included in the core tensor.
The following explains learning of Deep Tensor.
The learning device 100 executes learning of a prediction model using the expanded error propagation method obtained by expanding the error back-propagation method. That is, the learning device 100 corrects various kinds of parameters of NN so that the classification error becomes smaller as the classification error is propagated toward lower layers through an input layer, an intermediate layer, and an output layer included in the NN. Furthermore, the learning device 100 causes the classification error to be propagated to the target core tensor to correct the target core tensor so as to be closer to a partial graph structure that contributes to prediction, that is, a feature pattern representing a feature that a person who takes suspension of work or a feature pattern representing a feature that a person who has not taken sick leave has. This correction allows an optimized target core tensor to have a partial pattern that contributes to the prediction extracted thereto.
When a prediction is performed, an input tensor is converted to a core tensor (a partial pattern of the input tensor) by tensor decomposition and the core tensor is input to a neural network, and thereby prediction results are obtained. The tensor decomposition allows the core tensor to be converted so as to analogize to the target core tensor. That is, a core tensor having a partial pattern that contributes to prediction is extracted.
Functional Configuration
The communicating unit 101 is a processing unit that controls communication with other devices, which provides a communication interface, for example. For example, the communicating unit 101 receives a process start instruction, attendance record data, and the like, from the terminal of an administrator. The communication unit 101 outputs learning results, prediction results of prediction target data, and the like, to the administrator terminal.
The storage unit 102 exemplifies a storage device that stores therein a computer program and data, such as a memory and a hard disk. This storage unit 102 stores therein an attendance record data DB 103, a tensor DB 104, a learning result DB 105, and a prediction target DB 106.
The attendance record data DB 103 is a database that stores therein attendance record data concerning the attendance of an employee or the like input by a user or the like. The attendance record data to be stored is attendance record data constituted of a plurality of records for a plurality of employees, the records each corresponding to a period of a calendar and having a plurality of items. Furthermore, the attendance record data is made by data organization based on attendance records used in respective companies, and is able to be obtained from various kinds of well-known attendance management systems or the like.
It is noted that to the classification item of attendance/absence, a value corresponding to any one of items such as coming to the office, sick leave (suspension of work), accumulated holiday, paid holiday, and refresh vacation, is set. For the item of with/without business trip, a value of with/without a business trip is set, and the value corresponding to with or without taking a business trip are stored therein. It is noted that the above-described values are able to be distinguished with numbers or the like. For example, it is possible to distinguish the values in such a manner as attendance=0, sick leave=1, accumulated holiday=2, and paid holiday=3, for example. It is noted that a record unit of the attendance data record may be not only a daily unit but also a weekly or a monthly unit. Moreover, in accordance with a case where it is possible to take leave in an hourly unit, a value of hourly leave=4 may be set.
The attendance record data to be stored is learning data and has a teacher label attached thereto.
A tensor DB 104 is a database that stores therein respective tensors (tensor data) generated from the attendance record data of the employees. This tensor DB 104 stores therein training data in which tensors and labels are associated with each other. For example, the tensor DB 104 stores therein “tensor data 1, label (with sick leave)”, “tensor data 2, label (without sick leave)”, and the like, as “tensor data, label”.
It is noted that settings of record items and tensor data labels in the above-described learning data are merely examples, and not limited to values and labels such as “with sick leave” and “without sick leave”. It is also possible to use various types of values and labels such as “a person who takes a suspension of work” and “a person who has not taken suspension of work”, and “with a suspension of work” and “without a suspension of work”, which are able to distinguish the existence of an unwell person.
The learning result DB 105 is a database that stores therein a learning result. For example, the learning result DB 105 stores a determination result (classification result) of learning data by the control unit 110, various parameters of NN and various parameters of Deep Tensor learned through machine learning or deep learning, and the like.
The prediction target DB 106 is a database that stores therein attendance record data of a target for which the existence of sick leave (suspension of work) is predicted using a learned prediction model. For example, the prediction target DB 106 stores therein attendance record data of a prediction target, tensor data generated from the attendance record data of the prediction target, and the like.
The control unit 110 is a processing unit that manages the whole processing of the learning device 100, and is a processor, for example. This control unit 110 includes an exclusion unit 111, a tensor generator 112, a learning unit 113, and a prediction unit 114. It is noted that the exclusion unit 111, the tensor generator 112, the learning unit 113, and the prediction unit 114 exemplify a process that is executed by an electronic circuit included in the processor or the like, or the processor or the like.
The exclusion unit 111 is a processing unit that generates exclusion data obtained by excluding, from the attendance record data, a record corresponding to the individual holidays that are set differently by the employees, and a record corresponding to the common holidays set commonly to the employees. That is, the exclusion unit 111 excludes, from the attendance record data that is data to be learned, a specified holiday (relevant holiday) on which attendance is not made, among the specified holidays including common holidays such as Saturdays, Sundays, and national holidays, individual holidays supplied by the company and that are set differently by the employees such as a summer vacation and a refresh vacation, and holidays specified by the company as holidays. Specifically, the exclusion unit 111 reads out the attendance record data stored in the attendance record data DB 103, excludes a specified holiday (relevant holiday) on which attendance is not made from the attendance record data, and outputs the attendance record data after the exclusion to the tensor generator 112.
For example, the exclusion unit 111 refers to attendance/absence in the attendance record data to locate specified holidays such as Saturdays, Sundays, and national holidays. Consequently, the exclusion unit 111 specifies a specified holiday on which attendance/absence does not fall on “attendance” with respect to the located specified holidays. Then, the exclusion unit 111 excludes (deletes) the located specified holiday as the relevant holiday from the attendance record data, and outputs the attendance record data after the exclusion to the tensor generator 112.
The tensor generator 112 is a processing unit that generates tensor data in which a tensor is created from each piece of attendance record data after a specified holiday (relevant holiday) on which attendance is not made is excluded, the piece of attendance record data being input from the exclusion unit 111. The tensor generator 112 creates a tensor with calendar information, and each of the items of “month, date, attendance/absence, with/without business trip, attendance time, and leave time” included in each piece of attendance record data as a dimension. The tensor generator 112 stores in the tensor DB 104 the created tensor (tensor data) being associated with a label (with or without sick leave) specified by a user or the like or a label (with or without sick leave) located based on the attendance/absence in the attendance record data. Learning is executed by Deep Tensor with the generated tensor data as input. It is noted that Deep Tensor, during learning, extracts a target core tensor that identifies a partial pattern of learning data having an influence on prediction, and executes the prediction based on the extracted target core tensor when the prediction is performed.
The tensor generator 112 generates a tensor from learning data with items that are assumed to characterize tendency of taking sick leave, such as frequent business trips, long overtime, repeated sudden absences, absence without notice, frequent holiday works, and a combination of any of these items, as dimensions. For example, the tensor generator 112 generates a fourth order tensor of four dimensions using four elements of month, date, attendance/absence, and with/without business trip. When four months of data are used, an element count for month is “4”, an element count of date is “31” based on the fact that the maximum number of days of a month is 31, an element count of attendance/absence is “3” based on the fact that types of attendance/absence are coming to the office, leave, and holiday, an element count of with/without business trip is “2” based on the fact that business trip is done or not done. Thus, a tensor generated from learning data is a tensor of “4×31×3×2” and a value of an element corresponding to an item among attendance/absence and with/without business trips in the months and dates in the learning data is 1, and a value of an element not corresponding to any of those items is 0. Any desired item is selectable as a dimension for a tensor or is determinable based on the past event.
In the first embodiment, the above-described tensor is simplified to be described as in
The learning unit 113 is a processing unit that performs deep learning of the neural network and learning of a method of tensor decomposition with respect to a learning model in which the tensor data is subjected to tensor decomposition as input tensor data, so as to be input to the neural network (NN). That is, the learning unit 113 executes learning of the learning model by Deep Tensor with the tensor data generated from each piece of the learning data and the label as input. Specifically, similarly to the method explained in
The prediction unit 114 is a processing unit that predicts, using the learning results, a label of data that is to be determined. Specifically, the prediction unit 114 reads out the various kinds of parameters from the learning result DB 105, and builds Deep Tensor including the neural network in which the various kinds of parameters are set, and the like. The prediction unit 114 reads out attendance record data to be predicted from the prediction target DB 106, creates a tensor therefrom after excluding the relevant holiday in the same manner as in the exclusion unit 111, and inputs the created tensor to Deep Tensor. After that, the prediction unit 114 outputs a prediction result indicating with or without sick leave. The prediction unit 114, then displays the prediction result on a display or transmits the prediction result to the administrator terminal.
It is noted that the prediction unit 114 is able to perform prediction when the attendance record data to be predicted is input to the learning model as it is with the relevant holiday retained therein or when the attendance record data to be predicted is input to the learning model after the relevant holiday is deleted therefrom. Moreover, it is possible to input the attendance record data to be predicted by dividing by six months of data or input the attendance record data to be predicted as it is.
Next, the following describes, using
In the example of
Furthermore, in the example of
In accordance with differences in the number of days per month, a period of holidays individually selected, and with/without attendance on a holiday, there may be a different number of days per month. In this case, tensor data has a difference in size. Then, a portion corresponding to a date not having a value in a tensor is assumed to be an empty element, so that different months have the same date size in tensor representation.
Specifically, as illustrated in
Deep Tensor enables exclusion from a target from which a core tensor is extracted by setting the weight to be zero as described above. As illustrated in
It is noted that, in general machine learning, the input of a feature vector having a fixed length is a prerequisite, and it is not possible to correctly perform learning processing simply by deleting the portion of the relevant holiday. Thus, it is probable to complement an element for a deleted number of days, but the improvement of the accuracy of learning is not expectable even with the complementation.
As illustrated in
In the example of
In the example of
In this manner, in general machine learning, when the relevant holiday is deleted, an empty element is completed for the reason of restriction that the input data has a fixed length. However, the complemented empty elements are similar among learning data, resulting in degradation of the accuracy of learning.
Processing Flow
Then, when there exists a specified holiday on which attendance is not made (S103: Yes), the exclusion unit 111 deletes a specified holiday on which attendance is not made from the attendance record data (S104), and the tensor generator 112 inserts an empty element to the deleted portion to create a tensor from the attendance record data (S105).
By contrast, when there is not any specified holiday on which attendance is not made (S103: No), the tensor generator 112 create a tensor from the attendance record data as it is (S106).
After that, when there exists unprocessed attendance record data (S107: Yes), for the next attendance record data, S102 and the subsequent steps are repeated. By contrast, when there is not any unprocessed attendance record (S107: No), the learning unit 113 executes, using the tensors and labels stored in the tensor DB 104, learning process (S108).
Effect
As described above, the learning device 100 is capable of executing learning processing with a feature pattern of a specified holiday on which attendance is made being excluded from the learning data, so that it is possible to execute learning process by extracting a feature pattern of a day other than the specified holiday, instead of a feature pattern including the specified holiday that is an item common to the learning data. As a result, the learning device 100 is capable of preventing learning of a wrong feature pattern from proceeding and thereby preventing degradation of the accuracy of learning.
The learning device 100 is capable of setting a priority of prediction results by using a learning model (prediction model) from which holiday information such as specified holiday is excluded together with a learning model including the holiday information.
Then, the learning device 100 inputs the attendance record data as a prediction target to the learning model A and the learning model B, and, according to the prediction results from the respective learning models, determines the risk of sick leave of the employees. Specifically, the learning device 100 determines a result of “both learning models predict unwell condition” to be the high risk, and a result of “either learning model predicts unwell condition” to be the middle risk, and a result of “none of the learning models predicts unwell condition” to be the low risk.
In the example of
Although the embodiments of the present invention have been explained, the present invention may be implemented in various kinds of different aspects in addition to the above-described embodiments.
Learning
The above-described learning process may be executed for any desired number of times. For example, the learning process may be executed using all pieces of training data, or may be executed for a certain number of times. Furthermore, as a method for calculating a classification error, a known calculation method such as the least square method may be employed, or a general calculation method used in NN may be employed. It is noted that learning weight or the like of a neural network by inputting tensor data to the neural network, so as to be able to classify an event (for example, with sick leave and without sick leave), using learning data, corresponds to an example of a learning model.
While the explanation is made with attendance record data for six months as example data used for prediction, it is not limited thereto, but may be optionally changed to attendance record data for four months or the like. Moreover, while the explanation is made to the example in which a label is attached to attendance record data for six months depending on whether sickness leave (suspension of work) is taken within three months after the end time thereof, it is not limited thereto, but may be optionally changed to within two months or the like. The order of tensor data is not limited to fourth order, and tensor data below the fourth order may be generated, or tensor data of a fifth order or more may be generated. Not only attendance record data but also any other format of data may be used as far as it provides conditions of employees or the like, such as coming to the office, leaving the office, and taking leave.
In the above-described example, although an example is explained that the exclusion unit 111 excludes the relevant holiday that is a holiday on which attendance is not made among the specified holidays from the attendance record data, it is not limited thereto, but it is possible to exclude all of the specified holidays.
Neural Network
In the second embodiment, various kinds of neural networks such as RNN (Recurrent Neural Networks) and CNN (Convolutional Neural Network) may be used. For a method of learning, various kinds of known methods may be employed in addition to the error back-propagation method. A neural network has a multistage configuration including an input layer, an intermediate layer (hidden layer), and an output layer, for example, the layers each having a structure in which a plurality of nodes are tied with edges. Each of the layers has a function called “activation function”, each edge having “weight”, the value of each node being calculated based on the value of a node in the previous layer, the value of weight of a connection edge (weighting factor), and the activation function that the layer has. For a calculation method, various kinds of known methods are able to be employed.
Learning in a neural network refers to correcting parameters, that is, weight and bias so that the output layer has a correct value. In the error back-propagation method, “loss function” is determined that indicates how far the value of the output layer is away from a proper condition (desired condition) with respect to the neural network, and the weight and the bias are updated so that the loss function can be minimized using the steepest descent method and the like.
System
Process procedures, control procedures, specific names, and information including various kinds of data and parameters represented in the above description and drawings may be optionally changed unless otherwise specified. The specific example, distribution, numeric values explained in the embodiments are merely examples, and may be optionally changed.
In addition, each component of each device illustrated in the drawings is a functional idea and thus is not always be configured physically as illustrated in the drawings. That is, specific forms of the distribution and integration of each device are not limited to those in the drawings. In other words, all or part of the devices may be functionally or physically distributed or integrated in any desired unit according to various kinds of loads and operating conditions. Moreover, all or any desired part of the processing functions of the devices are implemented by CPU and a computer program analyzed or executed by the CPU, or may be implemented as hardware with wired logic.
Hardware
The communication device 100a is a network interface card or the like, which communicates with other servers. The HDD 100b stores therein a computer program and a database that operate the functions illustrated in
The processor 100d reads out from HDD 100b or the like to develop in the memory 100c a computer program that executes the same processing as that executed by the processing units illustrated in
In this manner, the learning device 100 operates as an information processing device that executes a learning method by reading out to execute the computer program. Moreover, the learning device 100 is capable of implementing the same functions as those described in the above-described embodiments by allowing a media reader to read out the computer program from a recording medium to execute the read computer program. It is noted that the computer program referred to in other embodiments than the above-described embodiments is not limited to being executed by the learning device 100. For example, it is possible to apply the present invention similarly when another computer or server executes the computer program or these computer and server execute the computer program in cooperation with one another.
This computer program is distributable via a network such as the Internet. Furthermore, this computer program may be stored in a computer-readable recording medium such as a hard disc, a flexible disc (FD), a compact disc read-only memory (CD-ROM), a magneto optical disk (MO), or a digital versatile disc (DVD), and may be executed by being read out from the recording medium.
According to the embodiments, it is possible to prevent the accuracy of learning from being degraded.
All examples and conditional language recited herein are intended for pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2018-081900 | Apr 2018 | JP | national |