This invention was made with Japanese Government support awarded by the Japan Science and Technology Agency (JST). The Japanese Government has certain rights in this invention.
The present invention relates to a process for integrating results of works performed by multiple workers, and more particularly to an estimation method, an estimation system, a computer system and a program for estimating abilities of multiple workers in a process for integrating results of works performed by the workers.
Recently, attention has been paid to crowdsourcing in which works are entrusted to an unspecified large number of workers. In crowdsourcing, the same task is entrusted to a lot of workers, work results are integrated, and an integrated work result of the task is obtained. It is possible to obtain the work result with a higher quality by performing the integration appropriately.
In the case of attempting to integrate answers from a lot of workers to obtain a correct answer, a simple method is thought of, which uses majority voting to determine the correct answer. In the case of simple majority voting, however, differences among correct answer rates of the workers is ignored. Therefore, there is proposed a technique for improving the accuracy of the obtained answer by estimating skills of the workers (for example, correct answer rates) and the degree of difficulty of works and, thereby, performing weighted integration.
In the conventional technique mentioned above, however, it is assumed that individual workers create answers from the beginning. Therefore, the technique is not sufficiently compatible with a step-by-step workflow in which succeeding-stage workers work based on a work result of a preceding-stage worker. In one example, modification may be performed based on a result of Automatic Speech Recognition (ASR) for work of voice or video transcription. In another example, a workflow includes a person modifying a result of Optical Character Recognition (OCR) for work of digitization of a book.
In the above workflows, there is a possibility that succeeding-stage workers are influenced by seeing the answer of a preceding-stage worker. For example, there is a possibility that a worker exists who has a tendency of believing an automatic recognition result and not modifying wrong recognition, and a worker who has, on the contrary, a tendency of being suspicious and excessively modifying even correct recognition.
From the background described above, there has been a demand for development of a technique capable of, in a workflow in which succeeding-stage workers work based on a work result of a preceding-stage worker for the same task, and work results of these workers are integrated, estimating the behaviors and skills of the workers more accurately and performing integration with higher accuracy.
In one embodiment, a computer-implemented method for estimating an ability of a worker in a process for integrating work results of multiple workers for the same task includes acquiring, for each of one or more tasks, a work result of a preceding-stage worker and a work result of a succeeding-stage worker that works based on the work result of the preceding-stage worker. The method also includes estimating multiple parameters of a probability model in which an ability of the succeeding-stage worker conditioned by a quality of the work result of the preceding-stage worker is introduced as a conditioned ability parameter, based on the multiple work results obtained for each of the one or more tasks.
In another embodiment, an estimation system for estimating an ability of a worker in a process for integrating work results of multiple workers for the same task includes an acquisition section configured to acquire, for each of one or more tasks, a work result of a preceding-stage worker and a work result of a succeeding-stage worker that works based on the work result of the preceding-stage worker. The estimation system also includes an estimation section configured to estimate multiple parameters of a probability model in which an ability of the succeeding-stage worker conditioned by a quality of the work result of the preceding-stage worker is introduced as a parameter, based on the multiple work results obtained for each of the one or more tasks.
According to yet another embodiment, a computer system for estimating an ability of a worker in a process for integrating work results of multiple workers for the same task includes a processor and a memory communicating with the processor. The processor is configured to acquire, for each of one or more tasks, a work result of a preceding-stage worker and a work result of a succeeding-stage worker that works based on the work result of the preceding-stage worker. The processor is also configured to estimate multiple parameters of a probability model in which an ability of the succeeding-stage worker conditioned by a quality of the work result of the preceding-stage worker is introduced as a parameter, based on the multiple work results obtained for each of the one or more tasks.
In another embodiment, a computer-readable program product for estimating an ability of a worker in a process for integrating work results of multiple workers for the same task. The program product causes a computer system to function as an acquisition section configured to acquire, for each of one or more tasks, a work result of a preceding-stage worker and a work result of a succeeding-stage worker that works based on the work result of the preceding-stage worker. The program product also causes the computer system to function as an estimation section configured to estimate multiple parameters of a probability model in which an ability of the succeeding-stage worker conditioned by a quality of the work result of the preceding-stage worker is introduced as a parameter, based on the multiple work results obtained for each of the one or more tasks.
In another embodiment, a computer-readable program product for integrating work results of multiple workers for the same task causes a computer system to function as an acceptance section configured to present a work result of a preceding-stage worker to one or more succeeding-stage workers and accept a work result from each of the one or more succeeding-stage workers, for a task. The computer-readable program product also causes the computer system to function as a results integration section configured to estimate a work result to be obtained for the task as a result of integrating the work results of the multiple workers based on a probability model in which an ability of each succeeding-stage worker conditioned by a quality of the work result of the preceding-stage worker is introduced as a parameter.
Other aspects and embodiments of the present invention will become apparent from the following detailed description, which, when taken in conjunction with the drawings, illustrate by way of example the principles of the invention.
The present invention has been made in view of the inadequate points in the above conventional techniques, and an object of the embodiments described herein is to provide an estimation method, an estimation system, a computer system and a program capable of estimating abilities of succeeding-stage workers that may fluctuate according to the quality of a work result of a preceding-stage worker. Another object of the embodiments described herein is to provide a program for performing a process for integrating work results of multiple workers while estimating the abilities of the succeeding-stage workers that may fluctuate according to the quality of a work result of the preceding-stage worker.
In order to solve the problems described above, one embodiment described herein provides an estimation method having the following features. In the estimation method, a computer system acquires, for each of one or more tasks, a work result of a preceding-stage worker and a work result of a succeeding-stage worker that works based on the work result of the preceding-stage worker. Then, the computer system estimates multiple parameters of a probability model in which abilities of the succeeding-stage worker conditioned by a quality of the work result of the preceding-stage worker are introduced as a parameter based on the multiple work results obtained for each of the one or more tasks. Thereby, the abilities of multiple workers in a process for integrating work results of the multiple workers for the same task are estimated.
Further, according to one embodiment described herein, there is provided an estimation system for estimating abilities of multiple workers in a process for integrating work results of the multiple workers for the same task. The estimation system includes an acquisition section configured to acquire, for each of one or more tasks, a work result of a preceding-stage worker and a work result of a succeeding-stage worker that works based on the work result of the preceding-stage worker. The estimation system further includes an estimation section configured to estimate multiple parameters of a probability model in which abilities of the succeeding-stage worker conditioned by a quality of the work result of the preceding-stage worker are introduced as a parameter, based on the multiple work results obtained for each of the one or more tasks.
Furthermore, according to one embodiment described herein, there is provided a computer system for estimating abilities of multiple workers in a process for integrating work results of the multiple workers for the same task, the computer system comprising a processor and a memory communicating with the processor. The processor of the computer system is configured to acquire, for each of one or more tasks, a work result of a preceding-stage worker and a work result of a succeeding-stage worker and estimate multiple parameters of a probability model in which abilities of the succeeding-stage worker conditioned by a quality of the work result of the preceding-stage worker are introduced as a parameter, based on the multiple work results. Furthermore, according to one embodiment described herein, there is provided a computer-readable program for estimating abilities of multiple workers in a process for integrating work results of the multiple workers for the same task.
Furthermore, according to one embodiment described herein, there is provided a computer-readable program for integrating work results of multiple workers for the same task. This program causes a computer system to function as: an acceptance section configured to present a work result of a preceding-stage worker to one or more succeeding-stage workers and accept a work result from each of the one or more succeeding-stage workers, for a task; and a results integration section configured to estimate a work result to be obtained for the task as a result of integrating the work results of the multiple workers based on a probability model in which abilities of the succeeding-stage worker conditioned by a quality of the work result of the preceding-stage worker are introduced as a parameter.
By the above configuration, it becomes possible to estimate the ability of a succeeding-stage worker that may fluctuate according to the quality of a work result of a preceding-stage worker. Furthermore, it becomes possible to perform processing for integrating work results of multiple workers while estimating the abilities of the succeeding-stage workers that may fluctuate according to the quality of a work result of the preceding-stage worker.
Other effects and advantages of the embodiments described herein will be ascertained from the detailed description explained together with reference to accompanying drawings.
Although embodiments of the present invention are described below, embodiments of the present invention are not limited to the various embodiments described below. In the embodiments described below, a work results integration system 100 for estimating the abilities of multiple workers as an estimation system and integrating work results of the multiple workers will be described as an example.
Each worker terminal 104 is a terminal operated by a worker who processes a task allocated from the management server 110. The worker terminal 104 may be an information terminal such as a personal computer, a tablet computer or a smartphone although not especially limited thereto.
The management server 110 allocates tasks to workers and collects work results of the tasks from the workers. In the described embodiment, the management server 110 can redundantly allocate the same task to the multiple workers, integrate work results from the multiple workers and generate a final work result for the task.
Here, although the task is not especially limited, various information processing tasks, such as an image processing task, a voice processing task and a text processing task, are possible. As the image processing task, for example, a character recognition task for extracting characters or text from an image and an image classification task for classifying the kinds of images can be given as examples. As the voice processing task, a voice recognition task for performing transcription of characters or text from voice attached data, such as voice data and motion picture data, and a voice classification task for classifying the kinds of voices can be given as examples. As the text processing task, a sentence proofreading task, an inter-language translation task, and a metadata giving task for giving metadata to information, such as a message and an article, can be given as examples.
The management server 110 stores task data 112 in which the tasks to be allocated to the workers are described. When description is made on the character recognition task described above as an example, the task data includes image data of an image part in which characters or text obtained by scanning a book or the like is included. Further, when description is made on the voice recognition task as an example, the task data includes voice data of a voice part including utterances, in which communication voices, lecture video, or the like is recorded. As for other kinds of tasks, details will not be further described.
When receiving work results for a task transmitted from the worker terminals 104, the management server 110 stores the work results as a task log 114. The task log 114 includes a work result obtained from each of one or more workers for each of one or more tasks. When description is made on the character recognition task as an example, the work result includes characters or text recognized from image data. Further, when description is made on the voice recognition task as an example, the work result includes characters or text obtained by transcription from voice attached data.
The management server 110 integrates the work results of the multiple workers for each task included in the task log 114, generates an integrated work result as a consensus of the workers who took charge of the task, and stores the integrated work result as work product data 116. When description is made on the character recognition task or the voice recognition task as an example, the work product data 116 includes characters or text presumed as a consensus.
In the work results integration system 100 shown in
In such a system that integrates multiple redundant work results for the same task from multiple workers to obtain a final work product as described above, it is important to appropriately estimate the abilities of the workers. This is because the qualities of the work results differ according to the workers.
As a method of integrating the multiple work results, majority voting, weighted evaluation according to the abilities of the workers or the like can be performed. For example, the ability of each worker can be estimated as the correct answer rate of each worker. However, in such a step-by-step workflow that succeeding-stage workers work based on a work result of a preceding-stage worker, there is a possibility that the ability and behavior of each worker cannot be sufficiently evaluated by the correct answer rate for each worker described above. This is because there is a possibility that work results of the succeeding-stage workers are influenced by seeing the work result of the preceding-stage worker. Therefore, it is important to appropriately model the influence described above in order to improve the quality of an integrated work result.
For example, there is a possibility that a worker who has a tendency of believing the work result of the preceding-stage worker and not modifying mistakes and a worker who has, on the contrary, a tendency of being suspicious and excessively modifying even a correct result exist. There is a possibility that, by estimating the abilities of succeeding-stage workers that is influenced by the result of a preceding-stage work, according to the quality of a work result of the preceding-stage worker and appropriately modeling the behavior of the workers having the above tendency, integration with higher accuracy can be performed.
Therefore, in the work results integration system 100 according to the embodiment, the management server 110 estimates the abilities of the succeeding-stage workers that may fluctuate according to the quality of a result of a preceding-stage work using a probability model in which the abilities of the succeeding-stage workers conditioned with the quality of a work result of the preceding-stage worker is introduced as parameters. Thereby, the behavior and skills of the succeeding-stage workers dependent on the quality of a result of a preceding-stage work are estimated more accurately to improve the accuracy of integration.
A work results integration process in the work results integration system 100 according to the embodiment is described below with reference to
The automatic recognition section 210 performs automatic recognition processing of task data by the arithmetic capacity of a computer and outputs an automatic recognition result to the work result acceptance section 220. The automatic recognition section 210 is an Optical Character Recognition (OCR) engine in the case of a character recognition task, and an Automatic Speech Recognition (ASR) engine in the case of a voice recognition task. The automatic recognition section 210 can operate as a preceding-stage worker in the described embodiment.
The work result acceptance section 220 provides a user interface to which the worker terminals 104 operated by workers access and through which the workers perform work for processing a task. Each of the workers of the worker terminals 104 can be either a preceding-stage worker or a succeeding-stage worker, or both of them.
The work result acceptance section 220 transmits task data 112 to be allocated to the worker terminals 104 and requests work for a task from the workers of the worker terminals 104. At that time, the work result acceptance section 220 can present a work result of a preceding-stage worker (including an automatic recognition result by the automatic recognition section 210) that has been already obtained for the task, to the worker terminals 104. The work result acceptance section 220 accepts work results for the task from the worker terminals 104 and stores the work results as a task log 114, associating the work results with pieces of identification information identifying the workers. Further, when accepting an automatic recognition result inputted from the automatic recognition section 210, the work result acceptance section 220 records the automatic recognition result in the task log 114 as a work result from a preceding-stage worker.
The work results integration section 230 reads out accumulated task logs 114 of one or more tasks, estimates the abilities of the workers based on a predetermined probability model, estimates work results to be obtained for the tasks, and generates work product data 116. The work results integration section 230, in more detail, includes an acquisition section 232, a parameter estimation section 234 and an output section 236.
The acquisition section 232 reads out the accumulated task logs 114 and acquires work results of preceding-stage and succeeding-stage workers for each of the one or more tasks. The parameter estimation section 234 estimates a parameter set of the predetermined probability model based on the multiple work results obtained for each of the one or more tasks. For each of the one or more tasks, the output section 236 estimates and outputs a work result to be obtained for the task based on the estimated parameter set. In addition to the work result to be obtained for each of the one or more tasks, the output section 236 can also output a parameter set including the abilities of the workers. Details of the probability model and the estimation of the parameters will be described later.
In the above description, the function sections 210, 220 and 230 shown in
A parameter estimation process using a probability model which is to be executed in the work results integration system 100 according to the embodiment of the present invention will be described below in more detail with reference to
A screen 310 shown in
That is, in the screen 310 shown in
The character image surrounded by a box 316 in
In the description below, a step-by-step workflow will be described as an example in which, via a character proofreading interface screen as shown in
Here, t is an index identifying a task, and it is assumed that there are n tasks. As for i, it is an index identifying the succeeding-stage workers, and it is assumed that there are m succeeding-stage workers. All the workers do not necessarily give an answer for each task, and a set of succeeding-stage workers after a task t is performed is indicated by Wt. Further, it is assumed that each task is independent from the other tasks and that, when the true correct answer zt and the answer ot of the preceding-stage worker are given, answers yit of succeeding-stage workers are mutually independent.
The answer ot of the preceding-stage worker and the answers yit of the succeeding-stage workers are included in a set of answers X that includes all characters as elements (otεX, yitεX). For a certain task t, the number of unique answers included in an answer ot of the preceding-stage worker and answers yit (iεWt) of the succeeding-stage workers is indicated by Kt. On the assumption that a correct answer of the task t exists in the Kt unique answers k (k=1, . . . . Kt), it is an objective to estimate a true correct answer zt (ztεX) that is latent.
Multiple parameters are given to a generative model. In the generative model shown in
A gray arrow (indicating a conditional probability) in the generative model shown in
As shown in
Further, in a parameter estimation method according to one embodiment, a parameter set (αi, βi, γ and πtk) of the generative model shown in
The parameter estimation method in the embodiment is described below in more detail with reference to
Here, before describing the process flow shown in
In the formula (4) above, the parameter set θ, the observed variable D and the latent variable Z are expressed by the formulas (5) to (7) below.
The prior distribution p (θ) in the formula (4) above may be indicated by beta distribution shown in the formulas (8) to (10) below and Dirichlet distribution shown in the formula (11) below for each parameter. In the formulas (8) to (11) below, a1, a2, b1, b2, g1, g2 and ρ are hyper-parameters, and it is assumed that appropriate values are given in advance. In the described embodiment, prior distribution πtk of the latent variable is indicated by symmetric Dirichlet distribution with Kt parameters having the same value. In the case of Kt=2, the prior distribution πtk corresponds to beta distribution.
αi˜Beta(a1,a2) (8)
βi˜Beta(b1,b2) (9)
γ˜Beta(g1,g2) (10)
{πtk}k=1K
That is, in the described embodiment, a problem of estimating correct answer rates αi and βi of m participating succeeding-stage workers, a correct answer rate γ of a preceding-stage worker, and a prior probability πtk of each answer k among Kt answers for each of n tasks, based on an answer ot of the preceding-stage worker for each of the n tasks and answers yit of the succeeding-stage workers as well as estimating a true correct answer zt for each of the n tasks is formularized.
It is generally difficult to perform calculation of an optimum solution of a problem as shown in the formula (4) above because a sum exists in a logarithm and, therefore, the amount of calculation is large. Therefore, in the described embodiment, an EM algorithm is applied to perform parameter estimation by repeated calculation. A specific process from applying the EM algorithm to perform MAP estimation of a parameter set θ up to estimating a true correct answer zt for a task using the obtained parameter set θ will be described below with reference to the flowchart shown in
The parameter estimation method shown in
At step S102, the parameter estimation section 234 first calculates an initial value of posterior distribution μtk of a variable zt for each of n tasks t obtained. The initial value of the posterior distribution μtk of the latent variable zt may be equal for the number of unique answers Kt. From a viewpoint of causing repeated calculation to converge earlier, however, the initial value can be determined according to majority voting, preferably, as expressed by the formula (12) below. In the formula (12) below, δ indicates Kronecker delta.
At step S103, the parameter estimation section 234 calculates initial values of a parameter set θ (αi, βi, γ and πtk) from the value of the posterior distribution μtk of the latent variable. As for a method of calculation the parameter set θ from the posterior distribution μtk, the calculation can be performed with the use of the formulas (17) to (20) below to be used at an M step of the EM algorithm to be described later.
At step S104, a loop is started, and an E (Expectation) step shown at step S105 and the M (Maximization) step shown at step S106 are alternately repeated until it is determined at step S107 that a predetermined convergence condition is satisfied.
At step S105, the parameter estimation section 234 executes the E step in the EM algorithm. Generally, at the E step in the EM algorithm, an expected value R (θ∥θ˜) of the posterior probability of the parameters, which is expressed by the formula (13) below, is calculated with the use of a current value θ˜ (superscript tilde) of the parameter set. The formula (13) below corresponds to determination of the lower bound of logarithm likelihood ln p (D, θ). Then, at the succeeding M step, parameters that maximize the expected value (lower bound) determined at the E step are newly determined. Then, by repeating the E step and the M step, parameters that maximize the posterior probability are determined. When being expressed more specifically, the formula (13) below can be expressed by the formula (14) below.
In the formula (14) above, μtk is posterior distribution of the latent variable zt when observed data (Dt={{yit}iεwt, ot}) and the parameter set θ˜ are given, and it is calculated by the formula (15) below.
(wherein the following is satisfied:
In the formula (16) above, rtk(θ) is a part that estimates a correct answer probability from the current values of the parameters taking account of the abilities of all workers. The formula (15) corresponds to weighted evaluation of each of given one or more unique answers k with the abilities (γ, αi and βi; iεWt) of the preceding-stage and succeeding-stage workers based on given work results (yit, ot) of the preceding-stage and succeeding-stage workers. In the formula (16) above, δ indicates Kronecker delta.
In the embodiment, the parameter set θ may be explicitly determined from the posterior distribution μtk at the succeeding M step as described later. Therefore, at step S105, the parameter estimation section 234 only has to calculate the posterior distribution μtk of a true correct answer zt for each of one or more tasks (t=1, . . . , n) using the current values θ˜ of the parameter set.
At step S106, the parameter estimation section 234 executes the M step in the EM algorithm. Generally, at the M step in the EM algorithm, updated values θ of the parameters that maximize the expected value R (θ∥θ˜) of the posterior probability determined at the E step, which is a function of the parameter set θ. If the expected value R of the posterior probability expressed by the formula (14) described above is partially differentiated with respect to each of the parameters αi, βi, γ and πtk to be zero, updated formulas of the parameters can be obtained in closed forms as shown in the formulas (17) to (20) below. In the formulas (17) and (18) below, Ti indicates a set of tasks performed by a worker i. In the formula (17) below, δ similarly indicates Kronecker delta.
That is, at step S106, the parameter estimation section 234 only has to explicitly calculate the updated values θ of the parameter set that maximize the expected value R of the posterior probability from the calculated new posterior distribution μtk of the latent variable using the updated formulas (17) to (20).
At step S107, it is determined whether or not the predetermined convergence condition is satisfied, and the loop is repeated from S104 until the predetermined convergence condition is satisfied. If the predetermined convergence condition is satisfied at step S107, the process exits the loop and proceeds to step S108.
The convergence determination can be a condition for logarithm likelihood L, for example, as expressed by the formula (21) below. In the formula (21) below, L (•) indicates the logarithm likelihood; θold means the previous values of the parameters; and θnew means the latest values of the parameters. The initial value of the logarithm likelihood of θold can be set, for example, to −∞. If the logarithm likelihood is not improved by predetermined ε or more by the formula (21) below, it is regarded that the parameter set has converged.
L(θnew)−L(θold)<ε (21)
At step S108, the parameter estimation section 234 calculates final posterior distribution μtk (k=1, . . . , Kt) of the latent variable for each of the one or more tasks (1, . . . , n) in accordance with the formula (15) above, using estimated values θ after the convergence of the parameter set. At step S109, for each of the one or more tasks, the output section 236 selects an answer k that maximizes the posterior probability μtk as a true correct answer (zt^ (superscript hat)) in accordance with the formula (22) below. That is, the mode of the given latent variable under observation is selected. If there are multiple values that take the same maximum value, for example, one can be selected at random.
At step S110, the output section 236 outputs the true correct answer zt for each of the one or more tasks and the parameter set θ (including the correct answer rates αi and βi indicating the abilities of the succeeding-stage workers) which have been obtained, and the process ends at step S111.
By the configuration described above, it becomes possible to appropriately estimate the abilities of succeeding-stage workers which may fluctuate according to the quality of a work result of a preceding-stage worker. Thereby, it is possible to appropriately model the behavior of the succeeding-stage workers influenced by seeing a work result of the preceding-stage worker and, therefore, improve the accuracy of integration of work results.
Knowledge of the abilities of workers estimated in this way may also be utilized, for example, at the time of determining by which worker a certain task is to be performed or which task a certain worker is to perform. Furthermore, the embodiment described above has an advantage that updated formulas of parameters are explicitly determined at the M step of the EM algorithm. Furthermore, since the number of parameters introduced into a model is small, there are also advantages of being robust against sparseness of data and not requiring a long time for calculation.
A method is also conceivable in which, by regarding the task for which a preceding-stage worker gives an incorrect answer, which has been described above, simply as a task with a high degree of difficulty, the task is incorporated into, for example, the model of J. Whitehill, et al., “Whose Vote Should Count More?: Optimal Integration of Labels from Labelers of Unknown Expertise.”, NIPS, Vol. 22, pp. 2035-2043, 2009, December. In this case, however, there is a tendency that, for a task with a high degree of difficulty, the estimated values of the abilities of workers are difficult to be reflected on difference among correct answer rates, and a weighted majority vote is in a form closer to a common majority vote. In comparison, by introducing parameters of the abilities of workers conditioned according to the qualities of work results of a preceding-stage worker into a generative model, the effect of the qualities of the work results of the preceding-stage worker counteracting variation among the abilities of the workers is eliminated, and, thereby, a weighted majority vote according to the abilities of the workers becomes possible.
In the parameter estimation method according to the embodiment described above, a true correct answer zt for all n tasks and a parameter set θ are estimated with the use of n task logs prepared in advance. However, such an embodiment of batch learning is not limiting. In another embodiment, an aspect of performing so-called sequential learning or online learning is also possible in which, under a situation of work results for a task being sequentially generated from a preceding-stage worker and succeeding-stage workers, a true correct answer zt for the task is estimated while the parameter set θ is updated, each time data is given.
Further, in the generative model shown in
However, the specific workflow described above is not limiting. For example, in another embodiment, the preceding-stage worker may be a worker who operates the worker terminal 104. Further, in another embodiment, the preceding-stage worker differs for each task, and the correct answer rate γt of the preceding-stage worker for each task may be introduced as a parameter.
Furthermore, in the generative model shown in
Further, in the embodiment described above, two kinds of correct answer rates α and β indicating the ability of a succeeding-stage worker conditioned by whether or not an answer of a preceding-stage worker is correct are introduced. However, conditioning of the abilities of the succeeding-stage workers is not limited to such an aspect. In another embodiment, the abilities of the succeeding-stage workers can be further conditioned by the kind and individual of the preceding-stage worker or any of them. For example, parameters indicating the ability of a succeeding-stage worker conditioned by whether the preceding-stage worker is a machine or a person can be introduced. For example, parameters indicating the ability of a succeeding-stage worker conditioned by which automatic recognition engine or which person the preceding-stage worker is can be introduced. Furthermore, in addition to conditioning by two stages of whether or not an answer of the preceding-stage worker is correct, conditioning by three or more stages is not impossible in another embodiment.
Further, in the generative model shown in
In the description with reference to
Furthermore, in the embodiment described above, the degree of difficulty of a task is not taken account of. In another embodiment, however, a configuration is also possible in which a parameter of the degree of task difficulty dt for each task t is introduced to estimate the degree of task difficulty dt together. Otherwise, it is also possible to introduce the abilities of succeeding-stage workers conditioned by the qualities of work results of a preceding-stage worker as parameters into the algorithm for estimating the degree of task difficulty disclosed in J. Whitehill, et al., “Whose Vote Should Count More?: Optimal Integration of Labels from Labelers of Unknown Expertise.”. NIPS, Vol. 22, pp. 2035-2043, 2009, December. In that case, the correct answer rate p (yit=zt) is conditioned by whether or not the preceding-stage worker gives a correct answer, and can be expressed by the formulas (23) and (24) below, wherein the skills of the worker are indicated by αi and βi.
A computer apparatus that realizes the management server 110 according to the embodiment above is described below.
The CPU 12, the cache memory 14 and the system memory 16 are connected to other devices or drivers, for example, a graphics driver 20 and a network interface card (NIC) 22 via a system bus 18. The graphics driver 20 is connected to an external display 24 via the bus and can display a result of processing by the CPU 12 on a display screen. The NIC 22 connects the computer apparatus 10 to a network 102 that uses an appropriate communication protocol such as TCP/IP at the physical layer level and the data link layer level.
An I/O bus bridge 26 is further connected to the system bus 18. On the downstream side of the I/O bus bridge 26, a hard disk device 30 is connected by IDE. ATA, ATAPI, serial ATA, SCSI, USB or the like via an I/O bus 28 such as a PCI. The hard disk device 30 can store, for example, task data 112, task logs 114 and work product data 116. Further, an input device 32 such as a pointing device, such as a keyboard and a mouse, is connected to the I/O bus 28 via a bus such as a USB, and a user interface is provided by this input device 32.
As the CPU 12 of the computer apparatus, any single-core or multi-core processor may be used. The computer apparatus is controlled by an operating system (hereinafter referred to as an OS) such as Windows® 200X, UNIX® and LINUX®. The computer apparatus realizes the configuration and processing of the function sections described above by developing a program on the system memory 16 or the like under the control of the above OS, executing the program and controlling operation of each hardware resource.
As described below, an experiment was conducted using actual task logs about succeeding-stage workers having performed modification of wrong recognition for an OCR recognition result of a preceding stage. First, character recognition of image data of a book was performed by two OCR engines, and character images for which the two OCR engines had given different answers were extracted. Then, an OCR recognition result and multiple extracted character images for which the common recognition result had been given were displayed via a character proofreading interface screen as shown in
The number of the workers was 42. Each of 39,176 unique tasks was allocated to one to three workers, and work results for a total of 116,662 tasks were obtained. In the character proofreading log data described above, how and which character was modified and by which worker the character was modified is recorded in a log format, and the log data corresponds to 116,662 lines. A descriptive statistics thereof is shown in Table 1 below. In this experiment, true correct answers indicated by the character images are already known, and, therefore, various estimated values obtained by parameter estimation can be compared with actually measured values.
A management server 110 implemented with the parameter estimation method shown in
A management server 110 was constructed with the same computer as the experiment example 1, the management server 110 being implemented with the same parameter estimation method as the experiment example 1 except for a point that the correct answer rates of the workers were uniform, that is, worker-independent correct answer rates α and β were used as parameters. The parameter estimation method was executed for the task logs described above to calculate accuracy. The accuracy of all the tasks was 0.9664. When accuracy was determined for the same set of the tasks as the experiment example 1 for which different answers had been obtained from the workers, the accuracy was 0.7108.
A management server 110 was constructed with the same computer as the experiment example 1, the management server 110 being implemented with the same parameter estimation method as the experiment example 1 except for a point that conditioning by whether or not an OCR recognition result had been correct was not performed, but one kind of correct answer rate si=p (yit=zt) for each worker was introduced as a parameter. This corresponds to what is obtained by a simplified latent class model (LC). Further, although a recognition result of the OCR engine can be also considered as an answer by a worker in such a model, the OCR engine was not included in the workers, and only results by the human workers were integrated in the experiment example 3. The parameter estimation method was executed for the task logs described above, and accuracy was determined for the same set of the tasks as the experiment example 1 for which different answers had been obtained from the workers.
An experiment example 4 is the same as the experiment example 3 except that an OCR recognition result was included in answers by the workers, and results by the workers including the OCR engine were integrated.
The same computer as the experiment example 1 was used to implement a program for integrating tasks by majority voting, and, for the task logs described above, the accuracy of an answer determined according to majority voting corresponding to a known true correct answer was evaluated. In the experiment example 5, the OCR engine was not included in the workers, and only results by the human workers were integrated. The accuracy of all the tasks was 0.9662. When accuracy was determined for the same set of the tasks as the experiment example 1 for which different answers had been obtained from the workers, the accuracy was 0.6791.
An experiment example 6 is the same as the experiment example 5 except for a point that an OCR recognition result was included in answers by the workers, and results by the workers including the OCR engine were integrated.
Estimation was performed with an answer of the OCR engine regarded as a correct answer as it was, and the accuracy of corresponding to a known true correct answer was evaluated. The accuracy of all the tasks was 0.7499, and the accuracy for the same set of the tasks as the experiment example 1 for which different answers had been obtained from the workers was 0.0858.
When the experiment results of the experiment examples 1 to 7 are compared, the accuracy for a set of all the tasks in the experiment example 1 is improved by about 0.3% even when compared with the experiment example 5, the result of which is the best among the experiment examples 3 to 7. Referring to
When workers who satisfied a condition that had performed two hundred or more tasks for which the OCR engine had given an answer corresponding to a known true correct answer and had performed two hundred or more tasks for which the OCR engine had given an incorrect answer were extracted from all the workers in the log data described above, nineteen workers among the total of forty-two workers were extracted.
The middle chart in
The upper chart in
As shown in the upper chart of
It is thought that this is because the correct answer rate of the worker 92 is low when the OCR engine gives a correct answer, and, on the contrary, the correct answer rate of the worker 225 is low when the OCR engine gives an incorrect answer.
Furthermore, as for the RMSE and the rank correlation coefficient, it can be said that estimation is better as the value of the RMSE is closer to zero, because the RMSE indicates an absolute error between estimated and actual values; and it can be said that estimation is better as the value of the rank correlation coefficient is closer to one, because the rank correlation coefficient indicates the degree of similarity between ranking by an estimated value and ranking by an actual value. When the RMSE and the rank correlation coefficient are compared in the experiment examples 8 and 9, it is seen that, as for the correct answer rate αi in the case of the preceding-stage giving a correct answer, both of the RMSE and the rank correlation coefficient are significantly improved. When compared as for the correct answer rate βi in the case of the preceding stage giving an incorrect answer, it is seen that the RMSE is equal to the correct answer rate βi in the experiment example 9, but the rank correlation coefficient is improved. In the experiment example 9, there is a tendency of excessively evaluating a worker whose actual correct answer rate is low, and it is thought that the reason is that a correct answer rate is influenced by the correct answer rate of a worker who gives a lot of answers and has a high correct answer rate. In comparison, it can be thought that, in the experiment example 8, the effect of being influenced as the number of answers decreases is reduced.
As described above, according to one embodiment, it is possible to provide an estimation method, an estimation system, a computer system and a program capable of estimating abilities of succeeding-stage workers that may fluctuate according to the quality of a work result of a preceding-stage worker. Furthermore, according to one embodiment, it is possible to provide a program for performing a process for integrating work results of multiple workers while estimating the abilities of the succeeding-stage workers which may fluctuate according to the quality of a work result of the preceding-stage worker.
The estimation method and the like described above can be applied not only to quality management in a process for integrating work results of multiple workers in crowdsourcing and the like but also to education of multiple workers, for example, by giving feedback about estimated abilities of the multiple workers.
Each function section and a process of each function section have been described in order to facilitate understanding of the embodiments described herein. In one embodiment, however, in addition to the particular function sections described above executing particular processes, it is possible to determine to which function sections functions for executing the processes described above are to be assigned in consideration of efficiency of processing and efficiency of programming or the like for implementation.
The above functions may be realized by an apparatus-executable program written, for example, in an object-oriented programming language such as C++, Java®, Java® Beans, Java® Applet, JavaScript®, Perl, Python and Ruby, and it can be stored in an apparatus-readable recording medium and distributed or can be distributed through transmission.
The present invention has been described with particular embodiments. The present invention, however, is not limited to the embodiments described herein and may be changed within a range that one skilled in the art may think of, such as other embodiments, addition, modification and deletion. Any aspect is included in the scope of the present invention as far as the operation/effects of the present invention may be obtained.
Number | Date | Country | Kind |
---|---|---|---|
2014-226645 | Nov 2014 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
20090287546 | Gillespie | Nov 2009 | A1 |
20120029978 | Olding | Feb 2012 | A1 |
20140039870 | Roy | Feb 2014 | A1 |
20150213360 | Venanzi | Jul 2015 | A1 |
20150242447 | Ipeirotis | Aug 2015 | A1 |
20150356488 | Eden | Dec 2015 | A1 |
Number | Date | Country |
---|---|---|
02068658 | Mar 1990 | JP |
10261122 | Sep 1998 | JP |
2001243341 | Sep 2001 | JP |
2004151895 | May 2004 | JP |
2007026135 | Feb 2007 | JP |
2008084051 | Apr 2008 | JP |
2014074966 | Apr 2014 | JP |
Entry |
---|
Dawid, A.P. et al., “Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm,” Journal of the Royal Statistical Society, Series C (Applied Statistics), vol. 28, No. 1, 1979, pp. 20-28. |
Kobayashi, M. et al., “Age-based Task Specialization for Crowdsourced Proofreading,” Universal Access in Human-Computer Interaction, User and Context Diversity, 2013, pp. 104-112. |
Kacorri, H. et al., “Introducing Game Elements in Crowdsourced Video Captioning by Non-Experts,” Proceedings of the 11th Web for All Conference, ACM, Apr. 2014, pp. 1-4. |
Whitehill, J. et al., “Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise,” NIPS, vol. 22, pp. 2035-2043, Dec. 2009. |
Welinder, P. et al., “The Multidimensional Wisdom of Crowds,” In Proceedings of the 24th Annual Conference on Neural Information Processing Systems (NIPS), 2010, pp. 1-9. |
Donmez, P., et al., “Efficiently Learning the Accuracy of Labeling Sources for Selective Sampling,” In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, Jun. 30, 2009, pp. 259-268. |
Number | Date | Country | |
---|---|---|---|
20160132815 A1 | May 2016 | US |