INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND COMPUTER PROGRAM

TECHNICAL FIELD

This disclosure relates to an information processing system, an information processing method, and a computer program that process information about classification, for example.

BACKGROUND ART

A known system of this type performs a learning process about a likelihood. For example, Patent Literature 1 discloses learning of a support vector machine used to determine a likelihood. Patent Literature 2 discloses that the support vector machine or logistic regression may be used for the learning of an identifier/classifier using a likelihood.

As another related technology, for example, Patent Literature 3 discloses a technique/technology of performing a folder classification process for an image file on the basis of a predetermined recognition condition and superiority condition, on an apparatus that determines whether or not a person included in an image is a registered person.

CITATION LIST
Patent Literature

Patent Literature 1: JP2019-139618A

Patent Literature 2: JP2017-174054A

Patent Literature 3: JP2010-250730A

SUMMARY
Technical Problem

This disclosure aims to improve the related techniques/technologies described above.

Solution to Problem

An information processing system according to an example aspect of this disclosure includes: an acquisition unit that obtains a plurality of elements included in series data; a calculation unit that calculates a likelihood ratio indicating a likelihood of a class to which the series data belong, on the basis of at least two consecutive elements of the plurality of elements; a classification unit that classifies the series data into at least one class of a plurality of classes that are classification candidates, on the basis of the likelihood ratio; and a learning unit that performs learning related to calculation of the likelihood ratio, by using a loss function in which the likelihood ratio increases when a correct answer class to which the series data belong is in a numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in a denominator of the likelihood ratio.

An information processing method according to an example aspect of this disclosure includes: obtaining a plurality of elements included in series data; calculating a likelihood ratio indicating a likelihood of a class to which the series data belong, on the basis of at least two consecutive elements of the plurality of elements; classifying the series data into at least one class of a plurality of classes that are classification candidates, on the basis of the likelihood ratio; and performing learning related to calculation of the likelihood ratio, by using a loss function in which the likelihood ratio increases when a correct answer class to which the series data belong is in a numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in a denominator of the likelihood ratio.

A computer program according to an example aspect of this disclosure operates a computer: to obtain a plurality of elements included in series data; to calculate a likelihood ratio indicating a likelihood of a class to which the series data belong, on the basis of at least two consecutive elements of the plurality of elements; to classify the series data into at least one class of a plurality of classes that are classification candidates, on the basis of the likelihood ratio; and to perform learning related to calculation of the likelihood ratio, by using a loss function in which the likelihood ratio increases when a correct answer class to which the series data belong is in a numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in a denominator of the likelihood ratio.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram illustrating a hardware configuration of an information processing system according to a first example embodiment.

FIG. 2 is a block diagram illustrating a functional configuration of the information processing system according to the first example embodiment.

FIG. 3 is a flowchart illustrating a flow of operation of a classification apparatus in the information processing system according to the first example embodiment.

FIG. 4 is a flowchart illustrating a flow of operation of a learning unit in the information processing system according to the first example embodiment.

FIG. 5 is a flowchart illustrating a flow of operation of a learning unit in an information processing system according to a second example embodiment.

FIG. 6 is a matrix diagram illustrating an example of likelihood ratios that are considered by the learning unit in the information processing system according to the second example embodiment.

FIG. 7 is a flowchart illustrating a flow of operation of a learning unit in an information processing system according to a third example embodiment.

FIG. 8 is a flowchart illustrating a flow of operation of a learning unit in an information processing system according to a fourth example embodiment.

FIG. 9 is a matrix diagram illustrating an example of likelihood ratios that are considered by the learning unit in the information processing system according to the fourth example embodiment.

FIG. 10 is a graph illustrating an example of a sigmoid function used in an information processing system according to a fifth example embodiment.

FIG. 11 is a graph illustrating an example of a logistic function used in an information processing system according to a sixth example embodiment.

FIG. 12 is a block diagram illustrating a functional configuration of an information processing system according to a seventh example embodiment.

FIG. 13 is a flowchart illustrating a flow of operation of the classification apparatus in the information processing system according to the seventh example embodiment.

FIG. 14 is a block diagram illustrating a functional configuration of an information processing system according to an eighth example embodiment.

FIG. 15 is a flowchart illustrating a flow of operation of a likelihood ratio calculation unit in the information processing system according to the eighth example embodiment.

FIG. 16 is a flowchart illustrating a flow of operation of the classification apparatus in an information processing system according to a ninth example embodiment.

DESCRIPTION OF EXAMPLE EMBODIMENTS

Hereinafter, an information processing system, an information processing method, and a computer program according to example embodiments will be described with reference to the drawings.

First Example Embodiment

An information processing system according to a first example embodiment will be described with reference to FIG. 1 to FIG. 4.

(Hardware Configuration)

First, a hardware configuration of the information processing system according to the first example embodiment will be described with reference to FIG. 1. FIG. 1 is a block diagram illustrating the hardware configuration of the information processing system according to the first example embodiment.

As illustrated in FIG. 1, an information processing system 1 according to the first example embodiment includes a processor 11, a RAM (Random Access Memory) 12, a ROM (Read Only Memory) 13, and a storage apparatus 14. The information processing system 1 may further include an input apparatus 15 and an output apparatus 16. The processor 11, the RAM 12, the ROM 13, the storage apparatus 14, the input apparatus 15, and the output apparatus 16 are connected through a data bus 17.

The processor 11 reads a computer program. For example, the processor 11 is configured to read a computer program stored by at least one of the RAM 12, the ROM 13 and the storage apparatus 14. Alternatively, the processor 11 may read a computer program stored in a computer-readable recording medium by using a not-illustrated recording medium reading apparatus. The processor 11 may obtain (i.e., may read) a computer program from a not-illustrated apparatus disposed outside the information processing system 1, through a network interface. The processor 11 controls the RAM 12, the storage apparatus 14, the input apparatus 15, and the output apparatus 16 by executing the read computer program. Especially in this example embodiment, when the processor 11 executes the read computer program, a functional block for performing a classification using a likelihood ratio and a learning process related to the classification is realized or implemented in the processor 11. An example of the processor 11 includes a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a FPGA (field-programmable gate array), a DSP (Demand-Side Platform), and an ASIC (Application Specific Integrated Circuit). The processor 11 may use one of the examples described above, or may use a plurality of them in parallel.

The RAM 12 temporarily stores the computer program to be executed by the processor 11. The RAM 12 temporarily stores the data that is temporarily used by the processor 11 when the processor 11 executes the computer program. The RAM 12 may be, for example, a D-RAM (Dynamic RAM).

The ROM 13 stores the computer program to be executed by the processor 11. The ROM 13 may otherwise store fixed data. The ROM 13 may be, for example, a P-ROM (Programmable ROM).

The storage apparatus 14 stores the data that is stored for a long term by the information processing system 1. The storage apparatus 14 may operate as a temporary storage apparatus of the processor 11. The storage apparatus 14 may include, for example, at least one of a hard disk apparatus, a magneto-optical disk apparatus, a SSD (Solid State Drive), and a disk array apparatus.

The input apparatus 15 is an apparatus that receives an input instruction from a user of the information processing system 1. The input apparatus 15 may include, for example, at least one of a keyboard, a mouse, and a touch panel. The input apparatus 15 may be a dedicated controller (operation terminal). The input apparatus 15 may also include a terminal owned by the user (e.g., a smartphone or a tablet terminal, etc.). The input apparatus 15 may be an apparatus that allows an audio input including a microphone, for example.

The output apparatus 16 is an apparatus that outputs information about the information processing system 1 to the outside. For example, the output apparatus 16 may be a display apparatus (e.g., a display) that is configured to display the information about the information processing system 1. The display apparatus here may be a TV monitor, a personal computer monitor, a smartphone monitor, a tablet terminal monitor, or another portable terminal monitor. The display apparatus may be a large monitor or a digital signage installed in various facilities such as stores. The output apparatus 16 may be an apparatus that outputs the information in a format other than an image. For example, the output apparatus 16 may be a speaker that audio-outputs the information about the information processing system 1.

(Functional Configuration)

Next, a functional configuration of the information processing system 1 according to the first example embodiment will be described with reference to FIG. 2. FIG. 2 is a block diagram illustrating the functional configuration of the information processing system according to the first example embodiment.

As illustrated in FIG. 2, the information processing system 1 according to the first example embodiment includes a classification apparatus 10 and a learning unit 300. The classification apparatus 10 is an apparatus for performing class classification of inputted series data, and includes, as processing blocks for realizing the functions thereof, a data acquisition unit 50, a likelihood ratio calculation unit 100, and a class classification unit 200. Furthermore, the learning unit 300 is configured to perform a learning process related to the classification apparatus 10. Although the learning unit 300 is provided separately from the classification apparatus 10, the classification apparatus 10 may include the learning unit 300. Each of the data acquisition unit 50, the likelihood ratio calculation unit 100, the class classification unit 200, and the learning unit 300 may be realized or implemented by the processor 11 (see FIG. 1).

The data acquisition unit 50 is configured to obtain a plurality of elements included in the series data. The data acquisition unit 50 may directly obtain data from an arbitrary data acquisition apparatus (e.g., a camera, a microphone, etc.) or may read data obtained in advance by a data acquisition apparatus and stored in a storage or the like. When data are obtained from a camera, the data acquisition unit 50 may be configured to obtain the data from each of a plurality of cameras. The elements of the series data obtained by the data acquisition unit 50 is configured to be outputted to the likelihood ratio calculation unit 100. The series data are data including a plurality of elements arranged in a predetermined order, and an example thereof is time series data, for example. Amore specific example of the series data includes, but is not limited to, video data and audio data.

The likelihood ratio calculation unit 100 is configured to calculate a likelihood ratio on the basis of at least two consecutive elements of the plurality of elements obtained by the data acquisition unit 50. The “likelihood ratio” here is an index indicating a likelihood of a class to which the series data belong. A specific example of the likelihood ratio and a specific calculation method thereof will be described in detail in another example embodiment described later.

The class classification unit 200 is configured to classify the series data on the basis of the likelihood ratio calculated by the likelihood ratio calculation unit 100. The class classification unit 200 selects at least one class to which the series data belong, from among a plurality of classes that are classification candidates. The plurality of classes that are classification candidates may be set in advance. Alternatively, the plurality of classes that are classification candidates may be set by the user as appropriate, or may be set as appropriate on the basis of a type of the series data to be handled.

The learning unit 300 performs learning related to the calculation of the likelihood ratio by using a loss function. Specifically, the learning unit 300 performs the learning related to the calculation of the likelihood ratio such that the class classification based on the likelihood ratio is accurately performed. The loss function used by the learning unit 300 is defined as a function in which the likelihood ratio increases when a correct answer class to which the series data belong is in a numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in a denominator of the likelihood ratio. The loss function may be set in advance as a function that satisfies such a definition. A specific example of the loss function will be described in detail in another example embodiment described later.

(Flow of Classification Operation)

Next, with reference to FIG. 3, a flow of operation of the classification apparatus 10 in the information processing system 1 according to the first example embodiment (specifically, a class classification operation after the learning) will be described. FIG. 3 is a flowchart illustrating the flow of the operation of the classification apparatus in the information processing system according to the first example embodiment.

As illustrated in FIG. 3, when the operation of the classification apparatus 10 is started, first, the data acquisition unit 50 obtains elements included in the series data (step S11). The data acquisition unit 50 outputs the obtained elements of the series data to the likelihood ratio calculation unit 100. Then, the likelihood ratio calculation unit 100 calculates the likelihood ratio on the basis of the obtained two or more elements (step S12).

Subsequently, the class classification unit 200 performs the class classification on the basis of the calculated likelihood ratio (step S13). The class classification may determine a single class to which the series data belong, or may determine a plurality of classes to which the series data are likely to belong. The class classification unit 200 may output a result of the class classification to a display or the like. The class classification unit 200 may output the result of the class classification by audio through a speaker or the like.

(Flow of Learning Operation)

Next, a flow of operation of the learning unit 300 in the information processing system 1 according to the first example embodiment (i.e., a learning operation related to the calculation of the likelihood ratio) will be described with reference to FIG. 4. FIG. 4 is a flowchart illustrating the flow of the operation of the learning unit in the information processing system according to the first example embodiment.

As illustrated in FIG. 4, when the learning operation is started, first, training data are inputted to the learning unit 300 (step S101). The training data may be configured as a set of the series data and information about the correct answer class to which the series data belong (i.e., correct answer data), for example.

Subsequently, the learning unit 300 calculates the loss function by using the inputted training data (step S102). As already described, the loss function here is a function in which the likelihood ratio increases when the correct answer class to which the series data belong is in the numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in the denominator of the likelihood ratio.

Subsequently, the learning unit 300 adjusts a parameter (specifically, a parameter of a model for calculating the likelihood ratio) such that the calculated loss function is small (step S103). That is, the learning unit 300 optimizes the parameter of the model for calculating the likelihood ratio. An optimization method of the parameter using the loss function may adopt existing techniques/technologies as appropriate. An example of the optimization method is an error back propagation method, but another method may be also used.

Then, the learning unit 300 determines whether or not all the learning is ended (step S104). The learning unit 300 may determine whether or not all the learning is ended, depending on whether or not all the training data are inputted, for example. Alternatively, the learning unit 300 may determine whether or not all the learning is ended, depending on whether or not a predetermined period elapses from a start of the learning. Alternatively, the learning unit 300 may determine whether or not all the learning is ended, depending on whether or not a predetermined number of loops of the steps S101 to S103 are executed.

When it is determined that all the learning is ended (step S104: YES), a series of processing steps is ended. On the other hand, when it is determined that all the learning is not ended (step S104: NO), the learning unit 300 starts the process from the step S101 again. This allows the learning process using the training data to be repeated, and adjusts the parameter to be more optimal.

(Technical Effect)

Next, a technical effect obtained by the information processing system 1 according to the first example embodiment will be described.

As described in FIG. 1 to FIG. 4, in the information processing system 1 according to the first example embodiment, the learning related to the calculation of the likelihood ratio used for the class classification is performed by the learning unit 300. Especially in this example embodiment, the learning is performed by using the loss function in which the likelihood ratio increases when the correct answer class to which the series data belong is in the numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in the denominator of the likelihood ratio. By using such a loss function, the learning can be performed such that a penalty increases when the class is an incorrect answer and the penalty decreases when the class is a correct answer. As a result, it is possible to properly select at least one class to which the series data belong, from among a plurality of classes that are classification candidates.

When there are a plurality of classes as the classification candidates (so-called, when a multiclass classification is performed), it is not easy to determine what type of likelihood ratio is considered at the time of learning (e.g., what ratio should be taken). By using the loss function described above, however, the magnitude of the likelihood ratio varies depending on whether the correct answer class is in the numerator of the likelihood ratio or in the denominator, and that changes an influence on the loss function. By using such a loss function, it is possible to properly perform the learning related to the calculation of the likelihood ratio in the multiclass classification. This makes it possible to realize a proper class classification. What type of likelihood ratio should be considered at the time of learning is hard to determine, especially when there are three or more classes as the classification candidates. Therefore, the technical effect according to this example embodiment is remarkably exhibited when the classification candidates are three or more classes.

Second Example Embodiment

The information processing system 1 according to a second example embodiment will be described with reference to FIG. 5 and FIG. 6. The second example embodiment is partially different from the first example embodiment only in the operation, and may be the same as the first example embodiment in the apparatus configuration (see FIG. 1 and FIG. 2), the operation of the classification apparatus 10 (see FIG. 3) or the like, for example. For this reason, a part that is different from the first example embodiment will be described in detail below, and a description of other overlapping parts will be omitted as appropriate.

(Flow of Learning Operation)

First, a flow of operation of the learning unit 300 in the information processing system 1 according to the second example embodiment will be described with reference to FIG. 5. FIG. 5 is a flowchart illustrating the flow of the operation of the information processing system according to the second example embodiment. In FIG. 5, the same steps as those illustrated in FIG. 4 carry the same reference numerals.

As illustrated in FIG. 5, when the operation of the information processing system 1 according to the second example embodiment is started, first, the training data are inputted to the learning unit 300 (step S101).

Subsequently, the learning unit 300 calculates the loss function by using the inputted training data, and especially in the second example embodiment, the learning unit 300 calculates the loss function that takes into account the likelihood ratios of N×(N−1) patterns in which the denominator is a likelihood in which the series data belong to one class and the numerator is a likelihood in which the series data belong to another class, out of N classes (wherein N is a natural number) that are classification candidates of the series data (step S201). As in the first example embodiment, this loss function is also a function in which the likelihood ratio increases when the correct answer class to which the series data belong is in the numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in the denominator of the likelihood ratio. The likelihood ratios to be considered in the loss function will be described in detail later with a specific example.

Subsequently, the learning unit 300 adjusts a parameter such that the calculated loss function is small (step S103). That is, the learning unit 300 optimizes the parameter of the model for calculating the likelihood ratio. Then, the learning unit 300 determines whether or not all the learning is ended (step S104). When it is determined that all the learning is ended (step S104: YES), a series of processing steps is ended. On the other hand, when it is determined that all the learning is not ended (step S104: NO), the learning unit 300 starts the process from the step S101 again.

Specific Example of Likelihood Ratios to be Considered

Next, with reference to FIG. 6, the likelihood ratios to be considered in the learning operation by the learning unit 300 (i.e., the likelihood ratios to be considered in the calculation of the loss function) will be specifically described. FIG. 6 is a matrix diagram illustrating an example of the likelihood ratios that are considered by the learning unit in the information processing system according to the second example embodiment.

As illustrated in FIG. 6, the likelihood is considered to be a matrix. For convenience of description, it is assumed that there are three classes that are classification candidates, which are a “class 0,” a “class 1,” and a “class 2.” p(X|y=0) is a likelihood indicating that the series data are in the “class 0”. p(X|y=1) is a likelihood indicating that the series data are in the “class 1”. p(X|y=2) is a likelihood indicating that the series data are in the “class 2”.

In a first row from the top of the matrix, the numerators of log likelihood ratios (hereinafter simply referred to as “likelihood ratios”) are all p (X|y=0). In a second row from the top of the matrix, the numerators of the likelihood ratios are all p(X|y=1). In a third row from the top of the matrix, the numerators of the likelihood ratios are all p(X|y=2). On the other hand, in a first column from the left of the matrix, the denominators of the likelihood ratios are all p(X|y=0). In a second column from the left of the matrix, the denominators of the likelihood ratios are all p(X|y=1. In a third column from the left of the matrix, the denominators of the likelihood ratios are all p(X|y=2).

In the likelihood ratios on a diagonal line of the matrix (the likelihood ratios shaded in gray in FIG. 6), the denominator and the numerator have the same likelihood. Specifically, in each of log{p(X|y=0)/p(X|y=0)} in the first row from the top and the first column from the left, log{p(X|y=1)/p(X|y=1)} in the second row from the top and the second column from the left, and log{p(X|y=2)/p(X|y=2)} in the third row from the top and the third column from the left, the denominator is the same as the numerator. Furthermore, in the likelihood ratios located at a position facing each other across the likelihood ratios on the diagonal line, the denominator and the numerator are reversed to each other. Specifically, in log{p(X|y=0)/p(X|y=1)} in the first row from the top and the second column from the left and log{p(X|y=1)/p(X|y=0)} in the second row from the top and the first column from the left, the denominator and the numerator are reversed. Similarly, log{p(X|y=0)/p(X|y=2)} in the first row from the top and the third column from the left and log{p(X|y=2)/p(X|y=0)} in the third row from the top and the first column from the left, the denominator and the numerator are reversed. In log{p(X|y=1)/p(X|y=2)} in the second row from the top and the third column from the left and log{p(X|y=2)/p(X|y=1)} in the third row from the top and the second column from the left, the denominator and the numerator are reversed. Therefore, in the likelihood ratios located at the position facing each other across the diagonal line, have opposite signs to each other. Thus, the likelihood ratios illustrated in the matrix are arranged like an alternating matrix.

In particular, the likelihood ratios on the diagonal line in which the denominator is the same as the numerator are all log 1, and have values of zero. For this reason, the likelihood ratios on the diagonal line in which the denominator is the same as the numerator, have substantially meaningless values even when the loss function is considered. Therefore, the likelihood ratios on the diagonal line in which the denominator is the same as the numerator, are not considered in the loss function. The number of the remaining likelihood ratios excluding the likelihood ratios on the diagonal line, is N×(N−1), wherein N is the number of classes. In this example embodiment, these likelihood ratios of N×(N−1) patterns (i.e., the likelihood ratios excluding the likelihood ratios on the diagonal line in the matrix) are considered in the loss function. A specific example of the loss function that takes into account the likelihood ratios of N×(N−1) patterns will be described in detail in another example embodiment described later.

(Technical Effect)

Next, a technical effect obtained by the information processing system 1 according to the second example embodiment will be described.

As described in FIGS. 5 and 6, in the information processing system 1 according to the second example embodiment, the learning is performed by using the loss function that takes into account the likelihood ratios of N×(N−1) patterns in which the denominator is a likelihood in which the series data belong to one class and the numerator is a likelihood in which the series data belong to another class. By using such a loss function, as in the first example embodiment, the learning can be performed such that the penalty increases when the class is an incorrect answer and the penalty decreases when the class is a correct answer. As a result, it is possible to properly select at least one class to which the series data belong, from among a plurality of classes that are classification candidates.

Third Example Embodiment

The information processing system 1 according to a third example embodiment will be described with reference to FIG. 7. The third example embodiment is partially different from the first and second example embodiments only in the operation, and may be the same as the first and second example embodiments in the other parts. For this reason, apart that is different from each of the example embodiments described above will be described in detail below, and a description of other overlapping parts will be omitted as appropriate.

(Flow of Learning Operation)

First, a flow of operation of the learning unit 300 in the information processing system 1 according to the third example embodiment will be described with reference to FIG. 7. FIG. 7 is a flowchart illustrating the flow of the operation of the information processing system according to the third example embodiment. In FIG. 7, the same steps as those illustrated in FIG. 4 carry the same reference numerals.

As illustrated in FIG. 7, when the operation of the information processing system 1 according to the third example embodiment is started, first, the training data are inputted to the learning unit 300 (step S101).

Subsequently, the learning unit 300 calculates the loss function by using the inputted training data, and especially in the third example embodiment, the learning unit 300 calculates the loss function that takes into account a part of the likelihood ratios of N×(N−1) patterns in which the denominator is a likelihood in which the series data belong to one class and the numerator is a likelihood in which the series data belong to another class, out of N classes that are classification candidates of the series data (step S301). That is, the learning unit 300 according to the third example embodiment considers not all, but a part of the likelihood ratios of N×(N−1) patterns described in the second example embodiment. As in the first example embodiment, this loss function is also a function in which the likelihood ratio increases when the correct answer class to which the series data belong is in the numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in the denominator of the likelihood ratio. A specific example of the loss function that takes into account a part of the likelihood ratios of N×(N−1) patterns will be described in detail in another example embodiment described later.

Subsequently, the learning unit 300 adjusts the parameter such that the calculated loss function is small (step S103). Then, the learning unit 300 determines whether or not all the learning is ended (step S104). When it is determined that all the learning is ended (step S104: YES), a series of processing steps is ended. On the other hand, when it is determined that all the learning is not ended (step S104: NO), the learning unit 300 starts the process from the step S101 again.

Selection Example of Likelihood Ratios to be Considered

Next, a selection example of the likelihood ratios to be considered in the loss function (i.e., a selection example of a part of the likelihood ratios of N×(N−1) pattern) will be specifically described.

Out of the likelihood ratios of N×(N−1) patterns, a part of the likelihood ratios to be considered in the loss function may be selected in advance by the user or the like, or may automatically selected by the learning unit 300. When the learning unit 300 selects a part of the likelihood ratios to be considered in the loss function, the learning unit 300 may select the likelihood ratios in accordance with a predetermined rule set in advance. Alternatively, the learning unit 300 may determine whether or not to make a selection on the basis of values of the calculated likelihood ratios.

A selection example of a part of the likelihood ratios to be considered in the loss function is to select only the likelihood ratios in one row or one column of the matrix illustrated in FIG. 6. For example, as the likelihood ratios to be considered in the loss function, only the likelihood ratios in the first row of the matrix illustrated in FIG. 6 may be selected, only the likelihood ratios in the second row may be selected, or only the likelihood ratios in the third row may be selected. Alternatively, only the likelihood ratios in the first column of the matrix illustrated in FIG. 6 may be selected, only the likelihood ratios in the second column may be selected, or only the likelihood ratios in the third column may be selected.

In addition, only the likelihood ratios in a part of a plurality of rows or a part of a plurality of columns of the matrix may be selected. Specifically, only the likelihood ratios in the first row and the second row of the matrix may be selected, only the likelihood ratios in the second row and the third row may be selected, and only the likelihood ratios in the third row and the first row may be selected. Alternatively, only the likelihood ratios in the first column and the second column of the matrix may be selected, only the likelihood ratios in the second column and the third column may be selected, and only the likelihood ratios in the third column and the first column may be selected.

The selection example of the likelihood ratio described above is only an example, and another likelihood ratio may be selected as the likelihood ratios to be considered in the loss function. For example, the likelihood ratios to be considered in the loss function may be randomly selected, regardless of the row and the column.

(Technical Effect)

Next, a technical effect obtained by the information processing system 1 according to the third example embodiment will be described.

As described in FIG. 7, in the information processing system 1 according to the third example embodiment, the learning is performed by using the loss function that takes into account a part of the likelihood ratios, out of N×(N−1) patterns in which the denominator is a likelihood in which the series data belong to one class and the numerator is a likelihood in which the series data belong to another class. By using such a loss function, as in each of the example embodiments descried above, the learning can be performed such that the penalty increases when the class is an incorrect answer and the penalty decreases when the class is a correct answer. As a result, it is possible to properly select at least one class to which the series data belong, from among a plurality of classes that are classification candidates. Especially in the third example embodiment, by properly selecting the likelihood ratios to be considered in the loss function from among the N×(N−1) patterns, it is possible to perform more efficient learning than that when all the likelihood ratios of N×(N−1) patterns are considered. For example, it is possible to increase a learning efficiency by selecting only the likelihood ratio with a relatively large influence on the loss function and by not selecting the likelihood ratio with a relatively small influence on the loss function.

Fourth Example Embodiment

The information processing system 1 according to a fourth example embodiment will be described with reference to FIG. 8 and FIG. 9. The fourth example embodiment describes a specific selection example of the third example embodiment (i.e., a selection example of a part the likelihood ratios to be considered in the loss function), and may be the same as the third example embodiment in the other parts. For this reason, a part that is different from each of the example embodiments described above will be described in detail below, and a description of other overlapping parts will be omitted as appropriate.

(Flow of Learning Operation)

First, a flow of operation of the learning unit 300 in the information processing system 1 according to the fourth example embodiment will be described with reference to FIG. 8. FIG. 8 is a flowchart illustrating the flow of the operation of the information processing system according to the fourth example embodiment. In FIG. 8, the same steps as those illustrated in FIG. 4 carry the same reference numerals.

As illustrated in FIG. 8, when the operation of the information processing system 1 according to the fourth example embodiment is started, first, the training data are inputted to the learning unit 300 (step S101).

Subsequently, the learning unit 300 calculates the loss function by using the inputted training data, and especially in the fourth example embodiment, the learning unit 300 calculates the loss function that takes into account the likelihood ratio in which the correct answer class is in the numerator, out of the likelihood ratios of N×(N−1) patterns described above (step S401). That is, the learning unit 300 according to the fourth example embodiment selects the likelihood ratio in which the correct answer class is in the numerator, as a part of the likelihood ratios of N×(N−1) patterns described in the third example embodiment. As in the first example embodiment, this loss function is also a function in which the likelihood ratio increases when the correct answer class to which the series data belong is in the numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in the denominator of the likelihood ratio. A specific example of the loss function that takes into account the likelihood ratio in which the correct answer class is in the numerator will be described in detail in another example embodiment described later.

Specific Example of Likelihood Ratios to be Considered

Next, with reference to FIG. 9, the likelihood ratios to be considered in the learning operation by the learning unit 300 (i.e., the likelihood ratios to be considered in the calculation of the loss function) will be specifically described. FIG. 9 is a matrix diagram illustrating an example of the likelihood ratios that are considered by the learning unit in the information processing system according to the fourth example embodiment.

In the matrix illustrated in FIG. 9, as already described in the second example embodiment (see FIG. 6), the likelihood ratios are arranged like an alternating matrix. The learning unit 300 according to the fourth example embodiment selects the likelihood ratio in which the correct answer class is in the numerator, from among the likelihood ratios of N×(N−1) patterns excluding the likelihood ratios on the diagonal line in such a matrix, and considers it in the loss function.

For example, it is assumed that the correct answer class of the series data inputted as the training data is the “class 1”. In this case, the learning unit 300 selects the likelihood ratio in which the class 1 is in the numerator from among the likelihood ratios of N×(N−1) patterns, and considers it in the loss function. Specifically, the learning unit 300 selects only the likelihood ratios in the second row from the top (excluding the likelihood ratios on the diagonal line) in FIG. 9, and considers them in the loss function. In this case, log{p(X|y=1)/p(X|y=0)} in the second row from the top and the first column from the left and log{p(X|y=1)/p(X|y=2)} in the second row from the top and the third column from the left are considered in the loss function. That is, the likelihood ratios that are not shaded in gray in FIG. 9 are considered in the loss function.

When the correct answer class of the series data inputted as the training data is the “class 0”, the learning unit 300 may select the likelihood ratio in which the class 0 is in the numerator, from among the likelihood ratios of N×(N−1) patterns, and may consider it in the loss function. Specifically, the learning unit 300 may select only the likelihood ratios in the first row from the top (excluding the likelihood ratios on the diagonal line) in FIG. 9, and may consider them in the loss function. In this case, log{p(X|y=0)/p(X|y=1)} in the first row from the top and the second column from the left and log{p(X|y=0)/p(X|y=2)} in the first row from the top and the third column from the left are considered in the loss function.

Similarly, when the correct answer class of the series data inputted as the training data is the “class 2”, the learning unit 300 may select the likelihood ratio in which the class 2 is in the numerator, from among the likelihood ratios of N×(N−1) patterns, and may consider it in the loss function. Specifically, the learning unit 300 may select only the likelihood ratios in the third row from the top (excluding the likelihood ratios on the diagonal line) in FIG. 9, and may consider them in the loss function. In this case, log{p(X|y=2)/p(X|y=0)} in the third row from the top and the first column from the left and log{p(X|y=2)/p(X|y=1)} in the third row from the top and the second column from the left are considered in the loss function.

(Technical Effect)

Next, a technical effect obtained by the information processing system 1 according to the fourth example embodiment will be described.

As described in FIG. 8 and FIG. 9, in the information processing system 1 according to the fourth example embodiment, the learning is performed by using the loss function that takes into account the likelihood ratio in which the correct answer class is in the numerator, out of N×(N−1) patterns. By using such a loss function, as in each of the example embodiments described above, proper learning is performed, and it is thus possible to properly select at least one class to which series data belong, from among a plurality of classes that are classification candidates. Especially in the fourth example embodiment, the likelihood ratio in which the correct answer class is in the numerator (in other words, the likelihood ratio that may significantly influence the loss function) is considered in the loss function, and it is thus possible to perform more efficient learning than that when all the likelihood ratios of N×(N−1) patterns are considered.

Fifth Example Embodiment

The information processing system 1 according to a fifth example embodiment will be described with reference to FIG. 10. The fifth example embodiment describes a specific example of the loss functions used in the first to fourth example embodiments, and may be the same as the first to fourth example embodiments in the apparatus configuration and the flow of the operation. For this reason, a part that is different from each of the example embodiments described above will be described in detail below, and a description of other overlapping parts will be omitted as appropriate.

(Loss Function Including Sigmoid Function)

First, an outline of the loss function used in the information processing system 1 according to the fifth example embodiment will be described with reference to FIG. 10. FIG. 10 is a graph illustrating an example of a sigmoid function used in the information processing system according to the fifth example embodiment.

As illustrated in FIG. 10, in the information processing system 1 according to the fifth example embodiment, the learning unit 300 performs the learning by using the loss function including a sigmoid function. More specifically, the learning unit 300 performs the learning by using the loss function including the sigmoid function as a nonlinear function that effects the likelihood ratio. The following describes a specific example of the loss function including the sigmoid function.

Example in which all Likelihood Ratios are Considered

As a first example, a description will be given to the loss function when all the likelihood ratios of N×(N−1) patterns are considered, as in the second example embodiment. The likelihood ratio in the equation illustrated below is assumed to be a log likelihood ratio (LLR).

As the loss function that takes into account all the likelihood ratios of N×(N−1) patterns, the following equation (1) is given, for example.

$\begin{matrix} [Equation 1] &  \\ L_{L L R} = \frac{1}{M T} \sum_{i = 1}^{M} \sum_{t = 1}^{T} \frac{1}{2 K} \sum_{k = 1}^{K} \frac{1}{K - 1} \sum_{l \neq k}^{K} ❘ δ_{k y_{i}} - σ (λ_{k l}^{(t)}) ❘ & (1) \end{matrix}$

In the above equation (1), K is the number of classes, M is the number of data, and T is a time series length. Furthermore, k is a subscript in a row direction, 1 is a subscript in a column direction (i.e., subscripts indicating a row number and a column number in the matrix illustrated in FIG. 6 or the like). δ is the Kronecker delta, which is “1” when the subscripts match and which is “0” otherwise. λ is the likelihood ratio, and in the equation (1) above, it represents the log likelihood ratio in a k-th row and the first column at a time t. σ is a sigmoidal function and is included as the nonlinear function that effects the likelihood ratio λ.

1/MT in the equation (1) is to take an average of all of data and all in a time-series direction. ½K is multiplication of 1/K that is to take an average of K rows, by ½ that is to take an average of an alternating matrix. 1/(K−1) is to take an average of (K−1) columns obtained by subtracting one row on the diagonal line from K columns.

In a loss function L in the equation (1), when the k-th row is the correct answer class, the Kronecker δ is “1”, and when the k-th row is an incorrect answer class (i.e., a class other than the correct answer class), the Kronecker δ is 0. As a result, when the k-th row is the correct answer class (in other words, when the correct answer class is in the numerator of the likelihood ratio), the value of the likelihood ratio increases. On the other hand, when the k-th row is the incorrect answer class (in other words, when the incorrect answer class is in the numerator of the likelihood ratio), the value of the likelihood ratio decreases.

Example in which a Part of Likelihood Ratios is Considered

Next, as a second example, a description will be given to the loss function when a part of the likelihood ratios of N×(N−1) patterns as in the third and fourth example embodiments. Hereinafter, in particular, the following describes an example that takes into account only the likelihood ratio in which the correct answer class is in the numerator, out of N×(N−1) patterns, as in the fourth example embodiment.

As the loss function that takes into account only the likelihood ratio in which the correct answer class is in the numerator out of N×(N−1) patterns, the following equation (2) is given, for example.

$\begin{matrix} [Equation 2] &  \\ L_{L L R} = \frac{1}{M T} \sum_{i = 1}^{M} \sum_{t = 1}^{T} \frac{1}{K - 1} \sum_{l \neq k}^{K} ❘ 1 - σ (λ_{kl}^{(t)}) ❘ & (2) \end{matrix}$

In the above equation (2), K is the number of classes, M is the number of data, and T is a time series length. Furthermore, k is a subscript in the row direction, 1 is a subscript in the column direction (i.e., subscripts indicating the row number and the column number in the matrix illustrated in FIG. 6 or the like). λ is the likelihood ratio, and in the equation (2) above, it represents the log likelihood ratio in the k-th row and the first column at a time t. σ is a sigmoidal function and is included as the nonlinear function that effects the likelihood ratio λ.

The equation (2) takes into account only the rows of the correct answer class and thus does not include the steps of summing the K rows and taking an advantage of the K rows and an average of the alternating matrix by means of ½K, compared to the equation (1) previously described. In addition, a part corresponding to the Kronecker δ in the equation (1) is “1.”

(Technical Effect)

Next, a technical effect obtained by the information processing system 1 according to the fifth example embodiment will be described.

As described in FIG. 10, in the information processing system 1 according to the fifth example embodiment, the learning unit 300 uses the loss function including the sigmoid function σ. By using such a loss function, as described in the first to fourth example embodiments, it is possible to properly perform the learning related to the calculation of the likelihood ratio. Specifically, the learning can be performed such that the penalty increases when the class is an incorrect answer and the penalty decreases when the class is a correct answer.

The loss functions mentioned in the fifth example embodiment (i.e., the equation (1) and the equation (2)) are an example, and a different loss function may be created by using the sigmoid function. Furthermore, the loss function may be created by using another nonlinear function, instead of the sigmoid function. For example, the loss function including a logistic function may be used, as in an example embodiments described later.

Sixth Example Embodiment

The information processing system 1 according to a sixth example embodiment will be described with reference to FIG. 11. The sixth example embodiment describes a specific example of the loss functions used in the first to fourth example embodiments, as in the fifth example embodiment, and may be the same as those of the first to fourth example embodiments in the apparatus configuration and the flow of operation. For this reason, a part that is different from each of the example embodiments described above will be described in detail below, and a description of other overlapping parts will be omitted as appropriate.

(Loss Function Including Logistic Function)

First, an outline of the loss function used in the information processing system 1 according to the sixth example embodiment will be described with reference to FIG. 11. FIG. 11 is a graph illustrating an example of a logistic function used in the information processing system according to the sixth example embodiment.

As illustrated in FIG. 11, in the information processing system 1 according to the sixth example embodiment, the learning unit 300 performs the learning by using the loss function including a logistic function. More specifically, the learning unit 300 performs the learning by using the loss function including the logistic function as a nonlinear function that effects the likelihood ratio. The following describes a specific example of the loss function including the logistic function.

Example in which all Likelihood Ratios are Considered

As the loss function that takes into account all the likelihood ratios of N×(N−1) patterns, the following equation (3) is given, for example.

$\begin{matrix} [Equation 3] &  \\ L_{L L R} = \frac{1}{M T} \sum_{i = 1}^{M} \sum_{t = 1}^{T} \frac{1}{2 K} \sum_{k = 1}^{K} \frac{1}{K - 1} \sum_{l \neq k}^{K} (δ_{k y_{i}} logistic (λ_{kl}^{(t)}) + (1 - δ_{k y_{i}}) logistic (- λ_{kl}^{(t)})) & (3) \end{matrix}$

In the above equation (3), K is the number of classes, M is the number of data, and T is a time series length. Furthermore, k is a subscript in a row direction, 1 is a subscript in a column direction (i.e., subscripts indicating a row number and a column number in the matrix illustrated in FIG. 6 or the like). δ is the Kronecker delta, which is “1” when the subscripts match and which is “0” otherwise. λ is the likelihood ratio, and in the equation (3) above, it represents the log likelihood ratio in the k-th row and the first column at a time t. logistic is a logistic function and is included as a nonlinear function that effects the likelihood ratio λ.

1/MT in the equation (3) is to take an average of all of data and all in the time-series direction. ½K is multiplication of 1/K that is to take an average of K rows, by ½ that is to take an average of an alternating matrix. 1/(K−1) is to take an average of (K−1) columns obtained by subtracting one row on the diagonal line from K columns.

In a loss function L in the equation (3), when the k-th row is the correct answer class, the Kronecker δ is “1”, and when the k-th row is an incorrect answer class (i.e., a class other than the correct answer class), the Kronecker δ is 0. As a result, when the k-th row is the correct answer class (in other words, when the correct answer class is in the numerator of the likelihood ratio), a first-half term of two terms including the Kronecker δ remains, and a second-half term is zero. On the other hand, when the k-th row is the incorrect answer class (in other words, when the incorrect answer class is in the numerator of the likelihood ratio), the first-half term of the two terms including the Kronecker δ is zero, and the second-half term remains.

Example in which a Part of Likelihood Ratios is Considered

As the loss function that takes into account only the likelihood ratio in which the correct answer class is in the numerator out of N×(N−1) patterns, the following equation (4) is given, for example.

$\begin{matrix} [Equation 4] &  \\ L_{L L R} = \frac{1}{M T} \sum_{i = 1}^{M} \sum_{t = 1}^{T} \frac{1}{K - 1} \sum_{l \neq k}^{K} logistic (λ_{kl}^{(t)}) & (4) \end{matrix}$

In the above equation (4), K is the number of classes, M is the number of data, and T is a time series length. Furthermore, k is a subscript in the row direction, 1 is a subscript in the column direction (i.e., subscripts indicating the row number and the column number in the matrix illustrated in FIG. 6 or the like). λ is the likelihood ratio, and in the equation (4) above, it represents the log likelihood ratio in the k-th row and the first column at a time t. logistic is a logistic function and is included as a nonlinear function that effects the likelihood ratio λ.

The equation (4) takes into account only the rows of the correct answer class and thus does not include the steps of summing the K rows and taking an advantage of the K rows and an average of the alternating matrix by means of ½K, compared to the equation (3) previously described. In addition, only the first-half item of the two terms of the Kronecker δ in the equation (3) remains.

(Technical Effect)

Next, a technical effect obtained by the information processing system 1 according to the sixth example embodiment will be described.

As described in FIG. 11, in the information processing system 1 according to the sixth example embodiment, the learning unit 300 uses the loss function including the logistic function. By using such a loss function, as described in the first to fourth example embodiments, it is possible to properly perform the learning related to the calculation of the likelihood ratio. Specifically, the learning can be performed such that the penalty increases when the class is an incorrect answer and the penalty decreases when the class is a correct answer. Furthermore, the sigmoid function a used in the fifth example embodiment is changed such that a slope thereof comes closer to zero (see FIG. 10), whereas the logistic function is changed such that a slope thereof is kept constant (see FIG. 11). Therefore, when the loss function including the logistic function is used, a component does not disappear in a process of calculating the slope of the loss function (i.e., a process corresponding to differentiation), and more proper learning can be performed.

The loss functions mentioned in the sixth example embodiment (i.e., the equation (3) and the equation (4)) are an example, and a different loss function may be created by using the logistic function. Furthermore, the loss function may be created by using another nonlinear function, instead of the logistic function. For example, the loss function including a function that is different from the sigmoid function and the logistic function, may be used.

Seventh Example Embodiment

The information processing system 1 according to a seventh example embodiment will be described with reference to FIG. 12 and FIG. 13. The seventh example embodiment is partially different from the first to sixth example embodiments only in the configuration and operation (specifically, the configuration and operation of the classification apparatus 10), and may be the same as the first to sixth example embodiments in the other parts. For this reason, a part that is different from each of the example embodiments described above will be described in detail below, and a description of other overlapping parts will be omitted as appropriate.

(Functional Configuration)

First, a functional configuration of the information processing system 1 according to the seventh example embodiment will be described with reference to FIG. 12. FIG. 12 is a block diagram illustrating the functional configuration of the information processing system according to the seventh example embodiment. In FIG. 12, the same components as those illustrated in FIG. 2 carry the same reference numerals.

As illustrated in FIG. 12, in the information processing system 1 according to the seventh example embodiment, the likelihood ratio calculation unit 100 of the classification apparatus 10 includes a first calculation unit 110 and a second calculation unit 120. Each of the first calculation unit 110 and the second calculation unit 120 may be realized or implemented by the processor 11 (see FIG. 1), for example.

The first calculation unit 110 is configured to calculate an individual likelihood ratio on the basis of two consecutive elements included in the series data. The individual likelihood ratio is calculated as a likelihood ratio indicating a likelihood of a class to which two consecutive elements belong. The first calculation unit 110 may sequentially obtain elements included in the series data from the data acquisition unit 50, and may sequentially calculate the individual likelihood ratio based on two consecutive elements, for example. The individual likelihood ratio calculated by the first calculation unit 110 is configured to be outputted to the second calculation unit 120.

The second calculation unit 120 is configured to calculate an integrated likelihood ratio on the basis of a plurality of individual likelihood ratios calculated by the first calculation unit 110. The integrated likelihood ratio is calculated as a likelihood ratio indicating a likelihood of a class to which a plurality of elements considered in each of the plurality of individual likelihood ratios belong. In other words, the integrated likelihood ratio is calculated as a likelihood ratio indicating a likelihood of a class to which the series data including a plurality of elements belong. The integrated likelihood ratio calculated by the second calculation unit 120 is configured to be outputted to the class classification unit 200. The class classification unit 200 performs the class classification of the series data on the basis of the integrated likelihood ratio.

The learning unit 300 according to the fifth example embodiment may perform the learning for the entire likelihood ratio calculation unit 100 (i.e., for the first calculation unit 110 and the second calculation unit 120 together), or may perform the learning separately for the first calculation unit 110 and the second calculation unit 120. Alternatively, the learning unit 300 may be separately provided as a first learning unit that performs the learning only for the first calculation unit 110 and a second learning unit that performs the learning only for the second calculation unit 120. In this case, only one of the first learning unit and the second learning unit may be provided.

(Flow of Classification Operation)

Next, with reference to FIG. 13, a flow of operation of the classification apparatus 10 in the information processing system 1 according to the seventh example embodiment (specifically, a class classification operation after the learning) will be described. FIG. 13 is a flowchart illustrating the flow of the operation of the classification apparatus in the information processing system according to the seventh example embodiment.

As illustrated in FIG. 13, when the operation of the classification apparatus 10 is started, first, the data acquisition unit 50 obtains elements included in the series data (step S21). The data acquisition unit 50 outputs the obtained elements of the series data to the first calculation unit 110.

Then, the first calculation unit 110 calculates the individual likelihood ratio on the basis of the obtained two consecutive elements (step S22). Then, the second calculation unit 120 calculates the integrated likelihood ratio on the basis of a plurality of individual likelihood ratios calculated by the first calculation unit 110 (step S23).

Subsequently, the class classification unit 200 performs the class classification on the basis of the calculated integrated likelihood ratio (step S24). The class classification may determine one class to which the series data belong, or may determine a plurality of classes to which the series data are likely to belong. The class classification unit 200 may output a result of the class classification to a display or the like. The class classification unit 200 may output the result of the class classification by audio through a speaker or the like.

(Technical Effect)

Next, a technical effect obtained by the information processing system 1 according to the seventh example embodiment will be described.

As described in FIG. 12 and FIG. 13, in the information processing system 1 according to the seventh example embodiment, first, the individual likelihood ratio is calculated on the basis of two elements, and then, the integrated likelihood ratio is calculated on the basis of a plurality of individual likelihood ratios. By using the integrated likelihood ratio calculated in this manner, it is possible to properly select the class to which the series data belong. Furthermore, even in the classification apparatus 10 that calculates the individual likelihood ratio and the integrated likelihood ratio, it is possible to perform the class classification, more properly, by performing the learning by the learning unit 300 described in each of the example embodiments described above. That is, more proper classification can be performed by performing the learning such that the penalty increases when the class is an incorrect answer and the penalty decreases when the class is a correct answer.

Eighth Example Embodiment

The information processing system 1 according to an eighth example embodiment will be described with reference to FIG. 14 and FIG. 15. The eighth example embodiment is partially different from the seventh example embodiment only in the configuration and operation (specifically, the configuration and operation of the likelihood ratio calculation unit 100), and may be the same as the seventh example embodiment in the other parts. For this reason, a part that is different from each of the example embodiments described above will be described in detail below, and a description of other overlapping parts will be omitted as appropriate.

(Functional Configuration)

First, a functional configuration of the information processing system 1 according to the eighth example embodiment will be described with reference to FIG. 14. FIG. 14 is a block diagram illustrating the functional configuration of the information processing system according to the eighth example embodiment. In FIG. 14, the same components as those illustrated in FIG. 2 and FIG. 12 carry the same reference numerals.

As illustrated in FIG. 14, in the information processing system 1 according to the eighth example embodiment, the likelihood ratio calculation unit 100 of the classification apparatus 10 includes the first calculation unit 110 and the second calculation unit 120. The first calculation unit 110 includes an individual likelihood ratio calculation unit 111 and a first storage unit 112. The second calculation unit 120 includes an integrated likelihood ratio calculation unit 121 and a second storage unit 122. Each of the individual likelihood ratio calculation unit 111 and the integrated likelihood ratio calculation unit 121 may be realized or implemented by the processor 11 (see FIG. 1), for example. Furthermore, each of the first storage unit 112 and the second storage unit 122 may be realized or implemented by the storage apparatus 14 (see FIG. 1), for example.

The individual likelihood ratio calculation unit 111 is configured to calculate the individual likelihood ratio on the basis of two consecutive elements of the elements sequentially obtained by the data acquisition unit 50. More specifically, the individual likelihood ratio calculation unit 111 calculates the individual likelihood ratio on the basis of a newly obtained element and past data stored in the first storage unit 112. Information stored in the first storage unit 112 is configured to be read by the individual likelihood ratio calculation unit 111. When the first storage unit 112 stores the individual likelihood ratio of the past, the individual likelihood ratio calculation unit 111 reads the stored past individual likelihood ratio and calculates a new individual likelihood ratio in view of the obtained element. On the other hand, when the first storage unit 112 stores the element itself obtained in the past, the individual likelihood ratio calculation unit 111 may calculate the past individual likelihood ratio from the stored past element, and may calculate the likelihood ratio for the newly obtained element.

The integrated likelihood ratio calculation unit 121 is configured to calculate the integrated likelihood ratio on the basis of a plurality of individual likelihood ratios. The integrated likelihood ratio calculation unit 121 calculates a new integrated likelihood ratio by using the individual likelihood ratio calculated by the individual likelihood ratio calculation unit 111 and the integrated likelihood ratio of the past stored in the second storage unit 122. Information stored in the second storage unit 122 (i.e., the past integrated likelihood ratio) is configured to be read by the integrated likelihood ratio calculation unit 121.

Next, a flow of a likelihood ratio calculation operation (i.e., operation of the likelihood ratio calculation unit 100) in the information processing system 1 according to the eighth example embodiment will be described with reference to FIG. 15. FIG. 15 is a flowchart illustrating the flow of the operation of the likelihood ratio calculation unit in the information processing system according to the eighth example embodiment.

As illustrated in FIG. 15, when the likelihood ratio calculation operation by the likelihood ratio calculation unit 100 is started, first, the individual likelihood ratio calculation unit 111 of the first calculation unit 110 reads the past data from the first storage unit 112 (step S31). The past data may be a processing result of the individual likelihood ratio calculation unit 111 for the element obtained one time before the element obtained this time by the data acquisition unit 50 (in other words, the individual likelihood ratio calculated for the previous element), for example. Alternatively, the past data may be the element itself obtained one time before the element obtained in the acquisition.

Subsequently, the individual likelihood ratio calculation unit 111 calculates a new individual likelihood ratio (i.e., the individual likelihood ratio for the element obtained this time by the data acquisition unit 50) on the basis of the element obtained by the data acquisition unit 50 and the past data read from the first storage unit 112 (step S32). The individual likelihood ratio calculation unit 111 outputs the calculated individual likelihood ratio to the second calculation unit 120. The individual likelihood ratio calculation unit 111 may store the calculated individual likelihood ratio in the first storage unit 112.

Subsequently, the integrated likelihood ratio calculation unit 121 of the second calculation unit 120 reads the past integrated likelihood ratio from the second storage unit 122 (step S33). The past integrated likelihood ratio may be a processing result of the integrated likelihood ratio calculation unit 121 for the element obtained one time before the element obtained this time by the data acquisition unit 50 (in other words, the integrated likelihood ratio calculated for the previous element), for example.

Subsequently, the integrated likelihood ratio calculation unit 121 calculates a new integrated likelihood ratio (i.e., the integrated likelihood ratio for the element obtained this time by the data acquisition unit 50) on the basis of the likelihood ratio calculated by the individual likelihood ratio calculation unit 111 and the past integrated likelihood ratio read from the second storage unit 122 (step S34). The integrated likelihood ratio calculation unit 121 outputs the calculated integrated likelihood ratio to the class classification unit 200. The integrated likelihood ratio calculation unit 121 may store the calculated integrated likelihood ratio in the second storage unit 122.

(Technical Effect)

Next, a technical effect obtained by the information processing system 1 according to the eighth example embodiment will be described.

As described in FIG. 14 and FIG. 15, in the information processing system 1 according to the eighth example embodiment, the individual likelihood ratio is calculated by using the past individual likelihood ratio, and then, the integrated likelihood ratio is calculated by using the past integrated likelihood ratio. By using the integrated likelihood ratio calculated in this manner, it is possible to properly select the class to which the series data belong. Furthermore, even in the classification apparatus 10 that calculates the individual likelihood ratio and the integrated likelihood ratio by using the past data, it is possible to perform the class classification, more properly, by performing the learning by the learning unit 300 described in each of the example embodiments described above. That is, more proper classification can be performed by performing the learning such that the penalty increases when the class is an incorrect answer and the penalty decreases when the class is a correct answer.

Ninth Example Embodiment

The information processing system 1 according to a ninth example embodiment will be described with reference to FIG. 16. The ninth example embodiment is partially different from the first to seventh example embodiments only in the operation (specifically, operation of the class classification unit 200), and may be the same as the first to seventh example embodiments in the other parts. For this reason, a part that is different from each of the example embodiments described above will be described in detail below, and a description of other overlapping parts will be omitted as appropriate.

(Flow of Classification Operation)

First, with reference to FIG. 16, a flow of operation (specifically, a class classification operation after the learning) of the classification apparatus 10 in the information processing system 1 according to the ninth example embodiment will be described. FIG. 16 is a flow chart illustrating the flow of the operation of the classification apparatus in the information processing system according to the ninth example embodiment. In FIG. 16, the same steps as those described in FIG. 3 carry the same reference numerals.

As illustrated in FIG. 16, when the operation of the classification apparatus 10 is started, first, the data acquisition unit 50 obtains elements included in the series data (step S11). The data acquisition unit 50 outputs the obtained elements of the series data to the likelihood ratio calculation unit 100. Then, the likelihood ratio calculation unit 100 calculates the likelihood ratio on the basis of the obtained two or more elements (step S12).

Subsequently, the class classification unit 200 performs the class classification on the basis of the calculated likelihood ratio, and especially in the ninth example embodiment, the class classification unit 200 selects and outputs a plurality of classes to which the series data may belong (step S41). That is, the class classification unit 200 does not determine one class to which the series data belong, but determines a plurality of classes to which the series data are likely to belong. More specifically, the class classification unit 200 performs a process of selecting k classes (wherein k is a natural number of n or less) from n classes that are prepared as classification candidates (wherein n is a natural number).

The class classification unit 200 may output informations about the k classes to which the series data may belong, to a display or the like. Furthermore, the class classification unit 200 may output the informations about the k classes to which the series data may belong, by audio through a speaker or the like.

When outputting the informations about the k classes to which the series data may belong, the class classification unit 200 may rearrange and output them. For example, the class classification unit 200 may rearrange and output the informations about the k classes in descending order of the likelihood ratios. Alternatively, the class classification unit 200 may output each of the informations about the k classes in a different aspect for each class. For example, the class classification unit 200 may perform the output in a display aspect that highlights a class with a high likelihood ratio, while performing the output in a display aspect that does not highlight a class with a low likelihood ratio. In the highlighting, for example, a size or color to be displayed may be changed, or a movement may be given to an object to be displayed.

Specific Application Examples

A configuration of outputting the k classes from the n classes described above will be described with some specific application examples.

(Product Proposal)

The information processing system 1 according to the ninth example embodiment may be used to propose a product in which the user is likely to be interested, at a shopping site on a web. Specifically, the information processing system 1 may select k products (i.e., the k classes) in which the user is likely to be interested in, from n products (i.e., the n classes) that are handled products, and may output them to the user (wherein k is a number that is smaller than n). In this case, an example of the series data to be inputted is a past purchase history, browsing history, or the like.

Similarly, it may be used to propose a product and a store in digital signage or the like. In the digital signage, the user's image may be captured by a mounted camera. In this case, the user's feeling may be estimated from the user's image to propose a store or a product in accordance with the feeling. In addition, the user's line of sight may be estimated from the user's image (i.e., the user's viewing area may be estimated) to propose a store or a product in which the user is likely to be interested. Alternatively, the user's attribute (e.g., gender, age, etc.) may be estimated from the user's image to propose a store or a product in which the user is likely to be interested. When information about the user is estimated as described above, the n classes may be weighted in accordance with the estimated information.

(Criminal Investigation)

The information processing system 1 according to the ninth example embodiment may also be used for crime investigation. For example, when a real criminal is to be found from among a plurality of suspects, selecting from them only a single person who is most likely the criminal may cause a big problem when the selection is wrong. In the information processing system 1 according to this example embodiment, however, it is possible to select and output high-ranking k suspects who are highly possibly the criminal. Specifically, classes corresponding to the high-ranking k suspects who are highly possibly the criminal may be selected and outputted from the series data including, as the element, information about each of the plurality of suspects. In this way, for example, a plurality of suspects who are highly possibly the criminal may be put under criminal investigation to properly find the real criminal.

(Radar Image Analysis)

The information processing system 1 according to the ninth example embodiment may also be applied to analyses of a radar image. Since most radar images are not clear by their nature, it is hard to accurately determine what is captured in the image only by machine, for example. In the information processing system 1 according to this example embodiment, however, k candidates that are likely to be captured in the radar image can be selected and outputted. Therefore, it is possible to firstly output the k candidates, from which the user can make a determination. For example, if a “dog,” a “cat,” a “ship,” and a “tank” are selected as candidates for what is captured in a radar image of a port, the user can easily determine that the “ship” that is highly related to the port, is captured in the radar image.

The application example described above are an example, among the n candidates, and in a situation in which it is required to select the k candidates from the n candidates, it is possible to achieve a beneficial effect by applying the information processing system 1 according to this example embodiment.

A processing method in which a program for allowing the configuration in each of the example embodiments to operate to realize the functions of each example embodiment is recorded on a recording medium, and the program recorded on the recording medium is read as a code and executed on a computer, is also included in the scope of each of the example embodiments. That is, a computer-readable recording medium is also included in the range of each of the example embodiments. Not only the recording medium on which the above-described program is recorded, but also the program itself is also included in each example embodiment.

The recording medium may be, for example, a floppy disk (registered trademark), a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a magnetic tape, a nonvolatile memory card, or a ROM. Furthermore, not only the program that is recorded on the recording medium and executes processing alone, but also the program that operates on an OS and executes processing in cooperation with the functions of expansion boards and another software, is also included in the scope of each of the example embodiments.

This disclosure is not limited to the examples described above and is allowed to be changed, if desired, without departing from the essence or spirit of this disclosure which can be read from the claims and the entire specification. An information processing apparatus, an information processing method, and a computer program with such changes are also intended to be within the technical scope of this disclosure.

The example embodiments described above may be further described as, but not limited to, the following Supplementary Notes below.

(Supplementary Note 1)

An information processing system described in Supplementary Note 1 is an information processing system including: an acquisition unit that obtains a plurality of elements included in series data; a calculation unit that calculates a likelihood ratio indicating a likelihood of a class to which the series data belong, on the basis of at least two consecutive elements of the plurality of elements; a classification unit that classifies the series data into at least one class of a plurality of classes that are classification candidates, on the basis of the likelihood ratio; and a learning unit that performs learning related to calculation of the likelihood ratio, by using a loss function in which the likelihood ratio increases when a correct answer class to which the series data belong is in a numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in a denominator of the likelihood ratio.

(Supplementary Note 2)

An information processing system described in Supplementary Note 2 is the information processing system described in Supplementary Note 1, wherein the learning unit performs the learning by using a loss function that takes into account the likelihood ratios of N×(N−1) patterns in which the denominator is a likelihood in which the series data belong to one class and the numerator is a likelihood in which the series data belong to another class, out of N classes (wherein N is a natural number) that are classification candidates of the series data.

(Supplementary Note 3)

An information processing system described in Supplementary Note 3 is the information processing system described in Supplementary Note 2, wherein the learning unit performs the learning by using a loss function that takes into account a part of the likelihood ratios of the N×(N−1) patterns.

(Supplementary Note 4)

An information processing system described in Supplementary Note 4 is the information processing system described in Supplementary Note 3, wherein the learning unit performs the learning by using a loss function that takes into account the likelihood ratio in which the correct answer class is in the numerator, out of the N×(N−1) patterns.

(Supplementary Note 5)

An information processing system described in Supplementary Note 5 is the information processing system described in any one of Supplementary Notes 1 to 4, wherein the loss function includes a sigmoid function as a nonlinear function that effects the likelihood ratio.

(Supplementary Note 6)

An information processing system described in Supplementary Note 6 is the information processing system described in any one of Supplementary Notes 1 to 4, wherein the loss function includes a logistic function as a nonlinear function that effects the likelihood ratio.

(Supplementary Note 7)

An information processing system described in Supplementary Note 7 is the information processing system described in any one of Supplementary Notes 1 to 6, wherein the likelihood ratio is an integrated likelihood ratio that is calculated by taking into account a plurality of individual likelihood ratios that are calculated on the basis of two consecutive elements included in the series data.

(Supplementary Note 8)

An information processing system described in Supplementary Note 8 is the information processing system described in Supplementary Note 7, wherein the acquisition unit sequentially obtains a plurality of elements included in the series data, and the calculation unit calculates a new integrated likelihood ratio by using the individual likelihood ratio that is calculated on the basis of the newly obtained element and the integrated likelihood ratio calculated in the past.

(Supplementary Note 9)

An information processing method described in Supplementary Note 9 is an information processing method including: obtaining a plurality of elements included in series data; calculating a likelihood ratio indicating a likelihood of a class to which the series data belong, on the basis of at least two consecutive elements of the plurality of elements; classifying the series data into at least one class of a plurality of classes that are classification candidates, on the basis of the likelihood ratio; and performing learning related to calculation of the likelihood ratio, by using a loss function in which the likelihood ratio increases when a correct answer class to which the series data belong is in a numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in a denominator of the likelihood ratio.

(Supplementary Note 10)

A computer program described in Supplementary Note 10 is a computer program that operates a computer: to obtain a plurality of elements included in series data; to calculate a likelihood ratio indicating a likelihood of a class to which the series data belong, on the basis of at least two consecutive elements of the plurality of elements; to classify the series data into at least one class of a plurality of classes that are classification candidates, on the basis of the likelihood ratio; and to perform learning related to calculation of the likelihood ratio, by using a loss function in which the likelihood ratio increases when a correct answer class to which the series data belong is in a numerator of the likelihood ratio and the likelihood ratio decreases when the correct answer class is in a denominator of the likelihood ratio.

(Supplementary Note 11)

A recording medium described in Supplementary Note 11 is an recording medium on which the computer program described in Supplementary Note 10 is recorded.

DESCRIPTION OF REFERENCE CODES

- 1 Information processing system
- 11 Processor
- 14 Storage apparatus
- 10 Classification apparatus
- 50 Data acquisition unit
- 100 Likelihood ratio calculation unit
- 110 First calculation unit
- 111 Individual likelihood ratio calculation unit
- 112 First storage unit
- 120 Second calculation unit
- 121 Integrated likelihood ratio calculation unit
- 122 Second storage unit
- 200 Class classification unit
- 300 Learning unit

INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND COMPUTER PROGRAM

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

PCT Information