The present invention relates to an information processing apparatus that solves an online submodular optimization problem.
Use of online submodular optimization is being considered in order to determine advertisements to be presented to a user regarding web advertising and to determine a product to be sold at a discount in web sales. Online submodular optimization refers to selecting a subset of a given set in each round in order to minimize or maximize a cumulative value of an objective function.
Examples of a known document related to online submodular minimization include Non-patent Literature 1. Patent Literature 1 discloses an algorithm for deriving subsets X1, X2, . . . , XT that minimize an expected value of regret Σt∈[T]ft(Xt)−minX∈S{Σt∈[T]ft(X)} to not more than O((nT)1/2). Note here that ft represents an objective function in a round t.
[Non-patent Literature 1]
E. Hazan and S. Kale, “Online Submodular Minimization”, Journal of Machine Learning Research 13 (2012) 2903-2922
In a method disclosed in Non-patent Literature 1, subsets X1, X2, . . . , XT are derived in which an expected value of regret Σt∈[T]ft(Xt)−minX∈S{Σt∈[T]ft(X)} is minimized to not more than O((nT)1/2). This causes the following problem. Specifically, useful subsets X1, X2, . . . , XT can be derived for an online submodular minimization problem for which a fixed strategy to select the same subset in all rounds is effective, whereas useful subsets X1, X2, . . . , XT cannot be derived for online submodular minimization problem for which a fixed strategy is not effective. An online submodular maximization problem also has a similar problem.
An example aspect of the present invention has been made in view of the above problem, and an example object thereof is to provide an information processing apparatus that makes it possible to derive useful subsets X1, X2, . . . , XT also for an online submodular optimization problem for which a fixed strategy is not effective.
An information processing apparatus in accordance with an aspect of the present invention includes: an objective function setting means that sets, as an objective function f t in each round t∈[T] (T is any natural number), a submodular function on a power set 2S of a set S consisting of n elements (n is any natural number); and a subset sequence derivation means that derives a subset sequence X1, X2, . . . , XT∈2S in which an expected value of regret Σt∈[T]ft(Xt)−Σt∈[T]ft(Xt*) with respect to any benchmark X1*, X2*, . . . , Xt*∈2S satisfying Σt∈[T−1]dH(Xt*, Xt+1*)≤V is not more than an upper limit Max (n,T,V) determined from n,T,V, assuming that V is a given integer not less than 0,
where dH(Xt*,Xt+1*) is a Hamming distance between subsets, the Hamming distance being defined by dH(Xt*,Xt+1*)=|Xt∪Xt+1*|−|Xt*∩Xt+1*|.
An information processing apparatus in accordance with an aspect of the present invention includes: an objective function setting means that sets, as an objective function ft in each round t∈[T] (T is any natural number), a normalized submodular function on a power set 2S of a set S consisting of n elements (n is any natural number); and a subset sequence derivation means that derives a subset sequence X1, X2, . . . , XT satisfying the following condition β1 or β2:
the condition β1 being that each subset Xt satisfies |Xt|≤k assuming that k is a given natural number and that an asymptotic behavior of an expected value of α regret αΣt∈[T]ft(Xt*)−Σt∈[T]ft(Xt) with respect to any benchmark X1*, X2*, . . . , Xt*∈2S satisfying |Xt*|≤k and Σt∈[T−1]dH(Xt*,Xt+1*)≤V coincides with an asymptotic behavior of a function A (k,T,V) determined from k,T,V, assuming that V is a given integer not less than 0,
the condition β2 being that the asymptotic behavior of the expected value of the α regret αΣt∈[T]ft(Xt*)−Σt∈[T]ft(Xt) with respect to the any benchmark X1*, X2*, . . . , Xt*∈2S satisfying Σt∈[T−1]dH(Xt*,Xt−1*) coincides with an asymptotic behavior of a function B (n,T,V) determined from n,T,V, assuming that V is a given integer not less than 0,
where dH(Xt*,Xt+1*) is a Hamming distance between subsets, the Hamming distance being defined by dH(Xt*,Xt+1*)=|Xt*∪Xt+1*|−|Xt*∩Xt+1*|.
An example aspect of the present invention makes it possible to provide an information processing apparatus that makes it possible to derive useful subsets X1, X2, . . . , XT also for an online submodular optimization problem for which a fixed strategy is not effective.
A first example embodiment of the present invention will be described in detail with reference to the drawings.
Considered are (i) a set S consisting of n elements and (ii) an objective function ft: 2S→I defined for each round t∈[T]. Note here that n and T each represent any natural number. [T] represents a set of natural numbers not less than 1 and not more than T. 2S represents a power set of the set S, that is, a set consisting of all subsets of the set S. I represents a closed interval on a real number R. In the first example embodiment, it is assumed that I=[−1,1]. This assumption may affect expressions and values to be described below, but the present invention is not limited to the first example embodiment, and is therefore not limited by the assumption.
It is assumed that each objective function ft is a submodular function. That is, it is assumed that an inequality ft(X∪{i})−ft(X)≥ft(Y∪{i})−ft(Y) is satisfied for (i) any subset X,Y∈2S satisfying X□Y and (ii) any element i∈S.
Among problems of selecting a subset sequence X1, X2, . . . , XT∈2S, a problem whose target is minimization of a cumulative value Σt∈Tft(Xt) of the objective function ft is referred to as an “online submodular minimization problem”. In the first example embodiment, the online submodular minimization problem is studied under the following full-information setting or bandit feedback setting.
Full-information setting: After selecting a subset Xt in a round t, it is possible to refer to a value ft(X) of the objective function ft with respect to any subset X∈2S.
Bandit feedback setting: After selecting the subset Xt in the round t, it is (1) possible to refer to a value ft(Xt) of the objective function ft with respect to the selected subset Xt and (2) impossible to refer to a value ft(X) of the objective function ft with respect to a subset X∈2S that is different from the selected subset.
A configuration of an information processing apparatus 1 in accordance with the first example embodiment will be described with reference to
The information processing apparatus 1 is an apparatus for solving the online submodular minimization problem related to the set S consisting of the n elements. As illustrated in
The objective function setting unit 11 is a means that sets, as the objective function ft in each round t, a submodular function on the power set 2S of the set S. The objective function setting unit 11 is an example of an “objective function setting means” in the claims. The submodular function that the objective function setting unit 11 sets as the objective function ft may be (i) predetermined, (ii) input by a user via a keyboard or the like, or (iii) input by another apparatus via a communication network or the like. The submodular function that the objective function setting unit 11 sets as the objective function ft may be generated in various processes carried out inside the information processing apparatus 1.
The subset sequence derivation unit 12 is a means that derives a subset sequence X1, X2, . . . , XT satisfying a condition α below. The subset sequence derivation unit 12 is an example of a “subset sequence derivation means” in the claims. The subset sequence X1, X2, . . . , XT that is derived by the subset sequence derivation unit 12 may be provided to a user via a display or the like, or may be provided to another apparatus via a communication network or the like. The subset sequence X1, X2, . . . , XT that is derived by the subset sequence derivation unit 12 may be used in various processes carried out inside the information processing apparatus 1.
The condition α is that an expected value of regret Σt∈[T]ft(Xt)−Σt∈[T]ft(Xt*) with respect to any benchmark X1*,X2*, . . . ,Xt*∈2S satisfying Σt∈[T−1]dH(Xt*,Xt+1*) is not more than an upper limit Max (n,T,V) determined from n,T,V, assuming that V is a given integer not less than 0, where dH(Xt*,Xt+1*) is a Hamming distance between subsets, the Hamming distance being defined by dH(Xt*,Xt+1*)=|Xt*∪Xt+1*|−|Xt*∩Xt+1*|.
A flow of an information processing method S1 in accordance with the first example embodiment will be described with reference to
The information processing method S1 is a method for solving the online submodular minimization problem related to the set S consisting of the n elements. As illustrated in
The objective function setting process S1 is a process for setting, as the objective function ft in the each round t, the submodular function on the power set 2S of the set S. The objective function setting process S11 is carried out by, for example, the objective function setting unit 11 of the information processing apparatus 1. The subset sequence derivation process S12 is a process for deriving the subset sequence X1, X2, . . . , XT satisfying the condition α shown in the previous section. The subset sequence derivation process S12 is carried out by, for example, the subset sequence derivation unit 12 of the information processing apparatus 1.
In the method disclosed in Non-patent Literature 1, subsets X1, X2, . . . , XT that cause an expected value of regret Σt∈[T]ft(Xt)−minX∈S{Σt∈[T]ft(X)} to be not more than an upper limit Max (n,T) determined in accordance with n,T. Thus, useful subsets X1, X2, . . . , XT can be derived for the online submodular minimization problem for which a fixed strategy to select the same subset in all rounds is effective, whereas the useful subsets X1, X2, . . . , XT cannot be derived for the online submodular minimization problem for which a fixed strategy is not effective.
In contrast, in the information processing apparatus 1 and the information processing method S1 in accordance with the first example embodiment, the subsets X1, X2, . . . , XT are derived in which the expected value of the regret Σt∈[T]ft(Xt)−Σt∈[T]ft(Xt*) is not more than the upper limit Max (n,T,V) determined from n,T,V. In this case, a benchmark X1*, X2*, . . . , Xt* need only satisfy Σt∈[T−]dH(Xt*,Xt+1*)≤V and need not be constant. It is therefore possible to derive the useful subsets X1, X2, . . . , XT also for the online submodular minimization problem for which the fixed strategy is not effective.
The inventors of the present invention have succeeded in proving, regarding the online submodular minimization problem in full-information setting, the following theorem A.
Theorem A: If a subset sequence X1, X2, . . . , XT∈2[n] is a subset sequence derived by an algorithm shown in Table 1 below, the following inequality (1) holds true for the any benchmark X1*, X2*, . . . , Xt*∈2[n]. This causes an asymptotic behavior of the expected value of the regret Σt∈[T]ft(Xt)−Σt∈[T]ft(Xt*) to coincide with an asymptotic behavior of {T(n+Σt∈[T−1]dH(Xt*,Xt+1*))}1/2. Note that the asymptotic behaviors are compared here in disregard of a polynomial of logT and a polynomial of logn.
where E[·] represents an expected value for internal randomness of the algorithm. Furthermore, ┌⋅┐ represents the smallest natural number not less than ⋅.
The following description will discuss, with reference to
In the subset sequence derivation process S12 in accordance with a specific example of the present invention, a natural number d, a real number η, and d real numbers η(1),η(2), . . . , η(d) are used as constants. Furthermore, a d-dimensional vector pt∈[0,1]d satisfying ∥pt∥=1 an n-dimensional vector xt∈Rn, and d n-dimensional vectors xt(1), xt(2), . . . , xt(d)∈Rn are used as variables. Moreover, T real numbers u1, u2, . . . , uT are used as respective random variables that are uniformly distributed on an interval [0,1].
The initial setting step S121 is a step of setting the constants d, η(1),η(2), . . . , and η(d) and initializing the vectors pt, xt(1),xt(2), . . . , and xt(d). In the initial setting step S121, the subset sequence derivation unit 12 sets the constant d to, for example, a number obtained by adding 4 to the smallest natural number not less than logT. The subset sequence derivation unit 12 sets the constant η to, for example, η={logd/(8T)}1/2. For each j∈[d], the subset sequence derivation unit 12 sets a constant η(j) to, for example, η(j)=(n/2j)1/2. The subset sequence derivation unit 12 initializes the vector pt to, for example, p1=(1/d,1/d, . . . , 1/d). For the each j∈[d], the subset sequence derivation unit 12 initializes a vector xt(j) to, for example, xt(j)=(0,0, . . . , 0).
The subset derivation step S122 is a step of deriving the subset Xt. In the subset derivation step S122, the subset sequence derivation unit 12 sets the vector xt first to, for example, xt=Σj∈[d]ptjxt(j). Note here that ptj represents a jth component of the vector pt. Next, the subset sequence derivation unit 12 randomly sets a value of a random variable ut. Subsequently, the subset sequence derivation unit 12 derives the subset Xt defined by Xt={i∈[n]|xti≥ut}. Note here that xti represents an ith component of the vector xt.
The subgradient derivation step S123 is a step of deriving a subgradient gt at xt of the objective function ft. In the subgradient derivation step S123, it is possible to refer to the value ft(X) of the objective function ft with respect to any subset X∈[n]. In the subgradient derivation step S123, the subset sequence derivation unit 12 derives, for example, the subgradient gt defined by the following expression (2). In the following expression (2), o represents a permutation on the set [n] satisfying xtσ(1)≥xtσ(2)≥. . . ≥xtσ(n). Sσ(i) represents a subset of the set [n] which subset is defined by Sσ(i)={σ(j)|j∈[i]}. x(i)∈{0,1}n represents an indicator vector in which the ith component is 1 and a component that is different from the ith component is 0.
The vector update step S124 is a step of updating the vectors pt and Xt(1),xt(2), . . . , xt(d). In the vector update step S124, the subset sequence derivation unit 12 updates the vector xt(j) in accordance with, for example, the following expression (3). The subset sequence derivation unit 12 updates the vector pt in accordance with, for example, the following expression (4).
As is clear from the theorem A, use of the subset sequence derivation process S12 in accordance with a specific example of the present invention enables the expected value of the regret Σt∈[T]ft(Xt)−Σt∈[T]ft(Xt*) with respect to the any benchmark X1*, X2*, . . . , Xt* satisfying Σt∈[T−1]dH(Xt*,Xt+1*)≤V to be not more than the upper limit Max (n,T,V) defined by the following expression (5):
Max(n,T,V)=4√{square root over (T(n+2V))}+√{square root over (32Tlog(┌log T┐+4))} (5)
The inventors of the present invention have succeeded in proving, regarding the online submodular minimization problem in bandit feedback setting, the following theorem B.
Theorem B: If the subset sequence X1, X2, . . . , XT∈2[n] is a subset sequence derived by an algorithm shown in Table 2 below, the following inequality (6) holds true for the any benchmark X1*, X2*, . . . , Xt*∈2[n]. This causes the asymptotic behavior of the expected value of the regret Σt∈[T]ft(Xt)−Σt∈[T]ft(Xt*) to coincide with an asymptotic behavior of nT2/3{(loglogT/n)1/2+(1+Σt∈[T−1]dH(Xt*,Xt+1*)/n)}.
where γ represents a predetermined constant not less than 0 and not more than 1, the predetermined constant being called a search parameter.
The following description will discuss, with reference to
In the subset sequence derivation process S12 in accordance with a specific example of the present invention, the natural number d, the real number η, and the d real numbers η(1), η(2), . . . , η(d) are used as constants. Furthermore, the d-dimensional vector pt∈[0,1]d satisfying ∥pt∥=1, the n-dimensional vector xt∈Rn, and the d n-dimensional vectors xt(1), xt(2), . . . , xt(d)∈Rn are used as variables. Moreover, the T real numbers u1, u2, . . . , uT are used as respective random variables that are uniformly distributed on the interval [0,1]. Further, T integers s1, s2, . . . , sT are used as respective random variables that are uniformly distributed on {0,1, . . . , n}.
The initial setting step S125 is a step of setting the constants d, η(1),η(2), . . . , and η(d) and initializing the vectors pt, xt(1),xt(2), . . . , and xt(d). In the initial setting step S125, the subset sequence derivation unit 12 sets the constant d to, for example, a number obtained by quadrupling the smallest natural number not less than logT. The subset sequence derivation unit 12 sets the constant η to, for example, η=[logd/{2(n+1)2T}]1/2. For the each j∈[d], the subset sequence derivation unit 12 sets the constant η(j) to, for example, η(j)=(n/2j)1/2. The subset sequence derivation unit 12 initializes the vector pt to, for example, p1=(1/d,1/d, . . . ,1/d). For the each j∈[d], the subset sequence derivation unit 12 initializes the vector xt
The subset derivation step S126 is a step of deriving the subset Xt. In the subset derivation step S126, the subset sequence derivation unit 12 sets the vector xt first to, for example, xt=Σj∈[d]ptjxt(j). Note here that ptj represents the jth component of the vector pt. Next, the subset sequence derivation unit 12 randomly sets respective values of random variables ut,st. Subsequently, the subset sequence derivation unit 12 derives the subset Xt defined by (1) Xt={i∈[n]|xti≥ut} or derives the subset Xt defined by Xt={σ(j)|j∈[st]}. Note here that xti represents the ith component of the vector xt. Note also that σ represents the permutation on the set [n] satisfying xtσ(1)≥xtσ(2)≥. . . ≥xtσ(n). A probability with which a subset Xt={i∈[n]|xti≥ut} is derived in the subset derivation step S126 is set to 1−γ. In other words, a probability with which Xt={σ(j)|j∈[st]} is derived in the subset derivation step S126 is set to γ.
The unbiased estimator derivation step S127 is a step of deriving an unbiased estimator {circumflex over ( )}gt (with a symbol ∧ above gt) of the subgradient gt at xt of the objective function ft. In the unbiased estimator derivation step S127, it is possible to refer to only the value ft(Xt) of the objective function ft with respect to a subset Xt∈[n] derived in the subset derivation step S126. In the unbiased estimator derivation step S127, the subset sequence derivation unit 12 derives, for example, the unbiased estimator {circumflex over ( )}gt defined by the following expression (7). In the following expression (7), σ represents the permutation on the set [n] satisfying xtσ(1)≥xtσ(2)≥. . . ≥xtσ(n). qt represents a vector in which the ith component qti is defined by qti=γ/(1+n)+(1−γ)(xtσ(i)−xtσ(i+1)). it represents a natural number satisfying Xt=Sσ(it).
The vector update step S128 is a step of updating the vectors pt and xt(1),xt(2), . . . ,xt(d). In the vector update step S128, the subset sequence derivation unit 12 updates the vector xt(j) in accordance with, for example, the following expression (8). The subset sequence derivation unit 12 updates the vector pt in accordance with, for example, the following expression (9).
As is clear from the theorem B, use of the subset sequence derivation process S12 in accordance with a specific example of the present invention enables the expected value of the regret Σt∈[T]ft(Xt)−Σt∈[T]ft(Xt*) with respect to the any benchmark X1*, X2*, . . . , Xt* satisfying Σt∈[T−1]dH(Xt*,Xt+1*)≤V to be not more than the upper limit Max (n,T,V) defined by the following expression (10):
A second example embodiment of the present invention will be described in detail with reference to the drawings.
Considered are (i) a set S consisting of n elements and (ii) an objective function ft:2S→R≥0 defined for each round t∈[T]. Note here that n and T each represent any natural number. [T] represents a set of natural numbers not less than 1 and not more than T. 2S represents a power set of a set S, that is, a set consisting of all subsets of the set S. R≥0 represents nonnegative real numbers as a whole. It is assumed that each objective function ft is a normalized submodular function.
Among problems of selecting a subset sequence X1, X2, . . . , XT∈2S, a problem whose target is maximization of a cumulative value Σt∈Tft(Xt) of the objective function ft is referred to as an “online submodular maximization problem”. In the second example embodiment, the online submodular maximization problem is studied under full-information setting (described earlier).
A configuration of an information processing apparatus 2 in accordance with the second example embodiment will be described with reference to
The information processing apparatus 2 is an apparatus for solving the online submodular maximization problem related to the set S consisting of the n elements. As illustrated in
The objective function setting unit 21 is a means that sets, as the objective function ft in the each round t, a submodular function on the power set 2S of the set S. The objective function setting unit 21 is an example of the “objective function setting means” in the claims. The submodular function that the objective function setting unit 21 sets as the objective function ft may be (i) predetermined, (ii) input by a user via a keyboard or the like, or (iii) input by another apparatus via a communication network or the like. The submodular function that the objective function setting unit 21 sets as the objective function ft may be generated in various processes carried out inside the information processing apparatus 2.
The subset sequence derivation unit 22 is a means that derives a subset sequence X1, X2, . . . , XT satisfying a condition β1 or β2 below. The subset sequence derivation unit 22 is an example of the “subset sequence derivation means” in the claims. The subset sequence X1, X2, . . . , XT that is derived by the subset sequence derivation unit 22 may be provided to a user via a display or the like, or may be provided to another apparatus via a communication network or the like. The subset sequence X1,X2, . . . , XT that is derived by the subset sequence derivation unit 22 may be used in various processes carried out inside the information processing apparatus 2.
The condition β1 is that each subset Xt satisfies |Xt|≤k assuming that k is a given natural number and that an asymptotic behavior of an expected value of α regret αΣt∈[T]ft(Xt*)−Σt∈[T]ft(Xt) with respect to any benchmark X1*, X2*, . . . , Xt*∈2S satisfying |Xt*|≤k and Σt∈[T−1]dH(Xt*,Xt+1*)≤V coincides with an asymptotic behavior of a function A (k,T,V) determined from k,T,V, assuming that V is a given integer not less than 0.
The condition β2 is that the asymptotic behavior of the expected value of the α regret αΣt∈[T]ft(Xt*)−Σt∈[T]ft(Xt) with respect to the any benchmark X1*,X2*, . . . ,Xt*∈2S satisfying Σt∈[T−1]dH(Xt*,Xt+1*)≤V coincides with an asymptotic behavior of a function B (n,T,V) determined from n,T,V, assuming that V is a given integer not less than 0.
A flow of an information processing method S2 in accordance with the second example embodiment will be described with reference to
The information processing method S2 is a method for solving the online submodular maximization problem related to the set S consisting of the n elements. As illustrated in
The objective function setting process S21 is a process for setting, as the objective function ft in the each round t, the submodular function on the power set 2S of the set S. The objective function setting process S21 is carried out by, for example, the objective function setting unit 21 of the information processing apparatus 2. The subset sequence derivation process S22 is a process for deriving the subset sequence X1, X2, . . . , XT satisfying the condition β1 or β2 shown in the previous section. The subset sequence derivation process S22 is carried out by, for example, the subset sequence derivation unit 22 of the information processing apparatus 2.
In the information processing apparatus 2 and the information processing method S2 in accordance with the second example embodiment, subsets X1, X2, . . . , XT in which the expected value of the α regret αΣt∈[T]ft(Xt*)−Σt∈[T]ft(Xt) is not more than an upper limit Max (k,T,V) or an upper limit Max (n,T,V) is derived. In this case, the benchmark X1*, X2*, . . . , Xt* need not be constant. It is therefore possible to derive useful subsets X1, X2, . . . , XT also for online submodular maximization for which a fixed strategy is not effective.
The inventors of the present invention have succeeded in proving, regarding the online submodular maximization problem in full-information setting in which the number of elements of the subset Xt is fixed, the following theorem C.
Theorem C: If each objective function ft has monotonicity and a subset sequence X1, X2, . . . , XT∈2[n] constituted by a subset Xt consisting of k or less elements is a subset sequence derived by algorithms shown in Tables 3 and 4 below, the following evaluation formula (11) holds true for any benchmark X1*, X2*, . . . , Xt*∈2[n] constituted by a subset Xt* consisting of k or less elements. Note here that the objective function ft having monotonicity means that ft(X)≤ft(Y) holds true for any subset X,Y∈∈2[n] satisfying X□Y. Note also that O of Landau with a tilde above represents an asymptotic behavior in disregard of a polynomial of logT and a polynomial of logn.
submodular maxi
tion under size constraint
the base set
the size-constrained parameter
such that
copies
(Algorithm 4) with parameters T and
do
do
output
1
2
do
for each
indicates data missing or illegible when filed
t = (
t1,
t2, . . . ,
tn)T.
ti) for i = 1, 2, . . . , n.
tTpt(j)).
Note that the algorithm shown in Table 4 includes J fixed share forecaster (FSF) algorithms corresponding to different η(j). Since each of the FSF algorithms is a publicly-known algorithm, a description thereof is omitted here. In the following description, {it1, it2, . . . , its} is referred to as Xts, and {it1, it2, . . . , itk} is referred to as Xt. Furthermore, Xts∪{it,s+1} is referred to Xt,s+1, and ft(Xts∪{i})−ft(Xts) is referred to as lti.
The following description will discuss, with reference to
The FSF algorithm initialization step S221 is a step of initializing, in accordance with the number T of rounds, k FSF algorithm execution modules FSF*(1), FSF*(2), . . . , FSF*(k) that execute the FSF algorithms.
The subset derivation step S222 is a step of deriving the subset Xt. In the subset derivation step S222, after setting Xt0 to Xt0=Ø, the subset sequence derivation unit 22 repeatedly carries out the following process for s=1,2, . . . , k. First, the subset sequence derivation unit 22 reads a vector pt(s) that is output by an FSF algorithm execution module FSF*(s). Next, the subset sequence derivation unit 22 derives an element its from the read vector pt(s). Subsequently, the subset sequence derivation unit 22 uses the derived element its to generate Xts=Xt,s−1∪{its}. The subset sequence derivation unit 22 derives a subset Xt=Xtk by repeatedly carrying out the above process for s=1,2, . . . , k.
The feed generation step S223 is a step of generating feeds lt(1), lt(2), . . . , lt(k) to be input to the respective FSF algorithm execution modules FSF*(1), FSF*(2), . . . , FSF*(k). In the feed generation step S223, the subset sequence derivation unit 22 generates, in accordance with lti(s)=ft(Xt,s−1∪{i})−ft(Xt,s−1)(i∈[n]), a feed lt(s)=(lt1(s),lt2(s), . . . ,ltn(s)) to be input to the FSF algorithm execution module FSF*(s).
As is clear from the theorem C, use of the subset sequence derivation process S22 in accordance with a specific example of the present invention enables an asymptotic behavior of an expected value of (1−1/e) regret (1−1/e)Σt∈[T]ft(Xt*)−Σt∈[T]ft(Xt) with respect to the any benchmark X1*,X2*, . . . , Xt*∈2[n] satisfying Σt∈[T−1]dH(Xt*,Xt+1*)≤V and constituted by the subset Xt* consisting of the k or less elements to coincide with the asymptotic behavior of the function A (k,T,V) represented by the following expression (12):
A(k,T,V)=√{square root over (kT(k+V))} (12)
The inventors of the present invention have succeeded in proving, regarding the online submodular maximization problem in full-information setting in which the number of elements of the subset Xt is not fixed, the following theorem D.
Theorem D: If a subset sequence X1, X2, . . . , XT∈2[n] is a subset sequence derived by an algorithm shown in Table 5 below, the following evaluation formula (13) holds true for the any benchmark X1*, X2*, . . . , Xt*∈2[n].
Submodular Max
of rounds, the size
of the
copies
(Algorithm 4) with parameters
and 2.
= 1
do
do
output
from
and set
set
Otherwise, (with probability
and get feedback of
do
as the
input
indicates data missing or illegible when filed
The following description will discuss, with reference to
The FSF algorithm initialization step S224 is a step of initializing, in accordance with the number T of rounds, n FSF algorithm execution modules FSF*(1), FSF*(2), . . . , FSF*(n) that execute the FSF algorithms.
The subset derivation step S225 is a step of deriving the subset Xt. In the subset derivation step S225, after setting Xt0 to Xt0=Ø and setting Yt0 to Yt0=[n], the subset sequence derivation unit 22 repeatedly carries out the following process for s=1,2, . . . , n. First, the subset sequence derivation unit 22 reads the vector pt(s) that is output by the FSF algorithm execution module FSF*(s) and sets qt(s) to qt(s)=(1+2pt1(s))/4. Next, with a probability qt(s), the subset sequence derivation unit 22 sets Xts to Xts=Xt,s−1∪{s} and sets Yts to Yts=Yt,s−1. Otherwise, the subset sequence derivation unit 22 sets Xts to Xts=Xt,s−1 and sets Yts to Yts=Yt,s−1\{s}. The subset sequence derivation unit 22 derives the subset Xt=Xtn=Ytn by repeatedly carrying out the above process for s=1,2, . . . ,k.
The feed generation step S226 is a step of generating the feeds lt(1),lt(2), . . . ,lt(k) to be input to the respective FSF algorithm execution modules FSF*(1),FSF*(2), . . . ,FSF*(k). In the feed generation step S226, the subset sequence derivation unit 22 sets αts to αts=ft(Xt,s−1∪{s})−ft(Xt,s−1) and sets βts to αts=ft(Yt,s−1\{s})−ft(Yt,s−1). The subset sequence derivation unit 22 generates, in accordance with lti(s)=(1−qt(s))αts and lt2(s)=qt(s)βts, a feed lt(s)=(lt1(s),lt2(s)) to be input to the FSF algorithm execution module FSF*(s).
As is clear from the theorem D, use of the subset sequence derivation process S22 in accordance with a specific example of the present invention enables an asymptotic behavior of an expected value of (½) regret (½)Σt∈[T]ft(Xt*)−Σt∈[T]ft(Xt) with respect to the any benchmark X1*, X2*, . . . , Xt*∈2[n] satisfying t∈[T−1]dH(Xt*,Xt+1*)≤V to coincide with an asymptotic behavior of a function B (k,T,V) represented by the following expression (14):
B(n,T,V)=√{square root over (T(1+V/n))} (14)
Some or all of functions of the information processing apparatus 1 or 2 can be realized by hardware provided in an integrated circuit (IC chip) or the like or can be alternatively realized by software. In the latter case, the functions of the units of the information processing apparatus 1 or 2 are realized by, for example, a computer that executes instructions of a program that is software.
Examples of the at least one processor C1 encompass a central processing unit (CPU), a graphic processing unit (GPU), a digital signal processor (DSP), a micro processing unit (MPU), a floating point number processing unit (FPU), a physics processing unit (PPU), a microcontroller, and a combination thereof. Examples of the at least one memory C2 encompass a flash memory, a hard disk drive (HDD), a solid state drive (SSD), and a combination thereof.
Note that the computer C may further include a random access memory (RAM) in which the program P is to be loaded while being executed and in which various kinds of data are to be temporarily stored. The computer C may further include a communication interface through which data is to be transmitted and received between the computer C and at least one other apparatus. The computer C may further include an input/output interface through which (i) an input apparatus(s) such as a keyboard and/or a mouse and/or (ii) an output apparatus(s) such as a display and/or a printer is/are to be connected to the computer C.
The program P can be recorded in a non-transitory, tangible storage medium M capable of being read by the computer C. Examples of such a storage medium M encompass a tape, a disk, a card, a semiconductor memory, and a programmable logic circuit. The computer C can acquire the program P via the storage medium M. The program P can alternatively be transmitted via a transmission medium. Examples of such a transmission medium encompass a communication network and a broadcast wave. The computer C can alternatively acquire the program P via the transmission medium.
The information processing apparatus 1 or 2 described earlier is applicable to various problems. An example of this is shown below.
It is assumed that a measure is to reduce the respective beer prices of companies in a certain store. For example, in a case where an implemented measure Xt=[0,2,1, . . . ], it is assumed that a first element indicates setting of a beer price of a company A to a fixed price, a second element indicates a 10% increase in a beer price of a company B from a fixed price, and a third element indicates a 10% reduction in a beer price of a company C from a fixed price.
The objective function ft regards the implemented measure Xt as an input and regards, as an output, a result obtained by applying the implemented measure X to the respective beer prices of the companies to carry out sales. In this case, application of the above-described optimization method makes it possible to derive optimum setting of the respective beer prices of the companies in the above store.
The following description will discuss a case of application to an investment activity of, for example, an investor. In this case, it is assumed that the implemented measure Xt is investment (purchase, capital increase) with respect to a plurality of financial products (stock brands, etc.) held or to be held by the investor, or selling or holding of the plurality of financial products. For example, in a case where the implemented measure Xt=[1,0,2, . . . ], it is assumed that the first element indicates additional investment in stocks of a company A, the second element indicates holding (neither purchasing nor selling) receivables of a company B, and the third element indicates selling stocks of a company C. The objective function ft regards the implemented measure Xt as the input and regards, as the output, a result obtained by applying the implemented measure Xt to the investment activity with respect to financial products of the companies.
In this case, application of the above-described optimization method makes it possible to derive an optimum investment activity of the investor with respect to each brand.
The following description will discuss a case of application to an administration activity for a clinical trial of a certain drug of a pharmaceutical company. In this case, it is assumed that the implemented measure Xt is a dose of administration or avoidance of administration. For example, in a case where the implemented measure Xt=[1,0,2, . . . ], it is assumed that the first element indicates that administration in a dose 1 is carried out with respect to a subject A, the second element indicates that administration is not carried out with respect to a subject B, and the third element indicates that administration in a dose 2 is carried out with respect to a subject C. The objective function ft regards the implemented measure Xt as the input and regards, as the output, a result obtained by applying the implemented measure Xt to the administration activity with respect to each of the subjects.
In this case, application of the above-described optimization method makes it possible to derive an optimum administration activity with respect to each of the subjects in the clinical trial of the pharmaceutical company.
The following description will discuss a case of application to an advertising activity (marketing measure) in an operating company of a certain electronic commerce site. In this case, it is assumed that the implemented measure Xt is advertising (an online (banner) advertisement, advertising by electronic mail, direct mail, electronic mail transmission of a discount coupon, etc.), with respect to a plurality of customers, for a product or service to be sold by the operating company. For example, in a case where the implemented measure Xt=[1,0,2, . . . ], it is assumed that the first element indicates a banner advertisement with respect to a customer A, the second element indicates that advertising is not carried out with respect to a customer B, and the third element indicates electronic mail transmission of a discount coupon to a customer C. The objective function ft regards the implemented measure Xt as the input and regards, as the output, a result obtained by applying the implemented measure Xt to the advertising activity with respect to each of the customers. Note here a result of implementation may be whether or not a banner advertisement has been clicked, a purchase amount, a purchase probability, or an expected value of the purchase amount.
In this case, application of the optimization method of the second example embodiment makes it possible to derive an optimum advertising activity of the operating company with respect to each of the customers.
The present invention is not limited to the foregoing example embodiments, but may be altered in various ways by a skilled person within the scope of the claims. For example, the present invention also encompasses, in its technical scope, any example embodiment derived by appropriately combining technical means disclosed in the foregoing example embodiments.
The whole or part of the example embodiments disclosed above can be described as, but not limited to, the following supplementary notes.
An information processing apparatus including:
The information processing apparatus according to Supplementary note 1, wherein
Max(n,T,V)=4√{square root over (T(n+2V))}+√{square root over (32Tlog(┌logT┐+4))} (a)
The information processing apparatus according to Supplementary note 2, wherein
The information processing apparatus according to Supplementary note 1, wherein
The information processing apparatus according to Supplementary note 4, wherein
An information processing apparatus including:
An information processing apparatus including:
An information processing apparatus including:
The information processing apparatus according to Supplementary note 8, wherein
A(k,T,V)=√{square root over (kT(k+V))} (c)
The information processing apparatus according to Supplementary note 9, wherein
The information processing apparatus according to Supplementary note 8, wherein
B(n,T,V)=√{square root over (T(1+V/n))} (d)
The information processing apparatus according to Supplementary note 11, wherein
An information processing apparatus including:
An information processing apparatus including:
An information processing method including: setting, as an objective function ft in each round t∈[T] (T is any natural number), a submodular function on a power set 2S of a set S consisting of n elements (n is any natural number); and
An information processing method including: setting, as an objective function ft in each round t∈[T] (T is any natural number), a normalized submodular function on a power set 2S of a set S consisting of n elements (n is any natural number); and
A program for causing a computer to operate as an information processing apparatus,
A computer-readable storage medium storing the program according to Supplementary note 17.
A program for causing a computer to operate as an information processing apparatus,
A computer-readable storage medium storing the program according to Supplementary note 19.
An information processing apparatus including at least one processor, the at least one processor carrying out: an objective function setting process for setting, as an objective function ft in each round t∈[T] (T is any natural number), a submodular function on a power set 2S of a set S consisting of n elements (n is any natural number); and a subset sequence derivation process for deriving a subset sequence X1, X2, . . . , XT∈2S in which an expected value of regret Σt∈[T]ft(Xt)−Σt∈[T]ft(Xt*) with respect to any benchmark X1*, X2*, . . . , Xt*∈2Ss satisfying Σt∈[T−1]dH(Xt*,Xt+1*)≤V is not more than an upper limit Max (n,T,V) determined from n,T,V, assuming that V is a given integer not less than 0, where dH(Xt*,Xt+1*) is a Hamming distance between subsets, the Hamming distance being defined by dH(Xt*,Xt+1*)=|Xt*∪Xt+1*|−|Xt*∩Xt+1*|.
An information processing apparatus including at least one processor, the at least one processor carrying out: an objective function setting process for setting, as an objective function ft in each round t∈[T] (T is any natural number), a normalized submodular function on a power set 2S of a set S consisting of n elements (n is any natural number); and a subset sequence derivation process for deriving a subset sequence X1, X2, . . . , XT satisfying the following condition β1 or β2:
Note that any of these information processing apparatuses may further include a memory, which may store a program for causing the at least one processor to carry out the objective function setting process and the subset sequence derivation process. Not also that the program may be recorded in a non-transitory, tangible computer-readable storage medium.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/038818 | 10/14/2020 | WO |